Introduction
Cognitive aids or checklists are tools used by anesthesiologists and other team members to facilitate a timely and appropriate response to an emergency during surgery [
1–
3]. Many previous publications have supported the utility of cognitive aids especially during rare events [
4–
7]. Malignant hyperthermia (MH) is a rare but potentially deadly crisis that may occur in the perioperative period when susceptible patients are exposed to volatile anesthetics or succinylcholine [
8]. Fortunately, MH has a known treatment plan and several cognitive aids detailing the steps necessary for a proper emergency response are currently available [
3–
5,
9].
The methodology used to develop cognitive aids is not standardized. A systematic review in 2013 by Marshall revealed that only a single article out of 22 articles published on cognitive aids described “an iterative method that is the standard for other medical devices,” and only 11 of the 22 cognitive aid articles included any description of the design process [
2]. Studies of cognitive aids have focused primarily on implementation and before/after assessments of performance during simulated crises but have rarely assessed the effectiveness of the cognitive aid design [
2] or speed of the user’s response [
3].
Eye tracking technology has been used to a limited extent in anesthesiology [
10,
11]. The ability to track participants’ voluntary and involuntary eye movements, quantify gaze fixations within an area of interest (AOI), and measure speed of response may be particularly applicable to assessing and improving the effectiveness of cognitive aid designs. The primary aim of this eye tracking study is to compare the accessibility of five MH cognitive aids by measuring participants’ time to answer three standardized questions.
Materials and Methods
This study was approved by the Institutional Review Board and Veterans Affairs research committee, and written informed consent was obtained from all participants.
Participants
From July to September of 2017, we recruited a convenience sample of attending anesthesiologists from a single university-affiliated Veterans Affairs hospital to voluntarily participate in the study. Inclusion criteria were as follows: 1) having hospital privileges in good standing to practice anesthesiology; 2) active medical license; and 3) board certification in anesthesiology. Recruits were excluded if the eye tracking system could not be successfully calibrated [
10]. No participants received remuneration.
Baseline data
All participants completed a brief survey to establish basic demographic information, assess years of clinical anesthesiology experience and years working at the study institution, and report residency training site. All participant data and performance analyses were anonymized and kept confidential.
Experimental setting
As previously described [
10], each participant was seated in a private office in front of a 127-centimeter plasma, high-definition (1080 p) television screen (Panasonic, Japan). The distance between the seated participant and the television screen was fixed for all participants at 142 centimeters. Lighting was constant for all participants with half of the overhead lights in the office turned off to minimize glare and the other half turned on during data collection; calibration and subsequent measurement using the eye tracking system require some amount of ambient light.
Eye tracking system calibration
All participants were fitted with Tobii Glasses 2 eye tracking glasses (Tobii, Sweden). Eye tracking glasses utilize corneal reflection to determine the focus of the subject’s gaze [
12]. Calibration required each participant to focus on a card with a black and white target (included with the system) held at arms distance while wearing the glasses and running the calibration function on the eye tracking software (Tobii Pro Glasses Controller). If the initial attempt at calibration was unsuccessful, nose bridges of different sizes (included with the Tobii eye tracking system) were tested sequentially until the individual successfully achieved calibration; otherwise, the participant was excluded from further study procedures.
Examination
After calibration, participants were given scripted instructions. They were shown a series of slides (PowerPoint, Microsoft Office, USA), each of which displayed one MH cognitive aid (Supplemental Content). Three questions were asked about each of the five cognitive aids based on content common to all cognitive aids: 1) “Please read the section that includes the dosing regimen for the rescue medication”; 2) “Please read the section that includes treatment for hyperkalemia”; and 3) “Please read the hotline number for reporting”. Questions and answers were developed and iteratively tested by five investigators who were not study recruits. The answer to each question was always a specific location on the cognitive aid, defined as the AOI. The five MH cognitive aids selected were freely available [
1] and previously published by the following groups: 1) The Association of Anaesthetists of Great Britain and Ireland (AAGBI) [
13,
14]; 2) Ariadne Labs [
5]; 3) the Malignant Hyperthermia Association of the United States (MHAUS) [
9]; 4) the Society for Pediatric Anesthesia (SPA) [
3]; and 5) Stanford Anesthesia Cognitive Aid Group [
4]. The sequence of slides (cognitive aids and questions) were ordered randomly using an online tool (
http://www.randomizer.org).
Outcomes
Tobii Glasses 2 was used to record gaze data and produce eye tracking recordings for all participants. Tobii Pro Lab was used to designate the AOI, calculate metrics, record viewing time, and produce image maps.
Fig. 1 demonstrates the heat map feature of the eye tracking system. The heat map combines the number of gaze fixations and fixation duration for all participants. Recordings were reviewed and annotated to measure the time required for participants to locate objects on the cognitive aid to provide an answer; the cumulative time to answer (measured in seconds) was the primary outcome. Secondary outcomes included cumulative time-to-first-fixation (seconds) using the IV–T fixation filter, time to answer each question individually, and time to first fixation for each question individually.
Statistical analysis
Statistical analyses were performed using NCSS-PASS software (USA). Descriptive statistics were performed and normality was determined using the Kolmogorov-Smirnov test. Cumulative time to answer and total time to first fixation were plotted on Kaplan-Meier survival curves and analyzed using the log-rank test. Pairwise comparisons were conducted using log-rank tests with reporting of Mantel-Haenszel probability levels. Since data were not normally distributed, we analyzed years of practice and years at the study institution against cumulative time to answer and cumulative time to first fixation using Spearman’s correlation coefficient. All P values were two-sided, and a P value < 0.05 was considered statistically-significant.
Results
Twelve participants were recruited and provided written consent. Participants were 50% male (6 of 12) with (mean [10th–90th percentiles]) years in practice of 10 (3–27) and years at the current institution of 6 (2–18); 7 of 12 completed anesthesiology residency at Stanford University (58%). All participants successfully achieved calibration and completed all study procedures.
Primary outcome
For the primary outcome, there were differences detected between the cumulative time to answer survival curves (P < 0.001). Participants demonstrated the shortest cumulative time to answer when viewing the SPA cognitive aid compared to the four others (P < 0.001 for SPA vs. Ariadne, SPA vs. MHAUS, SPA vs. Stanford, and SPA vs. AAGBI;
Fig. 2).
Secondary outcomes
Survival analyses for time to answer individual questions showed the following: 1) for dantrolene dosing, the overall distribution of curves differed (P = 0.028); Stanford was longer than MHAUS (P = 0.001) and SPA (P = 0.006); 2) for hyperkalemia, the overall distribution of curves differed (P = 0.002) with SPA being shorter than MHAUS (P < 0.001), Stanford (P = 0.032), and AAGBI (P = 0.025), and Ariadne being shorter than MHAUS (P = 0.004); and 3) for the hotline number, the overall distribution of curves differed (P < 0.001); SPA was shorter than all four of the other cognitive aids (P < 0.001) in every pairwise comparison.
Survival curves for cumulative time to first fixation within the AOI (
Fig. 2) also showed differences (P < 0.001) with pairwise differences between SPA and Ariadne (P = 0.003), SPA and MHAUS (P = 0.002), as well as between SPA and Stanford (P < 0.001). Survival analyses of time to first fixation for individual questions showed the following: 1) for dantrolene dosing (overall distribution P < 0.001), SPA was shorter than Ariadne (P = 0.006) and Stanford (P < 0.001), and MHAUS was shorter than Stanford (P = 0.002); 2) for hyperkalemia (overall distribution P = 0.006), AAGBI was shorter than MHAUS (P < 0.001) and Stanford (P = 0.003); and 3) for the hotline number (overall distribution P < 0.001), SPA was shorter than Ariadne (P = 0.016), MHAUS (P = 0.002), and Stanford (P < 0.001).
Years of practice (i.e., experience) and years working at the study institution did not demonstrate statistically significant correlations with either of the composite outcomes: cumulative time to answer (P = 0.528 and P = 0.762, respectively) and cumulative time to first fixation (P = 0.485 and P = 0.207, respectively).
Discussion
Our eye tracking analyses show that use of the MH cognitive aid designed by SPA results in the shortest cumulative time to answer three relevant standardized content questions when compared to four other cognitive aids currently available. This is the first study to utilize eye tracking technology in comparative evaluation of cognitive aid design, and our experience suggests that there may be additional applications of eye tracking technology in healthcare and medical education.
Our results are particularly interesting since more than half of participants completed residency at Stanford, and all participants work at a Stanford-affiliated hospital. Despite having Stanford emergency manuals [
4] in every operating room at the study institution, the Stanford MH cognitive aid did not provide the best performance. We believe that this lack of intrinsic institutional bias in our results arguably supports the generalizability of our study results. We also note that years of anesthesiology experience show no correlation with the outcomes of our cognitive aid comparison using eye tracking technology. Unfortunately, routine use of cognitive aids in clinical practice may be as low as 7%, and poor design is considered one contributory factor [
15].
Potentially advantageous design features of the SPA cognitive aid (
Fig. 1) include a single page and simple typescript with minimal use of single color blocking. Previous work in the use of eye tracking in anesthesiology suggests that “visually salient” regions within an image (i.e., areas of high contrast or color) may distract from “cognitively salient” points (i.e., areas of value) [
11]. Another strong design feature of the SPA cognitive aid is its linear layout, which has been shown to facilitate better team performance during a simulated crisis when compared to branched cognitive aid designs [
16]. The use of eye tracking in the context of evaluating participants’ use of cognitive aids may represent a means to collect objective data on the results of thought processes that have been previously limited to subjective assessment [
10].
There are important limitations to this study. First, we recruited a small convenience sample size based at a single institution. Given a lack of previous studies involving eye tracking in cognitive aid evaluation, however, we did not have sufficient data with which to propose a difference in performance. Using the experimental design described previously [
10], the eye tracking system was set up in a specific office location; this justified the single site recruitment since the study site had to be accessible to all participants. Second, generalizability is limited to the set of cognitive aids and topic studied, and the results should not be extrapolated to other cognitive aid designs or topics. All cognitive aids employed in this study were publicly available and produced by reputable sources. Lastly, metrics related to cognitive aid performance using eye tracking in this experimental setting may not translate to performance in other real-life settings such as clinical simulation or actual clinical care. The differences in eye tracking metrics between cognitive aids can be measured in seconds, and we do not yet know if these differences are clinically relevant. Future studies should build on the results of the present study and compare different cognitive aids in simulated and clinical practice.
In summary, eye tracking technology may provide useful data in the design of future cognitive aids. This represents a new application within the field of medicine and warrants further research.