Comparisons of Weighted NASA-TLX and SURG-TLX for Simulated Laparoscopic Tasks
TimeThursday, April 152:00pm - 3:00pm EDT
LocationEducation and Simulation
The National Aeronautics and Space Administration Task Load Index (NASA -TLX) and Surgery Task Load Index (SURG-TLX) are two subjective instruments to measure workload. NASA-TLX, measuring six factors, was primarily developed for aviation and widely applied in other fields (Hart, 2006; Hart & Staveland, 1988). SURG-TLX, measuring six dimensions, was created in 2011 by adapting three common dimensions from the NASA-TLX (Wilson et al., 2011). Three shared dimensions are Mental demand, Physical demand, and Temporal demand. Three alternative dimensions were created for the SURG-TLX: Task complexity, Situational stress, and Distractions, which replaced the original NASA-TLX factors of Performance, Effort and Frustration. Both instruments create both unweighted and weighted (pairwise comparison) workload measurements (Hart & Staveland, 1988; Wilson et al., 2011). The pairwise comparison weighting method for both instruments requires subjects to select a factor/dimension that contributes more workload than the other from each pair of factors/dimensions based on their own experience. Research has proved that there is a high correlation between the weighted NASA-TLX with unweighted NASA-TLX (Byers, Bittner, & Hill, 1989; Dickinson, Byblow, & Ryan, 1993). Even though no study has been conducted to compare the weighted and unweighted SURG-TLX, the method to calculate the weighted SURG-TLX is the same with the NASA-TLX.
The objective was to compare different weighted overall workload scores with the unweighted overall workload scores for both NASA-TLX and SURG-TLX, as well as assess all TLX scores’ sensitivity to experimental and demographic factors. It was expected that even though the high correlation existed between the weighted and unweighted TLX, they were still different in terms of the overall score and its sensitivity to different factors.
To better assess both NASA-TLX and SURG-TLX, two types of simulated laparoscopic single site surgery tasks (peg transfer task and clock transfer task) were conducted under four different methods including conventional laparoscopy (CL), unaltered LESS with or without intracorporeal crossing of instruments (UL), physically correct LESS with extracorporeal crossing of hands (PL) and visually altered LESS with intracorporeal crossing of instruments (VL). There were 23 medical students and 2 residents from a midwestern hospital participating in the study. Participants needed to evaluate the subjective workload received after each trial with a combined instrument with all nine unique factors/dimensions from NASA-TLX and SURG-TLX. Six different weighted methods were applied to both NASA-TLX and SURG-TLX to compare with the unweighted overall TLX score:
• P-indNASA – Overall NASA-TLX workload with individual score pairwise comparison weights
• NASAEx – Overall NASA-TLX workload with expert pairwise comparison weights
• AWPNASA - Overall NASA-TLX workload with group pairwise comparison weights on average score
• WAPNASA- Overall NASA-TLX workload with group average weights on individual pairwise comparison weights
• AWRNASA - Overall NASA-TLX workload with group rating method weights on average score
• WARNASA - Overall NASA-TLX workload with group average weights on individual rating method weights
• P-indSURG – Overall SURG-TLX workload with individual score pairwise comparison weights
• SURGEx – Overall SURG-TLX workload with expert pairwise comparison weights
• AWPSURG - Overall SURG-TLX workload with group pairwise comparison weights on average score
• WAPSURG- Overall SURG-TLX workload with group average weights on individual pairwise comparison weights
• AWRSURG - Overall SURG-TLX workload with group rating method weights on average score
• WARSURG - Overall SURG-TLX workload with group average weights on individual rating method weights
The statistical analyses were competed for the tasks and methods condition, tasks only condition and methods only condition. All analyses were completed using Minitab (V18, Minitab, LLC., State College, PA). The significance level was 0.05.
For both NASA-TLX and SURG-TLX, the unweighted TLX scores were highly correlated with all different weighted TLX scores in all the tasks and methods, tasks only and methods only conditions analyses (r>0.950, p=0.000). However, during the PCA analysis, some groups of different TLX scores were observed. Unweighted TLX and expert weighted TLX were found to be different from other weighted TLX scores for almost all cases (full descriptive and inferential results could not be included in this submission format).
NASA-TLX had higher scores and better sensitivity to demographic and experimental setting factors than SURG-TLX, but both weighted TLX scores showed better sensitivity than the unweighted TLX scores. For the tasks and methods condition analysis, no weighted NASA-TLX or SURG-TLX scores showed better sensitivity than the unweighted TLX overall scores. However, for the tasks only and methods only conditions’ analyses, weighted TLX scores showed more sensitive to some factors than the unweighted TLX scores with a p-value less than 0.05 from the mixed effect models. For example, age was significantly sensitive to SURGEx (peg transfer task: p=0.048; clock transfer task: p=0.045; UL: p=0.032; PL: p=0.04), AWPSURG (clock transfer task: p=0.038; UL: p=0.034), WAPSURG (clock transfer task: p=0.048; UL: p=0.047), AWRSURG (clock transfer task: p=0.045; UL: p=0.045), and WARSURG (clock transfer task: p=0.045; UL: p=0.041), while the unweightedSURG was insignificantly sensitive to age. Some weighted NASA-TLX scores showed better sensitivity to BMI, thumb length and pointer finger length (p<0.05) than unweighted NASA-TLX scores (p>0.05). Meanwhile, both NASA-TLX and SURG-TLX expert weighted scores tended to have better sensitivity to the tasks than the unweighted TLX scores (p>0.05), like SURGEx was sensitive to task for CL (p=0.007) and NASAEx was sensitive to task for PL (p=0.03). In all analyses, there was no cases that the unweighted TLX scores were significantly more sensitive to factors than the weighted TLX scores.
NASA-TLX measured the simulated surgical workload differently and more sensitively than the SURG-TLX. The unweighted TLX scores were different from weighted TLX scores. Although it is hard to select the best weighted methods for both NASA-TLX and SURG-TLX due to the high correlation and similarity among weighted TLX scores, the weighted TLX scores showed better sensitivity than the unweighted TLX scores for both instruments. Therefore, the weighted TLX scores showed a potential to improve the unweighted TLX scores despite the high correlation between them. Future research is needed to develop a better weighting method for the subjective workload instruments or identify which weighting method works better for NASA-TLX or SURG-TLX.