- Does the experience level of the radiologist, assessment in consensus, or the addition of the abduction and external rotation view improve the diagnostic reproducibility and accuracy of MRA of the shoulder?
Does the experience level of the radiologist, assessment in consensus, or the addition of the abduction and external rotation view improve the diagnostic reproducibility and accuracy of MRA of the shoulder?
To prospectively evaluate the influence of observer experience, consensus assessment, and abduction and external rotation (ABER) view on the diagnostic performance of magnetic resonance arthrography (MRA) in patients with traumatic anterior-shoulder instability (TASI). Fifty-eight MRA examinations (of which 51 had additional ABER views) were assessed by six radiologists (R1-R6) and three teams (T1-T3) with different experience levels, using a seven-lesion standardized scoring form. Forty-five out of 58 MRA examination findings were surgically confirmed. Kappa coefficients, sensitivity, specificity, and differences in percent agreement or correct diagnosis (p-value, McNemar's test) were calculated per lesion and overall per seven lesion types to assess diagnostic reproducibility and accuracy. Overall kappa ranged from poor (k = 0.17) to moderate (k = 0.53), sensitivity from 30.6-63.5%, and specificity from 73.6-89.9%. Overall, the most experienced radiologists (R1-R2) and teams (T2-T3) agreed significantly more than the lesser experienced radiologists (R3-R4: p = 0.014, R5-R6; p = 0.018) and teams (T2-T3: p = 0.007). The most experienced radiologist (R1, R2, R3) and teams (T1, T2) were also consistently more accurate than the lesser experienced radiologists (R4, R5, R6) and team (T3). Significant differences were found between R1-R4 (p = 0.012), R3-R4 (p = 0.03), and T2-T3 (p = 0.014). The overall performance of consensus assessment was systematically higher than individual assessment. Significant differences were established between T1-T2 and radiologists R3-R4 (p<0.001, p = 0.001) and between T2 and R3 (p<0.001/p = 0.001) or R4 (p = 0.050). No overall significant differences were found between the radiologists' assessments with and without ABER. The addition of ABER does not significantly improve overall diagnostic performance. The radiologist's experience level and consensus assessment do contribute to higher reproducibility and accuracy.