If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Wrist arthroscopy is generally considered the reference standard in the diagnosis of triangular fibrocartilage complex (TFCC) injuries. There is a paucity of data examining the reliability of wrist arthroscopy as a diagnostic modality for TFCC injuries. The goal of this study was to evaluate the interobserver and intraobserver reliability of the diagnosis of TFCC pathology during wrist arthroscopy.
Twenty-five intraoperative digital videos were captured by the senior author during diagnostic and surgical arthroscopy of the wrist joint for known or suspected articular pathology. The senior author (P.K.B.) confirmed TFCC resilience on visual inspection and ballottement (trampoline effect) to make the diagnosis. Two videos were excluded for poor quality and inadequate visualization. Three hand surgeons subsequently reviewed the remaining 23 videos in a blinded fashion at 2 time points separated by 4 weeks. The reviewers determined if the trampoline test was positive and if a TFCC tear was present. Tears were classified using a morphologic classification. Statistical measures of reliability including percentage agreement and κ coefficients were calculated.
Agreement between observers for the presence or absence of a tear was 66.7%. The average intraobserver agreement regarding the presence or absence of a tear was 67.4% The kappa value for interobserver agreement was 0.33, whereas the intrarater agreement was 0.88. The 3 reviewers identified an average of 11.3 positive trampoline tests. Agreement between observers for a positive trampoline test was 65.2%. The average percentage of intraobserver agreement regarding a positive trampoline test was 49.3%. In cases where all 3 reviewers agreed on the presence of a TFCC tear, the agreement regarding tear location was 76.6%.
Wrist arthroscopy remains instrumental in the treatment of TFCC tears. However, given that inter-rater reliability in the assessment of these tears is probably too low, reconsideration should be given to arthroscopy as the reference standard in the diagnosis of these tears.
Wrist arthroscopy is commonly used in the diagnosis and treatment of ulnar sided wrist pain, and tears of the triangular fibrocartilage complex (TFCC) are frequently the focus of this intervention. Although magnetic resonance imaging is an important diagnostic modality in assessing ulnar-sided wrist pain, several studies have shown that it lacks sensitivity in detecting peripheral TFCC lesions.
The goal of this study was to evaluate the interobserver and intraobserver reliability of the diagnosis of TFCC pathology during wrist arthroscopy. We hypothesized that, during arthroscopic evaluation of the TFCC, observers would have poor reliability regarding the presence or absence of a tear.
Institutional board review approval was obtained before the beginning of the study. Patients who underwent diagnostic and therapeutic arthroscopy of the wrist joint for known or suspected articular pathology, including intercarpal ligament injuries and tears of the TFCC, were eligible for inclusion. No patients in this group were treated arthroscopically for a fracture of the distal radius. In all cases, the arthroscopy was captured using digital video by the senior author (P.K.B.). The study group consisted of 25 patients and comprised 12 women and 13 men with an average age of 45 years (range, 21–60 y).
All of the procedures were performed with patients under sedation with regional anesthesia. A traction tower (ConMed, Largo, FL) with 10 to 15 lbs of distraction was used in all cases. All arthroscopies were performed with a 1.9- or 2.7-mm, 30° arthroscope (Stryker, Kalamazoo, MI), which was introduced into the 3-4 portal by the standard technique. Outflow was established through the 6-U portal, and the work portals included the 6-R or 4-5 portal. All videos included a systematic, arthroscopic wrist examination as described by Löw et al.
Integrity of the TFCC was evaluated by visual inspection, and by using the “trampoline” test, which evaluates the tautness or laxity of the TFCC with ballottement of the articular disk using a surgical probe (Fig. 1).
Twenty-five videos were then assessed for quality and edited by the senior author (P.K.B.) to include only the ulnar side of the wrist. Two videos were excluded for poor quality, inadequate visualization, or inadequate probing of pathologic lesions.
Twenty-three videos averaging 50 seconds in length were chosen for analysis.
Three additional fellowship-trained hand/upper extremity orthopedic surgeons with extensive experience in wrist arthroscopy participated in this study as video reviewers. The surgeons, who averaged 16 years (range, 7–27 y) in postfellowship practice, reviewed the videos using a personal computer. Each of the 3 reviewers performs between 20 and 30 arthroscopic surgeries per year. The reviewers were blinded to the patients’ history and the diagnosis of the treating surgeon. They assessed whether each video was adequate, and documented the presence or absence of a TFCC tear based on their overall evaluation of the video. If a tear was determined to be present, reviewers were then asked to classify the TFCC tear according to location: central, radial, or ulnar peripheral (Fig. 2). The videos were reordered, and then 4 weeks later each surgeon reviewed them a second time to evaluate intraobserver reliability.
Measures of reliability including percentage agreement and κ coefficient were calculated.
κ calculations are probably a better measure of reliability than percent agreement because they control for chance agreement. Interobserver agreement between multiple respondents surveyed at the same time was assessed using Fleiss’ κ coefficient for multiple raters. Intraobserver agreement was evaluated by Cohen’s κ coefficient.
All surgeons agreed that all the videos were of adequate quality. The percentage of full agreement for all 3 observers for the presence or absence of a tear was 67% (95% confidence interval [CI], 58% to 74%). The average percentage of intraobserver agreement regarding the presence or absence of a tear was 89% (95% CI, 80% to 96%). The kappa value for interobserver agreement for all 3 observers was 0.26 (95% CI, 0.08–0.43), whereas the agreement within each observer was 0.74 (95% CI, 0.61–0.93).
The percentage of full agreement for all 3 observers for a positive trampoline test was 65% (95% CI, 60% to 76%). The average percentage of intraobserver agreement regarding a positive trampoline test was 93% (95% CI, 87% to 96%). The kappa value for agreement for all 3 observers was 0.33 (95% CI, 0.13–0.45), whereas the agreement within each observer was 0.68 (95% CI, 0.50–0.85).
In cases where 3 reviewers agreed on the presence of a TFCC tear, the agreement regarding tear location was 76% in the first evaluation and 77% in the second, for an average of 77% (95% CI, 50% to 93%). The κ coefficient was 0.44 (95% CI, 0.32–0.65). When there was agreement with regard to tear location, central tears accounted for 43 of 45 tears (92%) of tears. The average percentage of intraobserver agreement regarding tear location was 93% (95% CI, 71% to 99%), with a κ coefficient of 0.74 (95% CI, 0.59–0.99).
Diagnosing the source of ulnar-sided wrist pain, and identifying TFCC tears specifically, can be a challenging problem. This is largely due to the complexity of the anatomy of the ulnar wrist, the multiple overlapping structures that can be causes of pain, and the relatively small size of the TFCC. Accurate diagnosis by physical examination and diagnostic studies can be difficult. As such, arthroscopy remains the reference standard for the diagnosis of TFCC tears.
We found that hand surgeons disagreed whether a tear was present in approximately one third of the cases evaluated. Similarly, the reviewers also disagreed one third of the time whether there was a positive trampoline test. These levels of agreement remained similar while evaluating intraobserver reliability. When a tear was present, using a simplified classification scheme for the location of a tear (central, radial, or ulnar peripheral) resulted in slightly higher agreement between (76.6%) and within (93%) observers. When there was agreement between observers regarding tear location, central tears accounted for 92% of the cases.
Few studies have previously examined the diagnostic reliability of wrist arthroscopy, but they have reported similar rates of disagreement. Löw et al
reviewed the reliability of the arthroscopic video review when added to photo documentation, in wrist arthroscopy with 2 reviewers. They found that videos, in addition to the arthroscopic photographs, increased the reliability when evaluating most articular pathology in the wrist joint, including the TFCC. The authors found fair interobserver agreement for both the presence of a TFCC tear and a positive trampoline test when evaluating both photographs and the videos. The actual rate of agreement between observers evaluating the arthroscopic videos alone cannot be ascertained based on the data presented, and no data regarding tear location were presented. A related study sought to examine the relationship between video length and interobserver reliability in 100 wrist arthroscopies.
Overall agreement in the videos was highly variable depending on the illustrated structure. The authors concluded that longer videos led to increased reliability.
The reliability of arthroscopy as a diagnostic modality in large joints has been shown to be inconsistent. There have been several studies evaluating the inter-rater reliability for grading chondral lesions in the knee. Marx et al
concluded that arthroscopic classification of articular cartilage lesions is reliable and reproducible as observed agreement ranged from 81% to 94%. However, subsequent studies have shown that the interobserver reliability of the arthroscopic grading of cartilage lesions is poor
Similar studies have been performed to evaluate the intraobserver and interobserver reliability of hip and shoulder arthroscopy with varying results. Arthroscopic classification of labral disease using the Beck classification demonstrated an overall agreement rate of 81.7%. Intraobserver reliability showed a similar level of reliability of 80.6%.
There is considerable overlap between traumatic lesions and degenerative lesions because the appearance of a traumatic lesion might change to a more degenerative appearance with time. By simplifying the classification of TFCC tears and focusing on the location and morphology of the tears, we chose to emphasize how the tears would be treated rather than etiology. In doing so, intraobserver agreement regarding tear location was substantial, likely in large part to the smaller number of types in this classification.
There are several shortcomings to this study. First, we used video to identify the presence and location of a tear. There is likely a tactile component to arthroscopic evaluation that video analysis does not allow for, and our results would potentially be different if each observer had been able to perform the arthroscopy and assess the tear directly. Second, the reviewers did not have the opportunity to examine the patients or the diagnostic studies. Third, the majority of the patients who underwent surgery had either suspected or established articular pathology, and as such our study group is not representative of the prevalence of TFCC pathology in the general population, which could lead to spectrum bias. This bias may have led to a spurious increase in the performance of arthroscopy, and agreement may have even been lower if a less selected patient cohort had been studied. Finally, kappa values and their applicability are highly context dependent. In some settings (eg, evaluating the results of a diagnostic test) anything less than excellent agreement would be inadequate. We do believe that in the setting of this study, κ coefficients do provide an adequate and applicable measure of agreement.
In conclusion, the interobserver and intraobserver reliability of the arthroscopic diagnosis of TFCC tears is inconsistent, with lack of agreement between and within observers approximately one third of the time regarding the presence or absence of a tear. Using a morphologic classification scheme, disagreement between and within observers as to the location of the tear occurs in approximately one quarter of cases. Arthroscopy remains a valuable tool in the treatment of TFCC tears. However, given the considerable disagreement between and within experienced observers, reconsideration should be given to its status as the reference standard in the diagnosis of these tears.