Maximal exercise testing with respiratory gas analysis is the reference technique for determining exercise capacity, allowing for measurement of VO2max, ventilatory threshold (VT) and respiratory compensation threshold (RCT). Such testing is often impractical because of the cost and technological sophistication required for respiratory gas analyses. Within the past decade, the Talk Test (TT) has been shown to be a useful surrogate of gas exchange thresholds in a variety of populations (Dehart-Beverley et al., 2000; Foster et al., 2008; Recalde et al., 2002; Voelker et al., 2002). Cannon et al., 2004 demonstrated that when patients who subsequently developed exertional ischemia were able to speak comfortably, they are unlikely to have ECG evidence of myocardial ischemia. This suggests the potential of the TT to minimize the risk of catastrophic events during exercise training. Further, the TT has been shown to be useful for 'translating' incremental exercise test results into absolute training intensities in a variety of populations including cardiac patients, sedentary individuals and well-trained individuals (Brawner et al., 2006; Foster et al., 2009). These data suggest that the TT is a safe, valid, and simple way of determining exercise intensity in populations where the use of maximal exercise testing may be impractical or where gas exchange technology is unavailable. Despite the strength of data supporting its use, there are no data on the reproducibility of the TT. The purpose of the present study was to determine the reproducibility of the TT compared to respiratory gas exchange measurements. Healthy volunteers (10 ♂: PPO = 280 ± 33W, 14 ♀: PPO = 211 ± 24W) provided written informed consent to the protocol which was approved by the university ethics committee. Each performed 4 randomly ordered cycle ergometer exercise tests (25W + 25 W per 2 min). Two tests included measurements of respiratory gas exchange and two used the TT. Respiratory gas exchange was measured using open circuit spirometry with a mixing chamber based system, using 30s data integration (AEI, Pittsgurgh, PA). VT and RCT were defined using both v-slope and ventilatory equivalents. At 1.5 min into each stage of the TT protocols, the subject was asked to recite the “Pledge of Allegiance”, a standard speech provoking paragraph familiar to most individuals in the US (Foster et al., 2008). They were asked, “Can you speak comfortably?” and were instructed to give responses of “Yes,” (POS) “Yes…but,” (EQ) or “No”, (NEG). All tests were continued to fatigue. Statistical comparisons of the reproducibility of responses were made using repeated measures ANOVA and Intra-class correlations. No statistically significant differences were seen between first and second tests within types (GE or TT) for all markers. There were no significant differences (p > 0.05) between first and second tests for estimating VT (140 ± 46 vs 141 ± 39W), RCT (191 ± 45 vs 194 ± 43W), max GE (238 ± 44 vs 237 ± 46W) or EQ (154 ± 36 vs 153 ± 40W), NEG (204 ± 43 vs 204 ± 44W) or max TT (242 ± 45 vs 241 ± 43W). Correlations between tests (within method) were computed using Intraclass correlation, with an Intra-class correlation (ICC) value of 0.7 - 0.9 indicating a “high correlation” between tests, and an ICC value of 0.9 - 1.0 indicating a “very high correlation” between tests. A high correlation between GE tests was seen for VT (ICC = 0.72) and RCT (ICC =0 .89) (Figure 1). Similarly, there was a very high correlation between TT tests for the EQ stage (ICC = 0.94) and NEG stage (ICC = 0.97) (Figure 2). Thus, both at the level of corresponding mean values and at the level of Intra-class correlation, the reproducibility of both the TT and GE techniques for 'threshold determination' is very high. There were, however significant differences between GE vs TT tests for estimating VT (141 ± 39W) vs EQ (154 ± 37W) and RCT (193 ± 42) vs NEG (204 ± 42W). There were no significant differences between GE vs TT for maximal PO (238 ± 44 vs 242 ± 44W). Past research using the TT has suggested a strong correlation between TT-EQ and GE-VT, as well as between TT-NEG and GE-RCT (Dehart-Beverley et al., 2000; Foster et al., 2008, Recalde et al., 2002; Voelker et al., 2002). Data from this study suggests TT-EQ overestimates GE-VT and that the TT-NEG overestimates GE-RCT. The correspondence between TT-EQ and GE-VT was particularly weak for low values for power output (PO < 100W). Other studies have shown that subjects often over-report their power output at the EQ and NEG stages of TT compared to experienced investigators, which may partially explain this phenomenon (Thiel et al. 2011; Zanettini et al., 2013). In these studies, subjects reported that they could speak comfortably at higher intensities than the investigator (trained in conducting the TT) believed the subject could speak comfortably. Although we have generally reported a good relationship between GE and TT markers of 'threshold intensities', these results were also evident in earlier work from our laboratory (Foster et al., 2008), where the TT was more often mistaken when subjects indicated that they could still speak comfortably when they were predicted not to be able to speak comfortably. This suggests that the 2 min stage duration used in this and other studies may be too short to allow adequate matching of TT and GE responses. However, despite the tendency of the TT to overestimate GE markers, the results of this study suggest that estimates of PO at VT and RCT using GE and of the EQ and NEG stages of the TT appear are highly reproducible within method. |