Research article - (2015)14, 675 - 680 |
Inter-Rater Reliability and Validity of the Australian Football League’s Kicking and Handball Tests |
Ashley J. Cripps1,, Luke S. Hopper2, Christopher Joyce1 |
Key words: Talent identification, skills test, coaches perceptions |
Key Points |
|
|
|
Participants |
Male athletes (n = 121, age = 15.7 ± 0.3 years, height = 1.77 ± 0.07 m, mass = 69.17 ± 8.08 kg) were recruited from seven semi-elite under 16 (U16) Western Australian Football League teams. Athletes and their guardians were given written information sheets detailing the potential risks associated with the study and subsequently provided written informed consent. Coaches (n = 7) from each of the teams were also recruited to give a subjective assessment of the skill efficiencies for athletes within their team. The coaches’ assessments rated the skills of each athletes in their team on a 1-5 Likert scale. Further detail regarding the coaches’ perceptions of skill is provided later. Assessors for the test were all university students with varying levels of exposure to Australian Football. Assessors were given a briefing on the tests purpose and scoring criterion prior to commencement. To further familiarise the assessor with the test, they were also required to watch the test conducted once prior to being allowed to score the test. Ethics approval was granted by the University’s Human Research Ethics Committee. |
Procedures |
The test procedures for both skill tests are provided by the AFL (Sheehan, Two student assessors stood approximately 35 m from the kick line in order to best assess the kicks. The assessors stood two metres apart aside the designated scoring position and were instructed not to communicate results to each other. Assessors were instructed to judge the kick on the criteria outlined in One point was subtracted from the possible five points for each kick if; the kick execution took longer than three seconds (monitored by the assessors using a stop watch from time of hearing the call from the feeder to skill execution), the kick was executed beyond the kick line, or the kick was executed incorrectly (unconventional flight and or spin). If the participant kicked to the wrong target, a score of zero was given. The handball test is depicted in Two student assessors stood 5 m behind the feeder to assess the handballs. The assessors stood two metres apart aside the designated scoring position and were instructed not to communicate results to each other. Assessors were instructed to judge the take and handball based on the criteria outlined in One point was subtracted if; the ball gather and handball took longer than three seconds to be executed (monitored by the assessors using a stop watch from time of hearing the call from the feeder to skill execution), or the handball was completed beyond the release line. The delivery was given a score of zero if the participant handballed to the wrong target. |
Coaches perceptions of the athletes |
Prior to receiving the results of the tests, the athletes’ coaches were asked to rate athletes from their team on a 1-5 Likert Scale for kicking and handball efficiency, and clean hands (their ability to take the ball cleanly either in the air or on the ground) with rating listed as; 5 rare, 4 excellent, 3 good, 2 marginal and 1 poor in accordance with the AFL youth coaching manual (2004). Outcome descriptors were attached to the 1-5 rating scale. For example, when assessing kicking and handball ability; a 5 mark was given if the athlete was considered very accurate on both dominant, and non-dominant sides, and when under pressure; the athlete was also required to be a very good decision maker. Coaches were also asked to categorise athletes as right (n = 102) or left (n = 19) side dominant. If they were unsure they were instructed to leave the field blank. These athletes (n = 8) were then excluded from the analysis. |
Data analysis |
The kicking and handball tests were assessed for inter-rater reliability, content and concurrent validity. Inter-rater reliability was examined using the subjective scores provided by two independent assessors, who both rated every disposal using the scoring procedure developed by the AFL. Content validity was assessed by examining the scoring outcomes sensitivity to laterality across a range of Australian Football specific distances. Concurrent validity was assessed by comparing the scores from both tests to coaches’ perception of skill efficiency. For the kicking test, the coaches’ perceptions of kicking ability was directly compared to their testing score. For the handball test, because the test examines both the ability to receive the ball cleanly and handball efficiently, the coaches’ perception of both clean hands and handball efficiency was summated and compared to the testing outcome. |
Statistical analysis |
Statistical analyses were carried out using SPSS software (Version 22.0, SPSS Inc., USA). Inter-rater reliability was assessed as relative and absolute measures. Relative reliability was calculated by comparing the total score given by both assessors using intra-class correlation coefficients (ICC). Absolute reliability was calculated using the 95% limits of agreement (LOA) method developed by Bland and Altman ( Scores were reported as means and standard deviations. Multivariate analysis (MANOVA) was used to examine the main effect of “laterality” (two levels: dominant and non-dominant) on the skills test variables. Cohen’s d effect sizes (ES) were calculated, with an ES of 0.20 considered small, 0.50 medium, and 0.80 large (Cohen, |
|
|
Inter-rater reliability for both the kicking (ICC = 0.96, The Pillai’s trace (V) revealed a significant effect of laterality on the kicking (V = 0.10, F(3, 252) = 9.63, A number of delivery errors were made in both tests by the athletes, whereby the athlete passed to the wrong target. A total of 25 errors made in the kicking test (3.23%) and 95 made in the handball test (12.27%). |
|
|
Inter-rater reliability |
Relative and absolute inter-rater reliability for both the kicking and handball tests was shown to be strong. The results of this study therefore suggest that the use of inexperienced assessors to administer the AFL’s skills tests will not affect the reliability of the tests scoring outcomes. Further, considering the assessors came from a varied and somewhat inexperienced football background, it is reasonable to assume that employing assessors with greater assessment experience, such as those used at the National Draft Combine, would further improve the reliability outcomes of the tests. There were a high number of delivery errors in the handball test. The higher number of errors in the handball test may have slightly elevated the test’s reliability measures, as it removed the opportunity for scoring variability. However, given the strength of the findings in the reliability analysis, these effects are likely to be minimal. |
Validity of AFL skills tests |
The results of this study demonstrates mixed results when assessing content validity. Scoring outcomes for the kicking test shows a significant ability to differentiate between accuracy on dominant and non-dominant foot kicks, across varying Australian Football specific distances. While the handball test was only able to significantly differentiate between laterality, with inconsistent results apparent when examining effects of distance. As with most skill tests, the AFL’s skills tests are closed-skill tests and are unable to examine every component of the complex task assessed (Robertson et al., The AFL’s handball test did not show the same level of content validity demonstrated by the kicking test. Whilst the test was able to differentiate between dominant and non-dominant disposals, it failed to consistently differentiate between target distances. This may be due to the short (6 m) and medium (8 m) distances not being long enough or the task itself being too simple to elicit meaningful accuracy changes. Further research is needed to confirm the use of the handball test for providing a valid means of handball skill assessment. Both the kicking and handball tests demonstrated poor concurrent validity, suggesting the AFL skills tests results are not representative of coaches’ perceptions of athletes kicking and handball skills. The poor concurrent validity of the skill tests is likely due to the tests inability to replicate all match related skill demands. In matches, other factors are likely to influence an athlete’s skill efficiency by both hand and foot, for example; opposition pressure, decision making, and fatigue. The poor concurrent validity demonstrated by both tests suggests that coaches should be cautious when using test results to predict match related skill outcomes. An identified weakness of the handball test is that the test examines two independent skill outcomes but only reports a single score. This means when examining the scoring outcomes it is impossible to tell which of the two skills in the test the player may have excelled or scored poorly in. For example, a player may have fumbled the ball, but executed an excellent disposal; or taken the ball cleanly but executed a poor disposal. In both cases the scoring outcome would not identify which skill the player performed well in and which they did not. A simple suggestion to eliminate this issue is to incorporate two scoring protocols, one for the clean-hands component of the test and a second for the disposal outcome. A further suggestion to reduce delivery errors in the test may be to adopt a pre-determined delivery pattern. This may reduce any errors associated with the athlete miss-hearing calls or decision making errors. This study was limited to assessments of partial content and concurrent validity. Further validity assessments, such as the tests ability to discriminate between athletes of higher and lower playing abilities is necessary to confirm the utility of the skills tests. Another limitation of this study was that the kicking and handball tests were originally designed to be used at the AFL National Draft Combine with athletes of eligible draft age (at least 18 years of age before 31st December of the relevant selection year). Whereas, the athletes we recruited were around two years younger than the athletes who would typically perform the test. Further assessments of the tests validity should therefore be conducted with athletes of eligible draft age. |
|
|
Both the AFL’s kicking and handball tests demonstrated acceptable levels of relative and absolute inter-rater reliability. The kicking tests was also shown to demonstrate partial content validity, with the tests able to discriminate between dominant and non-dominant disposals, across a range of Australian Football specific distances. The AFL’s handball test was also able to discriminate between laterality, however it could not consistently discriminate between disposal distances. Both tests demonstrate poor concurrent validity, when compared to coaches’ perceptions of skill. The AFL’s kicking test may provide an appropriate means of assessing and providing feedback to development athletes regarding their kicking skills, with further research required to establish if the handball test is appropriate to do the same. Future research should establish if both tests can differentiate between athletes of higher or lower playing abilities and if performance in the skill tests improve with age. |
ACKNOWLEDGEMENTS |
The authors would like to thank the Western Australian Football League and the University of Western Australia for supporting the research project. The research project received no external financial assistance. None of the authors have any conflict of interests to declare. |
AUTHOR BIOGRAPHY |
|
REFERENCES |
|