Evaluating Methods for Imputing Missing Data from Longitudinal Monitoring of Athlete Workload

Research article - (2021)20, 188 - 196
DOI: https://doi.org/10.52082/jssm.2021.188

Evaluating Methods for Imputing Missing Data from Longitudinal Monitoring of Athlete Workload

Lauren C. Benson^1,2,

, Carlyn Stilling², Oluwatoyosi B.A. Owoeye^2,3, Carolyn A. Emery^2,4,5,6

¹United States Olympic & Paralympic Committee, Colorado Springs, CO, United States
²Sport Injury Prevention Research Centre, Faculty of Kinesiology, University of Calgary, Calgary, Canada
³Department of Physical Therapy and Athletic Training, Doisy College of Health Sciences, Saint Louis University, Saint Louis, MO, United States
⁴Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
⁵McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Canada
⁶Departments of Community Health Sciences and Pediatrics, Cumming School of Medicine, University of Calgary, Calgary, Canada

Lauren C. Benson
✉ 2500 University Dr NW, Calgary, AB, T2N 1N4, Canada; 403-220-2170
Email: lauren.benson@ucalgary.ca

Received: 24-09-2020 -- Accepted: 26-01-2021
Published (online): 05-03-2021

ABSTRACT

Missing data can influence calculations of accumulated athlete workload. The objectives were to identify the best single imputation methods and examine workload trends using multiple imputation. External (jumps per hour) and internal (rating of perceived exertion; RPE) workload were recorded for 93 (45 females, 48 males) high school basketball players throughout a season. Recorded data were simulated as missing and imputed using ten imputation methods based on the context of the individual, team and session. Both single imputation and machine learning methods were used to impute the simulated missing data. The difference between the imputed data and the actual workload values was computed as root mean squared error (RMSE). A generalized estimating equation determined the effect of imputation method on RMSE. Multiple imputation of the original dataset, with all known and actual missing workload data, was used to examine trends in longitudinal workload data. Following multiple imputation, a Pearson correlation evaluated the longitudinal association between jump count and sRPE over the season. A single imputation method based on the specific context of the session for which data are missing (team mean) was only outperformed by methods that combine information about the session and the individual (machine learning models). There was a significant and strong association between jump count and sRPE in the original data and imputed datasets using multiple imputation. The amount and nature of the missing data should be considered when choosing a method for single imputation of workload data in youth basketball. Multiple imputation using several predictor variables in a regression model can be used for analyses where workload is accumulated across an entire season.

Key words: Jump count, imputation, training load, machine learning, basketball

Key Points

The error associated with single imputation of missing workload data depends on the method used.
Single imputation methods based on the specific context of the session for which data are missing (e.g., team mean) and methods that combine information about the session and the individual (e.g., machine learning models) have the smallest imputation error.
Multiple imputation using several predictor variables in a regression model can be used for analyses where workload is accumulated across an entire season.

Back

Full Text

PDF

Email link to this article

Evaluating Methods for Imputing Missing Data from Longitudinal Monitoring of Athlete Workload

Lauren C. Benson, Carlyn Stilling, Oluwatoyosi B.A. Owoeye, Carolyn A. Emery

2021(20), 188 - 196.

DOI: https://doi.org/10.52082/jssm.2021.188

Share this article