Neural network versus activity-specific prediction ... - CiteSeerX

21 downloads 728 Views 389KB Size Report
Aug 29, 2013 - network versus activity-specific prediction equations for energy expen- ... vert direct signals (i.e., acceleration) of wearable monitors into.
J Appl Physiol 115: 1229–1236, 2013. First published August 29, 2013; doi:10.1152/japplphysiol.01443.2012.

Neural network versus activity-specific prediction equations for energy expenditure estimation in children Nicole Ruch,1 Franziska Joss,2 Gerda Jimmy,1 Katarina Melzer,1 Johanna Hänggi,3 and Urs Mäder1 1 3

Swiss Federal Institute of Sport, Magglingen, Switzerland; 2Swiss Federal Institute of Technology, Zürich, Switzerland; and School for Teacher Education, University of Applied Sciences and Arts Northwestern Switzerland, Brugg, Switzerland

Submitted 3 December 2012; accepted in final form 27 August 2013

Ruch N, Joss F, Jimmy G, Melzer K, Hänggi J, Mäder U. Neural network versus activity-specific prediction equations for energy expenditure estimation in children. J Appl Physiol 115: 1229 –1236, 2013. First published August 29, 2013; doi:10.1152/japplphysiol.01443.2012.—The aim of this study was to compare the energy expenditure (EE) estimations of activity-specific prediction equations (ASPE) and of an artificial neural network (ANNEE) based on accelerometry with measured EE. Forty-three children (age: 9.8 ⫾ 2.4 yr) performed eight different activities. They were equipped with one tri-axial accelerometer that collected data in 1-s epochs and a portable gas analyzer. The ASPE and the ANNEE were trained to estimate the EE by including accelerometry, age, gender, and weight of the participants. To provide the activity-specific information, a decision tree was trained to recognize the type of activity through accelerometer data. The ASPE were applied to the activity-type-specific data recognized by the tree (Tree-ASPE). The Tree-ASPE precisely estimated the EE of all activities except cycling [bias: ⫺1.13 ⫾ 1.33 metabolic equivalent (MET)] and walking (bias: 0.29 ⫾ 0.64 MET; P ⬍ 0.05). The ANNEE overestimated the EE of stationary activities (bias: 0.31 ⫾ 0.47 MET) and walking (bias: 0.61 ⫾ 0.72 MET) and underestimated the EE of cycling (bias: ⫺0.90 ⫾ 1.18 MET; P ⬍ 0.05). Biases of EE in stationary activities (ANNEE: 0.31 ⫾ 0.47 MET, Tree-ASPE: 0.08 ⫾ 0.21 MET) and walking (ANNEE 0.61 ⫾ 0.72 MET, Tree-ASPE: 0.29 ⫾ 0.64 MET) were significantly smaller in the Tree-ASPE than in the ANNEE (P ⬍ 0.05). The Tree-ASPE was more precise in estimating the EE than the ANNEE. The use of activity-type-specific information for subsequent EE prediction equations might be a promising approach for future studies. automated pattern recognition; energy metabolism; child; physical activity; accelerometer THE ASSESSMENT OF physical activity (PA) behavior is important in studies investigating changes in children’s PA levels over time, the effectiveness of PA interventions, or the relationship between PA and health. Accelerometers are currently the devices of choice to quantify PA behavior in terms of duration, frequency, activity type, and intensity. The process used to convert direct signals (i.e., acceleration) of wearable monitors into other established measurement units, such as energy expenditure (EE), is referred to as the process of value calibration (3). Until recently, single linear regression equations were the most widely used approach to relate accelerometer measurements to EE in calibration studies (17, 23, 31, 33). However, the EE of certain activities was reported to be estimated inaccurately, due to low accelerations in these activities (17, 30, 32). The prior identification of activity categories was shown recently to improve EE estimation by applying metabolic equiv-

Address for reprint requests and other correspondence: N. Ruch, Swiss Federal Institute of Sport Magglingen, Hauptstrasse 247, 2532 Magglingen, Switzerland (e-mail: [email protected]). http://www.jappl.org

alent (MET) (7) or activity-specific prediction equations (ASPE) to the data of the respective activity type (8 –10). On the other hand, estimation of EE, using a support vector machine algorithm (26) or an artificial neural network (ANN), was suggested recently (18, 28, 35). ASPE and an ANN were shown to be more accurate than conventional single prediction equations (SPE) (9, 10, 18, 35). There is an ongoing debate about whether an ANN or ASPE provides a better method for processing accelerometer data into EE (5, 16). However, these methods have not yet been compared directly. The aim of the present study was to compare the EE estimations of ASPE that were applied to the activity-typespecific data determined by a decision tree (Tree-ASPE) and the EE estimation of an ANN with measured EE values. METHOD

The study included 22 girls (age: 9.8 ⫾ 2.4 yr, weight: 33.4 ⫾ 10.3 kg, height: 1.4 ⫾ 0.2 m) and 21 boys (age: 9.8 ⫾ 2.2 yr, weight: 36.4 ⫾ 12.7 kg, height: 140 ⫾ 15 cm). Parents and children gave their written, informed consent and assent, respectively. The study was approved by the regional Ethics Committee. Height of the children was measured to the nearest 0.5 cm using a stadiometer (Model 213; Seca, Hamburg, Germany), and weight was measured to the nearest 0.1 kg using a standardized digital scale (Model 861; Seca) with the children wearing light clothing but not wearing their shoes. Then, the children were equipped with a portable gas analyzer mounted on the chest and a tri-axial accelerometer attached on the right hip with an elastic belt. The children were then asked to perform eight activities, namely sitting, standing, walking, running, cycling, riding a city scooter (a skateboard with a handle), jumping, and crawling. The activities were chosen according to an observational study that determined the main activity classes for the development of an activity-type classification system (25). The boys (girls) in that study were of age 10.6 ⫾ 0.8 yr (10.9 ⫾ 1.0 yr), of height 1.5 ⫾ 0.1 m (1.5 ⫾ 0.1 m), and of weight 37.6 ⫾ 6.9 kg (38.0 ⫾ 6.1 kg), respectively. Sitting and standing were performed at the beginning of the measurements; all other activities were performed subsequently in randomized order. The children were asked to perform all activities at their own moderate pace. For crawling, they were asked to crawl and balance simultaneously a table-tennis ball on a spoon in one of their hands. For jumping, they were asked to jump over a rope that was lying on the floor at a pace of one/second to obtain a continuous activity. All activities were performed for 3.5 min each. The selected duration was reported to be sufficient to obtain steady-state conditions during EE measurements (22, 38). Coefficients of variations ⬍10% were reached in all children in the activities sitting, standing, cycling, running, and jumping. During walking, riding a city scooter, and crawling, one, one, and three children reached values between 10% and 15%, respectively, and during riding a city scooter and crawling, respectively, one child reached values between 15% and 20%. The data of the latter were discarded from the analysis. All activities were separated by a break of ⬃3 min. Oxygen consumption (VO2) and

8750-7587/13 Copyright © 2013 the American Physiological Society

1229

1230

Neural Network Versus Activity-Specific Prediction Equations

carbon dioxide production (CO2) were measured with a portable gas analyzer (MetaMax 3B; Cortex Biophysik, Leipzig, Germany). A pediatric facemask (Hans Rudolph, Shawnee, KS) was placed over each participant’s mouth and nose and was wired to a recording device worn on the chest of the participant. The recorded data were downloaded to a laptop with the corresponding software (MetaSoft, Version 3.9.8.; Cortex Biophysik). Before every measurement, the gas analyzer was calibrated according to the manufacturer’s guidelines. The validity, reliability, and stability of the device were reported elsewhere (37). VO2 and CO2 consumption was averaged over the last minute, after having visually inspected the data to cut off the last 5–10 s of each activity where the EE was decreasing. Data were converted to EE (kJ/min) by applying the EE equation provided by Elia and Livesey (15). MET levels were obtained by dividing activity EE by the resting EE (REE) to allow comparisons with previous research. REE was predicted using the Schofield’s regression for children (24, 27). A tri-axial accelerometer (GT3X; ActiGraph, Pensacola, FL), which measures accelerations with a full range of ⫾3 G, was used (12). The accelerometer includes a 12-bit analog-to-digital converter that digitizes the data that was recorded at an epoch of 30 Hz. The amplitude is proportional to the energy of the acceleration during a given period. The device band-path filters and sums up the acceleration by a manufacturer’s proprietary algorithm and was programmed to store the resulting accelerometer counts for each second. The step count and inclinometer function in the GT3X were activated when the device was initialized with the corresponding software (ActiLife 6; ActiGraph). Each second of the same last minute used in the EE measurement was labeled according to the performed activity type. Features used during this minute included the 1-s accelerometer counts of all three axes (x,y,z), the vector magnitude (VM) of the three axes, VM ⫽ 兹x2 ⫹y2 ⫹z2, the steps, and the classification of postures provided by the inclinometer function. To provide the activity-specific information for the ASPE, a recursive decision tree was trained with these features using a minimal number of 200 data points in each node of the tree. The final size of the tree was determined by



Ruch N et al.

a cost-complexity function showing that the optimum between misclassification and tree size was reached at a split of 20 (Fig. 1). For a comparison of the activity-type classification with the decision tree, a feed-forward ANN that recognizes the activity classes (ANNClass) with one hidden layer was developed with the same features as the tree and was used to classify the performed activities. An ANN implements linear discriminants but in a space where the inputs have been mapped nonlinearly. To determine the best set of parameters, all combinations of weight decay between 0.1 and 0.9 and the number of hidden neurons between 1 and 20 were tested with 1,000 training cycles using the leave-one-subject-out method. The best overall accuracy was reached using a weight decay of 0.4 and 18 hidden neurons. For the development of ASPE, the accelerometer data were aggregated by their means for activity type and subject and were aligned to the averaged EE data. Multiple regression analysis was used to estimate the effect of sex, age, body weight, tri-axial acceleration, vector magnitude, steps, and inclinometer on EE with backward exclusion of single predictors. The personal factors were included as they were shown to affect EE significantly (14, 27). After the decision tree determined the activity type, the respective ASPE were applied to the activity-type-specific data, resulting in the EE estimated by the Tree-ASPE. A feed-forward ANN with one hidden layer was used to estimate the ee (ANNee) of the activities using the same features as the ASPE. To determine the best set of parameters, all combinations of weight decay between 0.1 and 0.9 and the number of hidden neurons between one and 20 were tested with 1,000 training cycles using the leaveone-subject-out method. A model with a weight decay of 0.2 and 12 hidden neurons was found to provide the lowest total root mean squared error (RMSE) in the EE estimation. With the use of a multiple regression equation with backwards exclusion of variables, an additional SPE was developed with the same features as the Tree-ASPE for comparison with the Tree-ASPE approach. In addition, three previous prediction equations were tested

Fig. 1. Decision tree used for activity classification. Decision criteria refers to the left branch of the respective split. VM ⫽ vector magnitude (counts/second), vertical ⫽ vertical acceleration (counts/second), ant_post ⫽ antero-posterior axis (counts/second), medio_lat ⫽ medio-lateral axis (counts/second), steps ⫽ steps/second. Inclinometer contains levels 1–3. These levels are an a priori position classification according to the manufacturer (1 ⫽ standing, 2 ⫽ lying, 3 ⫽ sitting). J Appl Physiol • doi:10.1152/japplphysiol.01443.2012 • www.jappl.org

Neural Network Versus Activity-Specific Prediction Equations

on the data of this study, including one single-regression model developed by Freedson et al. (17) for vertical, 60-s accelerometer data and two, two-regression models developed by Crouter at al. (10) using vertical accelerometer data or VM over 10-s intervals. For the analysis with the previous regression models (10, 17), the VM was calculated for each second, and then, the 1-s epochs of the vertical axis and the vector magnitude were converted to counts/10 s and counts/minute, and the coefficient of variation was calculated for use with the regressions of Crouter et al. (10). Leave-one-subject-out cross-validation (29) was used to evaluate the EE estimation of all methods and the classification of the decision tree and the ANNClass. For the descriptive analysis of the EE estimates, means and SD were used. Mean bias and RMSE of the estimated and measured EE were determined for each activity. To compare the prediction of all EE estimation methods with measured values, a nonparametric Wilcoxon rank-sum test with Bonferroni adjustments for multiple comparisons was used to account for the different activity classes. This nonparametric test was used, as the analysis of quantile-quantile plots revealed non-normal, distributed EE estimates in the different activity classes. The same procedure was used to compare the biases of estimated and measured EE values between the Tree-ASPE and all other EE estimation methods. A confusion matrix was used to show proportions of correctly assigned and misclassified activity types. The proportions of data that were classified correctly and misclassified by the decision tree were compared with those of the ANNClass using a ␹2 test. All statistical



1231

Ruch N et al.

analyses were performed on R (The R Project for Statistical Computing, Version 2.14.0; The R Foundation for Statistical Computing, Vienna, Austria). The decision tree and the ANNs were developed and tested using package Recursive Partitioning (rpart) and Neural Networks (nnet), respectively. RESULTS

In the ASPE, a large part of the variance was explained by the regression equations for running, jumping, and crawling (R2 ⫽ 0.80 – 0.83; P ⬍ 0.001). Moderate R2 values were found for the remaining activities (R2 ⫽ 0.59 – 0.68; P ⬍ 0.001; Table 1). Acceleration values were statistically excluded in the models for sitting and standing. In the other activities, different combinations of acceleration features, such as VM, steps, and the individual axes, were included beside personal characteristics of the person. The EE estimated by the Tree-ASPE did not differ significantly from the measured EE in six activities (Fig. 2). However, they significantly overestimated the EE of walking (P ⬍ 0.05) and underestimated the EE of cycling (P ⬍ 0.05). The ANNEE significantly overestimated the EE of stationary activities and walking and underestimated the EE of cycling (P ⬍ 0.05; Fig. 2). The biases of the Tree-ASPE were significantly lower than the biases of the ANNEE during stationary activities and walking (P ⬍ 0.05; Table 2). Total

Table 1. Activity-specific prediction equations (ASPE) and a single prediction equation (SPE) for the estimation of energy expenditure (EE; kJ/min) from accelerometer and anthropometric data Activity Class

SPE

Sitting

Standing Walking

Running

Jumping

Scooter

Cycling

Crawling

Independent

Coefficient

Intercept VM Weight Steps Intercept Sex (0/1) Age Weight Intercept Weight Intercept Steps VM Weight Intercept Vertical Antero-posterior Sex (0/1) Weight Intercept Steps VM Weight Intercept Antero-posterior Medio-lateral Steps Weight Intercept Weight Vertical Antero-posterior VM Intercept Vertical Steps Weight

⫺1.880 0.086 0.259 1.955 1.412 0.472 0.148 0.062 2.613 0.083 ⫺1.24 1.529 0.084 0.172 1.325 0.091 ⫺0.071 2.180 0.436 6.23 ⫺10.82 0.107 0.507 ⫺8.534 0.104 0.173 7.768 0.252 6.106 0.223 0.534 0.279 ⫺0.233 ⫺2.596 0.053 6.91 0.302

Par R2

R2

SEE, kJ/min

P

0.602 0.102 0.025

0.729

4.64

⬍0.001

0.055 0.245 0.314

0.614

0.86

⬍0.001

0.627

0.627

0.72

⬍0.001

0.019 0.194 0.435

0.642

2.09

⬍0.001

0.094 0.008 0.021 0.706

0.829

3.14

⬍0.001

0.011 0.160 0.629

0.800

3.83

⬍0.001

0.683

3.70

⬍0.001

0.590

3.68

⬍0.001

0.818

1.89

⬍0.001

0.061 0.246 0.154 0.222 0.190 0.319 0.021 0.060 0.018 0.056 0.744

Par R ⫽ partial R ; SEE ⫽ standard error of the estimate; VM ⫽ vector magnitude; Sex: 0 ⫽ girls, 1 ⫽ boys; Scooter ⫽ riding a city scooter. 2

2

J Appl Physiol • doi:10.1152/japplphysiol.01443.2012 • www.jappl.org

1232

Neural Network Versus Activity-Specific Prediction Equations



Ruch N et al.

Fig. 2. Energy expenditure (EE), measured and estimated by a decision tree activityspecific prediction equations (Tree-ASPE), an artificial neural network (ANNEE), and a single prediction equation (SPE). Scooter ⫽ riding a city scooter. The bottom and top of the box are the 1st and 3rd quartiles, respectively. The end of the whiskers are ⫾1.58 of the interquartile range. *Differs significantly from the measured value (P ⬍ 0.05). MET ⫽ metabolic equivalent.

RMSEs over all activity types were 0.67 ⫾ 0.42 MET and 0.94 ⫾ 0.60 MET for the Tree-ASPE and ANNEE, respectively. RMSE was significantly larger in the ANNEE during sitting, standing, walking, and crawling than in the Tree-ASPE (P ⬍ 0.05). The SPE significantly overestimated stationary activities, walking, and crawling and underestimated the EE during riding a city scooter and cycling compared with measured values (P ⬍ 0.05; Fig. 2). Biases of the SPE were significantly different from the biases of the Tree-ASPE during sitting, standing, walking, and cycling (P ⬍ 0.05). The total RMSE of the SPE was 1.24 ⫾ 0.83 MET. During sitting, standing, walking, and cycling, the RMSE was significantly different between the SPE and the Tree-ASPE (P ⬍ 0.05). The Freedson equation (17) significantly overestimated the EE of all activities (P ⬍ 0.05), except for riding a city scooter (bias: 0.3 ⫾ 1.4 MET), and significantly underestimated cycling (bias: ⫺2.0 ⫾ 1.1 MET; P ⬍ 0.05) compared with measured values (Fig. 3). Biases of the Freedson equations were significantly larger than that of the Tree-ASPE during all activities except riding a city scooter (bias: 0.3 ⫾ 1.4 MET; P ⬍ 0.05; Table 3). The Crouter equation (10), using vertical acceleration, was significantly different from measured values during stationary activities, jumping, riding a city scooter, and cycling (P ⬍ 0.05). The Crouter equation (10), which used VM for the EE prediction, resulted in accurate EE estimates in walking (bias: ⫺0.5 ⫾ 0.6 MET), running (bias: ⫺1.0 ⫾ 1.1 MET), and riding a city scooter (bias: ⫺0.6 ⫾ 1.2 MET; Fig. 3). Biases of the Crouter equation (10), using vertical acceleration, were significantly

different than the biases of the Tree-ASPE, except for running (bias: ⫺0.4 ⫾ 1.7 MET) and crawling (bias: ⫺0.04 ⫾ 1.0 MET; P ⬍ 0.05). Biases of the Crouter equation (10) using VM were significantly different than the biases of the Tree-ASPE in all activities (P ⬍ 0.05). RMSE of the Freedson (17) and Crouter (10) equations was significantly larger than the RMSE of the Tree-ASPE during all activities except during riding a city scooter and in the Crouter equations (10) during walking (P ⬍ 0.05). The decision tree and the ANNClass correctly classified 60.5% and 63.8% of the data, respectively. Proportions of data that were classified correctly and misclassified were not significantly different between the two classification procedures (P ⫽ 0.66). Stationary activities, walking, running, and jumping were the activities with the highest recognition rates in the decision tree and the ANNClass (Table 4). Crawling was recognized at higher rates by the ANNClass than by the decision tree. Riding a city scooter was most commonly misclassified as walking, crawling, and cycling in both classifiers, whereas crawling was mostly mistaken as riding a city scooter, walking, and cycling. Cycling was misclassified mainly as a stationary activity and riding a city scooter by both classifiers. DISCUSSION

This study compared a Tree-ASPE with an ANNEE regarding the accuracy of their estimation of children’s EE, based on accelerometer data. Results showed that the Tree-ASPE was more accurate in estimating the EE than the ANNEE, indicating

Table 2. Bias and root mean squared error (RMSE) of EE, measured and estimated by decision tree ASPE (Tree-ASPE), an artificial neural network (ANNEE), and a SPE Mean Bias, kJ/min (MET)

RMSE, kJ/min (MET)

Activity

Tree-ASPE

ANNEE

SPE

Tree-ASPE

ANNEE

SPE

Sitting Standing Walking Running Jumping Scooter Cycling Crawling

0.3 ⫾ 0.9 (0.1 ⫾ 0.2) 0.3 ⫾ 0.8 (0.1 ⫾ 0.3) 1.1 ⫾ 2.5 (0.3 ⫾ 0.6) ⫺1.2 ⫾ 3.4 (⫺0.3 ⫾ 1.0) 0.4 ⫾ 3.6 (0.2 ⫾ 0.9) 0.3 ⫾ 4.6 (0.03 ⫾ 1.2) ⫺4.2 ⫾ 5.0 (⫺1.1 ⫾ 1.3) 0.8 ⫾ 2.1 (0.3 ⫾ 0.6)

1.3 ⫾ 3.2 (0.3 ⫾ 0.6) 1.6 ⫾ 3.4 (0.4 ⫾ 0.8) 3.1 ⫾ 4.4 (0.8 ⫾ 0.9)* ⫺0.6 ⫾ 4.0 (⫺0.2 ⫾ 1.1) ⫺0.5 ⫾ 5.8 (⫺0.2 ⫾ 1.4) ⫺0.4 ⫾ 5.8 (⫺0.2 ⫾ 1.4) ⫺2.5 ⫾ 5.6 (⫺0.8 ⫾ 1.3) 2.3 ⫾ 4.0 (0.6 ⫾ 0.9)*

2.1 ⫾ 2.2 (0.5 ⫾ 0.5)* 1.73 ⫾ 2.11 (0.4 ⫾ 0.5)* 3.20 ⫾ 2.25 (0.8 ⫾ 0.6)* ⫺1.8 ⫾ 4.0 (⫺0.4 ⫾ 1.1) 1.3 ⫾ 5.2 (⫺0.5 ⫾ 1.4) ⫺1.8 ⫾ 4.1 (⫺0.5 ⫾ 1.1) ⫺6.7 ⫾ 4.3 (⫺1.9 ⫾ 1.3)* 1.9 ⫾ 2.2 (0.6 ⫾ 0.6)

0.9 ⫾ 0.6 (0.2 ⫾ 0.2) 0.7 ⫾ 0.6 (0.2 ⫾ 0.2) 2.1 ⫾ 1.7 (0.6 ⫾ 0.4) 2.7 ⫾ 2.6 (0.7 ⫾ 0.7) 2.8 ⫾ 2.4 (0.8 ⫾ 0.6) 3.5 ⫾ 2.9 (0.9 ⫾ 0.7) 4.8 ⫾ 4.1 (1.3 ⫾ 1.1) 1.7 ⫾ 1.5 (0.5 ⫾ 0.4)

2.0 ⫾ 1.5 (0.5 ⫾ 0.4)* 2.1 ⫾ 2.2 (0.5 ⫾ 0.5)* 3.4 ⫾ 2.8 (0.9 ⫾ 0.8)* 3.6 ⫾ 4.0 (0.9 ⫾ 0.9) 4.5 ⫾ 4.6 (1.1 ⫾ 0.9) 4.0 ⫾ 3.4 (1.1 ⫾ 0.9) 4.3 ⫾ 3.6 (1.2 ⫾ 1.0) 3.2 ⫾ 2.7 (0.9 ⫾ 0.7)*

2.2 ⫾ 2.1 (0.6 ⫾ 0.4)* 2.1 ⫾ 1.8 (0.5 ⫾ 0.3)* 3.3 ⫾ 2.1 (0.9 ⫾ 0.5)* 3.4 ⫾ 2.8 (0.9 ⫾ 0.7)* 4.5 ⫾ 2.9 (1.2 ⫾ 0.8)* 3.4 ⫾ 2.9 (0.9 ⫾ 0.7) 6.8 ⫾ 4.1 (1.9 ⫾ 1.2)* 2.4 ⫾ 1.7 (0.7 ⫾ 0.5)*

Values are means ⫾ SD. *Differs significantly from the ASPE (P ⬍ 0.05). MET ⫽ metabolic equivalent of task. J Appl Physiol • doi:10.1152/japplphysiol.01443.2012 • www.jappl.org

Neural Network Versus Activity-Specific Prediction Equations



1233

Ruch N et al.

Fig. 3. EE estimated by previous SPE models [Freedson et al. (17)] and two 2-regression models using vertical acceleration (Crouter ver) or vector magnitude (Crouter vm) [Crouter et al. (10)]. The bottom and top of the box are the 1st and 3rd quartiles, respectively. The end of the whiskers are ⫾1.58 of the interquartile range. *Differs significantly from the measured value (P ⬍ 0.05).

that prior activity recognition might be useful for estimating the EE of different activities accurately. This was supported as a SPE developed within the same dataset, and previously developed SPE (17) estimated the EE less accurately than the Tree-ASPE. Two previously published two-regression models estimated the EE of locomotor activities accurately; however, they were less precise in estimating the EE in other activities. This study provided a direct comparison of the Tree-ASPE with an ANNEE and several previous methods and revealed that using activity-specific information might be a promising approach for the accurate prediction of EE of different activities in children. The Tree-ASPE was shown to be accurate in the estimation of the EE of more activities than the ANNEE. It was found that the bias during sitting and walking was significantly lower in the Tree-ASPE than in the ANNEE. Similarly, the RMSE was higher during sitting, standing, walking, and crawling in the ANNEE than in the Tree-ASPE. However, both methods underestimated the EE of cycling and overestimated the EE of walking. In addition, the ANNEE seemed to be overly adaptive to the more vigorous activities and neglect the static activities. The Tree-ASPE estimated the EE for each activity separately and therefore, may have been more sensitive to their individual intensities. In another study that used an ANNEE (35), a RMSE of 0.9 –1.1 MET was achieved, which is comparable with the results of the ANNEE of our study. These authors reported decreasing RMSE in the EE estimation of their ANNEE when the features were sampled over 60 s instead of 10 s. Hence, the RMSE in our study might also drop under such conditions, but

this might be true for the ANNEE and the Tree-ASPE. Therefore, in the present study, the Tree-ASPE was an accurate method for estimating the EE of the most common daily activities in children, and it might be preferred to an ANN as the more accurate method. The application of ASPE requires prior activity recognition. This two-step analysis might be regarded as complicated. An ANNEE, on the other hand, is an easily applied, straightforward approach to investigating the EE using accelerometer data. However, before the application of an ANN, the complexity of the model has to be determined. The results of the ANN might not be generalizable if too many parameters are used; however, if too few parameters are used, the training data cannot be fitted adequately (13). If the patterns are separated well or are linearly separable, few hidden neurons are needed. On the contrary, more hidden neurons are needed if the patterns are highly interspersed. Therefore, parameters have to be chosen carefully, although there is no foolproof method for determining the number of hidden neurons (13). In the present study, the parameters of the ANNEE were determined using leave-one-subject-out validation. Overfitting was therefore unlikely, and it was underlined by a lower amount of hidden neurons compared with previous studies (18, 30). They estimated the EE of a large variety of activities, which might have resulted in nonlinear separation increasing the number of hidden neurons. Finally, as the parameters used in an ANN depend on the training data, it is difficult to compare them between studies if different activities were performed. However, cross-validating these parameters, as was done in the present study, is important, as there is no gold standard for

Table 3. Bias and root mean squared error (RMSE) of the Freedson equation (17) and the Crouter equations (10) Mean Bias, kJ/min (MET)

RMSE, kJ/min (MET)

Activity

Freedson

Crouter vertical

Crouter vm

Freedson

Crouter vertical

Crouter vm

Sitting Standing Walking Running Jumping Scooter Cycling Crawling

1.6 ⫾ 1.1 (0.5 ⫾ 0.4)* 1.3 ⫾ 1.0 (0.4 ⫾ 0.3)* 3.5 ⫾ 3.0 (1.0 ⫾ 0.8)* 8.8 ⫾ 5.8 (2.5 ⫾ 1.6)* 24.5 ⫾ 8.0 (6.7 ⫾ 2.2)* 0.9 ⫾ 4.9 (0.3 ⫾ 1.4) ⫺7.5 ⫾ 4.3 (⫺2.0 ⫾ 1.1)* 2.2 ⫾ 4.6 (0.7 ⫾ 1.3)*

⫺1.7 ⫾ 0.8 (⫺0.4 ⫾ 0.2)* ⫺1.7 ⫾ 0.8 (⫺0.5 ⫾ 0.2)* ⫺0.7 ⫾ 2.1 (⫺0.2 ⫾ 0.6)* ⫺1.5 ⫾ 7.0 (⫺0.4 ⫾ 1.7) 21.9 ⫾ 14.9 (5.8 ⫾ 3.8)* ⫺2.6 ⫾ 4.3 (⫺0.7 ⫾ 1.2)* ⫺8.6 ⫾ 4.4 (⫺2.3 ⫾ 1.1)* ⫺0.5 ⫾ 3.8 (⫺0.04 ⫾ 1.0)

⫺1.6 ⫾ 1.0 (⫺0.4 ⫾ 0.2)* ⫺1.8 ⫾ 0.8 (⫺0.5 ⫾ 0.2)* ⫺1.7 ⫾ 2.0 (⫺0.5 ⫾ 0.6)* ⫺3.6 ⫾ 3.9 (⫺1.0 ⫾ 1.0)* 4.3 ⫾ 5.4 (1.2 ⫾ 1.4)* ⫺2.1 ⫾ 3.9 (⫺0.6 ⫾ 1.1)* ⫺8.8 ⫾ 4.8 (⫺2.4 ⫾ 1.3)* 2.6 ⫾ 2.8 (0.8 ⫾ 0.8)*

1.7 ⫾ 0.9 (0.5 ⫾ 0.3)* 1.4 ⫾ 0.9 (0.4 ⫾ 0.3)* 3.8 ⫾ 2.5 (1.0 ⫾ 0.7)* 9.1 ⫾ 5.4 (2.5 ⫾ 1.5)* 24.5 ⫾ 8.0 (6.7 ⫾ 2.2)* 3.8 ⫾ 3.1 (1.1 ⫾ 0.9) 7.6 ⫾ 4.3 (2.0 ⫾ 1.1)* 4.2 ⫾ 2.8 (1.2 ⫾ 0.9)*

1.6 ⫾ 0.9 (0.4 ⫾ 0.2)* 1.8 ⫾ 0.8 (0.5 ⫾ 0.2)* 1.7 ⫾ 1.4 (0.5 ⫾ 0.4) 5.1 ⫾ 4.8 (1.4 ⫾ 1.1)* 21.9 ⫾ 14.8 (5.9 ⫾ 3.7)* 3.8 ⫾ 3.3 (1.0 ⫾ 0.9) 8.6 ⫾ 4.4 (2.3 ⫾ 1.1)* 2.9 ⫾ 2.5 (0.8 ⫾ 0.6)*

1.6 ⫾ 0.9 (0.4 ⫾ 0.2)* 1.8 ⫾ 0.8 (0.5 ⫾ 0.2)* 2.0 ⫾ 1.7 (0.6 ⫾ 0.5) 4.4 ⫾ 3.1 (1.2 ⫾ 0.8)* 6.0 ⫾ 3.6 (1.6 ⫾ 0.9)* 3.4 ⫾ 2.8 (0.9 ⫾ 0.7) 8.9 ⫾ 4.8 (2.4 ⫾ 1.3)* 3.1 ⫾ 2.3 (0.9 ⫾ 0.7)*

Values are means ⫾ SD. *Differs significantly from the ASPE (P ⬍ 0.05). Freedson ⫽ Freedson et al. (17) model; Crouter vertical ⫽ Crouter et al. (10) vertical acceleration model; Crouter vm ⫽ Crouter et al. (10) VM model. J Appl Physiol • doi:10.1152/japplphysiol.01443.2012 • www.jappl.org

1234

Neural Network Versus Activity-Specific Prediction Equations



Ruch N et al.

Table 4. Classification results of a recursive classification tree (Tree) and an ANNClass Predicted Observed

Sitting Standing Walking Crawling Scooter Jumping Running Cycling

Method

Sitting, %

Standing, %

Walking, %

Crawling, %

Scooter, %

Jumping, %

Running, %

Cycling, %

Tree ANNClass Tree ANNClass Tree ANNClass Tree ANNClass Tree ANNClass Tree ANNClass Tree ANNClass Tree ANNClass

38.9 36.2 50.7 34.3 0 0.0 0.2 0.1 0.9 0.5 0.3 0.3 0 0.0 12.0 11.0

58.3 61.2 42.7 60.5 0.1 0.2 0.7 0.9 1.9 3.4 0.5 0.5 0.4 0.4 3.1 3.5

0 0.0 0 0.0 79.9 79.9 7.1 5.0 14.6 15.6 2.9 2.7 2.7 2.7 5.9 6.3

0 0.0 0 0.0 3.6 4.1 66.7 75.0 19.6 20.2 1.4 2.2 0.8 0.9 7.7 8.4

0.4 0.1 1.4 0.6 9.0 9.9 16.6 14.2 35.3 39.5 7.1 6.7 4.5 4.2 8.9 10.5

0 0.0 0 0.0 0.9 0.4 1.9 0.7 8.0 5.4 80.7 79.8 13.1 11.5 0 0.1

0 0.0 0 0.1 1.3 1.8 0 0.1 2.8 2.1 5.9 7.1 78.3 80.1 0.6 0.5

2.4 2.5 5.1 4.5 5.3 3.8 6.8 4.1 16.9 13.3 1.2 0.7 0.2 0.2 61.8 59.7

Bold ⫽ percentage of correctly classified data. ANNClass ⫽ ANN that recognizes the activity classes.

determining their dimensions. For comparison between different studies, an ANNEE would have to be trained with the same parameters and the same training data. Furthermore, an ANN was discussed recently regarding its reproducibility for activities not included in the dataset (5). However, for large sample studies that require a convenient, straightforward approach, an ANN could be considered with the inherent limitations in its interpretation. If the output should be robust and interpretable, standard statistical methods, such as ASPE, allow an easy accessible and detailed interpretation of the EE estimation. The Tree-ASPE increased the accuracy of the EE estimation in contrast to a SPE developed within the same dataset, indicated by the lower biases and RMSE in several activities. A recent study (20), which compared two-regression models that distinguished between play and locomotor activities with a SPE, yielded no advantage of the two-regression approach when both methods were developed within the same dataset. However, for its validation analysis, that study used one Bland-Altman plot for all activities and did not reveal the biases within single activities (20). Therefore, there might have been improvements within single activities regarding the EE estimation when using the activity-specific two-regression approach. The results of the present study indicate that a SPE might estimate the EE of certain activities inaccurately. A prior activity recognition with subsequent activity-specific estimation of EE might therefore enhance the accuracy of EE estimation. The overestimation of EE during walking and running using the Freedson equation (17) was in line with previous research that showed that EE of slow and fast walking and slow running was overestimated by the Freedson equation compared with measured values (34). This might be due to the equation’s development with locomotor data collected on the treadmill, which might have generated lower accelerations than the same activities might produce in the field. Except during riding a city scooter, RMSE were larger than in the Tree-ASPE, indicating that the estimation of EE with prior activity-type recognition increased the accuracy of EE estimation compared with this previous EE prediction equation (17). The Crouter equations (10) seemed to estimate the EE of locomotor activities accu-

rately, even though RMSE were larger than in the Tree-ASPE during running. These equations were less accurate for the nonlocomotor activities included in the present study. The inaccuracies of these previously developed equations might be explained, in part, by the use of other activities during their development. The second prediction equation, including household and game activities, was probably not sufficiently activity specific for an accurate EE estimation in the present study. Further subdividing the activities might therefore be more effective than using only two regressions. As these previous equations (10, 17) were developed originally with different activities and in other populations, they were disadvantaged to predict EE compared with the models developed within the present study. Therefore, the comparison of the outcome of the Tree-ASPE with a SPE developed within the same study sample, as was shown in the present study and in a previous study (20), might be more meaningful. As the ANNClass and the decision tree performed similarly during activity-type classification, a combination of ANNClass and the ASPE might also be considered as a combination of an activity-type classifications and subsequent EE estimation. An ANNClass was reported to be more flexible and adaptive than a decision tree in applications where the functional form is nonlinear, or the data are noisy (11). However, the included activities probably did not require an extreme adaptation, and therefore, the decision tree and the ANNClass were similarly accurate. Furthermore, high recognition rates were reported in a study using a decision tree on the data of six different activities in adults (4). Therefore, the activity recognition of decision trees seems to be comparable with an ANNClass, and as they can be easily interpreted, they might even be the preferred method compared with an ANNClass. Our study achieved lower recognition rates (74.2–75.3%) compared with other studies where an ANNClass was applied to the vertical accelerometer data (76.8 – 88.4%) (11a, 35). As in our study, those studies investigated the recognition of walking and running among other activities. They reached high recognition rates during walking (92.0 –93.9%), whereas our study reached rates between 79.4% and 79.8%. During running, the classifiers used in the present study reached 78.2– 80.6%,

J Appl Physiol • doi:10.1152/japplphysiol.01443.2012 • www.jappl.org

Neural Network Versus Activity-Specific Prediction Equations

whereas those used in previous studies reached 75.0 – 86.9%. Therefore, compared with other studies, the misclassifications of walking introduced a larger error into our model than those of running. As crawling and riding a city scooter were often confounded with other activities, such as walking, inclusion of these child-specific activities might have lowered the overall recognition rate compared with previous studies (11a, 35). Compared with the 1-s epoch used in our study, in these studies, several variables were extracted from the 1-s count data of the vertical axis over 10- to 60-s periods. This might have leveled out any extreme values and therefore, improved their recognition rate (11a, 35). However, as children’s PA behavior is very spontaneous and intermittent (1, 2), the use of an epoch of 1 s, as in the present study, accounts for childspecific exercise length. Trost et al. (35) merged a wide variety of activities into a few classes, such as sedentary behavior, walking, running, and two groupings, called light-to-moderate household chores and games and moderate-to-vigorous games. This procedure most likely improved their recognition results, as the model was more tolerant than the activity-type-specific classification performed in the present study. Detection of the short activity sequences and the precise type of children’s PA, as in the present study, might be important with regard to the development of interventions, according to children’s activitytype-specific preferences and to activity guidelines that aim at health effects, such as bone health, that are affected by certain activity types, such as running or jumping (39). The present study compared a Tree-ASPE with an ANNEE regarding their estimation of EE, based on data of a simple and well-accepted accelerometer. The study was conducted under laboratory conditions, and therefore, the performance of the methods may decrease under field condition, as was reported previously (19). Since REE was not measured but estimated by prediction equations, as in similar studies (10, 35), it was not included in the prediction of EE, although REE would explain more of the variance of EE than did body weight (6). To counter this, we included the age, body weight, and gender of the participants in the ANNEE and the ASPE to account for the body-size variables that have the largest effect on EE. Freedson et al. (17) argued that an ANN depends strongly on the amount of data used for training. As our activities were limited, this might have reduced the ANN capacity of estimating the EE accurately. However, the amount of activities was small for the Tree-ASPE and the other regression models, and comparison between them might therefore be valid. As there were no sport or game activities included, the generalization for such activities is limited. The classification of each second of activity on the basis of 1-s accelerometer counts offers no possibility for selecting features that reflect the variability of movements. The use of a raw accelerometer in the future might therefore lead to more accurate results when the variability is represented in the features. To our knowledge, this study is the first to compare directly the EE estimation of a Tree-ASPE with that of an ANNEE and with previous prediction equations (10, 17). The Tree-ASPE was more accurate in estimating the EE than all other methods. This first insight into the outcome of these methods revealed that using activity-specific information during EE estimation might be a promising approach. It is essential for future studies to include more activities and to validate such methods in the field, as proposed in recent studies (6, 7).



Ruch N et al.

1235

Conclusion. The aim of this study was to compare the EE estimation of ASPE that were applied with the activity types determined by a decision tree. The study demonstrated the potential of this method compared with an ANNEE that was less precise in estimating EE for the included activities. Furthermore, ASPE might be preferred, as they provide an easy accessible and detailed interpretation of their EE estimation. A SPE developed within the same dataset, and a previously developed SPE estimated the EE less accurately than the Tree-ASPE and should be used carefully for EE estimation. Two previous two-regression models estimated the EE of mainly locomotor activities accurately. Further subdividing the activities might therefore lead to more precise EE estimation rather than using only two regressions. Therefore, the use of activity-type-specific information for subsequent prediction equations might be a promising approach for future studies. Critical aspects for the choice of the appropriate method might also include the generalizability, transparency, and interpretability of the results. Future research should further validate such classification models with subsequent EE estimation in free-living conditions. ACKNOWLEDGMENTS We give special thanks to all of the children who participated voluntarily and with motivation. DISCLOSURES The authors declare no conflicts of interest. AUTHOR CONTRIBUTIONS Author contributions: N.R., G.J., and U.M. conception and design of research; N.R. and F.J. performed experiments; N.R. and F.J. analyzed data; N.R. and F.J. interpreted results of experiments; N.R. and F.J. prepared figures; N.R. and F.J. drafted manuscript; N.R., G.J., K.M., J.H., and U.M. edited and revised manuscript; N.R., F.J., G.J., K.M., J.H., and U.M. approved final version of manuscript. REFERENCES 1. Bailey RC, Olson J, Pepper SL, Porszasz J, Barstow TJ, Cooper DM. The level and tempo of children’s physical activities: an observational study. Med Sci Sports Exerc 27: 1033–1041, 1995. 2. Baquet G, Stratton G, Van Praagh E, Berthoin S. Improving physical activity assessment in prepubertal children with high-frequency accelerometry monitoring: a methodological issue. Prev Med 44: 143–147, 2007. 3. Bassett DR Jr, Rowlands A, Trost SG. Calibration and validation of wearable monitors. Med Sci Sports Exerc 44: 32–38, 2012. 4. Bonomi AG, Goris AH, Yin B, Westerterp KR. Detection of type, duration, and intensity of physical activity using an accelerometer. Med Sci Sports Exerc 41: 1770 –1777, 2009. 5. Bonomi AG, Plasqui G. “Divide and conquer”: assessing energy expenditure following physical activity type classification. J Appl Physiol 112: 932, 2012. 6. Bonomi AG, Plasqui G, Goris AH, Westerterp KR. Estimation of free-living energy expenditure using a novel activity monitor designed to minimize obtrusiveness. Obesity (Silver Spring) 18: 1845–1851, 2010. 7. Bonomi AG, Plasqui G, Goris AH, Westerterp KR. Improving assessment of daily energy expenditure by identifying types of physical activity with a single accelerometer. J Appl Physiol 107: 655–661, 2009. 8. Brandes M, Van Hees VT, Hannover V, Brage S. Estimating energy expenditure from raw accelerometry in three types of locomotion. Med Sci Sports Exerc 44: 2235–2242, 2012. 9. Crouter S. A new 2-regression model for the Actical accelerometer. Br J Sports Med 42: 217–224, 2008. 10. Crouter SE, Horton M, Bassett DR Jr. Use of a two-regression model for estimating energy expenditure in children. Med Sci Sports Exerc 44: 1177–1185, 2012.

J Appl Physiol • doi:10.1152/japplphysiol.01443.2012 • www.jappl.org

1236

Neural Network Versus Activity-Specific Prediction Equations

11. Curram SP, Mingers J. Neural networks, decision tree induction and discriminant analysis: an empirical comparison. J Opl Res Soc 45: 440 – 450, 1994. 11a.de Vries SI, Engels M, Galindo Garre F. Identification of children’s activity type with accelerometer-based neural networks. Med Sci Sports Exerc 43: 1994 –1999, 2011. 12. Dinesh J, Freedson P. ActiGraph and Actical physical activity monitors: a peek under the hood. Med Sci Sports Exerc 44: S86 –S89, 2012. 13. Duda R, Hart PE, Stork DK. Multilayer neural networks. In: Pattern Classification. New York: Wiley-Interscience, John Wiley and Sons, 2001. 14. Ekelund U, Yngve A, Brage S, Westerterp K, Sjöström M. Body movement and physical activity energy expenditure in children and adolescents: how to adjust for differences in body size and age. Am J Clin Nutr 79: 851–856, 2004. 15. Elia M, Livesey G. Energy expenditure and fuel selection in biological systems: the theory and practice of calculations based on indirect calorimetry and tracer methods. World Rev Nutr Diet 70: 68 –131, 1992. 16. Freedson P, Lyden K, Kozey Keadle SL, Staudenmayer J. Reply to Bonomi and Plasqui. J Appl Physiol 112: 933, 2012. 17. Freedson P, Pober D, Janz KF. Calibration of accelerometer output for children. Med Sci Sports Exerc 37, Suppl 11: S523–S530, 2005. 18. Freedson PS, Lyden K, Kozey-Keadle S, Staudenmayer J. Evaluation of artificial neural network algorithms for predicting METs and activity type from accelerometer data: validation on an independent sample. J Appl Physiol 111: 1804 –1812, 2011. 19. Gyllensten IC, Bonomi AG. Identifying types of physical activity with a single accelerometer: evaluating laboratory-trained algorithms in daily life. IEEE Trans Biomed Eng 58: 2656 –2663, 2011. 20. Jimmy G, Seiler R, Mäder U. Development and validation of energy expenditure prediction models based on GT3X accelerometer data in 5- to 9-year-old children. J Phys Act Health 10: 1057–1067, 2013. 22. Pearce DH, Milhorn HT Jr. Dynamic and steady-state respiratory responses to bycicle exercise. J Appl Physiol 42: 959 –967, 1977. 23. Puyau MR, Adolph AL, Vohra FA, Zakeri I, Butte NF. Prediction of activity energy expenditure using accelerometers in children. Med Sci Sports Exerc 36: 1625–1631, 2004. 24. Rodriguez G, Beghin L, Michaud L, Moreno LA, Turck D, Gottrand F. Comparison of the TriTrac-R3D accelerometer and a self-report activity diary with heart-rate monitoring for the assessment of energy expenditure in children. Br J Nutr 87: 623–631, 2002.



Ruch N et al.

25. Ruch N, Rumo M, Mäder U. Recognition of activities in children by two uniaxial accelerometers in free-living conditions. Eur J Appl Physiol 111: 1917–1927, 2011. 26. Sazonova N, Browning RC, Sazonov E. Accurate prediction of energy expenditure using a shoe-based activity monitor. Med Sci Sports Exerc 43: 1312–1321, 2011. 27. Schofield WN. Predicting basal metabolic rate, new standards and review of previous work. Hum Nutr Clin Nutr 39, Suppl 1: 5–41, 1985. 28. Staudenmayer J, Pober D, Crouter S, Bassett D, Freedson P. An artificial neural network to estimate physical activity energy expenditure and identify physical activity type from an accelerometer. J Appl Physiol 107: 1300 –1307, 2009. 29. Staudenmayer J, Zhu W, Catellier DJ. Statistical consideration in the analysis of accelerometry-based activity monitor data. Med Sci Sports Exerc 44: 61–67, 2012. 30. Tanaka C, Tanaka S, Kawahara J, Midorikawa T. Triaxial accelerometry for assessment of physical activity in young children. Obesity (Silver Spring) 15: 1233–1241, 2007. 31. Treuth MS, Schmitz K, Catellier DJ, McMurray RG, Murray DM, Almeida MJ, Going S, Norman JE, Pate R. Defining accelerometer thresholds for activity intensities in adolescent girls. Med Sci Sports Exerc 36: 1259 –1266, 2004. 32. Trost SG, Pate RR, Sallis JF, Freedson PS, Taylor WC, Dowda M, Sirard J. Age and gender differences in objectively measured physical activity in youth. Med Sci Sports Exerc 34: 350 –355, 2002. 33. Trost SG, Ward DS, Moorehead SM, Watson PD, Riner W, Burke JR. Validity of the Computer Science and Applications (CSA) activity monitor in children. Med Sci Sports Exerc 30: 629 –633, 1998. 34. Trost SG, Way R, Okely AD. Predictive validity of three ActiGraph energy expenditure equations for children. Med Sci Sports Exerc 38: 380 –387, 2006. 35. Trost SG, Wong W, Pfeiffer KA, Zheng Y. Artificial neural networks to predict activity type and energy expenditure in youth. Med Sci Sports Exerc 44: 1801–1809, 2012. 37. Vogler AJ, Rice AJ, Gore CJ. Validity and reliability of the Cortex MetaMax3B portable metabolic system. J Sports Sci 28: 733–742, 2010. 38. Whipp BJ, Ward SA, Lamarra N, Davis JA, Wassermann K. Parameters of ventilatory and gas exchange dynamics during exercise. J Appl Physiol 52: 1506 –1513, 1982. 39. World Health Organization. Global Recommendations on Physical Activity for Health. Geneva, Switzerland: World Health Organization, 2010.

J Appl Physiol • doi:10.1152/japplphysiol.01443.2012 • www.jappl.org