This is interesting data and we can poke around a little and explore the data to see if any patterns emerge. One of the first thins to note is that the the mean fran times reported on these pages are incorrect. The mean fran time for men is actually 2.54 (not 2.56) and the mean fran time for women is 4.33 (not 4.36).
One of the first things I did was also compute the BMI index since we have the weight and height data. For those who are curious, the BMI is calculated as:
BMI = (weight * 703)/height^2
So, now we can look at the correlations between all of the factors that could influence a Fran time. The visuals below are called scatterplot matrices (or sploms) because they plot the scatter of multiple variables at one time. The smooth line through the plot is called a "lowess smoother". It is the line that finds the smoothest path through the data. Here are two sploms, one for men and the other for women.


Essentially, what we see here is that the scatter is all over the place. There does not seem to be a strong relationship with any of the variables and Fran. But, we should more formally look at the data and compute the pearson correlation coefficient.
For the men, the correlations are:
BMI and Fran = -.13
Height and Fran = .21
Weight and Fran = .08
Age and Fran = -.02
For the women, the correlations are:
BMI and Fran = .008
Height and Fran = -.07
Weight and Fran = -.04
Age and Fran = -.25
Here, we see that none of these factors bear any real relationship with Fran times. There is a nominal and slightly negative relationship for Age among women. That is, women who are older have faster Fran times. But, this relationship is extremely weak, so the data do not support that inference very well. For the men, there is a small and positive correlation with Height and Fran, indicating that taller men have slightly slower Fran times. But again, that relationship is very weak, and so we really cannot make the claim that such a relationship truly exists.
Another thing we can do is look to see if any of these factors account for any substantial variability in the Fran times. Here we can use a random effects analysis of variance to look at the variability accounted for by the different factors. The random effects ANOVA us used because these individuals are drawn from a population of crossfitters, and so we can get more appropriate estimates of the variances than if a fixed effects ANOVA were used. Ideally, I should use a mixed linear model because the data are imbalanced. That is, there are some missing values in the data. But, the imbalance is slight, and so the estimates will be approximately the same.
Here, I look only at the main effects, and not interactions between the variables. It would be reasonable to examine the interaction between height and weight. But, in principle, the BMI captures this interaction, so we can use the main effect from BMI as a proxy for that interaction.
Looking at the results for me we see the following:
Analysis of Variance Table
Response: fran
Df Sum Sq Mean Sq F value Pr(>F)
age 1 0.015 0.015 0.0308 0.86123
height 1 1.357 1.357 2.8731 0.09458 .
weight 1 0.362 0.362 0.7677 0.38398
bmi 1 2.064 2.064 4.3709 0.04024 *
Residuals 69 32.580 0.472
This shows there is a significant effect for BMI. But, BMI accounts only for about 5.6% of the variability in scores. This is so small that we really cannot make the claim that any of these factors influence Fran performance. The residual, on the other hand, accounts for 90% of the total variability in fran times. This means that there is some unobserved characteristic(s) about the athletes not accounted for in this model that has a strong influence on Fran.
Now, I do the same thing for the women.
Analysis of Variance Table
Response: fran
Df Sum Sq Mean Sq F value Pr(>F)
age 1 11.910 11.910 4.1800 0.04516 *
height 1 1.012 1.012 0.3551 0.55343
weight 1 1.189e-04 1.189e-04 4.173e-05 0.99487
bmi 1 2.066 2.066 0.7252 0.39772
Residuals 62 176.658 2.849
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Here we see that Age has a significant effect. But, that effect accounts for only about 6% of the variability in Fran times. Again, the residual (or something unobserved) accounts for 92% of the variability in scores.
Clearly, some athletes are better at Fran that others. The times bear this to be true. But, if we entertain the question, "does a shorter athlete have a better fran time than a taller athlete" and so on with the other variables, we can see the answer IN THIS SAMPLE, is no.
Keep in mind, these times are from the best among us--those elite folks who are moving on to Aromas. We might find something different if we had data on random crossfitters from around the world. But, we don't.
So, if we assume these athletes are like the rest of us, we then need to assume that there is some unobserved characteristic that accounts for faster Fran times. My guess, is that it is heart and guts. Those of you who can power through, not drop the bar, and ignore the pain, have the best Fran times. Of course, a nice butterfly kip helps, too.
0 comments:
Post a Comment