Sunday, October 25, 2009

Getting In: The Regression Analysis

For a coach, there is nothing more annoying than waiting for your athletes to get in the water.  Most coaches can guesstimate pretty accurately how long it will take each athlete to get in.  I think we can do better than guesstimation.  I think we should run a regression analysis on this one.

When I was in grad school, I took a course on quantitative analysis which introduced me to the concept of regression analysis and it was love at first sample collection.  Basically, it’s a tool of statistics which uses a mathematical equation to figure out how much influence various factors (independent variables) have on a particular outcome (the dependent variable).  You can use it to figure out things like what demographic factors (age, gender, even eye color) best predict someone’s buying behavior.  You collect as many samples as possible that measure the factors and outcomes, then plug the sample data into the regression equation and “run” the equation.  The results tell you how significant each factor is to the outcome.

I took to using regression analyses for more useful things like predicting when a certain classmate was going to wear too much perfume (Thursdays, cloudy weather).  I really impressed my classmates, though, when I used a regression analysis to break up with a boyfriend. 

I told him I could build a regression analysis that would predict the next time he’d behave like a total farking icehole.  Just the threat of running a regression was enough to finish off the relationship (which was the intended outcome), but a rough run of the numbers did find that proximity to an exam period had the best p-value (i.e., it was the most statistically significant factor).

Life with Mr. Coach has not yielded as many opportunities for constructing regression analyses, mostly because he’s something of an open book when it comes to his behavior.  That’s nice for the health of our marriage, but a little boring for my Inner Statistician. 

However, over the last year I’ve realized I have a prime opportunity to create a regression analysis with “getting in the water” behavior.

We have access to a wide range of swimmers in our life -- college, high-school, masters, age-group -- and they come loaded with juicy demographic information like gender, age, time zone of birthplace, birth order, are they more of a linear thinker (math/science/business) or an abstract thinker (arts/humanities), are they romantically involved with anyone also in the vicinity of the pool, what events/distances do they swim, what’s their grade point average. 

The idea is to see which factors have the strongest link to the amount of time it takes for a swimmer to get in the water (as measured from the moment at which the swimmer appears within eye sight of a coach already on the pool deck). 

Based on experiential evidence (because I have been fine-tuning this during the last year), I’m going to hypothesize that the factor profile on the swimmer who takes the least amount of time to get in the water is going to be either a 10-year-old female, oldest child IMer who gets straight As in school or else a 56-year-old male science professor who drives a fuel-efficient sub-compact. 

Paradoxically, I predict that the athlete who takes the longest to get in will be a 20-year-old male middle child/linear thinker/sprinter who has been romantically involved with two or more people also in the pool vicinity.

Let the sample gathering begin!

1 comment:

  1. Oh my!!!

    I do not know where to begin...

    You know why statisticians always travel in pairs, don't you?

    So they don't get beat up!



Note: Only a member of this blog may post a comment.