Statistical Modeling

In a past life, that he hope some day to get back to, Dr. Charles did some computer simulations of problems in statistics. More specifically, problems in measurement theory, were researchers worry about problems in measurement and how those problems affect the relationships between measures and the types of inferences you can draw about measures.

Correction for Attenuation Due to Measurement Error

Modeling of Perception / Animal Behavior

Rat Pup Huddling

Correction for Attenuation Due to Measurement Error

Historically speaking, statistics served three purposes: Growing better crops, brewing better Guinness, or finding differences between people. The last factor can lead into surprisingly interesting conversations about positive eugenics... but that's another story. Scientists interested in studying mental traits became very sensitive to the difficulty of accurate measurement, and came up with several types of theories to explain what is happening in measurement situations. These models apply to any type of measurement, from using a ruler, to a chemical assay, to an intelligence test; psychologists just happened to be some of the first modern scientists who were concerned with the deeper problems, and out of this concern grew the still-thriving field of psychometrics. The first measurement model that psychologists came up with is called "True-Score Theory". The central thesis of this theory is that any observed measurement (X) consists of the true value of the thing-to-be-measured (T), plus some amount of measurement error (E): X=T+E.

This is nothing too revolutionary, except for the implication that we are always interested in T, but we can only measure X. What we do with true-score theory is to substitute T+E into all the traditional statistical equations. This leads to some nasty math, but if we assume that the error is random, a lot of it simplifies quite nicely. One of the earliest findings in this area was that if you substitute T+E for X in the formula for a correlation, you find that observed correlation should be systematically smaller than true-score correlations, on average. The more measurement error, the more the observed data underestimates the true correlation - the observed correlation is attenuated. If you have a know how much error error there was in your measurement, then you can correct for the underestimation - you can correct for the attenuation due to measurement error. This is a hundred year old finding.

One limitation to using this correction, in the context of modern hypothesis-testing based psychology, was an inability to draw confidence intervals around the corrected correlations. Several possible means of calculation had been offered over the years, but frankly the math is crazy difficult, and so no one had fully derived the needed formula. This seemed like it would be a good warm-up for working on agent-based modeling (see Rat Pup Huddling, below), so I decided to try to tackle the problem with raw computing power. Through a series of large-scale computer simulations, I created and tested a formula for calculating the size of the confidence intervals around a corrected correlation.

Charles, E. P. (2005). The Correction for Attenuation Due to Measurement Error: Clarifying Concepts and Creating Confidence Sets. Psychological Methods, 10, 206-226.

Modeling of Animal Behavior

I have preformed computer simulations to explore the evolution of altruistic behavior (with the first such study published here). I also have plans to do some simulations demonstrating how an understanding of the measurement error can lead to more sophisticated models in Animal Behavior. This work ties in with my interests in ecological psychology and perception, because the process of perceiving can be thought of as a process of measurement. That might or might not be a good way to think about perception, but it is a pretty common way to do so. Despite how common such thinking is, few seem to realize that it leads one straight into the full range of theoretical difficulties of measurement.

More coming soon.

In a past-life I was going to be an agent based modeler, working with Jeff Schank at UC Davis, who spent many years modeling rat pup huddling. My main interest in the work was that it showed how a group of organisms could perform very complex behavior, even when no individual organism knew what it was doing, or had access to sufficient information to coordinate what it was doing. This is a special case of the phenomenon where groups of simple systems can produce complex, intelligent actions.

What were we modeling?
Rat pups cannot thermo-regulate successfully on their own. Each rat pup can raise its temperature a little bit, but they cannot maintain temperature for long periods of time if the environment is even a little bit chilly (like it is in any underground nest-cave). Groups of rat pup, huddled together, can keep themselves warm enough, and can even overheat. Thus a nest of rat pups maintains the temperature needed to stay alive by continuously forming different sized huddles. The individual pups can feel when they are touching things, they can swivel their torsos back and forth, and they can push forward with their legs, oh, and they are sensitive to their own internal temperature.  But that is about it... the pups are blind, and they also lack any knowledge of how warm the group is as a whole, or how many pups are in any given huddle - and they certainly don't 'know' that they are trying to form ideal-sized dynamic group distributions - and even if they did, there is nothing they can do to coordinate the movements of other pups. Instead, each pup must make a simple decision: Stay where I am, or move.

For a little more context: Schank built a special device while post-docing with Jeff Alberts at Indiana University, that kept the pups on a flat surface of uniform temperature, and set about observing what the pups did. Many things were measured, including direct the pup was oriented and whether it was moving (change between frames of videos), but of particular interest were the distribution of huddles - e.g., out of ten pups, at this moment, there are three groups of 3, 3, and 4 pups. (Incidentally, this also leads into cool investigation of how you measure synchrony, which is not at all intuitive, and to proof that menstrual synchrony does not exist, such as the human data presented here, but that a different story.)

Schank then built probabilistic models of the simple stay or move decision rule, deployed the models in digital rat pups in a digital apparatus, and tuned them through Darwinian algorithms. He knew what the distribution of huddles looked like in real rat-pup huddles, and he ran tens of thousands of simulations with different values of the model variables to see how well they matched the desired pattern. Those values that produced better matches were allowed to 'reproduce' with minor variation, and tens of thousands of generations later you end up with a set of variables (or a few sets of variables) that produce very good matches.

The equations are pretty nasty, for example, the probability of a 10 day old pup being active are predicted by the following equations (taken from here):

What were the findings?
The most basic finding is in terms of which potential variables are significant in the final model. Schank looked at 7-day old and 10-day old rat pups. There is significant developmental change in those three days, and the behavior of the rat pups reflects that. The models show that the behavior of the 7-day old pups is not coupled with the behavior of other members of the group. The behavior of 10-day old pups is coupled, it a function of the activity of those pups it is in contact with. The equations change tremendously, meaning that the underlying dynamics that maintain system stability are changing dramatically. This is amazing, not because there are dynamics in development, but because the system is so dynamic without failure. Recall, that if the system malfunctions for even a small amount of time, the whole group of pups is dead. (Not in the lab, the lab pups are safe, but in the wild the whole group would be dead.) If the individual pups fail to join and leave huddles at the right time, all pups relying on the huddle for heat will be out of luck. This is an excellent example of a situation in which natural selection must favor developmental systems that maintain functionality.

This is good, basic-science work towards better understanding behavioral development. It not only tells us about the dynamics of individual behaviors, but also about the dynamics of group coordination. It demonstrates that complex, intelligent, adaptive behaviors are possible within individuals and within groups with no 'knowledge' of what is being done, and it investigates a mechanism that could account for such adaptive behavior (probabilistic behavior deployment). This has implications for how we might envision the role of multiple body parts (e.g., multiple brain regions) in guiding behavior. 

While I never went too far with this work, it lead me to the skill-set needed for the other simulation work described above.