Statistical
Modeling
In a past life, that he hope some day to get back to,
Dr. Charles did some computer simulations of problems
in statistics. More specifically, problems in
measurement theory, were researchers worry about
problems in measurement and how those problems affect
the relationships between measures and the types of
inferences you can draw about measures.
Correction
for Attenuation Due to Measurement Error
Modeling
of Perception / Animal Behavior
Rat Pup Huddling
Historically speaking, statistics served three
purposes: Growing better
crops, brewing better
Guinness, or finding differences between people.
The last factor can lead into surprisingly interesting
conversations about positive
eugenics... but that's another story. Scientists
interested in studying mental traits became very
sensitive to the difficulty of accurate measurement,
and came up with several types of theories to explain
what is happening in measurement situations. These
models apply to any type of measurement, from using a
ruler, to a chemical assay, to an intelligence test;
psychologists just happened to be some of the first
modern scientists who were concerned with the deeper
problems, and out of this concern grew the
stillthriving field of psychometrics. The first
measurement model that psychologists came up with is
called "TrueScore
Theory". The central thesis of this theory is
that any observed measurement (X) consists of the true
value of the thingtobemeasured (T), plus some
amount of measurement error (E): X=T+E.
This is nothing too revolutionary, except for the
implication that we are always interested in T, but we
can only measure X. What we do with truescore theory
is to substitute T+E into all the traditional
statistical equations. This leads to some nasty math,
but if we assume that the error is random, a lot of it
simplifies quite nicely. One of the earliest findings
in this area was that if you substitute T+E for X in
the formula for a correlation, you find that observed
correlation should be systematically smaller than
truescore correlations, on average. The more
measurement error, the more the observed data
underestimates the true correlation  the observed
correlation is attenuated. If you have a know how much
error error there was in your measurement, then you
can correct for the underestimation  you can correct
for the attenuation due to measurement error. This is
a hundred year old finding.
One limitation to using this correction, in the
context of modern hypothesistesting based psychology,
was an inability to draw confidence intervals around
the corrected correlations. Several possible means of
calculation had been offered over the years, but
frankly the math is crazy difficult, and so no one had
fully derived the needed formula. This seemed like it
would be a good warmup for working on agentbased
modeling (see Rat Pup Huddling, below), so I decided
to try to tackle the problem with raw computing power.
Through a series of largescale computer simulations,
I created and tested a formula for calculating the
size of the confidence intervals around a corrected
correlation.
Charles,
E. P. (2005). The Correction for Attenuation Due to
Measurement Error: Clarifying Concepts and Creating
Confidence Sets. Psychological
Methods, 10, 206226.
I have preformed computer simulations to explore the
evolution of altruistic behavior (with the first such
study published here).
I also have plans to do some simulations demonstrating
how an understanding of the measurement error can lead
to more sophisticated models in Animal Behavior. This
work ties in with my interests in ecological
psychology and perception, because the process of
perceiving can be thought of as a process of
measurement. That might or might not be a good way to
think about perception, but it is a pretty common way
to do so. Despite how common such thinking is, few
seem to realize that it leads one straight into the
full range of theoretical difficulties of measurement.
More coming soon.
In a pastlife I was going to be an agent based
modeler, working with Jeff
Schank at UC Davis, who spent many years
modeling rat pup huddling. My main interest in the
work was that it showed how a group of organisms could
perform very complex behavior, even when no individual
organism knew what it was doing, or had access to
sufficient information to coordinate what it was
doing. This is a special case of the phenomenon where
groups of simple systems can produce complex,
intelligent actions.
What were we modeling?
Rat pups cannot thermoregulate successfully on their
own. Each rat pup can raise its temperature a little
bit, but they cannot maintain temperature for long
periods of time if the environment is even a little
bit chilly (like it is in any underground nestcave).
Groups of rat pup, huddled together, can keep
themselves warm enough, and can even overheat. Thus a
nest of rat pups maintains the temperature needed to
stay alive by continuously forming different sized
huddles. The individual pups can feel when they are
touching things, they can swivel their torsos back and
forth, and they can push forward with their legs, oh,
and they are sensitive to their own internal
temperature. But that is about it... the pups
are blind, and they also lack any knowledge of how
warm the group is as a whole, or how many pups are in
any given huddle  and they certainly don't 'know'
that they are trying to form idealsized dynamic group
distributions  and even if they did, there is nothing
they can do to coordinate the movements of other pups.
Instead, each pup must make a simple decision: Stay
where I am, or move.
For a little more context: Schank built a special
device while postdocing with Jeff
Alberts at Indiana University, that kept the
pups on a flat surface of uniform temperature, and set
about observing what the pups did. Many things were
measured, including direct the pup was oriented and
whether it was moving (change between frames of
videos), but of particular interest were the
distribution of huddles  e.g., out of ten pups, at
this moment, there are three groups of 3, 3, and 4
pups. (Incidentally, this also leads into cool
investigation of how
you measure synchrony, which is not at all
intuitive, and to proof that menstrual
synchrony does not exist, such as the human
data presented here,
but that a different story.)
Schank then built probabilistic models of the simple
stay or move decision rule, deployed the models in
digital rat pups in a digital apparatus, and tuned
them through Darwinian
algorithms. He knew what the distribution of
huddles looked like in real ratpup huddles, and he
ran tens of thousands of simulations with different
values of the model variables to see how well they
matched the desired pattern. Those values that
produced better matches were allowed to 'reproduce'
with minor variation, and tens of thousands of
generations later you end up with a set of variables
(or a few sets of variables) that produce very good
matches.
The equations are pretty nasty, for example, the
probability of a 10 day old pup being active are
predicted by the following equations (taken from here):
What were the findings?
The most basic finding is in terms of which potential
variables are significant in the final model. Schank
looked at 7day old and 10day old rat pups. There is
significant developmental change in those three days,
and the behavior of the rat pups reflects that. The
models show that the behavior of the 7day old pups is
not coupled with the behavior of other members of the
group. The behavior of 10day old pups is coupled,
it a function of the activity of those pups it is in
contact with. The equations change tremendously,
meaning that the underlying dynamics that maintain
system stability are changing dramatically. This is
amazing, not because there are dynamics in
development, but because the system is so dynamic
without failure. Recall, that if the system
malfunctions for even a small amount of time, the
whole group of pups is dead. (Not in the lab, the lab
pups are safe, but in the wild the whole group would
be dead.) If the individual pups fail to join and
leave huddles at the right time, all pups relying on
the huddle for heat will be out of luck. This is an
excellent example of a situation in which natural
selection must favor developmental systems that
maintain functionality.
This is good, basicscience work towards better
understanding behavioral development. It not only
tells us about the dynamics of individual behaviors,
but also about the dynamics of group coordination. It
demonstrates that complex, intelligent, adaptive
behaviors are possible within individuals and within
groups with no 'knowledge'
of what is being done, and it investigates a mechanism
that could account for such adaptive behavior
(probabilistic behavior deployment). This has
implications for how we might envision the role of
multiple body parts (e.g., multiple brain regions) in
guiding behavior.
While I never went too far with this work, it lead me
to the skillset needed for the other simulation work
described above.
