The Dichotomy Enigma
What this series of short postings will say about information loss when tests are scored right/wrong is already well in educational and testing circles. Even the possible solutions being proposed are not entirely unique. It is the hope of Better Schooling Systems1, a non-profit service corporation, that the ways these ideas are combined will be seen as a workable solution to a long-standing dilemma in education.
The Normal distribution is the upper limit of the Binomial expansion; such as: (a + b)n where n → ∞.
This observation means that the use of this distribution requires a true dichotomy to be applied validly to any data, including test data.
We normally score students’ answers on tests to be either right or wrong. Is it a legitimate question to ask whether this scoring provides a true dichotomy? If answers do not form a dichotomy, the use of this mathematical model to interpret test results may be questionable.
Here is an example of the principle behind “Right-Wrong” scoring. In this coin toss of the 20 items, we have 7 “right” answers (heads). These are put into the student’s “Piggy Bank” as his “earnings” on the test and the rest disappear into “hyperspace.”
The initial problem our research has been addressing can be summarized as follows:
1. Substituting Memorization for Understanding.
It is immediately clear to computing professionals that computers understand nothing. The intelligence is not in the machine, it is in the program. Can we say the same thing about our students? Consider this simple study: In our research program that led to establishing Better Schooling Systems, it seemed that we should be able to determine the reasoning behind “right” answers more easily than the reasoning behind “wrong” ones. Harley Miki2 and I conducted an experiment to this end. Our thought was that there are two probable sources for those answers that meet expectations. One is through simple recall without understanding. The other is through understanding. A highly respected teacher of mine once made the claim that if you understood, you didn't’t need to remember. Harley was a mathematics teacher in a local middle school. He undertook to pre-test and post-test a unit in math. Then we were to interview a group of volunteer students. We scored the tests of these students in the conventional way. We scored the interviews derived from the questions “understood the concept” and “did not understand the concept.” We then cross-tabulated the results in a 2 by 2 discrepancy table, for both the pre- and post-tests. It is a simple statistical test using the Phi statistic to determine how closely the four pairs are related. What were the results? For both tests, understanding the concept and giving right answers were statistically unrelated to each other. Scores increased substantially from pre- to post-testing in both dimensions. The understanding score increased more than the right answers score. However they remained unrelated.3 Perhaps, once understanding has been consolidated, these two will come together, perhaps not.

Does this finding mean that we cannot assume students understand merely because their answers are the ones we wish to hear, read or see?
As a way to resolve this problem in science, a number of researchers have developed concept inventories.4 Does this observation apply to subjects other than mathematics and science? The solution we are proposing may apply to other content areas and may be applicable between disciplines as well.
Did student who obtained seven heads in the coin toss above get a low score, (Heads being the “Right” answer) or is this conclusion meaningless, in the light of what we just said?
2. A Dicey Alternative
The array of dice shown here is a better way to envision four-item multiple-choice tests, we have juts seen that right answers can come from memorization or from understanding. This means that these answers have at least two possible meanings. The Ru dice (right answers from understanding) are arranged on the left. The Rm dice are arranged on the right because we are presuming that giving an answer from memory (without understanding) requires mess mental effort than any thoughtful attempt to solve the problem. The wrong answers (W1, W2, W3) are arranged in descending order of selection maturity (W3 on the left) to illustrate the inclusion of these answers into the answering model. At the extreme left is W3+ which represents the few answers that go beyond the intention of the question because this student is profoundly informed on this item, leading to a valid “wrong” answer. This student also obtained a score right-wrong of seven out of twenty, if we count the W3+ answer as acceptable. The dice throwing model describes the answering of a four-option multiple-choice item better than does the coin toss model. It is not quite this simple. Learning changes the probabilities of which facet of the dices lands upward. That is, answering is not random. Instead, these dice are “loaded” and the loading changes with learning. Increased understanding increases the probability of the occurrence of higher-order errors. The reverse pattern also occurs. The illustration shown here with these dice would be of a somewhat above average 14 year-old, if the most common age for choosing “Right” answers is sixteen. Since the numbers on the Ru facets are item numbers, this outcome would be the results from a twenty-item multiple-choice test. Compare this interpretation with the “seven right answers” coin toss and decide for yourself which option provides more information to teachers, their students and others who need to know.
3. A Flashy Conclusion
There are two kinds of facts: “Facts of opinion” and “Facts of observation.” The progress if science has been the overthrow of opinions when observations proved them wrong. The research done by BSS has shown conclusively that the opinion that multiple-choice tests, (and many other tests) should be scored “Right-Wrong” has been overthrown by careful analyses of the multiple meanings embedded in the “Right” answer category and, even more conclusively, by the discovery, by BSS that there is a strong developmental influence upon which “Wrong” answers are chosen. The facts of observation speak for themselves. There are many examples of these observations throughout this Website.
1. Better Schooling Systems, P.O. Box 12833, Pittsburgh, PA 15241
2. Powell, J. C. and Miki, H. (1985) Answer anomalies, how serious? Paper presented to the Psychometric Society Nashville, TN
3. Does this observation mean that right answers may lack the homogeneity needed to be one pole in a true dichotomy? Does this explain why NCLB seems to be less successful than hoped? (See: Ho, A. D. (2007) Discrepancies Between Score Trends from NAEP and State Tests: A Scale-Invariant Perspective Educational Measurement: Issues and Practice, 26 (4), 11 – 20)
4. Halloun, I. & Hestenes, D. L. (1985) Common sense about motion American Journal of Physics, 53(11) 1056 – 1065. These researchers found that their students had serious misconceptions about the Laws of Motion as these applied in everyday life. Much good work has been done since then within particular science contexts. Our position is that this problem may not be confined to scientific concepts.
|