Perspectives 6: Uniform surjective grading?

June 1, 2008

I’m somewhat disconcerted by this recent article that appeared in USA Today. I wouldn’t have noticed it, except for the fact that it was critiqued to some extent by Mark Chu-Carroll and various commenters over at Good Math, Bad Math. I might be somewhat happier if I hadn’t noticed it.

The proposal to impose a 50 point minimum grade for an assignment (on a 100 point scale) is problematic for a number of reasons. Some of these are practical, and can be seen clearly by considering some analogous scenarios. I’m fairly certain that it would be in some people’s interest to impose similar artificial bounds in other situations, but such a proposition would be a complete non-starter. Declaring an NHL player’s plus-minus rating to have a minimum value of 4 would clearly invalidate the descriptive value of the statistic. Declaring that employees can be given no worse than a “satisfactory” review would have interesting repercussions in the corporate environment, since it would effectively eliminate the ability to fire someone for non-performance of their job, or even incompetence.

Apparently, however, mathematical competence is not required in this arena. According to the article, the argument in favor of this artificial minimum is this:

Other letter grades — A, B, C and D — are broken down in increments of 10 from 60 to 100, but there is a 59-point spread between D and F, a gap that can often make it mathematically impossible for some failing students to ever catch up.

“It’s a classic mathematical dilemma: that the students have a six times greater chance of getting an F,” says Douglas Reeves, founder of The Leadership and Learning Center, a Colorado-based educational think tank who has written on the topic.

Last time I checked, the closed interval S := \left[0,100\right] in the integers contains 101 elements, and a standard 90-80-70-60 scale allocates them in such a way that there is a 60 point spread between the minimum scores meriting an F and a D. Feel free to check my arithmetic, but even a hand calculator can usually get 60 - 0 = 60 correct. Apparently correct arithmetic is not important in an argument supporting how grades should be computed.

I’m also astounded—perhaps I shouldn’t be, but I am—that the head of a think tank would propose that statement as a “classic mathematical dilemma”. It certainly appears to contain two classic mathematical errors, however: one being a fencepost error, and the other being an unsupported assumption that grades are uniformly distributed over S. If the grades obtained by students were distributed uniformly over S, then the expected value of student grades would be 50. So far as I am aware, neither high school graduation rates nor university retention and graduation rates support such a claim. Moreover, we should expect strictly more values corresponding to an A than to any of the grades B, C, or D on a given assignment or exam, since it contains one more element of S than the others do. Would anyone care to provide the empirical evidence justifying such a result?

As unsettling as these errors are, I find the sidebar to be more disquieting. The examples it provides for calculating grades do not correspond to the general assumption in the article that a letter grade of F is recorded, and then considered to be 0 at some later point. Rather, it shows what happens when we have recorded the scores on a 100-point scale, and then compute the mean with the actual score or with a false minimum of 50. In these comparisons, the recorded failing grade is not in general a zero.

Let’s take a slightly closer look at what is happening here. There are two grading scales in use here: the closed interval S=\left[0,100\right] of integer scores, and the set L=\left\{\mathrm{A,B,C,D,F}\right\} of letter grades. The 90-80-70-60 convention establishes a surjective mapping \varphi: S\to L that preserves the standard order on each of these sets. However, \varphi is not injective, so we cannot define a consistent arithmetic on L in terms of \varphi^{-1}. The argument provided in support of this 50-minimum grading scale is, in effect, an argument about how a representative of the preimage \varphi^{-1}\left(x\right) should be selected for each x\in L.

Unfortunately, this argument does not go through if we already have scores recorded. We are not free to select any element of the preimage. At least in the sciences, there’s a term for such an operation: it’s called falsification of data. Viewed this way, the implications of the grading policies being adopted by some school districts are, at best, disturbing.

Copyright © 2008 Michael L. McCliment.

Perspectives 5: “Teaching to the test”

May 25, 2008
Someone, anyone, whoever…please explain to me why it is wrong to teach to a test.

—Jon-Paul @ American Age,
Teaching to the Test!

Whether it is “wrong” to teach to a test is, to a large degree, a value judgment; it depends fundamentally on what goals we are striving to achieve. I don’t agree with the assertion that it is “eminently sound pedagogy”, but again this may well depend on our objectives. I won’t attempt to explain why it is wrong to teach to a test, but I will explain why I judge it to be wrong.

Jon-Paul points out that

There is a distinct difference between teaching to the broad body of skills and knowledge that a test represents (good), and teaching to the exact items that will appear on the standardized test (indefensible and illegal).

The distinction he makes is important. I have observed a few instances where an instructor for a university course answered a student’s question with a response like “don’t worry about that right now, there’s no question on the test about it.” Jon-Paul’s conception of “good” teaching avoids this type of teaching to the test. Even with this conception, however, I still have two objections to such teaching. One objection is the use of the preposition to and what it implies about goals and achievement. The other concerns the relationship between a body of skills and knowledge on the one hand, and a test on the other.

The preposition to

The complete entry for to in the OED runs to about 8 pages, which I won’t reproduce here. The principal senses of to (A) are:

I. Expressing a spatial or local relation.
II. Expressing a relation in time.
III. Expressing the relation of purpose, destination, result, effect, resulting condition or status.
IV. Followed by a word or phrase expressing a limit in extent, amount, or degree.
V. Indicating addition, attachment, accompaniment, appurtenance, possession.
VI. Expressing relation to a standard or to a stated term or point.
VII. Expressing relations in which the sense of direction tends to blend with the dative.
VIII. Supplying the place of the dative in various other languages and in the earlier stages of English itself.

For those not familiar with it, the dative is essentially a way of indicating that the noun phrase is an indirect object. Across the board, to conveys a sense of limit. It suggests that just enough be taught to reach that limit.

Here’s a situation that is quite similar: preparing to give a presentation. When we present something (in a business meeting, at a conference, in a classroom, or wherever), we will be evaluated on the performance of a single act, just as students are evaluated on a single act of taking a test. When I prepare to give such a presentation, I don’t gather “just enough” information to put the presentation together. No good presenter does. Why not? There are several reasons to gather more information than we strictly need for a presentation, including being able to respond to questions that may be asked. But more importantly, this extra information actively helps us to master the material we will be presenting.

In science and engineering, we constantly look at the boundary conditions on any system we study, not just the interior behavior. We examine the boundary conditions because they help us to understand the purpose and behavior of the system. In software development, understanding the boundaries of a software system are often the key to understanding why certain pieces of it are the way they are. I’ve found the same to be true with any system of knowledge we may care to describe. I suspect that teaching “to” a test may well predispose us to teaching prescriptive rules rather than helping students to achieve understanding or mastery of a subject.

The relationship between tests and a body of knowledge

Jon-Paul’s comment suggested that good instructors will be “teaching to the broad body of skills and knowledge that a test represents”, and construes this to be equivalent to “teaching to the test”. Not only are the two not equivalent, but the skills and knowledge that a test represents do not cover the range of skills and knowledge that should be acquired by a student.

Test have a very specific function: they serve as a measurement instrument to help teachers and other education professionals to assess a student’s level of mastery of a subject. This has a couple of implications. The first of these is a general principle that the measurement instrument should not influence what is being measured, because it compromises the measurements made; in this case, it undermines the purported value of testing and, in particular, of standardized tests. A second implication is that, as an assessment instrument, it has been designed to focus on certain aspects of some portion of what it seeks to measure. Moreover, tests are notoriously limited to those aspects of a subject that are “testable”. Both the design focus and the criterion of testability limit what can be represented by a test.

The reason that I find teaching to the test so particularly wrong is that, in my assessment, it nurtures a culture of underachievement, restricts the development of a student’s ability to think critically, fosters an inability for people to connect their knowledge to any real application of that knowledge, and disrupts the only defensible reason for inflicting tests on people in the first place. Tests need to reflect the subject matter that is being taught, but cannot mediate between the subject matter and the pedagogic choices about its presentation. As I said at the outset, however, it all depends on the goals one wants to achieve.

Copyright © 2008 Michael L. McCliment

Perspectives 4: Predictably random

May 18, 2008

I don’t follow the churn of technology news closely anymore. So, every once in a while, one of the webcomics that I read will send me scurrying to Google one topic or another. The most recent of these was Security Holes at xkcd:

Security Holes

My immediate reaction was “Huh? What fiasco?” This was followed almost immediately by a help-me-please-I’ve-been-living-under-a-rock trip to Google. A quick search not only gave me the answer, but caused me to choke on my coffee. You see, apparently someone actually removed the code that seeded the random number generator!

For those of you who don’t already know, let me explain why I choked on my coffee. Let’s start with a story. Once upon a time, in my very first programming class, we were given a programming assignment that introduced us to screen coordinates. Our goal was to “fill the sky with stars”—in other words, we needed to put a bunch of random white points on a black background. Not particularly difficult, unless either

  1. you didn’t get the fact that the origin is at the top-left of the screen, and the positive y-axis points down (forgetting this plotted the point somewhere off the screen), or
  2. you didn’t get how to scale a floating-point number in the interval \left(0, 1\right) to get an integer in the interval \left[1, n-1\right] (getting this wrong also plotted points off the screen, often at the origin).

The trickier part was this: I correctly used the random number generator, but every time I ran the program it gave me the exact same pattern of “stars”. The problem is that nearly all random number generators on computers don’t do what they claim; there’s not anything actually random about the process used to generate the numbers.

What computers actually have are pseudo-random number generators (PRNGs). Here’s the thing about a PRNG: it actually returns elements from a fixed, finite sequence of numbers. (There’s a good writeup on how PRNGs work at However it happens to be implemented, a PRNG is characterized by a sequence

\sigma\left(0\right), \sigma\left(1\right), \dots, \sigma\left(s-1\right),

where s is the length of the sequence. If it has just given you \sigma\left(i\right), it will give you \sigma\left(i+1\left(\mathrm{mod}\ s\right)\right) the next time you call it. Each time that I ran my stars program, the PRNG started at the same point in the sequence, and there was no degree of randomness from one run to the next.

PRNGs generally have one (or more) ways to make the numbers they return appear to be more random than they are. I mentioned above how it decides on the next number to return, but not how to get the first number. This is where the seed enters into the process. When you seed the PRNG, you tell it where to start in \sigma. Seeding it with a constant doesn’t help, because that is also exceptionally predictable. Usually, the PRNG will be seeded with something that is at least hard to guess, like the millisecond portion of the time that it is seeded. (When I fixed my stars program, I set the seed to the current time of day.)

The thing is, this is basic. I’m not entirely sure how you can learn programming and not learn about the consequences of pseudo-random generation, and what you have to do for it to be of any use at all. It comes up in graphics programming, computer simulation, sampling, security, and many other areas. Seeding the PRNG is like putting gas in the car before driving out to a remote camping spot for the weekend. It’s like measuring someone for a bridesmaid’s dress before cutting into the $40-per-yard material that it’ll be made from. Sure, you can fail to do so; but the consequences of this failure…

So there you have it—that’s why I choked on my coffee.

Copyright © 2008 Michael L. McCliment.
Security Holes used under a Creative Commons Attribution-NonCommercial license.

Perspectives 3: Mathematics and the other

May 11, 2008

I seem to have a preoccupation with the boundary between mathematics and what we can call the other—subjects that aren’t mathematics, but that interact with it in interesting ways.

I’ll try to be clearer about this boundary. To some extent, we would be hard-pressed to find a subject that doesn’t interact with mathematics in some capacity. Reporting survey results invokes statistics, even if only for descriptive purposes. The simple reporting of survey results is an obvious situation where mathematics “says something” about the other.

Example: Linguistic accommodation

For a less superficial situation, let’s consider a distinctly non-mathematical phenomenon. One of Mark Liberman’s recent posts over at Language Log raised questions about dialect features and linguistic accommodation in recent speeches by the presidential hopefuls in the US. On the surface, it wouldn’t seem that mathematics has much to say on these topics. We could, of course, adapt the corpus analysis techniques used in forensic linguistics and try to analyze the extent to which a particular speech by a candidate fits with the corpus of other speeches made by the same candidate, in which case inferential statistics would have a great deal to say about the subject.

There is also a more fundamental way in which mathematics plays a role in this discussion. If a speaker engages in linguistic accommodation, they need to consciously modify various features of their speech, adapting it to the prevailing characteristics of their target audience. For example, if someone from Louisiana were addressing people in Boston, they might adapt the phonetic aspects of their speech (e.g., adopting a non-rhotic pronunciation) and making different lexical selections (e.g., using soda rather than coke). If they were addressing people in Albany, they would likely avoid the non-rhotic pronunciation, but would still use soda rather than coke when asking for a drink. For those choices that correlate primarily with geography (rather than, say, occupation or income level), linguistic maps like the Generic Names for Soft Drinks that prompt the speaker from Louisiana to use soda as a linguistic accommodation can be useful. When we look at the map, the colors provide a clear division into geographic regions that use each term. The boundaries between these regions are called isoglosses, and identify where there is a change in a specific dialectic feature. These isoglosses are determined by the statistical distribution of particular linguistic phenomena, which will be estimated by sampling some of the speech of the residents in each area. The success of a speaker’s linguistic accommodation will depend on how well they respect such statistical distributions—substituting pop for coke in Boston isn’t particularly accommodating, nor is using a non-rhotic pronunciation in Albany.

This example suggests that mathematics both informs the act of linguistic accommodation (via the identification of isoglosses that allow selection of appropriate linguistic features) and permits an evaluation of that act (via a comparison between a particular speech and a given corpus of speeches). Both the informative and evaluative aspects allow mathematics to contribute to the subject of linguistic accommodation.

A more direct and structured interaction between mathematics and the other shows up in mathematical modeling. Rutherford Aris (Mathematical Modeling: A Chemical Engineer’s Perspective, p. 3) suggests that

A mathematical model is a representation, in mathematical terms, of certain aspects of a nonmathematical system. The arts and crafts of mathematical modeling are exhibited in the construction of models that not only are consistent in themselves and mirror the behavior of their prototype, but also serve some exterior purpose.

The mathematics involved in our previous example involve a representation only of the observed distributional properties of the nonmathematical system. A mathematical model goes further in that it represents some aspects of the actual system. The basic idea is familiar to anyone who has encountered so-called word problems (in the same way that anyone who has encountered “See spot. See spot run.” is familiar with the basic idea of reading).

Example: Mixing tank problem

A typical introductory example is a mixing tank problem, such as this one (from Sanchez, Allen, and Kyner, Differential Equations 2nd ed., p. 13):

A 1000-liter tank contains a mixture of water and chlorine. In order to reduce the concentration of chlorine in the tank, fresh water is pumped in at a rate of 6 liters per second. The fluid is well stirred and pumped out at a rate of 8 liters per second. If the initial concentration of chlorine is 0.02 grams per liter, find the amount of chlorine in the tank as a function of t and the interval of validity of the mathematical model.

When we model this type of situation, we identify a particular nonmathematical system (the mixing tank), some observable properties of the system (flow rates, tank size, and initial concentration), and we try to find a mathematical relationship that holds among those properties (in this case, its a particular differential equation).

A critical element of any mathematical model is the identification of boundary conditions—essentially, observable properties that occur where the nonmathematical system meets the environment in which it is embedded. Aris makes the following observation (p. 13):

When Amundson taught the graduate course in mathematics for chemical engineering, he always insisted that “all boundary conditions arise from nature.” He meant, I think, that a lot of simplification and imagination goes into the model itself, but the boundary conditions have to mirror the links between the system and its environment very faithfully. Thus if we have no doubt that the feed goes get into the reactor, then we must have a condition that ensures this in the model. We probably do not wish to model the hydrodynamics of the entrance region, but the inlet must be an inlet.

The points of correspondence between the system to be modeled and the observable phenomena at the boundary of the system are fundamental. If the correspondence fails, then the mathematical model will fail to say anything about the (nonmathematical) system; the situation will be somewhat analogous to the speaker from Louisiana substituting pop for coke while they’re visiting Boston.

The relationship between mathematics and the other does not need to involve quantification. Contemporary mathematics includes the study of systematic relationships and patterns (as I’ve already discussed to some extent here).

Example: Data flow diagrams

A common tool in the software developer’s toolbox since the 1970’s is a data flow diagram (DFD). A DFD indicates the interaction between two types of objects of which a system is composed: data and processes. Each process can use certain data (its inputs), and produce other data (its outputs); in the DFD, inputs are represented by an arrow from the input data to the process that will use it, and outputs are represented by an arrow from the process to the data that it produces. From a mathematical standpoint, this is known as a directed graph.

In a structured analysis and design approach to software engineering, the system will be modeled not with a single DFD, but with several of them in a hierarchic structure. The top level often represents the entire system as a single process, and indicate what inputs it receives (and possibly who, or what organizational unit, provides these inputs) and what it produces (and possibly who, or what organizational unit, uses these products). At this level, the arrows on the diagram are completely analogous to the boundary conditions in mathematical models. The next level of DFD “explodes” the process, showing how it is to be decomposed into smaller processes that interact with one another in order to realize the system. Each of these smaller processes can then be exploded to show a finer level of detail. At some point, the process will be simple enough that the designer can simply specify what the code is to accomplish; this is the code that will be written by the programmers.

It is not uncommon to be in a position where we have an existing software system that we need to understand, but don’t have reliable documentation for. Unfortunately, the code does not provide us with this organizational and conceptual information about the system. What we can do is to identify the fine-grained processes, and their inputs and outputs. This provides one huge DFD (which is identified with a directed graph) with no hierarchic organization, which isn’t really what we want. We can impose a hierarchic organization on this by grouping some of the processes together into a new process, and ignoring the internal structure of the new process. In graph-theoretic terms, we are simply contracting the arcs in the directed graph.

This example, like the other two I discussed above, illustrates a way in which mathematics can interact with the other, this time without involving any notion of quantity.

Just as there is a boundary between a nonmathematical system and its environment, there is a boundary between mathematics and the other. Frequently, we use mathematics as a tool for describing, analyzing, and modeling the other. How successfully we do this depends, I think, on how faithfully the mathematics reflects the systematic properties of the other.

Copyright © 2008 Michael L. McCliment.

Perspectives 2: Utilitarian mathematics

May 4, 2008

I discovered Paul Lockhart’s A Mathematician’s Lament via Isabel Lugo’s post at God Plays Dice. That post also introduced me to Keith Devlin’s monthly column Devlin’s Angle; the latest installment responds to feedback that he and Lockhart have received since the publication of the March column in which the Lament was published.

One of Devlin’s reactions to the Lament is that it doesn’t serve the needs of most people who are learning mathematics. He writes that

[Mathematics] is one of the most influential and successful cognitive technologies the world has ever seen. Tens of thousands of professionals the world over use mathematics every day, in science, engineering, business, commerce, and so on. They are good at it, but their main interest is in its use, not its internal workings. For them, mathematics is a tool. … Industry needs few employees who understand what a derivative or an integral are, but it needs many people who can solve a differential equation.

The question of use—of mathematics as a tool—is a recurring theme in Devlin’s column. The June 2007 installment, for example, started off this way:

One problem with teaching mathematics in the K-12 system – and I see it as a major difficulty – is that there is virtually nothing the pupils learn that has a non-trivial application in today’s world. The most a teacher can tell a student who enquires, entirely reasonably, “How is this useful?” is that almost all mathematics finds uses, in many cases important ones, and that what they learn in school leads on to mathematics that definitely is used.

Things change dramatically around the sophomore university level, when almost everything a student learns has significant applications.

I am not arguing that utility is the only or even the primary reason for teaching math. But the question of utility is a valid one that deserves an answer, and there really isn’t a good one. For many school pupils, and often their parents, the lack of a good answer is enough to persuade them to give up on math and focus their efforts elsewhere.

Utility, unfortunately, is particularly tricky to evaluate. It almost always depends on the context in which it is applied. For example, floating point computations on computers generally have a high degree of utility. They are our best approximation to the real numbers; they are used in nearly all numerical approximation algorithms. However, their utility drops significantly when they’re used to represent money.

There’s another example that I want to talk about briefly. Let’s consider Isabel’s comment on proofs in high school geometry:

[Lockhart’s] most annoyed at the fact that when school mathematics does teach the idea of “proof” — which for traditional reasons is done in geometry classes — the proofs that students produce are so different from real proofs as to be unrecognizable. For those of you who don’t know, students in American schools are subjected to something called a “two-column proof” in which a sequence of statements is made in one column, and justifications for those statements is made in an adjacent column. I can imagine how this would be useful for very carefully checking if a proof is correct. But anybody who was actually reading such a proof in an attempt to learn something would probably attempt to translate it back to natural language first! I suspect that the real reason such “proofs” are so common in such courses is because they’re much easier to grade than a proof actually written out in sentences and paragraphs.

This type of proof introduces students to the important concept of traceability, which has a very high degree of utility in software development. Software systems are frequently large, complex, and expensive to build. (Software projects also fail, much more frequently than most people outside the software industry realize.) The typical software development project needs to meet specific goals, which are the requirements to be met by the software system when the project is completed. As the software is designed, there are two questions that need to be answered:

  1. Are all the requirements being met?
  2. Why are we building each of these components? What requirement do they help us satisfy?

The two-column proof in high-school geometry traces the asserted geometric property to a supporting justification for that assertion. This is precisely the idea of traceability that is needed to answer these fundamental questions in software development. Once the design for the software system is completed, we need to consider each requirement as an assertion that the software will have a particular property. Elements of the design must be offered as support justifying that assertion. Likewise, one check to help ensure that the design hasn’t introduced a mass of unnecessary work, the components can be treated as assertions to be justified by the requirements.

From the standpoint of understanding a proof in geometry, it may be entirely reasonable to “translate” the proof into paragraphs. An itemized list of requirements, however, may consist of several hundred individual items; these may be supported by a design that specifies the construction of a couple thousand components of various types. When these were written out in sentences and paragraphs, it is simply too easy to miss important aspects of the requirements and the design, which often incorporate complex relationships that cannot be captured in a linear “sentence and paragraph” format.

Now, consider the utility of a two-column proof when we have multiple objectives: an understanding of the geometric argument and an understanding of traceability. The two column proof may have relatively low utility in terms of mathematical understanding, but relatively high utility in terms of understanding complex systems.

There is one more potential issue with this utilitarian view of mathematics. If we return to Devlin’s column, we find that his January and February installments from this year actually suggest that the United States should be outsourcing the “routine mathematics”—presumably this refers to things like solving differential equations, since these are situations where we use mathematics as a non-trivial tool. (The trivial applications can generally be handed off to handheld calculators, spreadsheets, or computer math programs; there is no reason to outsource those tasks that are commonly handled by a microchip.)

This line of reasoning suggests some rather serious issues. If the mathematics taught in the K-12 environment has no real applications (Devlin’s June 2007 contention), and we outsource the routine mathematics that arises when we consider mathematics as a tool to achieve some other goal (his May 2008 suggestion), the argument that learning mathematics has any utility becomes particularly problematic.

Copyright © 2008 Michael L. McCliment.

Perspectives 1: Floating pennies

April 27, 2008

Currency has been bothering me for a long time. Or, more precisely, how currency gets represented in computer programs and databases has. The reason it bothers me is that a lot of software is written so that it can’t correctly represent a penny. I’m not saying that computer hardware can’t deal accurately with currencies, nor even that software can’t be written to deal accurately with currencies. The problem is that the software is written in such a way that it has to approximate the value of a penny, and doesn’t get it quite right.

So, if computers and software can represent money accurately, why would programmers choose to use approximate representations? To answer this question, we need to consider how currencies, number systems, and data types relate to one another.

Let’s start with number systems. By the time they enter a university, most people are familiar with the natural numbers \mathbb{N} (including 0), the integers \mathbb{Z}, the rational numbers \mathbb{Q}, and the real numbers \mathbb{R}; some people are even familiar with the complex numbers \mathbb{C}. For the most part, we don’t worry too much about the formal construction of these number systems, and we learn that there’s a strict containment relationship among them:

\mathbb{N} \subset \mathbb{Z} \subset \mathbb{Q} \subset \mathbb{R} \subset \mathbb{C}.

In many common contexts where we see numbers written down, we’re not told which of these number systems the number is taken from. Rather, we have to infer the number system from the form in which the number is written and the context in which we find it. I suspect that most people, when they see a number written with a decimal point (example: 2.32) are going to automatically think of this as a real number. It’s obviously not an integer, and it doesn’t look like the fraction notation that we’re taught when we first learn about rational numbers.

Let’s consider how currency works relative to the number systems. I’ll work with the US dollar for examples, but most of the same considerations apply to every currency that I’m familiar with. Each currency system has a smallest unit—the penny in the case of US dollars. We may often see rates that use smaller units (gas prices come to mind); but, when we make the purchase, there are no partial cents in the final transaction. The currency doesn’t permit subdivision of a penny. We cannot use the US currency system to pay an amount that is between $5.27 and $5.28. Of the usual number systems we learn as we’re growing up, only \mathbb{N} and \mathbb{Z} have this property.

Moreover, currency doesn’t have negative units. If we buy something that costs $5.65, we cannot pay this with a $5, a $1 bill, and coins totaling -$0.35. If we pay with the five and one dollar bills, the cashier has to give us change. Since everything from \mathbb{Z} on up allows negative values (technically, additive inverses), we see currency behaving more like \mathbb{N} than any of the other number systems.

The challenge here is the notation we use for currency. Since we express currency values in terms of dollars, we tend to think of counting in dollars and parts of dollars, which sounds like something we’d do using rational or real numbers. In terms of the characteristics of the currency system, though, what we’re really counting in is pennies; that’s why the US currency system behaves like the natural numbers.

So, now that we know how currency and number systems are related, let’s take a quick look at data types. All data on a computer is stored as a finite sequence of bits. We can think of the data type of some piece of data as a way to tell us what those bits mean. When it comes to numbers, many modern programming languages provide two primary data types for numbers. Each of these types is a way of representing a number—as an ‘integer’ or as a floating-point number. (PHP, for example, provides integer and float types.) An integer data type can accurately represent any integer in some interval; provided you don’t give it too large of a value, it will accurately represent any count (say, of pennies).

Floating-point numbers, on the other hand, are how we approximate real numbers on a computer. Floating-point numbers, however, can’t represent every number in the range they cover. Every floating point number actually has the form \frac{a}{b}, where b is a power of two. So, if you can’t write the number you want to represent as a fraction of this form, then the computer will approximate it. In particular, if we enter something like 2.11 and store it as a floating-point number, it will get approximated.

So why do programmers choose to use an approximate representation rather than an accurate one? I’d suggest that they think about currency very intuitively as “a number with a decimal point”. If you don’t think about the properties of currencies and data types, then the approximate representation as a floating-point number is the “obvious” choice. When you take those properties into account, however, storing a count of pennies as an integer certainly gives a better correspondence between the currency and how it gets represented.

Copyright © 2008 Michael L. McCliment.