Currency has been bothering me for a long time. Or, more precisely, how currency gets represented in computer programs and databases has. The reason it bothers me is that a lot of software is written so that it can’t correctly represent a penny. I’m not saying that computer hardware can’t deal accurately with currencies, nor even that software can’t be written to deal accurately with currencies. The problem is that the software is written in such a way that it has to approximate the value of a penny, and doesn’t get it quite right.
So, if computers and software can represent money accurately, why would programmers choose to use approximate representations? To answer this question, we need to consider how currencies, number systems, and data types relate to one another.
Let’s start with number systems. By the time they enter a university, most people are familiar with the natural numbers (including 0), the integers , the rational numbers , and the real numbers ; some people are even familiar with the complex numbers . For the most part, we don’t worry too much about the formal construction of these number systems, and we learn that there’s a strict containment relationship among them:
In many common contexts where we see numbers written down, we’re not told which of these number systems the number is taken from. Rather, we have to infer the number system from the form in which the number is written and the context in which we find it. I suspect that most people, when they see a number written with a decimal point (example: 2.32) are going to automatically think of this as a real number. It’s obviously not an integer, and it doesn’t look like the fraction notation that we’re taught when we first learn about rational numbers.
Let’s consider how currency works relative to the number systems. I’ll work with the US dollar for examples, but most of the same considerations apply to every currency that I’m familiar with. Each currency system has a smallest unit—the penny in the case of US dollars. We may often see rates that use smaller units (gas prices come to mind); but, when we make the purchase, there are no partial cents in the final transaction. The currency doesn’t permit subdivision of a penny. We cannot use the US currency system to pay an amount that is between $5.27 and $5.28. Of the usual number systems we learn as we’re growing up, only and have this property.
Moreover, currency doesn’t have negative units. If we buy something that costs $5.65, we cannot pay this with a $5, a $1 bill, and coins totaling -$0.35. If we pay with the five and one dollar bills, the cashier has to give us change. Since everything from on up allows negative values (technically, additive inverses), we see currency behaving more like than any of the other number systems.
The challenge here is the notation we use for currency. Since we express currency values in terms of dollars, we tend to think of counting in dollars and parts of dollars, which sounds like something we’d do using rational or real numbers. In terms of the characteristics of the currency system, though, what we’re really counting in is pennies; that’s why the US currency system behaves like the natural numbers.
So, now that we know how currency and number systems are related, let’s take a quick look at data types. All data on a computer is stored as a finite sequence of bits. We can think of the data type of some piece of data as a way to tell us what those bits mean. When it comes to numbers, many modern programming languages provide two primary data types for numbers. Each of these types is a way of representing a number—as an ‘integer’ or as a floating-point number. (PHP, for example, provides integer and float types.) An integer data type can accurately represent any integer in some interval; provided you don’t give it too large of a value, it will accurately represent any count (say, of pennies).
Floating-point numbers, on the other hand, are how we approximate real numbers on a computer. Floating-point numbers, however, can’t represent every number in the range they cover. Every floating point number actually has the form , where is a power of two. So, if you can’t write the number you want to represent as a fraction of this form, then the computer will approximate it. In particular, if we enter something like 2.11 and store it as a floating-point number, it will get approximated.
So why do programmers choose to use an approximate representation rather than an accurate one? I’d suggest that they think about currency very intuitively as “a number with a decimal point”. If you don’t think about the properties of currencies and data types, then the approximate representation as a floating-point number is the “obvious” choice. When you take those properties into account, however, storing a count of pennies as an integer certainly gives a better correspondence between the currency and how it gets represented.
Copyright © 2008 Michael L. McCliment.