Cardinality

April 30, 2008

In defining multisets, we have already used the class \mathbf{Card} of cardinals. Cardinals arise in set theory as a measure of the ‘size’ of a set; every set is equinumerous with a unique cardinal. (The formal definition of the cardinals relies on quite a bit more than I’m presenting here; see, for example, Chapter 3 of Mendelson’s Introduction to Mathematical Logic, 4th ed. (1997) for a discussion of the formal definition of the ordinals, and how to use the axiom of choice to define the cardinals in terms of the ordinals.) For any set S, the cardinality of S, written \left|S\right|, is the unique \kappa \in \mathbf{Card} that is equinumerous with S.

In order to define the cardinality of a multiset, we’ll need a definition related to cardinal arithmetic.

Definition

Let I be a set, \left\{\kappa_i\right\}_{i\in I} be a family of cardinals, and \left\{A_i\right\}_{i\in I} be a family of pariwise disjoint sets such that \left|A_i\right| = \kappa_i for i\in I. The sum of this family of cardinals is

\sum_{i\in I} \kappa_i := \left| \bigcup_{i\in I} A_i \right|.

The axiom of union ensures that \mathcal{A} = \bigcup_{i\in I} A_i exists as a set, so it has a unique cardinal. It can also be proven that \left|\mathcal{A}\right| does not depend on the choice of sets A_i, so the sum is well defined.

Definition

Let \mathcal{M} = \left(S, f\right) be a multiset. The cardinality of \mathcal{M} is

\left|\mathcal{M}\right| := \sum_{x\in S} f\left(x\right).

Based on their cardinality, sets can be classified as finite, infinite, denumerable, etc. This classification extends to multisets in the obvious way.

If \mathcal{M} = \left(S, f: x\mapsto 1\right), then \left|S\right| = \left|\mathcal{M}\right|. This is exactly what we would expect of the multisets that correspond to their underlying sets.

Copyright © 2008 Michael L. McCliment.


Multiset membership

April 28, 2008

Our informal conception of a multiset is that it’s like a set, but allows an element to be in the set multiple times. Formally, we’re considering multisets to be a structure \mathcal{M} = \left(S, f\right) where f: S\to\mathbf{Card}. Since S is a set, we have a primitive notion of membership available in the underlying set theory. What we want to do now is formalize the notion of membership in a multiset. Before we do, let’s define one other notion:

Definition

Let \mathcal{M} = \left(S,f\right) be a multiset. The support of \mathcal{M} is

\mathrm{support}\left(\mathcal{M}\right) := \left\{x\in S | f(x) > 0\right\}.

The support of a multiset is a subset of its underlying set: \mathrm{support}\left(\mathcal{M}\right) \subseteq S. We’ll now define membership in a multiset as

Definition

x\in\mathcal{M} := x\in\mathrm{support}\left(\mathcal{M}\right)

It may be tempting to define multiset membership as x\in\mathcal{M} := x\in S. So why not define it this way? Consider a multiset where S = \left\{a, b, c\right\} and f(a) = 2, f(b) = 7, f(c) = 0. Here, c\in S but appears 0 times in \mathcal{M}. Since \mathrm{support}\left(\mathcal{M}\right) = \left\{a, b \right\}, we end up with a, b\in\mathcal{M}, and c \not\in\mathcal{M}—exactly the result we would expect if c occurs 0 times in \mathcal{M}.

Since multisets are intended to be a generalization of sets, we should be able to correlate the class of sets with some subclass of multisets. (This probably has a more elegant statement in terms of category theory. Unfortunately, I have only a passing familiarity with it at this point, and wouldn’t trust any formulation I might propose in cetegory-theoretic terms.) The correlation is given by \phi: S \mapsto \left(S, f\right) with f: x\mapsto 1. That is, a set S is identified with the multiset \mathcal{M} = \phi\left(S\right) over S where every element of S has a multiplicity of 1. With this mapping, we have

\mathrm{support}\left(\mathcal{M}\right) = S and

x\in S \Leftrightarrow x\in\mathrm{support}\left(\mathcal{M}\right) \Leftrightarrow x\in \mathcal{M}.

We’ll revisit this correspondence again as we define the various operations on multisets, showing how the operations on multisets are an extension of the operation on sets.

Copyright © 2008 Michael L. McCliment.


Perspectives 1: Floating pennies

April 27, 2008

Currency has been bothering me for a long time. Or, more precisely, how currency gets represented in computer programs and databases has. The reason it bothers me is that a lot of software is written so that it can’t correctly represent a penny. I’m not saying that computer hardware can’t deal accurately with currencies, nor even that software can’t be written to deal accurately with currencies. The problem is that the software is written in such a way that it has to approximate the value of a penny, and doesn’t get it quite right.

So, if computers and software can represent money accurately, why would programmers choose to use approximate representations? To answer this question, we need to consider how currencies, number systems, and data types relate to one another.

Let’s start with number systems. By the time they enter a university, most people are familiar with the natural numbers \mathbb{N} (including 0), the integers \mathbb{Z}, the rational numbers \mathbb{Q}, and the real numbers \mathbb{R}; some people are even familiar with the complex numbers \mathbb{C}. For the most part, we don’t worry too much about the formal construction of these number systems, and we learn that there’s a strict containment relationship among them:

\mathbb{N} \subset \mathbb{Z} \subset \mathbb{Q} \subset \mathbb{R} \subset \mathbb{C}.

In many common contexts where we see numbers written down, we’re not told which of these number systems the number is taken from. Rather, we have to infer the number system from the form in which the number is written and the context in which we find it. I suspect that most people, when they see a number written with a decimal point (example: 2.32) are going to automatically think of this as a real number. It’s obviously not an integer, and it doesn’t look like the fraction notation that we’re taught when we first learn about rational numbers.

Let’s consider how currency works relative to the number systems. I’ll work with the US dollar for examples, but most of the same considerations apply to every currency that I’m familiar with. Each currency system has a smallest unit—the penny in the case of US dollars. We may often see rates that use smaller units (gas prices come to mind); but, when we make the purchase, there are no partial cents in the final transaction. The currency doesn’t permit subdivision of a penny. We cannot use the US currency system to pay an amount that is between $5.27 and $5.28. Of the usual number systems we learn as we’re growing up, only \mathbb{N} and \mathbb{Z} have this property.

Moreover, currency doesn’t have negative units. If we buy something that costs $5.65, we cannot pay this with a $5, a $1 bill, and coins totaling -$0.35. If we pay with the five and one dollar bills, the cashier has to give us change. Since everything from \mathbb{Z} on up allows negative values (technically, additive inverses), we see currency behaving more like \mathbb{N} than any of the other number systems.

The challenge here is the notation we use for currency. Since we express currency values in terms of dollars, we tend to think of counting in dollars and parts of dollars, which sounds like something we’d do using rational or real numbers. In terms of the characteristics of the currency system, though, what we’re really counting in is pennies; that’s why the US currency system behaves like the natural numbers.

So, now that we know how currency and number systems are related, let’s take a quick look at data types. All data on a computer is stored as a finite sequence of bits. We can think of the data type of some piece of data as a way to tell us what those bits mean. When it comes to numbers, many modern programming languages provide two primary data types for numbers. Each of these types is a way of representing a number—as an ‘integer’ or as a floating-point number. (PHP, for example, provides integer and float types.) An integer data type can accurately represent any integer in some interval; provided you don’t give it too large of a value, it will accurately represent any count (say, of pennies).

Floating-point numbers, on the other hand, are how we approximate real numbers on a computer. Floating-point numbers, however, can’t represent every number in the range they cover. Every floating point number actually has the form \frac{a}{b}, where b is a power of two. So, if you can’t write the number you want to represent as a fraction of this form, then the computer will approximate it. In particular, if we enter something like 2.11 and store it as a floating-point number, it will get approximated.

So why do programmers choose to use an approximate representation rather than an accurate one? I’d suggest that they think about currency very intuitively as “a number with a decimal point”. If you don’t think about the properties of currencies and data types, then the approximate representation as a floating-point number is the “obvious” choice. When you take those properties into account, however, storing a count of pennies as an integer certainly gives a better correspondence between the currency and how it gets represented.

Copyright © 2008 Michael L. McCliment.


“Language is not mathematics”

April 25, 2008

It has become something of a platitude to claim that “language is not mathematics”. It shows up in several variants—a quick search in Google gives results for all combinations of the variants using “is not” / “isn’t” and “mathematics” / “maths” / “math”. The origin of this phrase seems to be Otto Jespersen’s The Philosophy of Grammar (1924). Jespersen’s actual argument is that “language is not mathematics, and … a linguistic negative cannot be compared with the sign − (minus) in mathematics; hence any reference to the mathematical rule about two minus’s is inconclusive” (p. 331).

The mathematical rule to which he is referring, of course, is an arithmetic result that holds when multiplying real numbers: for all a, b\in\mathbb{R}, if a, b < 0, then the product ab > 0. This result relies on the definitions of addition and multiplication, the existence of additive inverses, and the order properties of the real numbers that are compatible with the operations on these numbers. Any analogy between linguistic negatives and a result that relates to the product of additive inverses in an algebraic structure would appear to be hopelessly wrong. In this case, Jespersen’s comment amounts to a claim that the semantics of natural languages cannot be adequately modeled as an ordered field—not that anyone would ever claim such a thing in the first place.

The more important point of Jespersen’s observations is that linguistic negatives do not correlate with logical negation. In propositional logic, the meaning of the statements \neg\neg A and A are identical for all propositions A. Cumulative (double, triple, quadruple) linguistic negatives behave differently than logical negation in propositional logic. Consequently, language is not (propositional) logic.

There is another respect in which language differs from both mathematics and logic, which is more fundamental than what is suggested by the “is not” assertions. This difference has to do with what language, logic, and mathematics are rather that with what they aren’t. Let’s start by considering the primary sense of each word as defined in the OED:

language The system of spoken or written communication used by a particular country, people, community, etc., typically consisting of words used within a regular grammatical and syntactic structure. Also fig.

logic The branch of philosophy that treats of the forms of thinking in general, and more especially of inference and of scientific method. (Prof. J. Cook Wilson.) Also, since the work of Gottlob Frege (1848-1925), a formal system using symbolic techniques and mathematical methods to establish truth-values in the physical sciences, in language, and in philosophical argument.

mathematics Originally: (a collective term for) geometry, arithmetic, and certain physical sciences involving geometrical reasoning, such as astronomy and optics; spec. the disciplines of the quadrivium collectively. In later use: the science of space, number, quantity, and arrangement, whose methods involve logical reasoning and usually the use of symbolic notation, and which includes geometry, arithmetic, algebra, and analysis; mathematical operations or calculations. Colloq. abbreviated maths, (N. Amer.) math.

Mathematics and logic share the property of being, in effect, the study of a specific subject. Mathematics is the study of “space, number, quantity, and arrangement”; such study has people—mathematicians—who undertake this effort. Logic is the study of “the forms of thinking”; such study has people—logicians—who undertake this effort. We also say, especially as students, that we are “studying mathematics” or “studying logic”; typically, this means that we are actively learning the techniques used and results previously obtained within that science.

Language, however, is not the study of anything; we don’t have ‘languageians’ who undertake the effort of language in the way that mathematicians and logicians undertake their respective efforts. Language is, rather, a “system of spoken or written communication”. It correlates not with mathematics and logic, but rather with space, number, quantity, arrangement, and the forms of thought (and, in Frege’s sense, with formal systems to establish truth values). The study of language is what we call linguistics, and the people who undertake this study are called linguists.

What this suggests is that we have the following parallels:

Field of study Practitioners Objects studied
linguistics linguists language
logic logicians forms of thought
mathematics mathematicians space, number, quantity, and arrangement

The question of whether language “is” mathematics or logic is easily answered at this point. A more interesting question is to what extent language, forms of thought, and arrangement overlap.

Copyright © 2008 Michael L. McCliment.


Multisets

April 23, 2008

I’ve been thinking about multisets quite a bit recently. I first encountered them in an Introductory Combinatorics course a few years back, where they were presented mostly as a convenient language for discussing combinations of elements from a set S where an element may be selected more than once, and may have explicit bounds placed on how many times it can be selected. A typical problem went something along the following lines:

A charity organization is preparing fruit baskets to be delivered to homebound people. Each basket is to contain 10 pieces of fruit, selected from apples (A), bananas (B), oranges (O), and pears (P); other than the type of fruit, the pieces of fruit are indistinguishable. Moreover, each basket is to have at least two apples, at least one and no more than three bananas, and no more than two oranges. How many different baskets (different combinations of fruit) can be created that meet these conditions?

What we’re looking at here are certainly not subsets of S = \left\{A, B, O, P\right\} satisfying the conditions. We’re also not looking for sequences of elements selected from S, since the fruit isn’t ordered in the basket. Rather, we’re looking for something very like a set (since order doesn’t matter) but in which the elements can appear multiple times, so that \left\{A, A, B, B, B, O, O, P, P, P\right\} \neq \left\{A, B, O, P\right\}. The question about the number of baskets is just a counting problem to determine the number of 10-element multisets whose elements are selected from S.

More recently, I’ve been considering directed multigraphs (where the arcs are a multiset over V^2, where V is the vertex set) and minimalist generative grammars (where the input to the syntactic computation, called a numeration in the literature, is a multiset over a set of lexical items). These contexts led me to think about other operations on multisets (such as unions, intersections, differences, and the like). When we formalize the intuitive notion of a set in which elements can appear multiple times, and the number of times an element appears is important, we immediately run into a question of the bounds to be placed on “multiple”: is the number of appearances limited to the positive integers? the non-negative integers? the extended non-negative integers (including \infty)? This ambiguity is resolved in various ways in the literature; the resolution I’d like to pursue here is that multiple be formalized as any cardinal number of times.

To frame the discussion, we’ll adopt von Neumann-Bernays-Gödel set theory as a foundation. One advantage of this theory over the more familar ZFC theory is that it admits proper classes, so a function can be defined whose codomain is a proper class. Let \mathbf{Card} be the (proper) class of cardinals.

Definition

A multiset \mathcal{M} is a pair \left(S,f\right) where S is a set and f: S\to\mathbf{Card} is a function. S is the underlying set of \mathcal{M}, and the value of f(x) is the multiplicity of x in \mathcal{M}.

This definition explicitly allows f(x) to be 0, which is disallowed by some authors. While this adds a little bit to the definition of membership in a multiset, it corresponds more naturally to some of the situations that we want to model using multisets. It also simplifies the definition of some of the operations we’ll be discussing.

Copyright © 2008 Michael L. McCliment.


Prologue

April 21, 2008

Greetings!

I’ll start with a quick introduction before jumping into the real content of this blog. My name is Michael, and I’m currently a graduate student in English linguistics at Arizona State University. I’m writing my master’s thesis this summer, and simultaneously applying to graduate programs in mathematics. Yes, the two are actually related; most of my research over the past four years has focused on the mathematics of various linguistic theories.

Mathematics and theoretical linguistics are also the twin themes of this blog. My plan is to write about pure mathematics (I’m particularly interested in algebra, graph theory, and foundations), linguistic theories (primarily syntax and semantics), and the connections between the two. For the most part, I expect I’ll be trying to explain the things that I’m actively studying. The best way I’ve found to figure out what I don’t yet understand is to try to explain it. Hopefully, the journey will be fun!

Copyright © 2008 Michael L. McCliment.