## Multisets

I’ve been thinking about multisets quite a bit recently. I first encountered them in an Introductory Combinatorics course a few years back, where they were presented mostly as a convenient language for discussing combinations of elements from a set $S$ where an element may be selected more than once, and may have explicit bounds placed on how many times it can be selected. A typical problem went something along the following lines:

A charity organization is preparing fruit baskets to be delivered to homebound people. Each basket is to contain 10 pieces of fruit, selected from apples ($A$), bananas ($B$), oranges ($O$), and pears ($P$); other than the type of fruit, the pieces of fruit are indistinguishable. Moreover, each basket is to have at least two apples, at least one and no more than three bananas, and no more than two oranges. How many different baskets (different combinations of fruit) can be created that meet these conditions?

What we’re looking at here are certainly not subsets of $S = \left\{A, B, O, P\right\}$ satisfying the conditions. We’re also not looking for sequences of elements selected from $S$, since the fruit isn’t ordered in the basket. Rather, we’re looking for something very like a set (since order doesn’t matter) but in which the elements can appear multiple times, so that $\left\{A, A, B, B, B, O, O, P, P, P\right\} \neq \left\{A, B, O, P\right\}$. The question about the number of baskets is just a counting problem to determine the number of 10-element multisets whose elements are selected from $S$.

More recently, I’ve been considering directed multigraphs (where the arcs are a multiset over $V^2$, where $V$ is the vertex set) and minimalist generative grammars (where the input to the syntactic computation, called a numeration in the literature, is a multiset over a set of lexical items). These contexts led me to think about other operations on multisets (such as unions, intersections, differences, and the like). When we formalize the intuitive notion of a set in which elements can appear multiple times, and the number of times an element appears is important, we immediately run into a question of the bounds to be placed on “multiple”: is the number of appearances limited to the positive integers? the non-negative integers? the extended non-negative integers (including $\infty$)? This ambiguity is resolved in various ways in the literature; the resolution I’d like to pursue here is that multiple be formalized as any cardinal number of times.

To frame the discussion, we’ll adopt von Neumann-Bernays-Gödel set theory as a foundation. One advantage of this theory over the more familar ZFC theory is that it admits proper classes, so a function can be defined whose codomain is a proper class. Let $\mathbf{Card}$ be the (proper) class of cardinals.

Definition

A multiset $\mathcal{M}$ is a pair $\left(S,f\right)$ where $S$ is a set and $f: S\to\mathbf{Card}$ is a function. $S$ is the underlying set of $\mathcal{M}$, and the value of $f(x)$ is the multiplicity of $x$ in $\mathcal{M}$.

This definition explicitly allows $f(x)$ to be 0, which is disallowed by some authors. While this adds a little bit to the definition of membership in a multiset, it corresponds more naturally to some of the situations that we want to model using multisets. It also simplifies the definition of some of the operations we’ll be discussing.