## Multiset operations as binary operations (part 1)

June 9, 2008

Review: Associativity, commutativity, and idempotence

A binary operation $\cdot$ on a set $X$ is a function $\cdot: X\times X \to X$. It is usually written using an infix notation $x\cdot y$ rather than a functional notation such as $\cdot\left(x, y\right)$.

The operation $\cdot$ is associative if and only if $a\cdot\left(b\cdot c\right) = \left(a\cdot b\right)\cdot c$ for all $a,b,c\in X$. It is commutative if and only if $a\cdot b = b\cdot a$ for all $a,b\in X$. It is idempotent if and only if $a\cdot a = a$ for all $a\in X$.

We’ve defined four operations on families of multisets $\left\{\mathcal{M}_i\right\}_{i\in I}$ where $\mathcal{M}_i \in \mathbf{MSet}_X$ for all $i\in I$: sums, products, intersections, and unions. As I’ve already commented in a couple of places, we can restrict our attention to families where $\left|I\right| = 2$. This provides us with four binary operations on $\mathbf{MSet}_X$. Today’s post collects together the properties of these binary operations.

Proposition 1: The operations $\uplus$, $\:\cdot\!\!\!\!\cup$, $\cap$, and $\cup$ on $\mathbf{MSet}_X$ are all associative and commutative.

Proof:

Let $\mathcal{M}_1, \mathcal{M}_2, \mathcal{M}_3 \in \mathbf{MSet}_X$. The first four parts follow directly from the associativity and commutativity of $+$ and $\cdot$ as binary operations on $\mathbf{Card}$.

Associativity of $\uplus$.

$\begin{array}{r@{\:=\:}l} \mathcal{M}_1 \uplus \left(\mathcal{M}_2 \uplus \mathcal{M}_3\right) & \left(X, f_1 + \left(f_2 + f_3\right)\right) \\ & \left(X, \left(f_1 + f_2\right) + f_3\right) \\ & \left(\mathcal{M}_1 \uplus \mathcal{M}_2\right) \uplus \mathcal{M}_3\end{array}$

Commutativity of $\uplus$.

$\begin{array}{r@{\:=\:}l} \mathcal{M}_1 \uplus \mathcal{M}_2& \left(X, f_1 + f_2\right) \\ & \left(X, f_2 + f_1\right) \\ & \mathcal{M}_2 \uplus \mathcal{M}_1\end{array}$

Associativity of $\:\cdot\!\!\!\!\cup$.

$\begin{array}{r@{\:=\:}l} \mathcal{M}_1 \:\cdot\!\!\!\!\cup\: \left(\mathcal{M}_2 \:\cdot\!\!\!\!\cup\: \mathcal{M}_3\right) & \left(X, f_1 \cdot \left(f_2 \cdot f_3\right)\right) \\ & \left(X, \left(f_1 \cdot f_2\right) \cdot f_3\right) \\ & \left(\mathcal{M}_1 \:\cdot\!\!\!\!\cup\: \mathcal{M}_2\right) \:\cdot\!\!\!\!\cup\: \mathcal{M}_3\end{array}$

Commutativity of $\:\cdot\!\!\!\!\cup$.

$\begin{array}{r@{\:=\:}l} \mathcal{M}_1 \:\cdot\!\!\!\!\cup\: \mathcal{M}_2& \left(X, f_1 \cdot f_2\right) \\ & \left(X, f_2 \cdot f_1\right) \\ & \mathcal{M}_2 \:\cdot\!\!\!\!\cup\: \mathcal{M}_1\end{array}$

The remaining parts follow from the following facts:

1. Let $A, B$ be partially ordered sets. Then $\inf \left(A \cup \left\{\inf B\right\}\right) = \inf\left(A\cup B\right)$ and $\sup \left(A \cup \left\{\sup B\right\}\right) = \sup \left(A\cup B\right)$ whenever the suprema and infima exist.
2. The supremum and infimum of every set of cardinals exist.

Associativity of $\cap$.

$\begin{array}{r@{\:=\:}l} \mathcal{M}_1 \cap \left(\mathcal{M}_2 \cap \mathcal{M}_3\right) & \left(X, \inf \left\{f_1, \inf\left\{f_2, f_3\right\}\right\}\right) \\ & \left(X, \inf\left\{f_1, f_2, f_3\right\}\right) \\ & \left(X, \inf \left\{\inf\left\{f_1, f_2\right\}, f_3\right\}\right) \\ & \left(\mathcal{M}_1 \cap\mathcal{M}_2\right) \cap \mathcal{M}_3\end{array}$

Commutativity of $\cap$.

$\begin{array}{r@{\:=\:}l} \mathcal{M}_1 \cap \mathcal{M}_2& \left(X, \inf \left\{f_1, f_2\right\}\right) \\ & \mathcal{M}_2 \cap \mathcal{M}_1\end{array}$

Associativity of $\cup$.

$\begin{array}{r@{\:=\:}l} \mathcal{M}_1 \cup \left(\mathcal{M}_2 \cup \mathcal{M}_3\right) & \left(X, \sup \left\{f_1, \sup\left\{f_2, f_3\right\}\right\}\right) \\ & \left(X, \sup\left\{f_1, f_2, f_3\right\}\right) \\ & \left(X, \sup \left\{\sup\left\{f_1, f_2\right\}, f_3\right\}\right) \\ & \left(\mathcal{M}_1 \cup\mathcal{M}_2\right) \cup \mathcal{M}_3\end{array}$

Commutativity of $\cup$.

$\begin{array}{r@{\:=\:}l} \mathcal{M}_1 \cup \mathcal{M}_2& \left(X, \sup \left\{f_1, f_2\right\}\right) \\ & \mathcal{M}_2 \cup \mathcal{M}_1\end{array}$

Proposition 2: The operations $\cap$ and $\cup$ on $\mathbf{MSet}_X$ are idempotent.

Proof:

Let $\mathcal{M}\in \mathbf{MSet}_X$.

Idempotence of $\cap$.

$\begin{array}{r@{\:=\:}l} \mathcal{M} \cap \mathcal{M}& \left(X, \inf \left\{f, f\right\}\right) \\ & \left(X, f\right) \\ & \mathcal{M}\end{array}$

Idempotence of $\cup$.

$\begin{array}{r@{\:=\:}l} \mathcal{M} \cup \mathcal{M}& \left(X, \sup \left\{f, f\right\}\right) \\ & \left(X, f\right) \\ & \mathcal{M}\end{array}$

The other two operations—sum and product—are not idempotent. One counterexample is the multiset $\mathcal{M} = \left\{a^3\right\}$, for which we have

$\mathcal{M}\uplus\mathcal{M} = \left\{a^6\right\} \neq \mathcal{M}$

and

$\mathcal{M}\:\cdot\!\!\!\!\cup\:\mathcal{M} = \left\{a^9\right\} \neq \mathcal{M}$.

## Abstract corpora (part 1)

June 6, 2008

In The Logical Structure of Linguistic Theory (written around 1956, although not published until 1975), Chomsky outlined a theory of linguistic form, and suggested from the beginning that “we will try to show how an abstract theory of linguistic structure can be developed within a framework that admits of operational interpretation, and how such a theory can lead to a practical mechanical procedure by which, given a corpus of linguistic material, various proposed grammars can be compared and the best of them selected” (Chomsky 1975, p. 61). In order for such a mechanical procedure to be used, it would be necessary to present an actual collection of linguistic material—utterances recorded in some suitable form—on which it could operate. A grammar, in this context, is construed as a theory (Chomsky 1975, p. 63):

By “the grammar of a language $L$” we mean that theory of $L$ that attempts to deal with such problems as [projection, ambiguity, sentence type, etc.] wholly in terms of the formal properties of utterances. And by “the general theory of linguistic form” we mean the abstract theory in which the basic concepts of grammar are developed, and by means of which each proposed grammar can be evaluated.

The relationship between a language $L$ and a grammar of $L$, in early generative theory, is conceived of as follows (Chomsky 1957, p. 13):

From now on I will consider a language to be a set (finite or infinite) of sentences, each finite in length and constructed out of a finite set of elements. … The fundamental aim in the linguistic analysis of a language $L$ is to separate the grammatical sequences which are the sentences of $L$ from the ungrammatical sequences which are not sentences of $L$ and to study the structure of the grammatical sequences. The grammar of $L$ will thus be a device that generates all of the grammatical sequences of $L$ and none of the ungrammatical ones.

The general proposal here is, to some degree, analogous to filling out tax forms. A person’s actual financial situation is a collection of transactions, with money being received and dispensed at various points in time. In filling out a tax form, they need to deal with certain problems—net income, withholdings, and the like—which are the financial properties of the transactions, and ignore such things as whether the money was earned by clearing clogged plumbing or by managing a team of financial auditors. The financial situation is evaluated based on the tax laws, which define the basic concepts independent of any specific person’s financial situation. The “general theory of linguistic form” is roughly analogous to the pertinent tax laws, and the “grammar of a language $L$” plays a role similar to the information provided on a tax form. In the tax scenario, all of this description and analysis is performed relative to an actual set of financial transactions. The language $L$ is analogous to these transactions, in that it provides the material to be described and analyzed.

## Mathematical logic and biological foundations

June 5, 2008

Last week, we considered generative linguistics as a theory of the faculty of language, and identified four distinct scopes that can be encompassed by the term faculty of language. In order to be clear about these different meanings, I adopted the notations FLB and FLN which were proposed by Chomsky, Fitch, and Hauser in a pair of articles, and I introduced FLC and FLG to represent a similar division independently of the evolutionary history of the faculty of language. All of this presupposes a biolinguistic perspective, in which language is treated as a biologically-founded cognitive phenomenon rather than as a collection of observable sentences. This view is essentially synchronic, considering only the current state of generative theory. It is also instructive to look at the historical development of the theoretical framework in order to understand why there is a distinction between FLC and FLG within the theory.

The origins of generative linguistics are often traced to Chomsky’s Syntactic Structures (1957/2002) and The Logical Structure of Linguistic Theory (1975, written ca. 1956). The fundamental idea of generation, however, has a longer history in algebra and in symbolic logic, dating as far back as the end of the 19th century; Moore (1894), for example, defines a particular abstract group in terms of generators and generating relations; these relations generate all of the elements of the group from the generators . A more direct antecedent to Chomsky’s initial work on generative grammar was Emil Post’s work from ca. 1921, by way of Rosenbloom’s The Elements of Mathematical Logic (Chomsky 1975 p. 105 fn 1; Post 1943, p. 215 fn 18; Rosenbloom 1950, p. 206). Rosenbloom even proposed that “one might also expect that many concepts in linguistics which have resisted all attempts up to now at clear and general formulation may now be treated with the same lucidity and rigor which has made mathematics a model for other sciences. The wealth of detail and the manifold irregularities of natural languages have often obfuscated the simple general principles underlying linguistic phenomena” (1950, p. 163). Chomsky’s early works pursued precisely this direction.

Some recent claims notwithstanding, the original literature suggests that generative linguistics was not originally conceived as a theory of the faculty of language, but rather just as a theory of language as an abstract corpus of sentences. (I’ll have more to say on this point in a later post.) The initial steps towards a treatment of generative theory as a theory of the faculty of language were evidently taken within a decade of the publication of Syntactic Structures. By the mid-1960s, Chomsky was writing an appendix to Lenneberg’s The Biological Foundations of Language (1967), and had already formulated the separation between competence and performance. A clearer distinction was drawn between the notions of I-language and E-language by the mid-1980s, where E-language treats language “independently of the mind/brain” (Chomsky 1986, p. 20), and I-language “is some element of the mind of the person who knows the language, acquired by the learner, and used by the speaker-hearer” (Chomsky 1986, p. 22). Taking generative grammar then to be the study of this I-language, we have a clear claim that it is a theory of the faculty of language.

## Properties of multiset union

June 4, 2008

We defined the union of a family of multisets last week. Today we’re going to look at some basic properties of the union operation on $\mathbf{MSet}_X$.

Suppose $\left\{\mathcal{M}_i = \left(X, f_i\right)\right\}_{i\in I}$ is a family of multisets over $X$. Then the following relationships hold:

(i) $\mathrm{support}\left(\bigcup_{i\in I}{\mathcal{M}_i}\right) = \bigcup_{i\in I}{\mathrm{support}\left(\mathcal{M}_i\right)}$.

(ii) $\mathcal{M}_i \subseteq \bigcup_{i\in I}{\mathcal{M}_i}$ for all $i\in I$.

(iii) $\bigcup_{i\in I}{\mathcal{M}_i} \subseteq \biguplus_{i\in I}{\mathcal{M}_i}$.

(iv) $\bigcup_{i\in I}{\mathcal{M}_i} \subseteq \:\cdot\!\!\!\!\!\!\;\bigcup_{i\in I}{\mathcal{M}_i}$ provided that all of the multisets $\mathcal{M}_i$ have the same support.

(v) $\bigcap_{i\in I}{\mathcal{M}_i} \subseteq \bigcup_{i\in I}{\mathcal{M}_i}$.

The proof of part (i), like the similar result for multiset intersections, is a straightforward series of equivalencies:

$\begin{array}{r@{\:\Leftrightarrow\:}l} x\in \mathrm{support}\left(\bigcup_{i\in I}{\mathcal{M}_i}\right) & \sup f_i(x) \neq 0 \\ & \left(\exists i\in I\right)\: f_i\left(x\right)\neq 0 \\ & \left(\exists i\in I\right)\: x\in\mathrm{support}\left(\mathcal{M}_i\right) \\ & x\in\bigcup_{i\in I}{\mathrm{support}\left(\mathcal{M}_i\right)}. \end{array}$

Part (ii) follows directly from the definition of the supremum of a set, since $f_i\left(x\right) \leq \sup f_i\left(x\right)$ for all $x\in X$ and $i\in I$.

To prove part (iii), start by recalling that the sum of a family of cardinals $\left\{\kappa_i\right\}_{i\in I}$ is given by

$\sum_{i\in I}{\kappa_i} = \left|\bigcup_{i\in I}{A_i}\right|$

where the sets $A_i$ are disjoint and $\kappa_i = \left|A_i\right|$ for all $i\in I$. Therefore, letting $A_{i,x}$ be a family of disjoint sets such that $\left|A_{i,x}\right| = f_i\left(x\right)$, we have

$\kappa_x = \sum_{i\in I}{f_i\left(x\right)} = \left|\bigcup_{i\in I}{A_{i,x}}\right| \geq \left|A_{i,x}\right| = f_i\left(x\right)$

for all $i\in I$ and $x\in X$. This establishes that $\kappa_x$ is an upper bound for $f_i\left(x\right)$ in the linearly ordered set $\left(\mathbf{Card}, \leq\right)$, and so $\kappa_x \geq \sup f_i\left(x\right)$. The result in part (iii) follows immediately since this holds for all $x\in X$.

The proof of part (iv) is similar, but relies on the definition of the product of a family of cardinals. The product of a family of cardinals $\left\{\kappa_i\right\}_{i\in I}$ is given by

$\prod_{i\in I}{\kappa_i} = \left|\prod_{i\in I}{A_i}\right|$

where $\kappa_i = \left|A_i\right|$ for all $i\in I$. Letting $A_{i,x}$ be a family of sets such that $\left|A_{i,x}\right| = f_i\left(x\right)$, we have

$\kappa_x = \prod_{i\in I}{f_i\left(x\right)} = \left|\prod_{i\in I}{A_{i,x}}\right| \geq \left|A_{i,x}\right| = f_i\left(x\right)$

for all $i\in I$ and $x\in X$, provided that $A_{i,x} \neq \emptyset$ for all $i\in I$. This establishes that $\kappa_x$ is an upper bound for $f_i\left(x\right)$ in the linearly ordered set $\left(\mathbf{Card}, \leq\right)$ whenever $x\in\mathrm{support}\left(\bigcup_{i\in I}{\mathcal{M}_i}\right)$, which is the common support of all multisets in the family $\left\{\mathrm{M}_i\right\}_{i\in I}$. If $x$ is not in this support, then $\prod_{i\in I}{f_i\left(x\right)} = 0 = \sup f_i\left(x\right)$. In either case, $\kappa_x \geq \sup f_i\left(x\right)$, and part (iv) follows immediately since this holds for all $x\in X$.

The requirement that all multisets in the family have the same support is necessary; you can see this by considering the product and union of the multisets $\left\{a\right\}$ and $\left\{b\right\}$.

Finally, part (v) follows immediately from the fact that

$\inf f_i\left(x\right) \leq f_i\left(x\right) \leq \sup f_i\left(x\right)$

for all $i\in I$ and $x\in X$.

## Transitions: May → June

June 2, 2008

Wow… one month already. (Technically, slightly more than a month, since my first post was published on 21 April.) Pretty much everything on the site is new since that post, but a couple of things are worth mentioning.

I’ve added the first four links to the blogroll.

• God Plays Dice has been around since June 2007. Isabel Lugo’s posts range from the somewhat whimsical to self-contained discussions of mathematical topics. Always worth reading.
• The only linguistics site I’ve added so far is Language Log, which moved to a new server a couple months ago where the new material is posted. As a bonus, they’ve kept their older material available as well. And yes, this is plural they; several well-known linguists author posts there.
• John Armstrong started The Unapologetic Mathematician back in January 2007. Mathematical topics are explored in some depth over the course of several posts, much like I’m attempting to do. In fact, this is the site that showed me what can be done with a blog and inspired me to start writing here.
• Topological Musings is, for me, the most recent discovery out of these four additions. Vishal Lama started writing this in November 2007, and was recently joined by Todd Trimble. I really like the idea of their Problem of the Week series, and just wish my current schedule allowed me enough time to participate.

Bibliography and glossary.

I’ve added a bibliography (12 entries at the end of May) and a glossary (4 entries at the end of May) to the site. I’d really rather have these in a limited-authorship wiki associated with the blog, but I’m not set up to do that at the moment.

Site design.

As some early readers are aware, I’ve changed the theme to something more $\LaTeX$-friendly. With some of the themes available here, the $\LaTeX$ rendering process gives results that aren’t particularly legible. (Thanks for prompting me to take care of that, Vishal!) I’m still not exceptionally happy with the result (by default, the theme I’m using now puts a little bit of extra padding around each image, including rendered mathematics), so things may yet change again in the future when I have time to deal with the CSS.

One of the things I’d really like to do is distinguish visually between external links, links back to other posts here, and links to the bibliography and glossary. If I ever tackle the CSS project, that will be part of the design. In the meantime, I’ve started making sure that I put title text on every link. (In most browsers, you’ll be able to see it by letting your pointer hover over the link.) External links are now preceded by an arrow (→), and the text should make it obvious when a link leads to the bibliography or the glossary.

## Perspectives 6: Uniform surjective grading?

June 1, 2008

I’m somewhat disconcerted by this recent article that appeared in USA Today. I wouldn’t have noticed it, except for the fact that it was critiqued to some extent by Mark Chu-Carroll and various commenters over at Good Math, Bad Math. I might be somewhat happier if I hadn’t noticed it.

The proposal to impose a 50 point minimum grade for an assignment (on a 100 point scale) is problematic for a number of reasons. Some of these are practical, and can be seen clearly by considering some analogous scenarios. I’m fairly certain that it would be in some people’s interest to impose similar artificial bounds in other situations, but such a proposition would be a complete non-starter. Declaring an NHL player’s plus-minus rating to have a minimum value of 4 would clearly invalidate the descriptive value of the statistic. Declaring that employees can be given no worse than a “satisfactory” review would have interesting repercussions in the corporate environment, since it would effectively eliminate the ability to fire someone for non-performance of their job, or even incompetence.

Apparently, however, mathematical competence is not required in this arena. According to the article, the argument in favor of this artificial minimum is this:

Other letter grades — A, B, C and D — are broken down in increments of 10 from 60 to 100, but there is a 59-point spread between D and F, a gap that can often make it mathematically impossible for some failing students to ever catch up.

“It’s a classic mathematical dilemma: that the students have a six times greater chance of getting an F,” says Douglas Reeves, founder of The Leadership and Learning Center, a Colorado-based educational think tank who has written on the topic.

Last time I checked, the closed interval $S := \left[0,100\right]$ in the integers contains 101 elements, and a standard 90-80-70-60 scale allocates them in such a way that there is a 60 point spread between the minimum scores meriting an F and a D. Feel free to check my arithmetic, but even a hand calculator can usually get $60 - 0 = 60$ correct. Apparently correct arithmetic is not important in an argument supporting how grades should be computed.

I’m also astounded—perhaps I shouldn’t be, but I am—that the head of a think tank would propose that statement as a “classic mathematical dilemma”. It certainly appears to contain two classic mathematical errors, however: one being a fencepost error, and the other being an unsupported assumption that grades are uniformly distributed over $S$. If the grades obtained by students were distributed uniformly over $S$, then the expected value of student grades would be 50. So far as I am aware, neither high school graduation rates nor university retention and graduation rates support such a claim. Moreover, we should expect strictly more values corresponding to an A than to any of the grades B, C, or D on a given assignment or exam, since it contains one more element of $S$ than the others do. Would anyone care to provide the empirical evidence justifying such a result?

As unsettling as these errors are, I find the sidebar to be more disquieting. The examples it provides for calculating grades do not correspond to the general assumption in the article that a letter grade of F is recorded, and then considered to be 0 at some later point. Rather, it shows what happens when we have recorded the scores on a 100-point scale, and then compute the mean with the actual score or with a false minimum of 50. In these comparisons, the recorded failing grade is not in general a zero.

Let’s take a slightly closer look at what is happening here. There are two grading scales in use here: the closed interval $S=\left[0,100\right]$ of integer scores, and the set $L=\left\{\mathrm{A,B,C,D,F}\right\}$ of letter grades. The 90-80-70-60 convention establishes a surjective mapping $\varphi: S\to L$ that preserves the standard order on each of these sets. However, $\varphi$ is not injective, so we cannot define a consistent arithmetic on $L$ in terms of $\varphi^{-1}$. The argument provided in support of this 50-minimum grading scale is, in effect, an argument about how a representative of the preimage $\varphi^{-1}\left(x\right)$ should be selected for each $x\in L$.

Unfortunately, this argument does not go through if we already have scores recorded. We are not free to select any element of the preimage. At least in the sciences, there’s a term for such an operation: it’s called falsification of data. Viewed this way, the implications of the grading policies being adopted by some school districts are, at best, disturbing.

## FoundAround 2008-22

May 31, 2008

Some miscellanea encountered this week:

• Reductio ad grammarium. D. C. Simpson’s Ozy and Millie has been on my reading list for several years now. In Wednesday’s installment, Grammar Nazi, Avery uses the overworn substitution of an $X$ Nazi for someone who takes a dictatorial attitude towards $X$. Millie recasts this by taking the word Nazi in a literal, historical sense and transferring it to a world of grammar: Grammar Nazis can invade places like Grammar Czechoslovakia and Grammar Poland. So, when we dispose of prescriptivist attitudes and a priori theoretical commitments in linguistics, have we moved to Grammar Switzerland?
• Attitude. Over at Language Log, the post Evervate, disconnect, revolt by (the apparently pseudonymous) Melvyn Quince bemoans the injection of “business-school jargon” in inappropriate places. Quince’s annoyance is directed at the three-word admonition “• innovate • connect • achieve” inflicted on him as a slogan for a linguistics conference he attended. As things go, however, it’s relatively innocuous. I have a mug in my kitchen that says, simply:
Attitude
is everything

(Yes, with the maroon on black and a cursive typeface to visually convey “attitude”.) Also innocuous. Except, of course, when given as a “bonus” to a demoralized team of software developers on a 100-hour-per-week death-march project. The resounding response—perhaps not surprisingly?—was best expressed by one of the developers as “Yeah, attitude is everything… and mine sucks!

• Just “a little differently”. Internet Explorer has been the source of frustration for many web developers. It is absolutely notorious for its broken behavior when rendering web pages. (An IRS auditor would have a field day with a tax return that was as compliant with tax laws as IE is with web standards.) The language that’s usually used when talking candidly about how to get IE to behave itself… well, I wouldn’t repeat it around my 4-year-old. And then, in the wordpress.com forums, boblets presented me with one of the single best examples of understatement I have ever seen. It seems that “IE tend[s] to …erm…. translate code a little diffrently than the others.” Hee…

## “The” faculty of language

May 30, 2008

When we talked about the specialist’s view of linguistics, I mentioned that the scientific study of language can be approached from a variety of standpoints. Generative linguistics, in its contemporary form, assumes from the outset that there is a “species property, close to uniform across a broad range” (Chomsky 2004, p. 104) that is responsible for the human capacity for language. This faculty of language is “more or less on a par with the systems of mammalian vision, insect navigation, and others” (Chomsky 2005, p. 2). This point of view is often referred to as biolinguistics.

Broadly construed, the human faculty of language is a cognitive system, realized by the brain, that enables the production and consumption of language. Modern generative linguistics is generally conceived of as a theory of the faculty of language, or at least some portion thereof. A more precise characterization would be that generative linguistics is a family of theories of a portion of the faculty of language; theories in this family share some basic assumptions, have a variety of characteristics in common with one another, and partake of a common intellectual tradition.

The distinction between the faculty of language and what we can observe as spoken and written language is often expressed as a distinction between internal language (I-language) and external language (E-language). Intuitively, we might expect that a theory of internal language, being the cognitive component that enables language production and consumption, should provide the underpinnings of a theory of external language, which is the observable result of that cognitive function. However, there is a gap between the two.

The notion of internalized language is taken to be a “‘notion of structure’ in the mind of the speaker ‘which is definite enough to guide him in framing sentences of his own’” (Chomsky 1986, pp. 21-22, citing Otto Jespersen). The cognitive processes that lie between this “notion of structure” and the externally observable phenomena of language are not represented in the division between internal and external language. “The standard assumption in linguistics,” suggests Lyle Jenkins, “has always been that the theory of the language faculty must be embedded in a real-time theory of speech synthesis, perception, parsing, and the like in accordance with the modularity viewpoint” (2000, p. 71). The language faculty to which he refers here is already a relatively constrained conception, corresponding to the notion of I-language, and excluding a number of cognitive functions that must occur in the production and consumption of observable language.

This gap was part of the subject of discussion in a 2002 article by Hauser, Chomsky, and Fitch. In this article, they distinguish between broad and narrow senses of the term “faculty of language”. The broad sense of the faculty of language (FLB) “includes an internal computational system (FLN, below) combined with at least two other organism-internal systems, which we call ‘sensory-motor’ and ‘conceptual-intentional’” (pp. 1570-1571). Further, the narrow sense of the faculty of language (FLN) is “the abstract linguistic computational system alone, independent of the other systems with which it interacts and interfaces” (p. 1571). This would be a useful distinction, if it were not for later discussion claiming instead that “the contents of FLN are to be empirically determined, and could possibly be empty, if empirical findings showed that none of the mechanisms involved are uniquely human or unique to language, and that only the way they are integrated is specific to human language” (Fitch, Hauser, and Chomsky, 2005, p. 181).

When we look at any specific theory of generative grammar, we find that the gap between the internal and external views of language will continue to exist, independent of the status of any evolutionary arguments regarding homologues in other species or the evolutionary purpose of an adaptation. In deference to the 2005 clarifications, I will allow FLB and FLN denote the distinctions related to biological homologues and evolutionary purpose. I will further distinguish between the generative faculty of language (FLG), which is the constrained sense of “faculty of language” (I-language) referenced by Jenkins, and the cognitive faculty of language (FLC), consisting of all of the cognitive processes realized by the brain that enter into language production and consumption.

## Mode of inquiry, object of inquiry

May 29, 2008

My linguistics posts the past few weeks have dealt with linguistics in very general terms. The purpose of the Mathematics and linguistics posts has been to outline a specific mode of inquiry within theoretical linguistics: the examination of the mathematical properties of a proposed theory. This mode of inquiry is fairly agnostic about specific theoretical details, and is very much in line with Pierce’s contention that mathematics is “the judge over both [induction and hypothesis], and it is the arbiter to which each must refer its claims” (1881, p. 97). Before we can proceed, however, we need to look at some actual linguistic theory. As with any active branch of scientific inquiry, there are multiple theories that researchers are actively pursuing. At least for the time being, I’m going to focus on generative grammar.

Generative grammar is not a single theory, but rather a family of theories that share a number of common assumptions. Historically, there are three main periods in the development of generative grammar. The first of these saw the development of theories of transformational grammar, the second introduced the principles and parameters framework, and the most recent period focuses on minimalist grammars. The intellectual roots of generative grammar go back further, drawing on mathematical logic and adopting Post’s (1943) notion of productions. Since at least the mid-1970s, there has been a growing trend to consider generative grammar within a biolinguistic context. My goal for the next few linguistics posts is to look at this historical development in more detail, and identify some of the common assumptions that are made in generative theories of language.

## Multiset union

May 28, 2008

Over the past two weeks, we’ve introduced three operations on families of multisets: sums, products, and intersections. The other basic operation on families of multisets, unions, is today’s topic. As we did with the other operations, we’ll revisit the characteristic function and then adapt this to the context of multisets.

The union of two sets $A, B\in \wp\left(X\right)$ can, as we’ve already seen, be represented in terms of the characteristic function as

$C=A\cup B \:\Leftrightarrow\: \chi_C = \max\left(\chi_A, \chi_B\right)$

where the maximum is taken in the ordered field $\mathbb{F}_2$ (just as we took the minimum in this field when looking at the intersection). This maximum is, in fact, a supremum:

$\sup\left\{\chi_A\left(x\right), \chi_B\left(x\right)\right\}$ for each $x\in X$.

For any family of functions $f_i: X\to Y$, we let

$\begin{array}{lrl}\sup \left\{f_i\right\} := &g:& X\to Y \\ &&x\mapsto \sup \left\{f_i\left(x\right)\right\} \end{array}$

provided that the supremum exists for each $x\in X$. Since $\mathbb{F}_2$ is finite—which ensures the existence of the suprema—and the characteristic functions are taken over a common domain, the representation of union in terms of characteristic functions extends to any family of subsets of $X$. That is, for any family $\left\{A_i\right\}_{i\in I}$ where each $A_i \in \wp\left(X\right)$, we have

$A = \bigcup_{i\in I}{A_i} \:\Leftrightarrow\: \chi_A = \sup\left\{\chi_{A_i}\right\}$.

This is completely analogous to the situation we encountered with the intersection of a family of subsets of $X$, including the avoidance of the algebraic properties of $\mathbb{F}_2$.

With the intersection, we were able to construct a definition on $\mathbf{MSet}_X$ that was analogous to the definition on $\mathbf{SSet}_X$ because both codomains—$\mathbb{F}_2$ and $\mathbf{Card}$—were well-ordered, so the necessary infima were guaranteed to exist. In $\mathbb{F}_2$, we know that the suprema exist because the set is linearly ordered and finite. $\mathbf{Card}$ is linearly ordered, but not finite. This raises a question: does every set of cardinals have a supremum?

The answer to this question depends on the set theory in which one is working. In our case, the axiom of choice allows us to actually define cardinals in terms of ordinals; in particular, cardinals are defined to be the initial ordinals. An ordinal is an initial ordinal if it is not equinumerous with any smaller ordinal. Moreover, every set of ordinals has a supremum. Since cardinals are ordinals, any set $\mathcal{K}$ of cardinals has an ordinal supremum $\omega$. The cardinal $\kappa$ which is equinumerous with $\omega$ is the cardinal supremum of $\mathcal{K}$.

With this in hand, we’re ready to define the union of a family of multisets.

Definition

Let $\mathrm{M} = \left\{\mathcal{M}_i = \left(X, f_i\right)\right\}_{i\in I}$ be a family of multisets over a set $X$. The multiset union of $\mathrm{M}$ is the set

$\mathcal{M} = \bigcup_{i\in I}{\mathcal{M}_i} := \left(X, \sup\left\{f_i\right\}\right)$.

The multiplicity functions $f_i$ are defined on a common domain, and every set of cardinals has a supremum. This ensures that the multiset union is well-defined on $\mathbf{MSet}_X$. We’ll deal with the properties of this operation in my next post on multisets.