“The” faculty of language

May 30, 2008

When we talked about the specialist’s view of linguistics, I mentioned that the scientific study of language can be approached from a variety of standpoints. Generative linguistics, in its contemporary form, assumes from the outset that there is a “species property, close to uniform across a broad range” (Chomsky 2004, p. 104) that is responsible for the human capacity for language. This faculty of language is “more or less on a par with the systems of mammalian vision, insect navigation, and others” (Chomsky 2005, p. 2). This point of view is often referred to as biolinguistics.

Broadly construed, the human faculty of language is a cognitive system, realized by the brain, that enables the production and consumption of language. Modern generative linguistics is generally conceived of as a theory of the faculty of language, or at least some portion thereof. A more precise characterization would be that generative linguistics is a family of theories of a portion of the faculty of language; theories in this family share some basic assumptions, have a variety of characteristics in common with one another, and partake of a common intellectual tradition.

The distinction between the faculty of language and what we can observe as spoken and written language is often expressed as a distinction between internal language (I-language) and external language (E-language). Intuitively, we might expect that a theory of internal language, being the cognitive component that enables language production and consumption, should provide the underpinnings of a theory of external language, which is the observable result of that cognitive function. However, there is a gap between the two.

The notion of internalized language is taken to be a “‘notion of structure’ in the mind of the speaker ‘which is definite enough to guide him in framing sentences of his own’” (Chomsky 1986, pp. 21-22, citing Otto Jespersen). The cognitive processes that lie between this “notion of structure” and the externally observable phenomena of language are not represented in the division between internal and external language. “The standard assumption in linguistics,” suggests Lyle Jenkins, “has always been that the theory of the language faculty must be embedded in a real-time theory of speech synthesis, perception, parsing, and the like in accordance with the modularity viewpoint” (2000, p. 71). The language faculty to which he refers here is already a relatively constrained conception, corresponding to the notion of I-language, and excluding a number of cognitive functions that must occur in the production and consumption of observable language.

This gap was part of the subject of discussion in a 2002 article by Hauser, Chomsky, and Fitch. In this article, they distinguish between broad and narrow senses of the term “faculty of language”. The broad sense of the faculty of language (FLB) “includes an internal computational system (FLN, below) combined with at least two other organism-internal systems, which we call ‘sensory-motor’ and ‘conceptual-intentional’” (pp. 1570-1571). Further, the narrow sense of the faculty of language (FLN) is “the abstract linguistic computational system alone, independent of the other systems with which it interacts and interfaces” (p. 1571). This would be a useful distinction, if it were not for later discussion claiming instead that “the contents of FLN are to be empirically determined, and could possibly be empty, if empirical findings showed that none of the mechanisms involved are uniquely human or unique to language, and that only the way they are integrated is specific to human language” (Fitch, Hauser, and Chomsky, 2005, p. 181).

When we look at any specific theory of generative grammar, we find that the gap between the internal and external views of language will continue to exist, independent of the status of any evolutionary arguments regarding homologues in other species or the evolutionary purpose of an adaptation. In deference to the 2005 clarifications, I will allow FLB and FLN denote the distinctions related to biological homologues and evolutionary purpose. I will further distinguish between the generative faculty of language (FLG), which is the constrained sense of “faculty of language” (I-language) referenced by Jenkins, and the cognitive faculty of language (FLC), consisting of all of the cognitive processes realized by the brain that enter into language production and consumption.

Copyright © 2008 Michael L. McCliment.

Mode of inquiry, object of inquiry

May 29, 2008

My linguistics posts the past few weeks have dealt with linguistics in very general terms. The purpose of the Mathematics and linguistics posts has been to outline a specific mode of inquiry within theoretical linguistics: the examination of the mathematical properties of a proposed theory. This mode of inquiry is fairly agnostic about specific theoretical details, and is very much in line with Pierce’s contention that mathematics is “the judge over both [induction and hypothesis], and it is the arbiter to which each must refer its claims” (1881, p. 97). Before we can proceed, however, we need to look at some actual linguistic theory. As with any active branch of scientific inquiry, there are multiple theories that researchers are actively pursuing. At least for the time being, I’m going to focus on generative grammar.

Generative grammar is not a single theory, but rather a family of theories that share a number of common assumptions. Historically, there are three main periods in the development of generative grammar. The first of these saw the development of theories of transformational grammar, the second introduced the principles and parameters framework, and the most recent period focuses on minimalist grammars. The intellectual roots of generative grammar go back further, drawing on mathematical logic and adopting Post’s (1943) notion of productions. Since at least the mid-1970s, there has been a growing trend to consider generative grammar within a biolinguistic context. My goal for the next few linguistics posts is to look at this historical development in more detail, and identify some of the common assumptions that are made in generative theories of language.

Copyright © 2008 Michael L. McCliment.

Multiset union

May 28, 2008

Over the past two weeks, we’ve introduced three operations on families of multisets: sums, products, and intersections. The other basic operation on families of multisets, unions, is today’s topic. As we did with the other operations, we’ll revisit the characteristic function and then adapt this to the context of multisets.

The union of two sets A, B\in \wp\left(X\right) can, as we’ve already seen, be represented in terms of the characteristic function as

C=A\cup B \:\Leftrightarrow\: \chi_C = \max\left(\chi_A, \chi_B\right)

where the maximum is taken in the ordered field \mathbb{F}_2 (just as we took the minimum in this field when looking at the intersection). This maximum is, in fact, a supremum:

\sup\left\{\chi_A\left(x\right), \chi_B\left(x\right)\right\} for each x\in X.

For any family of functions f_i: X\to Y, we let

\begin{array}{lrl}\sup \left\{f_i\right\} := &g:& X\to Y \\ &&x\mapsto \sup \left\{f_i\left(x\right)\right\} \end{array}

provided that the supremum exists for each x\in X. Since \mathbb{F}_2 is finite—which ensures the existence of the suprema—and the characteristic functions are taken over a common domain, the representation of union in terms of characteristic functions extends to any family of subsets of X. That is, for any family \left\{A_i\right\}_{i\in I} where each A_i \in \wp\left(X\right), we have

A = \bigcup_{i\in I}{A_i} \:\Leftrightarrow\: \chi_A = \sup\left\{\chi_{A_i}\right\}.

This is completely analogous to the situation we encountered with the intersection of a family of subsets of X, including the avoidance of the algebraic properties of \mathbb{F}_2.

With the intersection, we were able to construct a definition on \mathbf{MSet}_X that was analogous to the definition on \mathbf{SSet}_X because both codomains—\mathbb{F}_2 and \mathbf{Card}—were well-ordered, so the necessary infima were guaranteed to exist. In \mathbb{F}_2, we know that the suprema exist because the set is linearly ordered and finite. \mathbf{Card} is linearly ordered, but not finite. This raises a question: does every set of cardinals have a supremum?

The answer to this question depends on the set theory in which one is working. In our case, the axiom of choice allows us to actually define cardinals in terms of ordinals; in particular, cardinals are defined to be the initial ordinals. An ordinal is an initial ordinal if it is not equinumerous with any smaller ordinal. Moreover, every set of ordinals has a supremum. Since cardinals are ordinals, any set \mathcal{K} of cardinals has an ordinal supremum \omega. The cardinal \kappa which is equinumerous with \omega is the cardinal supremum of \mathcal{K}.

With this in hand, we’re ready to define the union of a family of multisets.


Let \mathrm{M} = \left\{\mathcal{M}_i = \left(X, f_i\right)\right\}_{i\in I} be a family of multisets over a set X. The multiset union of \mathrm{M} is the set

\mathcal{M} = \bigcup_{i\in I}{\mathcal{M}_i} := \left(X, \sup\left\{f_i\right\}\right).

The multiplicity functions f_i are defined on a common domain, and every set of cardinals has a supremum. This ensures that the multiset union is well-defined on \mathbf{MSet}_X. We’ll deal with the properties of this operation in my next post on multisets.

Copyright © 2008 Michael L. McCliment.

Properties of multiset intersection

May 27, 2008

Today, we’ll examine the properties of the multiset intersection that we defined yesterday.

Suppose \left\{\mathcal{M}_i = \left(X, f_i\right)\right\}_{i\in I} is a family of multisets over X. Then the following relationships hold:

(i) \mathrm{support}\left(\bigcap_{i\in I}{\mathcal{M}_i}\right) = \bigcap_{i\in I}{\mathrm{support}\left(\mathcal{M}_i\right)}.

(ii) \bigcap_{i\in I}{\mathcal{M}_i} \subseteq \mathcal{M}_i for all i\in I.

(iii) \bigcap_{i\in I}{\mathcal{M}_i} \subseteq \biguplus_{i\in I}{\mathcal{M}_i}.

(iv) \bigcap_{i\in I}{\mathcal{M}_i} \subseteq \:\cdot\!\!\!\!\!\!\;\bigcup_{i\in I}{\mathcal{M}_i}.

The proof of part (i) is a straightforward series of equivalencies:

\begin{array}{r@{\:\Leftrightarrow\:}l} x\in \mathrm{support}\left(\bigcap_{i\in I}{\mathcal{M}_i}\right) & \inf f_i(x) \neq 0 \\ & \left(\forall i\in I\right)\: f_i\left(x\right)\neq 0 \\ & \left(\forall i\in I\right)\: x\in\mathrm{support}\left(\mathcal{M}_i\right) \\ & x\in\bigcap_{i\in I}{\mathrm{support}\left(\mathcal{M}_i\right)}. \end{array}

Part (ii) follows directly from the definition of the infimum of a set, since \inf f_i\left(x\right) \leq f_i\left(x\right) for all x\in X and i\in I.

Recalling that the cardinal sum is a monotonic nondecreasing operation, we see that

\inf f_i\left(x\right) \leq f_i\left(x\right) \leq \sum_{i\in I}{f_i\left(x\right)}

for all x\in X, which proves part (iii).

Part (iv) requires only slightly more work. To begin with, consider a family \left\{\kappa_i\right\}_{i\in I} of cardinals. If there exists some i\in I such that \kappa_i = 0, then \prod_{i\in I}{\kappa_i} = 0. If, however, no such i\in I exists, then

\kappa_i\leq\prod_{i\in I}{\kappa_i} for all i\in I.

(This relies on the axiom of choice, which we have been assuming from the outset.) In other words, the product of any family of nonzero cardinals is monotonic nondecreasing.

Let x\in\bigcap_{i\in I}{\mathcal{M}_i}. If there exists some i\in I such that x\not\in\mathcal{M}_i, then the multiplicity of x in the intersection would be 0, contradicting the fact that x is a member of the intersection. Since f_i\left(x\right)\neq 0 for all i\in I, we have

\inf f_i\left(x\right) \leq f_i\left(x\right) \leq \prod_{i\in I}{f_i\left(x\right)}.

For x\not\in\bigcap_{i\in I}{\mathcal{M}_i}, we have

\inf f_i\left(x\right) = 0 = \prod_{i\in I}{f_i\left(x\right)}.

In either case, \inf f_i\left(x\right) \leq \prod_{i\in I}{f_i\left(x\right)} for all x\in X, and (iv) holds.

Copyright © 2008 Michael L. McCliment.

Multiset intersection

May 26, 2008

Last Monday, I mentioned that multiset products are not the best available extension of the intersection of a family of sets so that it applies to multisets. Now that we’ve talked about the concepts of infima and well-ordering, we’re ready to define multiset intersections.

A bit further back, we discussed how characteristic functions can be used to represent subsets of a set X and the operations on \wp\left(X\right). At the time, we noted that the intersection of two sets A, B\in \wp\left(X\right) is represented in terms of the characteristic function as

C=A\cap B \:\Leftrightarrow\: \chi_C = \min\left(\chi_A,\chi_B\right)

where the minimum is taken in the ordered field \mathbb{F}_2. For each x\in X, this minimum is just \inf\left\{\chi_A\left(x\right), \chi_B\left(x\right)\right\}.

Given a family of functions f_i: X\to Y, we let

\begin{array}{lrl}\inf \left\{f_i\right\} := &g:& X\to Y \\ &&x\mapsto \inf \left\{f_i\left(x\right)\right\} \end{array}.

Since \mathbb{F}_2 is well-ordered and the characteristic functions are taken over a common domain, our representation of intersection in terms of characteristic functions extends to any family of subsets of X. That is, for any family \left\{A_i\right\}_{i\in I} where each A_i \in \wp\left(X\right), we have

A = \bigcap_{i\in I}{A_i} \:\Leftrightarrow\: \chi_A = \inf\left\{\chi_{A_i}\right\}.

We also saw a representation of intersections as the product of the characteristic functions. When dealing with sets in \wp\left(X\right), the two representations correspond to the same objects and operations in the class \mathbf{SSet}_X. However, there is an important difference between the two: the representation in terms of products relies (exclusively) on the algebraic properties of the codomain, while the representation in terms of infima relies (exclusively) on the order properties of the codomain.

The algebraic properties of the field \mathbb{F}_2 and arithmetic on \mathbf{Card} are quite different. However, both of them are well-ordered classes, so their order properties are similar in many respects. This leads us to the following definition:


Let \mathrm{M} = \left\{\mathcal{M}_i = \left(X, f_i\right)\right\}_{i\in I} be a family of multisets over a set X. The multiset intersection of \mathrm{M} is the set

\mathcal{M} = \bigcap_{i\in I}{\mathcal{M}_i} := \left(X, \inf\left\{f_i\right\}\right).

The multiplicity functions f_i are defined on a common domain, and \mathbf{Card} is well-ordered by the usual relation \leq. Just as we found with the characteristic function, these facts are sufficient to ensure that the multiset intersection is always well-defined on \mathbf{MSet}_X.

As usual, when \mathrm{M} contains only two multisets, we will use the infix notation \mathcal{M}_1 \cap \mathcal{M}_2 for the intersection. In this case, \cap is an associative and commutative binary operation on \mathbf{Mset}_X. Next time, we’ll look at some of the properties of multiset intersections.

Copyright © 2008 Michael L. McCliment.

Perspectives 5: “Teaching to the test”

May 25, 2008
Someone, anyone, whoever…please explain to me why it is wrong to teach to a test.

—Jon-Paul @ American Age,
Teaching to the Test!

Whether it is “wrong” to teach to a test is, to a large degree, a value judgment; it depends fundamentally on what goals we are striving to achieve. I don’t agree with the assertion that it is “eminently sound pedagogy”, but again this may well depend on our objectives. I won’t attempt to explain why it is wrong to teach to a test, but I will explain why I judge it to be wrong.

Jon-Paul points out that

There is a distinct difference between teaching to the broad body of skills and knowledge that a test represents (good), and teaching to the exact items that will appear on the standardized test (indefensible and illegal).

The distinction he makes is important. I have observed a few instances where an instructor for a university course answered a student’s question with a response like “don’t worry about that right now, there’s no question on the test about it.” Jon-Paul’s conception of “good” teaching avoids this type of teaching to the test. Even with this conception, however, I still have two objections to such teaching. One objection is the use of the preposition to and what it implies about goals and achievement. The other concerns the relationship between a body of skills and knowledge on the one hand, and a test on the other.

The preposition to

The complete entry for to in the OED runs to about 8 pages, which I won’t reproduce here. The principal senses of to (A) are:

I. Expressing a spatial or local relation.
II. Expressing a relation in time.
III. Expressing the relation of purpose, destination, result, effect, resulting condition or status.
IV. Followed by a word or phrase expressing a limit in extent, amount, or degree.
V. Indicating addition, attachment, accompaniment, appurtenance, possession.
VI. Expressing relation to a standard or to a stated term or point.
VII. Expressing relations in which the sense of direction tends to blend with the dative.
VIII. Supplying the place of the dative in various other languages and in the earlier stages of English itself.

For those not familiar with it, the dative is essentially a way of indicating that the noun phrase is an indirect object. Across the board, to conveys a sense of limit. It suggests that just enough be taught to reach that limit.

Here’s a situation that is quite similar: preparing to give a presentation. When we present something (in a business meeting, at a conference, in a classroom, or wherever), we will be evaluated on the performance of a single act, just as students are evaluated on a single act of taking a test. When I prepare to give such a presentation, I don’t gather “just enough” information to put the presentation together. No good presenter does. Why not? There are several reasons to gather more information than we strictly need for a presentation, including being able to respond to questions that may be asked. But more importantly, this extra information actively helps us to master the material we will be presenting.

In science and engineering, we constantly look at the boundary conditions on any system we study, not just the interior behavior. We examine the boundary conditions because they help us to understand the purpose and behavior of the system. In software development, understanding the boundaries of a software system are often the key to understanding why certain pieces of it are the way they are. I’ve found the same to be true with any system of knowledge we may care to describe. I suspect that teaching “to” a test may well predispose us to teaching prescriptive rules rather than helping students to achieve understanding or mastery of a subject.

The relationship between tests and a body of knowledge

Jon-Paul’s comment suggested that good instructors will be “teaching to the broad body of skills and knowledge that a test represents”, and construes this to be equivalent to “teaching to the test”. Not only are the two not equivalent, but the skills and knowledge that a test represents do not cover the range of skills and knowledge that should be acquired by a student.

Test have a very specific function: they serve as a measurement instrument to help teachers and other education professionals to assess a student’s level of mastery of a subject. This has a couple of implications. The first of these is a general principle that the measurement instrument should not influence what is being measured, because it compromises the measurements made; in this case, it undermines the purported value of testing and, in particular, of standardized tests. A second implication is that, as an assessment instrument, it has been designed to focus on certain aspects of some portion of what it seeks to measure. Moreover, tests are notoriously limited to those aspects of a subject that are “testable”. Both the design focus and the criterion of testability limit what can be represented by a test.

The reason that I find teaching to the test so particularly wrong is that, in my assessment, it nurtures a culture of underachievement, restricts the development of a student’s ability to think critically, fosters an inability for people to connect their knowledge to any real application of that knowledge, and disrupts the only defensible reason for inflicting tests on people in the first place. Tests need to reflect the subject matter that is being taught, but cannot mediate between the subject matter and the pedagogic choices about its presentation. As I said at the outset, however, it all depends on the goals one wants to achieve.

Copyright © 2008 Michael L. McCliment

FoundAround 2008-21

May 24, 2008

Some miscellanea encountered this week:

  • Personal Digital Mathematics Assistant. John Armstrong posted an image of an “ACME integrating pistol” over at The Unapologetic Mathematician, calling it the “perfect math gadget”. But maybe we could improve on it, have it do more than just integration? Just imagine how much better results we would see on the standardized tests if students could come in with an ACME MathZapper™. It might even let us “fix” the “all children held behind” mess (to borrow a phrase from Keith Devlin). “Here you go, Johnny. It’s your first government-issued Personal Digital Mathematics Assistant, complete with a preinstalled symbolic math package.” Who needs another browser war when we can have a winner-take-all brawl among Magma, Maple, Mathcad, and Mathematica?
  • Blog brains. In a thread on the WordPress.com forums, I exchanged a series of posts with Jim Sizemore of Doodlemeister’s Weblog. After a brief foray into the world of HTML and CSS, he introduced the absolutely wonderful expression “blog brains” to refer to the technologies in question. I doubt I’ll ever look at tag soup in quite the same way again.
  • Floating pennies (redux). Last month, I wrote about the perils of floating-point representations of currencies. Now there’s a related bit of discussion of sums in Excel over at Walking Randomly, based on a post at Office Watch. The main example uses—yep, you guessed it—the same two-place decimal representations that make floating point numbers not work well for currencies. It gets more interesting when one of the commenters implies that people using Excel for financial purposes shouldn’t have to worry about rounding errors. This just isn’t true, since accountants multiply by non-integer rates and they split money among multiple parties all the time—both of which introduce rounding error when getting back to pennies. Just try to divide a dollar evenly among three people. I’m sympathetic to the general sentiment, but neither rational representations nor packed decimal representations solve the currency problems (and weak data typing really just exacerbates the issues). *Sniff* *Sniff* Do I smell another Perspectives post baking?

Copyright © 2008 Michael L. McCliment.