Abstract corpora (part 1)

June 6, 2008

In The Logical Structure of Linguistic Theory (written around 1956, although not published until 1975), Chomsky outlined a theory of linguistic form, and suggested from the beginning that “we will try to show how an abstract theory of linguistic structure can be developed within a framework that admits of operational interpretation, and how such a theory can lead to a practical mechanical procedure by which, given a corpus of linguistic material, various proposed grammars can be compared and the best of them selected” (Chomsky 1975, p. 61). In order for such a mechanical procedure to be used, it would be necessary to present an actual collection of linguistic material—utterances recorded in some suitable form—on which it could operate. A grammar, in this context, is construed as a theory (Chomsky 1975, p. 63):

By “the grammar of a language L” we mean that theory of L that attempts to deal with such problems as [projection, ambiguity, sentence type, etc.] wholly in terms of the formal properties of utterances. And by “the general theory of linguistic form” we mean the abstract theory in which the basic concepts of grammar are developed, and by means of which each proposed grammar can be evaluated.

The relationship between a language L and a grammar of L, in early generative theory, is conceived of as follows (Chomsky 1957, p. 13):

From now on I will consider a language to be a set (finite or infinite) of sentences, each finite in length and constructed out of a finite set of elements. … The fundamental aim in the linguistic analysis of a language L is to separate the grammatical sequences which are the sentences of L from the ungrammatical sequences which are not sentences of L and to study the structure of the grammatical sequences. The grammar of L will thus be a device that generates all of the grammatical sequences of L and none of the ungrammatical ones.

The general proposal here is, to some degree, analogous to filling out tax forms. A person’s actual financial situation is a collection of transactions, with money being received and dispensed at various points in time. In filling out a tax form, they need to deal with certain problems—net income, withholdings, and the like—which are the financial properties of the transactions, and ignore such things as whether the money was earned by clearing clogged plumbing or by managing a team of financial auditors. The financial situation is evaluated based on the tax laws, which define the basic concepts independent of any specific person’s financial situation. The “general theory of linguistic form” is roughly analogous to the pertinent tax laws, and the “grammar of a language L” plays a role similar to the information provided on a tax form. In the tax scenario, all of this description and analysis is performed relative to an actual set of financial transactions. The language L is analogous to these transactions, in that it provides the material to be described and analyzed.

Copyright © 2008 Michael L. McCliment.

Mathematical logic and biological foundations

June 5, 2008

Last week, we considered generative linguistics as a theory of the faculty of language, and identified four distinct scopes that can be encompassed by the term faculty of language. In order to be clear about these different meanings, I adopted the notations FLB and FLN which were proposed by Chomsky, Fitch, and Hauser in a pair of articles, and I introduced FLC and FLG to represent a similar division independently of the evolutionary history of the faculty of language. All of this presupposes a biolinguistic perspective, in which language is treated as a biologically-founded cognitive phenomenon rather than as a collection of observable sentences. This view is essentially synchronic, considering only the current state of generative theory. It is also instructive to look at the historical development of the theoretical framework in order to understand why there is a distinction between FLC and FLG within the theory.

The origins of generative linguistics are often traced to Chomsky’s Syntactic Structures (1957/2002) and The Logical Structure of Linguistic Theory (1975, written ca. 1956). The fundamental idea of generation, however, has a longer history in algebra and in symbolic logic, dating as far back as the end of the 19th century; Moore (1894), for example, defines a particular abstract group in terms of generators and generating relations; these relations generate all of the elements of the group from the generators . A more direct antecedent to Chomsky’s initial work on generative grammar was Emil Post’s work from ca. 1921, by way of Rosenbloom’s The Elements of Mathematical Logic (Chomsky 1975 p. 105 fn 1; Post 1943, p. 215 fn 18; Rosenbloom 1950, p. 206). Rosenbloom even proposed that “one might also expect that many concepts in linguistics which have resisted all attempts up to now at clear and general formulation may now be treated with the same lucidity and rigor which has made mathematics a model for other sciences. The wealth of detail and the manifold irregularities of natural languages have often obfuscated the simple general principles underlying linguistic phenomena” (1950, p. 163). Chomsky’s early works pursued precisely this direction.

Some recent claims notwithstanding, the original literature suggests that generative linguistics was not originally conceived as a theory of the faculty of language, but rather just as a theory of language as an abstract corpus of sentences. (I’ll have more to say on this point in a later post.) The initial steps towards a treatment of generative theory as a theory of the faculty of language were evidently taken within a decade of the publication of Syntactic Structures. By the mid-1960s, Chomsky was writing an appendix to Lenneberg’s The Biological Foundations of Language (1967), and had already formulated the separation between competence and performance. A clearer distinction was drawn between the notions of I-language and E-language by the mid-1980s, where E-language treats language “independently of the mind/brain” (Chomsky 1986, p. 20), and I-language “is some element of the mind of the person who knows the language, acquired by the learner, and used by the speaker-hearer” (Chomsky 1986, p. 22). Taking generative grammar then to be the study of this I-language, we have a clear claim that it is a theory of the faculty of language.

Copyright © 2008 Michael L. McCliment.

“The” faculty of language

May 30, 2008

When we talked about the specialist’s view of linguistics, I mentioned that the scientific study of language can be approached from a variety of standpoints. Generative linguistics, in its contemporary form, assumes from the outset that there is a “species property, close to uniform across a broad range” (Chomsky 2004, p. 104) that is responsible for the human capacity for language. This faculty of language is “more or less on a par with the systems of mammalian vision, insect navigation, and others” (Chomsky 2005, p. 2). This point of view is often referred to as biolinguistics.

Broadly construed, the human faculty of language is a cognitive system, realized by the brain, that enables the production and consumption of language. Modern generative linguistics is generally conceived of as a theory of the faculty of language, or at least some portion thereof. A more precise characterization would be that generative linguistics is a family of theories of a portion of the faculty of language; theories in this family share some basic assumptions, have a variety of characteristics in common with one another, and partake of a common intellectual tradition.

The distinction between the faculty of language and what we can observe as spoken and written language is often expressed as a distinction between internal language (I-language) and external language (E-language). Intuitively, we might expect that a theory of internal language, being the cognitive component that enables language production and consumption, should provide the underpinnings of a theory of external language, which is the observable result of that cognitive function. However, there is a gap between the two.

The notion of internalized language is taken to be a “‘notion of structure’ in the mind of the speaker ‘which is definite enough to guide him in framing sentences of his own’” (Chomsky 1986, pp. 21-22, citing Otto Jespersen). The cognitive processes that lie between this “notion of structure” and the externally observable phenomena of language are not represented in the division between internal and external language. “The standard assumption in linguistics,” suggests Lyle Jenkins, “has always been that the theory of the language faculty must be embedded in a real-time theory of speech synthesis, perception, parsing, and the like in accordance with the modularity viewpoint” (2000, p. 71). The language faculty to which he refers here is already a relatively constrained conception, corresponding to the notion of I-language, and excluding a number of cognitive functions that must occur in the production and consumption of observable language.

This gap was part of the subject of discussion in a 2002 article by Hauser, Chomsky, and Fitch. In this article, they distinguish between broad and narrow senses of the term “faculty of language”. The broad sense of the faculty of language (FLB) “includes an internal computational system (FLN, below) combined with at least two other organism-internal systems, which we call ‘sensory-motor’ and ‘conceptual-intentional’” (pp. 1570-1571). Further, the narrow sense of the faculty of language (FLN) is “the abstract linguistic computational system alone, independent of the other systems with which it interacts and interfaces” (p. 1571). This would be a useful distinction, if it were not for later discussion claiming instead that “the contents of FLN are to be empirically determined, and could possibly be empty, if empirical findings showed that none of the mechanisms involved are uniquely human or unique to language, and that only the way they are integrated is specific to human language” (Fitch, Hauser, and Chomsky, 2005, p. 181).

When we look at any specific theory of generative grammar, we find that the gap between the internal and external views of language will continue to exist, independent of the status of any evolutionary arguments regarding homologues in other species or the evolutionary purpose of an adaptation. In deference to the 2005 clarifications, I will allow FLB and FLN denote the distinctions related to biological homologues and evolutionary purpose. I will further distinguish between the generative faculty of language (FLG), which is the constrained sense of “faculty of language” (I-language) referenced by Jenkins, and the cognitive faculty of language (FLC), consisting of all of the cognitive processes realized by the brain that enter into language production and consumption.

Copyright © 2008 Michael L. McCliment.

Mode of inquiry, object of inquiry

May 29, 2008

My linguistics posts the past few weeks have dealt with linguistics in very general terms. The purpose of the Mathematics and linguistics posts has been to outline a specific mode of inquiry within theoretical linguistics: the examination of the mathematical properties of a proposed theory. This mode of inquiry is fairly agnostic about specific theoretical details, and is very much in line with Pierce’s contention that mathematics is “the judge over both [induction and hypothesis], and it is the arbiter to which each must refer its claims” (1881, p. 97). Before we can proceed, however, we need to look at some actual linguistic theory. As with any active branch of scientific inquiry, there are multiple theories that researchers are actively pursuing. At least for the time being, I’m going to focus on generative grammar.

Generative grammar is not a single theory, but rather a family of theories that share a number of common assumptions. Historically, there are three main periods in the development of generative grammar. The first of these saw the development of theories of transformational grammar, the second introduced the principles and parameters framework, and the most recent period focuses on minimalist grammars. The intellectual roots of generative grammar go back further, drawing on mathematical logic and adopting Post’s (1943) notion of productions. Since at least the mid-1970s, there has been a growing trend to consider generative grammar within a biolinguistic context. My goal for the next few linguistics posts is to look at this historical development in more detail, and identify some of the common assumptions that are made in generative theories of language.

Copyright © 2008 Michael L. McCliment.

Mathematics and linguistics (part 4)

May 23, 2008

In part 1, I discussed the non-specialist’s experience with mathematics and with linguistics, and suggested that their experience is, in both cases, essentially prescriptivist in nature. Before discussing the relationship between these fields, we needed to move beyond the non-specialist’s perceptions and to understand more about the actual scope of each of these fields. I offered some observations about mathematics in part 2, and addressed linguistics in part 3. In this final part, we’ll look at the relationship between the two fields in light of the preceding discussion.

Now that we have a better understanding of what mathematics and linguistics are, let’s reconsider the question of how they are related to one another. Our initial view was that the two fields were predominantly parallel, and would interact where their respective objects of study happened to coincide (if anywhere). We represented this aspect schematically:

Fields of inquiry and their objects of study
Field of study Practitioners Objects studied
linguistics linguists language
mathematics mathematicians space, number, quantity, and arrangement

Linguistics, as we have now seen, isn’t just an arbitrary study of language. In many cases, philosophers may be said to study language. I’m not convinced that literary criticism can ever avoid studying language. But neither of these are inherently linguistic in nature. (They may be approached from a linguistic perspective; for example, one can apply pragmatics to the study of literature—but one can also study literature without using pragmatics, or any other part of linguistics.) Linguistics is the scientific study of language as a principal phenomenon.

Linguistics naturally studies the patterns that arise in languages. Even linguists who strongly reject any notion of an underlying rule-governed cognitive system propose that there are patterns in the languages that they study. Whether we adopt a rule-governed framework or not, the role of a linguistic theory is to propose that there are certain patterns—ranging from fully systematic “laws” through to weak “tendencies”—that arise in the use of language. The process of drawing conclusions from these basic propositions, and hence the entire act of moving from philosophical opinion to empirical science, is inherently an act of mathematics. Language is not mathematics. Linguistic theorization is not mathematics, but uses mathematics as its tool of reasoning. The evaluation of linguistic theories, however, is intrinsically mathematical. Has the theorist constructed a consistent theory? Do the theorist’s predictions follow from the patterns abstracted from their observations? Are there other predictions that follow from the proposed theory that also need to be evaluated empirically? These questions cannot be answered by the linguistic theory, which takes some aspect of language as its object of study, since these questions naturally take the linguistic theory itself as the object of study.

When physicists propose specific theories, these theories are evaluated not only for agreement with the empirical data and for compatibility with generally known physical properties, but they also evaluate and validate the mathematical properties of these theories. If the theory proposes a set of relationships among the observed patterns that is inconsistent, or if there is no way to construct any object satisfying the properties proposed by the theory, then that theory is rejected. The situation in linguistics is analogous. Just as the study of physical theories is part of physics, the study of linguistic theories is just as much a part of linguistics as the study of languages is.

This relationship, in which mathematics is the instrument by which we analyze linguistic theories, is the more fundamental relationship that I hinted at from the outset. It comes with an immediate corollary: mathematical modeling is necessarily a valid research methodology in linguistics. It does not replace empirical studies or the myriad research methodologies associated with them; it complements such studies. Both aspects are important for the health of linguistics as a science. For the formal evaluation of linguistic theory, however, mathematical modeling may well be the only valid methodology.

Copyright © 2008 Michael L. McCliment.

Mathematics and linguistics (part 3)

May 16, 2008

In part 1, I discussed the non-specialist’s experience with both mathematics and linguistics, and suggested that their experience is, in both cases, essentially prescriptivist in nature. Before discussing the relationship between these fields, we must move beyond the non-specialist’s perceptions and understand more about the actual scope of each of these fields. In part 2, I offered some observations about mathematics. In this part, I’ll address the question of linguistics.

To start with, I’ll turn once again to my trusty OED. As a substantive, we find that linguistic is “the science of languages; philology”, and is almost always used in the plural form linguistics (linguistic, sense B.a, b). Turning to philology, we find that the pertinent sense of the word is

3. spec. (in mod. use) The study of the structure and development of language; the science of language; linguistics. Now usu. restricted to the study of the development of specific languages or language families, esp. research into phonological and morphological history based on written documents. (Really one branch of sense 1.)

This sense has never been current in the U. S. Linguistics is now the more usual term for the study of the structure of language, and, with qualifying adjective or adjective phrase, is replacing philology even in the restricted sense.

For the non-specialist, this doesn’t add too much beyond what we experienced in terms of grammar, and possibly pronunciation drills in a foreign language course (courtesy of something that looks, after all, like its related to the “Hooked on Phonics” products). And, of course, theres that odd term, morphological.

Mark Liberman recently provided a wordbite that does a much better job at suggesting the range of phenomena studied by linguists:

The usual division is into six levels [of linguistic analysis], named as pragmatics (how language is used to communicate), semantics (the meaning of words and phrases), syntax (the structure of sentences), morphology (the structure of words), phonology (the inventory of sounds and their systematic arrangement into words), and phonetics (the physical facts of speech).

An important point here is that linguistics is the scientific study of these phenomena. A linguist will not provide a list like the following (from Richard Johnson-Sheehan’s Technical Communication Today, 2nd ed., p. 218):

Eight Guidelines for Plain Sentences

  • Guideline 1: The subject of the sentence should be what the sentence is about.
  • Guideline 2: The subject should be the “doer” in the sentence.
  • Guideline 3: The verb should state the action, or what the doer is doing.
  • Guideline 4: The subject of the sentence should come early in the sentence.
  • Guideline 5: Eliminate nominalizations.
  • Guideline 6: Avoid excessive prepositional phrases.
  • Guideline 7: Eliminate redundancy in sentences.
  • Guideline 8: Write sentences that are “breathing length.”

People who study and teach the art of communication may well provide well-reasoned arguments in favor of following some or all of these guidelines under certain circumstances. However, no such argument is part of linguistics, any more than a rule that says “Do not stop on tracks” is part of physics. It’s a perfectly good traffic guideline, and well-supported by an argument that a train hitting a car generally has negative consequences. The reason that it isn’t physics is that there’s nothing about the physical nature of cars and railroad tracks that prevents a car from stopping in such close proximity to the tracks. The reason that the eight guidelines aren’t linguistics is that they are not a property of language, but rather a recommendation on how we use a specific language.

Linguistics, unlike writing advisors or publication style manuals, is the scientific study of language. The linguist doesn’t ask “what should people do?” (that’s rhetoric), but rather “what do people do?” and “how do people do what they do?” These questions can be from several different standpoints. For example, we might look at a corpus—a collection of written or spoken language that has been collected from various sources—and examine the phenomena that appear in that corpus. We can also treat corpora as being just a sample of the actual and / or possible range of language use, and study some aspects of the larger collection from which the sample is drawn.

The corpora-based approaches, whether drawn from real corpora or some form of virtual corpus, are often aggregates over some group of people. A fundamentally different approach is to study how a specific person uses language. One form that this type of research takes is to correlate specific linguistic tasks with imaging studies of the brain. Functional magnetic resonance imaging (fMRI) studies are one of the current tools in this vein. Evaluation and study of people with impaired language capabilities also lend themselves to this type of individual study.

The scientific tools that are deployed in linguistics are quite varied. Some of the biological investigations use highly technical instruments like the imaging systems that collect data for the fMRI studies. Other types of investigation, such as how language use correlates with sociological factors, may use surveys and interviews as their primary tools. But no matter what research tools are used, linguistic questions are always fundamentally concerned with what actually happens when people create and consume language.

Copyright © 2008 Michael L. McCliment.

Mathematics and linguistics (part 2)

May 9, 2008

In part 1, I discussed the non-specialist’s experience with both mathematics and linguistics, and suggested that their experience is, in both cases, essentially prescriptivist in nature. Before discussing the relationship between these fields, we must move beyond the non-specialist’s perceptions and understand more about the actual scope of each of these fields. In this part, I’ll address the question of mathematics.

The definition of mathematics offered by the OED (which I discussed here) proposes that modern mathematics is “the science of space, number, quantity, and arrangement, whose methods involve logical reasoning and usually the use of symbolic notation, and which includes geometry, arithmetic, algebra, and analysis; mathematical operations or calculations.” The Merriam-Webster dictionary proposes the following definition:

1: the science of numbers and their operations, interrelations, combinations, generalizations, and abstractions and of space configurations and their structure, measurement, transformations, and generalizations

2: a branch of, operation in, or use of mathematics

Definitions like these are common, but don’t really convey a sense of what mathematics is. Saunders Mac Lane opened the first chapter of Mathematics: Form and Function (1986) with the following statement (p. 6):

Mathematics, at the beginning, is sometimes described as the science of Number and Space—better, of Number, Time, Space, and Motion.

A somewhat different idea of the scope of mathematics had already emerged before the start of the 20th century. A relatively well-known quote is Benjamin Pierce’s comment that mathematics is “the science that draws necessary conclusions” (Google reports more than 2,500 hits for this exact phrase). This is the opening sentence of his Linear Associative Algebra, published posthumously in 1881. He then expands on this conception of mathematics:

This definition of mathematics is wider than that which is ordinarily given, and by which its range is limited to quantitative research. The ordinary definition, like those of other sciences, is objective; whereas this is subjective. Recent investigations, of which quaternions is the most noteworthy instance, make it manifest that the old definition is too restricted. The sphere of mathematics is here extended, in accordance with the derivation of its name, to all demonstrative research, so as to include all knowledge strictly capable of dogmatic teaching. Mathematics is not the discoverer of laws, for it is not induction; neither is it the framer of theories, for it is not hypothesis; but it is the judge over both, and it is the arbiter to which each must refer its claims; and neither law can rule nor theory explain without the sanction of mathematics. It deduces from a law all its consequences, and develops them into the suitable form for comparison with observation, and thereby measures the strength of the argument from observation in favor of a proposed law or of a proposed form of application of a law.

This conception of the scope of mathematics is much broader than one would expect from either the typical dictionary definitions or from the non-specialist’s experience of mathematics. For that matter, so is Mac Lane’s conception. In Mathematics: Form and Meaning, Mac Lane examines mathematics from several points of view, outlining and critiquing several schools of thought about its nature. Mac Lane evaluates the conception of mathematics as logicism, set theory, platonism, formalism, intuitionism, constructivism, finitism, and empiricism, all of which have been put forward as philosophical foundations for mathematics. He evaluates them, and finds all of them wanting (p. 456):

Each of these philosophies illuminates a relevant aspect of Mathematics, but none of them is remotely adequate as a description or foundation of the actual extensive network of Mathematics. Instead, our study has revealed Mathematics as an array of forms, codifying ideas extracted from human activities and scientific problems and deployed in a network of formal rules, formal definitions, formal axiom systems, explicit theorems with their careful proof and the manifold interconnections of these forms. More briefly, Mathematics aims to understand, to manipulate, to develop, and to apply those aspects of the universe which are formal.

The manipulation of numbers and geometric figures, and the establishment of their properties, certainly falls within the scope of mathematics as conceived of by these authors. However, mathematics is not limited to such considerations. Graph theory, for example, is unconcerned with the nature of the vertices of a graph. Graphs are used in modeling any set of (binary) relationships—whether that be shipping routes, network connections, inheritance relations in an object-oriented software system, dependencies in a project plan, or interspecies predator-prey relationships. Graph theory focuses on the existence of relationships between elements of a set, and systematically develops our understanding of the consequences that follow just from the existence of those relationships.

Eugene Wigner once wrote an article called The Unreasonable Effectiveness of Mathematics in the Natural Sciences. If we consider mathematics as a science of numbers, his assertion that mathematics is “unreasonably” effective appears to make sense. Once we discard this too-narrow idea of mathematics, the effectiveness of mathematics in scientific inquiry should no longer be a surprise. Mathematics is an effective tool in natural scientific inquiry precisely because the sciences are concerned with the systematic aspects of the phenomena they investigate. With this conception in hand, I suspect that the basis for judging mathematics to be unreasonably effective reduces to an a priori belief that the universe shouldn’t display any systematic properties.

Scientific inquiry into any set of phenomena presupposes that there is some degree of systematic behavior in what we observe. The scope of contemporary mathematics, as suggested by the perspectives offered above by Pierce and Mac Lane, can be considered as the systematic study of what it means to have a systematic behavior—in short, what Lynn Arthur Steen has called “the science of patterns”.

Copyright © 2008 Michael L. McCliment.