CHOMSKY, RUSSELL AND PLATONIC PROPOSITIONS
. Chomsky’s claim that humans are born with an innate mental lexicon which as a matter of empirical fact yields analytic connections seems to commit him to the existence of propositions. In this blog I will evaluate the claim that Chomsky is committed to the existence of propositions. I will argue that though his linguistics theory does not commit him to the existence of Platonic propositions; he does nonetheless endorse the existence of a certain type of proposition which Quine would find objectionable.
I will begin by first outlining the notion of a Platonic proposition which was endorsed at the turn of the century by Bertrand Russell. Russell first conceived propositions as the non-linguistic entity that a sentence is about, so for example if a person says in English that ‘The Twin Towers were Bombed on Sept 11th 2001’, and another person says the same thing in French, they are both asserting the same true proposition. According to the early Russell a proposition is a non-linguistic entity that sentences are about. Further examples will make the point clear, if on September the 10th 2001 Osama Bin Laden said ‘The twin towers will be bombed on September the 11th 2001’ and if I say today ‘The twin towers were bombed on September the 11th 2001’ we are taking different attitudes towards the same proposition. Propositions like sentences have constituents which are related to each other in certain ways. According to Russell circa 1903 the constituents of Propositions do not designate the constituents of states of affairs, rather they encompass them. The proposition ‘Socrates is in the field’ contains ‘Socrates’ the relation ‘is in’ and ‘the field’. It is important to note that the objects that enter into a proposition are neither true nor false; it is only the proposition which is true or false. According to the early Russell in order for a person to grasp a proposition we need to be acquainted with its constituents.
The two conditions which Russell put on something being a proposition are that (1) The structure of the sentence must mirror the structure of the proposition and (2) We must be acquainted with each of the constituents of the propositions. There are obvious difficulties with condition number (2), consider propositions such as ‘all men are mortal’, or ‘all prime numbers have a successor’ we can grasp the truth of these propositions; however we are not acquainted with ‘all men’ or ‘all prime numbers’. To meet this difficulty Russell postulated entities called Denoting Concepts, we could be acquainted with these concepts, and they could denote entities which fell beyond our ken. The price which Russell had to pay for the postulation of Denoting Concepts was that, it complicated his theory of Propositions which now consisted of entities we are acquainted with (objects) and Denoting Concepts which we are also acquainted with and which refer to the entities we were not directly acquainted with, such ‘all men’ ‘infinite prime number’ etc. As is well known in Russell’s famous Grey’s Elegy argument he showed that his postulation of Denoting Concepts lead to more trouble than it was worth. Having cleaned this mess up Russell was able to show that we are not acquainted with Denoting Concepts rather a proposition was a complex of objects arranged in a certain relation. He showed that the grammatical form of a sentence was misleading us and hiding the logical form of the sentence. Consider the sentence ‘Socrates is Mortal’ from a grammatical point of view this sentence would seem to have two constituents; the subject ‘Socrates’, and the predicate ‘is Mortal’, however on Russell’s analysis it consists of an existential quantifier, and a function (name) which satisfies the unsaturated predicate. This method of analysis has the advantage that it helps us assign a truth or falsehood to sentences with names that don’t refer by analysing them as indefinite descriptions. So for Russell the logical form of a sentence represents the unambiguous proposition which the sentence is ambiguously trying to assert. This picture gives philosophy the job of discovering which sentences are true propositions and which are not; prompting the early Wittgenstein to state ‘all philosophy is a critique of language’.
Now Quine famously claimed that one could benefit from Russell’s logic, and, explain truth and falsehood (along Tarski’s lines) without admitting things such as propositions into our ontology. His indeterminacy of translation argument was explicitly designed to show that propositions are non explanatory and have poor identity conditions and hence should not be admitted into our ontology.
When analysing whether Chomsky is committed to the existence of propositions it is important to note that Chomsky’s primary concern is not with the epistemological issues which interested Russell. Furthermore he is not interested in mapping word-world relations, language can only fruitfully be understood if a certain amount of idealisation occurs, and the domain of research is limited. For example if a person is interested in studying the structure of the foot, they will not exhaustively try to map every piece of interaction between the foot and the world, rather they will treat the foot as a biological aspect of the person and proceed to investigate its internal structure. Now for Chomsky language is just a biological aspect of the human species like any other, and hence should be studied in the same manner as every other aspect of the human species. Now in order to study language we need to determine what the main features of the language are, and to then construct an abstract model to explain how those features are possible. The central features of language are first the fact that it is compositional; the meaning of the sentence is determined by the meaning of its constituents and the rules used to combine them. A language also involves the ability to express an infinite amount of utterances using finite means. Humans born into any normal human environment will begin to speak at certain predictable times, and require very little explicit instruction to do so. The converse situation exists with other primates; who even with explicit teaching can learn only extremely rudimentary linguistic abilities. A further fact about language which needs to be explained is the fact that humans brought up in different environments seem to speak different languages. These are the basic facts of language that Chomsky sets out to explain in his theory of generative grammar, and Chomsky nowhere claims that Platonic Propositions are helpful in addressing these questions; in fact he has in numerous different places criticised people such as Katz and Postal for holding that language is an abstract object.
Before considering Chomsky’s position I will first consider a possible way that Platonic propositions can be used to explain the above four facts about language. The compositionality requirement could be explained by the fact that the grammatical form of sentences mirrors the logical form of propositions. So consider the sentence ‘Socrates is bald’ this sentence could mirror a proposition which contained the Particular Socrates, and two Universals the relation ‘is’ and the attribute ‘bald’. Or the sentence could consist of the particular the subject Socrates, and the Universal the predicate ‘is bald’. The two seemingly contradictory facts that humans seem to be able to speak without explicit instruction, and yet when born in different environments they speak different languages can also be handled by postulating propositions. When an English speaker says ‘Socrates is bald’, and a French man speaks its equivalent they are both speaking the same proposition but using different sounds to do so. And the Platonic doctrine that we are born with innate knowledge could be used to explain why we do not need explicit instruction to speak natural languages, humans it could be claimed are born with rational intuition and hence learn how to discover propositions in the world and represent them using arbitrary symbols. All that now needs to be explained is our ability to use finite means to express infinite ideas, now this ability is the real sticking point, when Russell wanted to explain our ability to grasp propositions such as ‘all prime numbers have a successor’ he was lead to postulating ‘Denoting Concepts’ which we were acquainted with, and which denoted the said numbers. So instead of being acquainted with an infinite amount of propositions about prime numbers we are acquainted with the Denoting Concepts. An obvious difficulty with the above explanation is the fact that if one accepts it, one can no longer accept the above view that grammatical sentences mirror the logical structure of propositions. One is left with logical form which is hidden by the grammatical form of a sentence. This being the case one no longer has an explanation of the compositionality of sentences.
We need to explain why sentences are structured the way that they are, if they don’t mirror the logical form of propositions one could claim that we adopt the structure we do because they are arbitrary conventions, and that the conventions as a matter of contingent fact do not mirror the logical structure of propositions. However, if one accepts this conventionalist view one is committed to the view that language is learned, and one must then be able to sketch a plausible learning theory account of language acquisition. The above discussion indicates that Platonic propositions as explanatory tools cannot explain all of the important facts about language, though they do seem to offer some kind of explanation of at least some of them. Furthermore as the later Wittgenstein recognized not all linguistic usage involves expressing propositions, he used the metaphor of language being a tool rather than a picture of reality, and for the most part he was correct. The fact is that linguistic usage is in inherently novel, and is not constrained to merely the of picturing facts. So much more than Platonic propositions are needed to explain linguistic ability.
Chomsky’s explanation of four the core features of language involve a shift towards internalistic explanation rather than the externalistic explanations which are common in philosophy. Firstly he agreed with Plato that to explain our linguistic ability one needs to postulate that the person comes to the learning situation with innate apparatus. So if people are all born with the same innate ability to speak language it follows that a person born in France and a person born in England could both be saying the exact same thing but sounding different not because they both grasp the same proposition which they use different sounds to express, rather they are both speaking the same sentence, which is structured differently because of parametric variations which are fixed depending on the type of data one is exposed to in ones childhood environment. The fact that we can express infinite ideas using finite means can be explained by making the rules which underlie this recursion explicit. Likewise the fact of compositionality can be explained in terms of innate features of the lexicon.
Chomsky argues for a language faculty with his poverty of stimulus argument I have criticised this argument in an earlier blog and will not repeat the material here. I will now proceed to show the nature of the language faculty Chomsky postulates, and demonstrate that it in no way relies on the notion of a Platonic proposition. Chomsky argues that this innate apparatus is genetically programmed and wired into our brains, and it determines how we speak and understand language. Two of the most obvious features of language which all linguists recognise are that, we can understand and speak an infinite amount of sentences, and this is despite the fact that we have only a finite resources available in our brains. And two sentences have a fixed structure only words can only go together in a certain order and still be grammatical. Chomsky tried to capture these features in his pioneering work Syntactic Structures where he made these facts explicit. His aim was to be able to enumerate certain principles and then to be able to predict how certain sentences should go together and to test sentences to see if these predictions are accurate. According to Chomsky people’s grammatical intuitions are the raw data we have to work with, and the principles and parameters approach is the explanatory theory we construct to explain the data. Chomsky’s first insight into language is that it is not just a string of words which we learn inductively, rather we automatically group words into what he calls phrases. He argues for this position by constructing mathematical models of how language could be learned as a string of words inductively, and then proving that such models are not possible. He called the model a Finite State Grammar. A Finite State Grammar consists of the following five features, (1) an initial state. (2) A finite number of states (3) A specification of transitions form one state to another (4) A specification of an initial symbol that will be printed when a particular transition obtains (5) A final state. (Lasnik: SSR p 12). In Syntactic Structures Chomsky constructed a finite state grammar consisting of (5) states. In the initial state the only option for the machine is to proceed to state (2), and when this is done, the machine must print the symbol ‘The’. From stage (2) the machine has two options go to stage (3) or stage (5), if it goes to stage (3) ‘man’ is printed, if it goes to stage (5) ‘men’ is printed. Once the machine has taken the option of either (3) or (5) its next option is determined, (3) must go to stage (4) at which point ‘comes’ is printed, and (5) must go to stage (4) at which point ‘come’ is printed. So there are two possible sentences that the finite state grammar will allow (A) The man comes. (B) The men come. The question is can such a machine be modified to make it capable of capturing infinity?
Now it is obviously easy to extend such a machine so that it can capture infinity. At stage (2) a third option can be added. The machine will now have three options It can go from (2) to (3), (2) to (5), or it can loop back on itself (2) back to (2), now the rule can be added that every time (2) loops back on itself the symbol old is printed. The finite state grammar now allows for infinite constructions. It can now print (A) The man comes, (C) The old man comes, or (D) The old old man comes, and so on to infinity. Now while such a device is obviously useful, and can capture an important feature of language (the ability to construct infinite sentences by finite means) it has obvious weakness. It cannot grasp certain obvious grammatical features of ordinary language, such as embedding, and cross serial dependencies.
Consider the following phrase ‘an anti missile missile’ which is a missile used to defend against missile attacks. Now presumably if I knew that my enemies had ‘an anti-missile missile’, I would proceed to try and develop ‘an anti-anti missile missile missile’. As Lasnik put it ‘There are presumably technological limits on how far one can go in creating this kind of weapon, but there are no linguistic limits as far as the corresponding word formation process is concerned. Lets notate the situation as follows: (45) antiⁿ missileⁿ+1. (Lasnik p15). The important point to note is that a finite state grammar cannot capture this basic grammatical feature of language. If one were to construct a grammar with a loop at the end of anti, and one at the end of missile one would be able to create as many ‘anti’s’ or as many ‘missiles’ as one wanted, however one would not be able to correlate the two in the way suggested by (45) without adding an abstract phrase structures. In order to explain the features of language we must assume that words are grouped together in terms of abstract phrase structure rules which determine the way they can be grouped together.
As I have already said finite state grammars had no way of keeping track of how many ‘anti’s’ and how many ‘missiles’ there was in the sentence. Context free Phrase structure Grammars can perform this deed, and they do so by introducing the two words at the same time. Consider the above grammar (a) Ʃ:S and (b) F→aSb, now the initial designated symbol is S is what is known as an abstract non-terminal symbol which will not be part of the sentence, so it will have to be rewritten as a sequence of terminal symbols. So as can be seen from (b) we are to rewrite S as aSb. Now from (a) and (b) we construct the following derivation. We will call our derivation grammar x, and it has the following structure,
Grammar X: Step one: S (following a)Step two: aSb (following b)Step three: aaSbb (by reapplying b step2)Step four: aaaSbbb (reapplying b to 3)
Obviously we can carry on this sequence to infinity and using this process we can keep track of cross serial dependencies. The important point to note is that it is the abstract structure (cross serial dependencies) that makes it possible for the numbers of a’s and b’s to be correlated. This phrase structure grammar above is obviously very different from our natural language, so I will now use the same technique to construct a model which is a bit closer to natural language.
So let us begin the way we did above by stating what our initial symbol is and then outlining our rewrite rules. (1) Ʃ: S (2) S→NP VP
Using the above rules one can construct the following sentences using what is known as a derivation. When one is using a derivation one basically tries to get rid of all of the non-terminal symbols. So our derivation goes as follows;
Step 1: S
Step 2: NP VP (S→NP VP)
Step 3: N VP (NP→N)
Step 4: Mary VP (N→Mary)
Step 5: Mary V (VP→V)
Step 6: Mary laughs (V→Laughs)
As Cedric Boeckx (Bare Syntax 2008) correctly notes the above grammars cannot capture infinity; however if we want to capture infinity all we need to do is introduce a rule that re introduces S; For example VP→S. With this proviso in place we can capture sentences such as Arsene thinks Alex laughs, or Alex cries Arsene laughs. So our artificial model has moved beyond the limits of a finite state grammar and can capture things such as cross serial dependencies. However there are some aspects of grammar which phrase structure grammars cannot capture, such as unbounded cross serial dependencies. In order to show what these unbounded cross serial dependencies are and how Chomsky accounts for them I will again give concrete examples from ordinary language.
In English there are three different types of Auxiliaries (1) Modal Auxiliaries: (can, must, may, will, etc) (2) Have (had etc) (3) BE (am, is, was, were). Now by examining how these auxiliaries can be combined with other words in sentences and with each other sentences and still remain grammatical according to the intuitions of ordinary English speakers Chomsky uncovered a series of generalisations. Now what Chomsky did was to discover a series of facts about the behaviour of English Auxiliaries and then formulate general laws about these linguistic regularities. All English sentences have a main verb and a noun, such as sentence (1) Arsene laughed. Some English sentences in addition to having main verbs have what are known as Auxiliary verbs. Consider sentence (2) Arsene may laugh. Sentence 2 as well as a main verb contains a modal auxiliary may. A sentence can also contain more than one Auxiliary (3) Arsene may have laughed. So sentence (3) contains two Auxiliary verbs a main verb and a noun. A sentence can even contain three Auxiliaries (4) Arsene may have been laughing. Now what Chomsky discovered was that there are regularities which govern how these auxiliaries behave.
When the main verb is combined with certain auxiliaries its structure remains unchanged, however when it is combined with certain other types of Auxiliaries its structure altered. Consider the following constructions (Taken from Lasnik p36)
(X) (a) Arsene may laugh.
(b) Arsene will laugh
(c) Arsene can laugh
(d) Arsene could laugh
(Y) (a) Arsene has laughed
(b) Arsene is laughing
(Z) (a) Arsene had laughed
(b) Arsene was singing
Obviously the Modal Auxiliaries (henceforth M) do not modify the main verb whereas with the other Auxiliaries do modify the main verb. So from the behaviour of Auxiliaries Chomsky formulated some generalisations, such as the following. Generalisation 1: When a sentence contains a modal auxiliary (M) it is always the first thing after the subject. Generalisation 2: When have and be occur, be immediately follows have.
Now we can take a further look at examples like the above ones and see if we can find further generalisations. Consider for example sentences with no auxiliary verbs. (Q)(a) John owns a house (present)
(b) John owned a house (past)
(c) *John owned a house (bare)
(d)* John owning a house (progressive)
(e)* John owned a house (perfect)
From the above data we can derive Generalisation 3: If the main verb is the first verb like thing in the sentence, then it can appear in the ‘present’ or ‘past’ form but not in the ‘bare’ ‘progressive’ or ‘perfect’ form.(Lasnik p38) Chomsky discovered that generalisation 3 works also for ‘all verb’ like things such as Modal auxiliaries, have and be. So from this fact he abstracted to Generalisation 4: What ever ‘verb like’ thing is first in the sentence, it will appear in the present or past form. From here Chomsky examined whether he could find generalisations for the second ‘verb like’ thing in a sentence, and for a third verb like thing in a sentence. It is his generalisation he discovered for the third ‘verb like’ thing which concerns us here. Generalisation 5: Whenever BE occurs, the next ‘verb like’ thing appears in the ‘progressive’ form. So when a sentence has BE in it has ing. As Lasnik put it Be and ing go together but they don’t go together.
The above point can be illustrated using concrete examples, take the following sentence (1) Arsene has been laughing. As a direct result of the BE auxiliary being used the ing gets affixed to the verb laugh, and this is a concrete example of unbounded cross serial dependencies. Now given that phrase structure grammars cannot capture these unbounded cross serial dependencies in a non-ad hoc manner we need to use a different approach to capture this fact about language. And the fact which Chomsky appeals to in order to capture this feature of language is what he calls Transformions. It is at this point that Chomsky introduces his division between deep structure and surface structure, at the level of deep structure a sentence like (1) above will have the following structure; Arsene (has en) (be ing) laugh. It is through applying transformations that the sentence gets translated into the surface structure form of Arsene has been laughing, now it should be patently obvious that this deep/surface structure has nothing in common with the distinction between logical form and grammatical form.
The above grammatical rules, and generalisations which Chomsky discovered led him to the conclusion that as we learn new words we automatically group them into phrases such as Noun Phrases, Verb Phrases, Preposition Phrases, and Adjective Phrases. A Noun Phrase is any phrase that is about the noun such as the phrase ‘the logical philosopher’. A noun Phrase can be broken down into the following structure NP→ (DET) A*N11, which means that a Noun Phrase consists of an optional determiner, any number of adjectives, and a noun. Further definitions are VP→ V NP, and S→NP VP. So a sentence can be broke down into various phrases which can be further broken down to words operating according to certain rules. A strange discovery was made by Chomsky when analysing these rules, the various phrases were all discovered to have the same structure.
Stephen Pinker summarised the rules that all phrases share in his book The Language Instinct, he claimed that there are four such rules. The rules are as follows (1) All Phrases have a head (2) All phrases have some role-players which are grouped with the head inside a subphrase (3) All phrases have modifiers which appear outside the subphrases (4) All phrases have a subject (The Language Instinct p110). I will explain these four rules in more detail before further discussing their significance.
The first rule that all phrases have a head is in a sense obvious; this simply means that the phrase is about the head. So the NP ‘The logical Philosopher’ is not about the determiner ‘The’ nor is it about the adjective ‘logical’, rather it is about the Noun ‘Philosopher’. The noun is the head of the phrase and the adjective is its complement. To understand the second rule consider the following sentence ‘Dave drank a pint of Vodka’, the sentence consists of a subject (Dave) who plays the role of agent, and an object (Vodka) which is a Patient. The Vodka plays the role of a patient because something is being done to it, while the subject is the agent because it is doing something to the object hence it has the role of agent. More complicated sentences can be broken down into the thematic roles of Agent, Patient and Recipient, e.g. ‘Van Persie passed the ball to Ramsey’ Van Persie (agent) the ball (patient) and Ramsey (Recipient). The sentence can be thought of in similar terms as Frege thought of it; as being broken down into function and argument. So the preceding sentence can be thought of as consisting of the three placed predicate ‘x passed y to z’, this incomplete predicate is a function which becomes complete when the arguments (appropriate names) are mapped on to the variables. The variables in this case are the three nouns which are various role players, the agent, the patient, and the recipient. The type of role the various names will play will be determined by the lexical information that is stored in the brain about the predicate, for example passed will require an agent, a recipient and a patient, whereas drank just requires an agent and a patient. These role players are grouped with the head inside a sub phrase called an N-bar or V-bar.
The third rule is that all phrases have modifiers which exist outside of the sub phrase. Consider the following PP ‘from France’ this prepositional phrase is what is known as a modifier. The sentence ‘The captain of Arsenal from France’, can be schematised as follows S→(NP (Det the (Noun’ captain)( PP of Arsenal))(PP from France)). The PP ‘of Arsenal’ is about the noun ‘the captain’, to be a captain you have to be a captain of something, so they are intrinsically connected. However, the modifier ‘from France’ is not intrinsically connected to ‘The captain’ so is not grouped in the same phrase. So while the modifier is still part of the NP, it is grouped on another branch. The fourth rule is that subjects are given a special role in phrases, the subject is usually a causal agent, and is represented as Spec.
As Pinker noted that what is interesting about these rules is that all phrases share them, whether they are a NP, VP, AP or a PP, and that the rules which all of these Phrases are governed by must be an abstract set of principles. The principles are represented by an abstract schema called X bar theory, XP→ (SPEC) X’ YP which states that ‘A phrase consists of an optional subject, followed by an X bar, followed by any number of modifiers’12. It is these abstract principles which govern how words are grouped in sentences.
It may seem that the posit of underlying principles which govern our language use seems to be contradicted by the fact that people brought up in different countries speak different languages, Chomsky accounts for this fact with his discussion of parameters. Adriana Belletti and Luigi Rizza describe the nature of a parameter in the following manner ‘The child interprets the incoming linguistic data through the analytic devices provided by Universal Grammar, and fixes the parameters of the system on the basis of the analysed data, his linguistic experience. Acquiring a language thus means selecting, among the options generated by the mind, those which match experience, and discarding other options’ (Language and Mind P17). A simplified example of a parameter is what is known as ‘The movement parameter’, consider question formation; when forming questions human languages have two different options to choose from, the first option is to move the interrogative phrase (who etc) to the front, to a position in the left periphery of the clause, or leave the interrogative phrase in the clause-internal argument position in which it is interpreted (ibid p17). English takes the first route, (Who did you meet?), Chinese the second (You love who?), while French uses both rules at the same time. There have been no known languages which have violated this parameter so one can logically assume that it is an aspect of universal grammar. This in short is a non technical explanation of how Chomsky believes that a child goes from his meagre input to the torrential output (optiminal language expression).
The important point to note is that nowhere in the brief sketch above is the notion of a Platonic proposition needed. So Chomsky can be cleared of a charge of postulating Platonic propositions that Quine would find objectionable. However Quine is not just concerned with propositions as abstract entities as the following quote shows:
‘My objection to recognising propositions does not arise primarily from philosophical parsinomy- from a desire to dream of no more things than are in heaven and earth than need be. Nor does it arise, more specifically, from particularism- from a disapproval of intangible or abstract entities. My objection is more urgent. If there were propositions, they would induce a relation of synonomy or equivalence between sentences themselves: those sentences would be equivalent which expressed the same proposition. Now my objection would be that the appropriate equivalence relation makes no objective sense at the level of sentences’(Quine: The Philosophy of Logic p. 2 ).
Chomsky does accept the existence of propositions in terms of determinate synonomy relations in a way which contradicts Quine. I have shown in two eariler blogs ‘Indeterminacy of Translation and Innate Concepts’ and ‘Chomsky and Quine on Analyticity’ that Chomsky’s views on these topics are not justified empirically. Here my main point was to show that Chomsky is not commited to the existence of Platonic Propositions.
 Modern mathematical research has refuted Chomsky’s claims that recursion cannot be learned statistically.
 Here I am following Lasnik’s Syntactic Structures Revisited.