Boeckx, Research Programmes and Reality

Boeckx: Linguistic Minimalism

 “As Chomsky (1959) convincingly argued, no ‘blank slate’ theory relying solely on external input can account for the creative aspect of language use. Native speakers of any language are able to effortlessly produce and understand sentences in the language that they have never heard or produced before. Chomsky’s rejection of any behaviourist account helped shape what came to be known as the ‘cognitive revolution’ – a mentalistic framework in which inborn (‘innate’) mechanisms played a central role in the acquisition and use of behaviour” (ibid p.17)

Cedric Boeckx in his (2006) ‘Linguistic Minimalism’ begins in the standard way of virtually all books on Generative Grammar with a reminder of the weakness of behaviourism. Boeckx then goes on to cite Chomsky’s ‘Review of Verbal Behaviour’ as the birth of cognitive science by refuting virtually every aspect of Skinner’s project in ‘Verbal Behaviour’. In his (2010) ‘Language in Cognition’ Boeckx goes through all of Chomsky’s main arguments against Skinner and then wonders how anyone could ever accept such an absurd world view. Boeckx makes all the same misinterpretations of behaviourism that are standard in the Generative Grammar literature. I have argued at length against these misinterpretations in a series of blogs (1) Poverty of Stimulus arguments and Behaviourism, (2) Some Behavioural Techniques and The Idea of a Blank Slate, (3) Pecs, Verbal Behaviour, and Universal Grammar. I won’t repeat the material here, interested readers can read the blogs if they want. I just want to make one point in his ‘Language and Cognition’ Boeckx claims that Skinner’s ‘Verbal Behaviour’ was a collection of the best behavioural evidence on the nature of verbal behaviour. That statement is simply false. ‘Verbal Behaviour’ unlike Skinner’s other books is not chocked with experiments, rather it is was meant as a programme for possible research into Verbal Behaviour. Boeckx’s description of the book is a clear indication that he never bothered to read the book (I wonder how many generative grammarians have). Skinner’s ‘Verbal Behaviour’ was meant as a research programme to guide further research. Those who haven’t read the book should at the very least read Kenneth Mac Corquodale’s reply to Chomsky which demonstrates the programmatic nature of the book.

It is also a fact that Skinner’s proposed research programme into Verbal Behaviour has lead to a lot of empirical research.  Michael (1982, 1988), Hall and Sundberg (1987) (Caroll and Hesse (1987), Yamamoto and Mochizuki (1988) all used a behaviour chain procedure to teach mand’s to children with intellectual disabilities. Rogers, Warren and Warren (1980) studied manding without using the chain procedure instead they got the children to play with preferred objects and asked the children to mand for the ones they wanted. Simic and Butcher (1980) used two different kinds of foods and trained the subject to say I want a when the analyst entered the room with a tray of food. Savage-Rumbaugh (1984) and Sundeberg (1985) trained non-human subjects to mand[1].

Sautter and LeBlanc’s  (2007) paper showed that between 1992 and 2007 the majority of Verbal Behaviour research focused on two areas (1) Mands (2) Tacts. Furthermore the majority of research in applied verbal behaviour has been with people with intellectual disabilities and/or autism. So Dixon et al, argue that more research needs to be done on people who are developmentally typical, while more research also needs to be done on more complex forms of language.

Their results showed that of the 99 articles they analysed 77percent were done atypical Members of the population. Of that number, 63 of the articles focused on children and 23 used adults.  Only 27 percent of the articles investigated typically developing members of the population, and 19 of those examined children and 10 were with just adults. Only four studies (4%) examined both AP and TP in one article. (ibid p. 202).

They conclude that the vast majority of research in this area has been with children with developmental disabilities and/or autism. And while this is important and welcomed data the scope of the research needs to be widened to include a much bigger section of typically developing people if Skinner’s ‘Verbal Behaviour’ is to be championed as an adequate theory of language acquisition. Furthermore they correctly note that we need to go beyond Mands and tact’s and do experimental and empirical research into things like Autoclitics. They note that with recording devices and mountains of internet conversations taking place we are swimming in data and have the technology to record it and analyse the functions of various speech patterns. So if Verbal Behaviour research needs to takes these limitations in the research done and overcome them if the research programme is to remain alive. But they are quiet clear that if it turns out that Autoclitics etc cannot be acquired in the way Skinner specifies this refutes Skinner’s Verbal Behaviour project.

Having worked with PECS a behavioural technique inspired by Skinner’s Verbal Behaviour I would predict that further empirical research into more complex forms of verbal behaviour will indeed refute the claims made by Skinner. While I am not impressed by Chomsky’s arguments against Skinner (which are nothing more than a caricature), I am not a behaviourist, and don’t think we can do without intentional locutions in our explanations of language acquisition. But Dixon et al’s empirical research into verbal behaviour and argument that future experiments will determine the success and failure of the programme stand in stark contrast to Boeckx’s claim for the minimalist programme:

“We are still far from having a fully-fledged minimalist theory of language. This fact has important repercussions for what it means to do research in the minimalist programme, A program is open-ended, it may take a long time to mature, it allows researchers to make maximal use of their creativity as they try to move from minimalist guidelines to concrete principles, it makes room for multiple, not necessarily mutually consistent or compatible perspectives, and cannot be evaluated in terms of true or false, but in terms of fecund or sterile.” (Boeckx ‘Linguistic Minimalism’ p. 6)


He justifies this approach by appeal to the philosopher of science Lakatos. I agree that an entire research programme is not refuted when a prediction is shown to be false. And it is not falsified by a theoretical claim being refuted experimentally. It is always open to the researchers to modify their programme in light of falsified predictions. However, as more and more counter evidence mounts against a particular research programme, its adherents should suspect that the programme is false. Boeckx et al have no problem listing a series of facts that show the behaviourist research programme is a dead end with a series of refuted claims. I think they should hold themselves to the same standard. I think it is frankly a bit bizarre to start a research programme with claims that Chomsky has effectively refuted a behaviourist research programme, and a few pages later to assert that Chomsky’s research programme cannot be refuted.

Boeckx has no problem with accepting that particular claims can be refute in science he just doesn’t seem to think that a mounting amount of refutation should be taken as a sign that a research programme has been refuted. I am not so sure. I think that if a workable poverty of stimulus argument was constructed this would refute rival programmes which relied on language being learned by domain general procedures. Unfortunately I have never seen a workable poverty of stimulus argument.

Interestingly Boeckx does discuss a poverty of stimulus argument early in his ‘Linguistic Minimalism’ book. One wonders why he does this, is it because he think he think it refutes rival research programmes? If so this attitude is wildly at odds with his attitude to The Minimalist Programme which we are told cannot be falsified.

Boeckx discusses the famous structure dependence poverty of stimulus argument. This is the standard case used by Nativists for Poverty of Stimulus Arguments. Boeckx (to his credit) does actually address the evidence that Sampson, Pullum and Scholz have put forth on the issue. He replies by making the correct point that the issue isn’t whether a child is presented with examples in their PLD which are evidence for the structure dependence rule, but it is whether there is enough evidence in the child’s PLD for him to learn the correct rule. Boeckx is correct that the question of whether the child is presented with the data in their PLD is not the primary issue. Nonetheless since Chomsky did make the following claim:

“The child could not generally determine by passive observation whether one or the other hypothesis is true, because cases of this kind rarely arise; you can easily live your whole life without ever producing a relevant example…you can go over a vast amount of data of experience without ever finding such a case… ( Chomsky 1980, 121)”

This claim has uncontroversially been refuted by the data of Pullum, and Sampson. Nonetheless, Boeckx is correct to note that the important issue is whether the child can learn from the data they experience. He appeals to Legate and Yang’s (2002) attempt to quantify how many examples need to be in the child’s PLD in order for a child to be able to learn the relevant construction.

Legate and Yang (2002) argued that if we compared the evidence a child has in his PLD for constructions that we do know they learn, with the evidence the child has for the structure dependence rule we have a good yard stick to determine whether the child has enough evidence to learn the rule for structure dependence.  Legate and Yang are to be applauded for trying to move the argument on and specify further ways of testing the issue. However as Clark and Lappin (2011) correctly note tests such as Legate and Yang’s really only make sense in terms of specified learning theory and Yang does not provide one.

Legate and Yang’s evidence of a rule that is learned is the use of null subjects in child language. The child reaches adult level at about 3 years of age. Boeckx glosses Yang’s argument as follows:

“The core examples which inform children that all English (finite) sentences require phonologically overt subjects are sentences involving pleonastic subjects (e.g. there is a man here). Such sentences amount to 1.2 per cent of the potential PLD (all sentences). Legate and Yang suggest, quiet reasonably, that the PLD relevant to fixing the Y/N question should be of roughly comparable proportion” (Boeckx 2006 p.25)

Boeckx then notes that a search of the CHILDES database reveal’s that sentences relevant to learning the auxiliary inversion rule are available to the child 0.045 and 0.068 percent of the sentences they experience. And he concludes this far is too little data for the child to learn the rule from.

Legete and Yang  (2002) even go on to argue that out of the 67,000 sentences observed in the CHILDES none of the adult interactions use sentences like (9):

Not only are those frequencies far below the magic figure of 1.2 percent required to learn the correct rule by the 36th month, it is also low enough to be considered negligible, that is, not reliably available for every human child. And interestingly, the canonical type of critical evidence, [aux [ NP … aux …] e …], appears not even once in all 66,871 adult sentences found in both the Nina and Adam corpora ñ the standard statements of the APS are not hyperbole as P&S charged. Hence the original APS stands unchallenged: the knowledge of structure dependence in syntax, as far as we can test quantitatively and comparatively, is available to children in the absence of experience. 8 And the conclusion then seems to be Chomskyís (1975: 33): ìthe childís mind … contains the instruction: Construct a structure-dependent rule, ignoring all structure independent rules. The principle of structure-dependence is not learned, but forms part of the conditions for language learning. (Legete and Yang (2002) ‘Empirical Re-Assessments of Poverty of Stimulus Arguments pp 158-159)

As I said above Legete and Yang (and Boeckx) should be applauded for their attempt to deal with Pullum et al’s data; however I think their negative conclusion is unwarranted. In their (2011) ‘Linguistic Nativism and Poverty of Stimulus Argument’ Clark and Lappin used learnability models to tackle the claims of Legete and Yang.  They noted that assessing whether such constructions can be learned by experience will require mathematical models of how learning from such few constructions is possible. Such programmes have been developed already. So, for example, Clark and Eyraud (2007), Perfors et al (2006), Reali have all developed programmes which can learn from less data than discovered by Pullum, Scholz and Sampson. Clark and Lappin:

 “In subsequent sections we consider work in computational learning theory applied to grammar in order to clarify the question of what data is needed for learning the principles governing polar interrogatives and related syntactic phenomena. The first paper (Clark and Eyraud, 2007) which we discuss in more detail in Chapter 8, shows that a very simple grammar induction algorithm based on distributional patterns acquires rule from a small data set that does not include examples like 8b (Is the student who is in the garden hungry?). The second paper (Perfors et al., 2006) indicates that learners can infer hierarchical structure in a language on the basis of a simple domain-general learning prior.

Both of these papers adopt the same general perspective. They grant the absence in the PLD of examples that effectively distinguish between correct and spurious rules for polar question formation. They also reject a transformational account of the relation between declarative and interrogative forms, relying instead on a context free grammar. They show that the correct interrogative form can be learned without seeing any examples in what is purportedly the set indispensible data” (Clark and Lappin ibid p.40)

A few points need to be noted here. Firstly Clark and Lappin correctly note that just because these programmes can learn from the relevant data doesn’t mean that the brain  uses the same proceedures to acquire a language. But it does mean that the Poverty of Stimulus argument as presented by Yang has been refuted. Secondly Clark and Lappin along with Yang assume that the rule is actually a rule of language. This assumption may not be warranted. Since analysis of the CHILDES corpus (actual speech and language) does not contain examples of the relevant rule then why exists? Geoffrey Sampson correctly notes that the rule is a rule of written language (which explains people’s intuitions of grammaticality), but since people don’t actually speak in ways that conform to the rule then why presume it is a rule governing how people form questions when they actually speak?

Boeckx thinks incorrectly that Legate and Yang have clinched the case for the poverty of stimulus argument. He then goes on to argue that independent of the Legate and Yang evidence opponents of the Poverty of Stimulus arguments are on even weaker grounds than they think. To support this claim he notes the following standard claims of Generative Grammarians. (1) People don’t try out false constructions like ‘Is Mary will believe that Frank is here?’* or ‘Is the man who tall will leave now?*. He cites Nakayama and Crain’s 1987 experiment as evidence people don’t try out sentences like the previous ones. He then makes the claim that even is children did try out ungrammatical constructions like the proceeding ones, since people are not-typically corrected for ungrammatical constructions, and even if they were they don’t make use of corrections, there is no way children could learn even through positive or negative data the relevant rules.

A couple of points need to be made here. Firstly Crain and Nakayama (1987), is widely cited in the literature as proof people don’t try out the relevant false constructions. But Sampson (2005) has noted that there is a real problem with interpreting this experiment. Since the corpus data indicates that people don’t form interrogatives according to the same rules of written language (the rules postulated by Chomsky et al), then it is strange that children in Crain and Nakayama’s experiment form questions in a way that never occurs in actual speech[2]. This fact indicates that there may be some element of accidental priming in the experiment. At the very least the experiment needs to be replicated before been used as conclusive proof on the issue.

Also Choinard and Clark (2002) presents experimental evidence that children are corrected for making grammatical mistakes and do make use of these corrections. At this point we have a lot of conflicting experimental data on how much corrections children make use of when are implicitly corrected. So Boeckx is wrong to present things as though the poverty of stimulus argument is proven. The fact is the argument relies on a series of unproven and disproven claims. Furthermore, other generative grammarians like John Collins have argued that the poverty of stimulus argument as presented by Pullum, Legate and Yang is not the real poverty of stimulus argument. I have dealt with Collins’s version of the argument in my earlier blog (as well as a more recent APS constructed Berwick, Chomsky et al 2011) in my ‘Poverty of Stimulus Arguments and Behaviourism’ I won’t repeat the material here.  My point is merely that as far as I can see the example of a poverty of stimulus argument Boeckx brings up does not work. However if he can construct a workable poverty of stimulus argument he has refuted his opponent, and I think he should accept the converse. Minimalism should be adopt the exact same approach.

When Boeckx notes the following:

“Minimalists endorse the belief (held by all major proponents of modern science, from Kepler to Einstein), that nature is a realisation of the simplest mathematical ideas, and a theory should be more highly valued if it gives us the sense that nothing could be changed…a sense of uniqueness,…a sense that when we understand the final answer, we will see that it could not have been any other way (Weinberg 2001).” (Boeckx p. 9)

He is pointing out that adopting similar approach to studying nature as minimalists has been very successful for physicists. Presumably he is implying that it is likely that a similar approach will yield similar success in linguistics. Maybe so; however I doubt whether the laws of physics will have similar correlates when studying the structures of creatures built by the tinkering process of natural selection. Either way the nature of cognitive structures of humans will ultimately be decided empirically. Minimalists cannot avoid this truism; if reality consistently conflicts with their beliefs then their beliefs must change.

[2] I should say that they never form questions in the manner which Chomsky et al predicate they should in the limited corpus analysis of speech interaction that has been done so far. Much more research needs to be done to say for sure.


