What is wrong with Reinforcement?

“What the results suggested was that the simple learning observed in Pavlovian conditioning paradigms arises from an information processing system sensitive to the contingencies between stimuli. If this implication is valid, then it changes our conception of the level of abstraction at which this basic learning process operates.” (Gallistel ‘Information Processing in Conditioning’ p. 1)

Gallistel (2002) discussed a 1968 experiment by Robert Rescorla which Gallistel argued contemporary theorists have not fully digested. According to Gallistel what the result showed was that the simple associative learning of Pavlovian conditioning comes from an information processing system that is sensitive to contingencies between stimuli. Pavlov famously managed to associate an Unconditioned Stimulus[1] to a Conditioned Stimulus and this has been one of the foundational pillars in behaviourism. Rescorla’s experiment was done in the following way. A pidgin experiences a key light up and a while later he gets food. In the case of the rat a tone comes on and soon after the rat is shocked. After a while the pidgin starts pecking when it hears its key lights up and the rat starts defecating when it hears the tone. According to Gallistel this showed that both the Rat and the Pigeon were anticipating that the US is coming. This learning was considered by followers of Pavlov to be a paradigm of associative learning.

In order to prove that this pairing was the result of association forming; an experiment was done where two stimuli were repeatedly presented together, and a control experiment was done where the stimulus were widely separated in time. So, if you could see the change in the experimental condition, but not in the control condition, then the experimenter would argue that the paring was an association.

However Rescorla (1967) found that temporal pairing and contingency could be disassociated from each other, prior to this people believed that they were the same thing. But Rescorla found that the controls used in previous experiments were not sufficient. Control conditions in which the CS and US are never paired do not eliminate CS-US contingency, they rather replace one contingency with another. Rescorla pointed out that if we want to determine whether it is temporal pairing or contingency that is leading to conditioning we need a truly random control.

In this condition the occurrence of the CS does not determine in anyway the time with which the US may occur; so the US must sometimes occur with the CS.  Rescorla (1968) ran his experiment as follows: He tested for conditioned fear in rats. In the first experiment hungry rats were trained to press a lever to obtain food. Once the rats were trained to press the lever regularly there were five sessions during which the lever was blocked (ibid p.3). In each of these sessions twelve tones came on at more or less random times. The rats also received short mildly painful shocks to their feet. Rescorla manipulated the distribution of the shocks relative to the tones. For one group the shocks were completely contingent on the tone and the shocks only occurred when the tone was on. In another group the rats got 12 shocks when the tone was on and the also got shocks at equal frequency when the tones were not on. This protocol did not alter then number or frequency of tone shock parings but it did destroy the contingency between tone and shock, and it did increase the number of shocks per session. To check the importance of this Rescorla had a third group who were shocked at random 12 times without regard to the tone.

Before testing the extent to which the rat had learned to fear the tone Rescorla eliminated their fear of the experimental chamber by eliminating tones and shocks for a couple of sessions. In the final sessions the rats conditioned fear of the tone was measured by seeing how the tone affected their willingness to continue pressing the lever. What they found was that although the Rats in the in the two conditions had the same tone shock pairings, the rats in the contingent condition learned to fear the tone, whereas the rats in the truly random condition did not, neither did the rats in the other condition. So it is contingency not temporal pairing that drives simple Pavlovian conditioning.

“In sum, the evidence that conditioning is driven by contingency unsettles us because it challenges what we have taken for granted about the level at which basic learning processes operate. Contingency, like number, arises at a more abstract level than the level of individual events. It can only be apprehended by a process that operates at the requisite level of abstraction–the level of information processing. A US is contingent on a CS to the extent that the CS provides information about the timing of the US.” (Ibid p.4)

Gallistel has done many other experiments which confirm Rescorla’s experimental findings. In various different blogs I have defended various different types of behaviourism. I have noted that behaviourism is not a blank slate theory, that reinforcement does play a role in our language acquisition, and that Applied Behavioural Analysis (in particular the Picture Exchange Communication System), can be very useful in helping children with Autism acquire language. I have also argued that despite the hype by Chomsky and Pinker behavioural accounts of language have not been refuted and more research should be done into them. These views of mine have typically been met with rhetoric and sometimes even personal attacks. Typically people don’t respond with reasons they respond with anger. Not all responses have been so irrational. Linguist David Pereplyotchik has argued that Gallistel’s paper above shows that any appeal to reinforcement alone will only give a partial account of language acquisition.

Firstly some terminology; Gallistel (2002, 2006, and 2012) presented evidence that classical conditioning relies on subjective computations of partial information. His evidence was largely directed against Pavlovian accounts of learning. Skinner however had a different conception of behaviourism his conception focused on Operant Conditioning as opposed to Pavlov’s focus on Classical Conditioning. Classical conditioning involves placing a neutral stimulus before a reflex, whereas Operant conditioning involves placing either punishment, negative reinforcement, or positive reinforcement after a behaviour. Classical conditioning focuses on taking an involuntary response (instinct) and pairing it with a neutral stimulus to form an association; Operant conditioning instead focuses on voluntary behaviour and a consequence following that behaviour. In Operant conditioning people are rewarded with incentives while in classical conditioning no such incentives are offered. In therapeutic setting typically operant conditioning is focused on more than classical conditioning. But it is not uncommon for both types of conditioning to be used by an applied behavioural analyst. In Tacting for example a child will come to associate a sound with a particular state of affairs in the world. This is a type of classical conditioning where the child’s instinctive imitative behaviour, (imitating sounds of his care givers) becomes associated with particular states of affairs. When the child says ‘mama’ primarily in the presence of ‘mama’ he has become conditioned to associate a piece of instinctive imitation with state of affairs and this is classical conditioning. But it is also combined with operant conditioning as the parents use various types of positive and negative reinforcement to help the increase the likelihood that the child will mouth the sound in the right circumstances. Now I have discussed at length in other blogs the extent to which reinforcement is necessary to explain things like language acquisition. I have note that is one amongst many of the tools necessary for a person to acquire language, it is far far from the complete story. Nonetheless I think that it is important to note that since operant and classical conditioning are combined in many different studies then Rescorla’s findings need to be taken account of in any scientific theory which makes use of reinforcement.

The results indicate that classical conditioning takes place at a much more abstract level than theorists previously believed. It is fair to say that no behaviourists currently believe that all of our behaviour can be explained in terms of classical conditioning. In fact Rescorla (1988) explicitly argues that classical conditioning is just a small part of psychology and should not be viewed as the total story; though it thinks that since it is relatively easy to gain experimental traction over it is ideal to be integrated with neuroscientific research.

Rescorla (1988) explicates some of the same experiments as Gallistel does, and details some other experiments which he thinks are relevant to showing people they misunderstand the details of Pavlovian conditioning. Rescorla emphasises that Pavlovian Conditioning no longer works in the simplistic reflex condition and their experimental results suggests that Pavlovian conditioning has a more complex richness in the relations it represents and the way the representations influence behaviour. He even goes as far as claiming that modern versions of classical conditioning involves representations which are closer to the British Empiricist tradition in philosophy than the reflex tradition favoured by reflex theorists. Rescorla’s view argues that conditioning involves the learning of relations and contiguity between US and CS is neither necessary nor sufficient to this process; the important aspect is the information available to the organism. Rescorla details experiments like the one above that Gallistel outlined above and further experiments like ‘The Blocking effect’ which he performed in 1968.  In the blocking effect two groups of animals receive a compound stimulus, but they differed in that for one the prior training of the light makes the tone redundant (ibid p.155). Rescorla notes that the results in the experiment are not driven by contiguity but by the informational relation on which they note. Experiments have been repeated by Rescorla, Gallistel et al over the last 50 or so years.

It is important to note that Rescorla explicates his view in terms of subjective amount of information available to the organism, not just the objective features of the environment. Furthermore Rescorla talks about the organism using this information to represent his environment. As I understand behaviourism this talk is not typical. It is difficult to answer this question, does claiming that classical conditioning (and hence indirectly operant conditioning) makes use of subjective information mean that one is no longer a behaviourist. People like Fodor don’t think so:

“The heart of the matter is that association is supposed to be a contiguity-sensitive relation. Thus Hume held that ideas became associated as a function of the temporal contiguity-sensitive relation. Thus Hume held that ideas became associated as a function of the temporal contiguity of their tokenings. (Other determinants of associative strength were said to be ‘frequency’ and on some accounts ‘similarity’). Likewise, according to Skinnerian theory, responses become conditioned to stimuli as a function of their temporal contiguity to reinforcers. Likewise, according to Skinnerian theory, responses become conditioned to stimuli as a function to their temporal contiguity to reinforcers. By contrast, Chomsky argued that the mind is sensitive to relations amongst mental or linguistic associations that may be arbitrarily far apart.” (Fodor ‘Language of Thought 2’ p. 103)

Now it could be argued that since on the evidence from Gallistel and Rescorla shows that in classical conditioning contiguity is neither necessary nor sufficient for association then their evidence is closer to Chomsky’s views than Skinners. And any view that is closer to Chomsky’s than Skinners’ cannot be considered behaviourist. I am not so sure though. The data that is used by Rescorla is behavioural data, that the data shows that Pavlov’s theory is too simplistic should be views as improvements on behavioural science not a refutation of it. Debates like these are common place in all sciences. Thus some people in Darwinian Theory argue that recent evidence in terms of laws of form, epigenetics, and evo-devo research show that the neo-Darwinian synthesis has been refuted and we need a superior theory to capture this new data. It is hard to know what importance should be attached to these debates. Whether we call it ‘Neo-Darwinism’ or something else is irrelevant as long it deals with the new evidence. Was Darwin refuted by the discovery of genetics and its explanation of heredity in terms of genes, or was his theory merely extended in light of new evidence. Similar questions arise with Rescorla’s discoveries; I prefer to think of them as improvements on behavioural science rather than refutations, just like I think the neo-Darwinian synthesis was an improvement of traditional evolutionary theory rather than a refutation. But the issues are complex and I cannot go in to them in any more detail here.

Interestingly that there are aspects of Rescorla’s talk that remind me of Fodor’s Representational Theory of the Mind, except Rescorla thinks that connectionist models can capture the facts revealed by Pavlovian conditioning, he cites the work of Rumelhart and McClelland (1986), in this connection. Noting that

“Connectionistic theories of this sort bear an obvious resemblance to theories of Pavlovian conditioning. Both view the organism as adjusting its representation to bring it into line with the world, striving to remove any discrepancies. Indeed, it is striking that often such complex models are built on elements that are tied closely to Pavlovian associations” For instance, one of the learning principles most frequently adopted within these models, the so called delta rule, is virtually identical to one popular theory of Pavlovian conditioning, the Rescorla-Wagner Model” (ibid p.158)

What is interesting here is that the same year that the above paper was wrote Fodor and Pylyshyn wrote their famous paper attacking the connectionist models of Rumelhart and McClelland. They argued that these connectionist models don’t work for modelling cognition and language because they cannot capture the compositionality of both our language and thought.

Fodor and Pylyshyn emphasise that connectionism is committed to representation and that we need to becareful to set out what our level of explanation is; is it the neural or the cognitive level. They note that since connectionists are dealing with representation they are giving explanations at the cognitive level. According to F and P classical theories unlike connectionist theories are committed to the existence of (1) a language of thought. Classic theories also accept an ABC of assumptions. (A)There is a distinction between structurally atomic and structurally molecular representations. (B) Structurally molecular representations have syntactic constituents that are themselves either structurally molecular or structurally atomic. (C) The semantic content of a molecular representation is a function of the semantic content of its syntactic parts, together with its constituent structure. And classical theories are also committed to (2) The structure sensitivity of our thought processes.

The major difference that Fodor and Pylyshyn see between connectionist architecture and classical architecture are that classical architecture recognises constituent structure and the semantics of a thought is determined by the semantics of its constituents, whereas connectionist models do not work in that way.  Fodor and Pylyshyn argue that there are 4 reasons that researchers don’t always recognise that connectionist architecture cannot handle constituent structure: (1) Failure to understand what arrays of symbols do in classical architecture and what they do in connectionist architecture. (2) Confusion of the question of whether the nodes in connectionist architecture have constituent structure with whether the nodes are neurologically distributed. (3) Failure to distinguish between a representation having semantic and syntactic constituents and a concept being encoded interms of micro features. (4) By wrongly assuming that since representations in connectionist networks have graph structure then they have constituent structure.

“To summarize: Classical and Connectionist theories disagree about the nature of mental representation; for the former, but not for the latter, mental representations characteristically exhibit a combinatorial constituent structure and a combinatorial semantics. Classical and Connectionist theories also disagree about the nature of mental processes; for the former, but not for the latter, mental processes are characteristically sensitive to the combinatorial structure of the representations on which they operate” (ibid p. 21)

Part 3 their argument centres on explicating why connectionist architectures cannot handle the productivity, systematicity, compositionality, and inferential coherence of thoughts.

  • Productivity of Thought: We can combine finite atoms of thought (concepts) in productive ways to produce a potentially infinite amount of expressions. This capacity is beyond all connectionist architecture. (Some connectionist models get around this fact by denying that our thoughts are potentially infinite) Note Elman has created connectionist recursive models.
  • Systematicity of cognitive representation: Even if you deny that cognitive capacities are productive you cannot deny that they are systematic. You won’t find a person who can think the thought that ‘John loves the girl’ but who cannot think the thought ‘The girl loves John.’ The reason that the thoughts are connected is that there must be structural connections between the thought ‘John loves the girl’ and ‘The girl loves John’ the structural connection is that the two sentences are made of the same parts. Based on this they argue that mental representations have constituent structure and that we therefore have a language of thought and these mental representations cannot be captured by connectionist models.
  • Compositionality: They connect this with systematicity; and note that the way that sentences are systematic is not arbitrary from a semantic point of view. A lexical item must make approximately the same semantic contribution to each sentence which it occurs in (not sure this works for metaphor, idioms, etc).

“So, here’s the argument so far: you need to assume some degree of compositionality of English sentences to account for the fact that systematically related sentences are always semantically related; and to account for certain regular parallelisms between the syntactical structure of sentences and their entailments. So, beyond any serious doubt, the sentences of English must be compositional to some serious extent. But the principle of compositionality governs the semantic relations between words and the expressions of which they are constituents. So compositionality implies that (some) expressions have constituents. So compositionality argues for (specifically, presupposes) syntactic/semantic structure in sentences.” (ibid p. 30)

They argue that some connectionists actually try to deny compositionality to get out of this argument. Connectionists seem to take idioms ‘Kick the Bucket’ as their model for natural language.

  • Systematicity of Inference: The syntax of mental representations, mediates with their semantic properties and their causal role in mental processes. Classical architecture can handle this fact but connectionist models cannot.

Summary of the Argument:

What’s deeply wrong with connectionist architecture is this: Because it acknowledges neither syntactic nor semantic structure in mental representations, it perforce treats them not as a generated set but as a list. But lists, qua lists, have no structure; any collection of items is a possible list. And, correspondingly, on Connectionist principles, any collection of (causally connected) representational states is a possible mind. So, as far as Connectionist architecture is concerned, there is nothing to prevent minds that are arbitrarily unsystematic. But that result is preposterous. Cognitive capacities come in structurally related clusters; their systematicity is pervasive. All the evidence suggests that punctate minds can’t happen. This argument seemed conclusive against the Connectionism of Hebb, Osgood and Hull twenty or thirty years ago. So far as we can tell, nothing of any importance has happened to change the situation in the meantime.” (ibid p. 34)

David Chalmers in his 1990 paper ‘Fodor and Pylyshyn the Simplest Refutation’ argued that F and P underestimate the differences between localist and distributed representation. Chalmers notes that distributed representations can be used to support structure sensitive operations in a very different manner than the classical approach.  Chalmers notes that the refutation of F and P can be stated in one sentence: If  F and P’s argument is correct, as it is presented, then it implies that no connectionist network can support a compositional semantics; not even a connectionist implementation of a Turing Machine, or of a Language of Thought.

It is unclear that Chalmers’ reply to F and P really works. Philosopher Simon Mc Waters has argued persuasively that the model that Chalmers uses to show that connectionist models can capture systematicity and compositionality is not sufficient to refute F and P. He argues that Chalmers model relies on a prior structuring of the symbols used in the model so the connectionist model doesn’t really show that operate on the symbols in a direct and holistic manner as Chalmers claimed. Debates are ongoing on this issue and despite the open ended nature of the debate the connectionist models have flourished in the 27 years since F and P wrote their critique. I am not sure if a model has been developed that can deal with all of F and P’s criticisms but I plan to return to the issue in my next blog.

It is unclear Fodor and Pylyshyn 1988 would affect Rescorla’s views on Pavlovian conditioning because he thinks that Pavlovian conditioning plays only a small role in cognition, and doesn’t tell us what his views are things like operant conditioning are and what he thinks the nature of computational procedures which make learning of all kinds possible. Furthermore as Paul Churchland has argued in ‘Plato’s Camera’ (2012) Rumelhart and McClelland (1986) is ancient history and more modern connectionist models can deal with the difficulties posed by Pylyshyn and Fodor (I don’t know whether Churchland is correct on this point I am currently researching the issue. I am not sure where Rescorla would stand on these facts; presumably he thinks it is his job to do the experiments and other theorist’s job to construct mathematical and artificial models which can accommodate the various different experimental data as they come in. Personally my hunch is that Bayesian Modelling will be more successful than connectionist ones to help us accommodate the findings of Rescorla et al. Overall though I don’t think that Rescorla’s studies refute reinforcement theory they merely show that the story is richer than previously believed.

[1] Henceforth Unconditioned Stimulus are referred to has US and Conditioned Stimulus are referred to as CS.


One thought on “What is wrong with Reinforcement?

  1. Bruce Caithness (@cthnss)

    Karl Popper anticipated such a rejection of association interpretations of training.

    “All this showed me the priority of the study of logic over the study of subjective thought processes. And it made me highly suspicious of many of the psychological theories accepted at the time. For example, I came to realize that the theory of conditioned reflex was mistaken. There is no such thing as a conditioned reflex. Pavlov’s dogs have to be interpreted as searching for invariants in the field of food acquisition (a field that is essentially “plastic”, or in other words open to exploration by trial and error) and as fabricating expectations, or anticipations, of impending events. One might call this “conditioning”; but it is not a reflex formed as a result of the learning process, it is a discovery (perhaps a mistaken one) of what to anticipate. Thus even the apparently empirical results of Pavlov, and the Reflexology of Bechterev, and most of the results of modern learning theory, turned out, in this light, to misinterpret their findings under the influence of Aristotle’s logic; for reflexology and the theory of conditioning were merely association psychology translated into neurological terms.” Pages 77-78 “Unended Quest” 1992 edition

    Firestone and McElroy in “Key Issues in the New Knowledge Management” (2003) interpret Popper’s views of knowledge. Knowledge, paraphrased, can be regarded as a subset of information that has survived trials, without ever being proven, and which has the propensity to assist living entities to adapt. This excludes neither genetic nor synaptic expectations, belief in minds, linguistic collections in libraries and computers, and even implicit in our technology.

    Even Pavlov’s reflexes are not the result of passive reaction to external stimuli but are conditional not only on the type and intensity of the stimulus but also on the propensity (knowledge) of the organism to so respond. The organism must have the appropriate searchlight capacity.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s