leolca's blog: 2013

sábado, 23 de março de 2013

20 steps to unlock and install Cyanogenmod on Xperia Mini Pro

20 steps to unlock and install cyanogenmod on Xperia Mini Pro

(http://wiki.cyanogenmod.org/w/Install_CM_for_mango)

1. download Android SDK

http://developer.android.com/sdk/index.html

extract it in you /tmp/ for example

2. Open the Phone application on the Xperia Mini Pro and enter *#06# to obtain the device's IMEI. Save this for later use.

3. Put the device into fastboot mode:

a) Turn off your Sony device

b) Press and hold the Volume Up button, at the same time plug in the micro USB cable which already connected with PC.

c) You should see Blue Color LED light up.

d) You are now in fastboot mode.

4. cd /tmp/adt-bundle-linux-x86_64-20130219/sdk/platform-tools

5. sudo ./fastboot -i 0x0fce getvar version

version: 0.3

finished. total time: 0.001s

6. http://unlockbootloader.sonyericsson.com/instructions

a) Click the 'continue' button at the bottom of the page.

b) Agree to the 'Are You Sure' and 'Legal Terms' prompts to continue.

c) Enter the first 14 digits of your IMEI.

d) You will receive you unlock boot loader key on your email.

7. In the PC's terminal, enter the following command:

sudo ./fastboot -i 0x0fce oem unlock 0xKEY

where KEY corresponds to the unlock code you were given.

$ sudo ./fastboot -i 0x0fce oem unlock 0xKEYKEYKEY

...

(bootloader) Unlock phone requested

(bootloader) Erasing block 0x00001300

(bootloader) Erasing block 0x00001400

(bootloader) Erasing block 0x00001500

(bootloader) Erasing block 0x00001600

(bootloader) Erasing block 0x00001700

(bootloader) Erasing block 0x00001800

(bootloader) Erasing block 0x00001900

(bootloader) Erasing block 0x00001a00

(bootloader) Erasing block 0x00001b00

(bootloader) Erasing block 0x00001c00

(bootloader) Erasing block 0x00001d00

(bootloader) Erasing block 0x00001e00

(bootloader) Erasing block 0x00001f00

OKAY [ 10.465s]

finished. total time: 10.465s

8. The Xperia Mini Pro's bootloader should now be unlocked.

9. Download Google Apps

http://goo.im/gapps

10. Place the CyanogenMod rom .zip file on the root of the SD card.

And also the supplemental Google App package

11. Extract the boot.img from the zip, you will need this file for fastboot.

12. Put the phone into fastboot mode again.

13. Open a terminal and enter the following:

sudo ./fastboot -i 0xfce flash boot boot.img

sudo ./fastboot -i 0xfce reboot

While the device reboots, press the Volume rockers a few times to load recovery.

14. Select backup and restore to create a backup of the current installation on the Xperia Mini Pro.

15. Select the option to wipe data/factory reset.

16. Select Install zip from sdcard.

17. Select Choose zip from sdcard.

18. Select the CyanogenMod file you placed on the sdcard.

19. Install the Google Apps using the same method.

20. Once the installation has finished, return back to the main menu, and select the reboot system now option. The Xperia Mini Pro should now boot into CyanogenMod.

terça-feira, 19 de março de 2013

percentual of Hapax legomenon in English

Computing the percentage of Hapax legomenon through the Gutenberg's database.

Bellow follows a python script to get number of Hapax legomenon and total lexical size through 1000 randomly chosen books in Gutenberg. The printed is result is the percentage of Hapax legomenon in each text.


#!/usr/bin/env python

import random
import urllib2
import re
import os

numMinGuttenberg = 10001
numMaxGuttenberg = 42370
numRand = 1000

ftpurl = "ftp://ftp.ibiblio.org/pub/docs/books/gutenberg/"

for x in xrange(numRand):
   rndint = random.randint(numMinGuttenberg,numMaxGuttenberg)
   try:
      txturl = ftpurl + str(rndint)[0] + '/' + str(rndint)[1] + '/' + str(rndint)[2] + '/' + str(rndint)[3] + '/' + str(rndint) + '/' + str(rndint) + '.txt'
      os.system('wget -nv -q -U firefox -O /tmp/txt ' + txturl)
      os.system('./wordcount.sh /tmp/txt > /tmp/wcount')
      a=os.popen("grep -c ': 1' /tmp/wcount").read()
      b=os.popen("sed -n '$=' /tmp/wcount").read()
      print float(a)/float(b)
   except Exception, e:
      print e
      continue

The script above use the bash script called wordcount.sh

#!/bin/bash
tr 'A-Z' 'a-z' < $1 | tr -sc 'A-Za-z' '\n' | sort | uniq -c | sort -n -r | sed 's/[[:space:]]*\([0-9]*\) \([a-z]*\)/\2 : \1/'

Run the script above and save the result to a text file, remove the lines where there was error in retrieving information and finally compute the average.


./hapaxlegomenon.py > hapaxlegomenon_results.txt

# remove lines with "could not blablabla"
sed -i '/could/d' hapaxlegomenon_results.txt

# compute average, min and max values
awk '{if(min==""){min=max=$1}; if($1>max) {max=$1}; if($1< min) {min=$1}; total+=$1; count+=1} END {print total/count, min, max}' hapaxlegomenon_results.txt

Results (from 788 texts):
min = 0.37550 max = 0.69534 avg = 0.54535 std = 0.045773

Intuitively we expect to observe a lower percentage of Hapax legomenon on a lexicon when dealing with rather less formal texts. In order to test it, we computed the percentage by using 18828 messages of Usenet newsgroups. The percentage of Hapax legomenon found in the lexicon was 0.49674. The code used follows bellow.

#!/bin/bash
wget http://qwone.com/~jason/20Newsgroups/20news-18828.tar.gz
tar -C /tmp/ -xvzf 20news-18828.tar.gz
for file in $(find /tmp/20news-18828/ -type f ); do cat $file >> /tmp/20news-18828.txt; done
./wordcount.sh /tmp/20news-18828.txt > /tmp/20news-18828count.txt
# number of Hapax legomenon
grep -c ': 1' /tmp/20news-18828count.txt
# total number of lexical entries
sed -n '$=' /tmp/20news-18828count.txt

Maybe if we use a less formal data-set, which would better approach the natural spoken language, then we expect a lower value for the percentage of Hapax legomenon on the lexicon. In order to do so we used IRC logs. Some are archived for a record of communications concerning major events in the history. Logs were made during the Gulf War and Oklahoma City bombing, for example. These and other events are kept in the ibiblio archive. The script bellow was used and the surprising result is that the percentage found was 0.45714, what is not a huge drop as one would expect.

wget -r http://www.ibiblio.org/pub/academic/communications/logs/
rm /tmp/irc.txt
for file in $( ./findbymime.sh /tmp/irc/ "application/octet-stream" ); do cat $file >> /tmp/irc.txt; done
for file in $( ./findbymime.sh /tmp/irc/ "text/plain" ); do cat $file >> /tmp/irc.txt; done
./wordcount.sh /tmp/irc.txt > /tmp/irccount.txt
grep -c ': 1' /tmp/irccount.txt
sed -n '$=' /tmp/irccount.txt

segunda-feira, 4 de março de 2013

Word Frequency and Context of Use in the Lexical Diffusion of Phonetically Conditioned Sound Change

Word Frequency and Context of Use in the Lexical Diffusion of Phonetically Conditioned Sound Change

Joan Bybee

Lexical diffusion refers to the way that a sound change affects the lexicon. If sound change is lexically abrupt, all the words of a language are affected by the sound change at the same rate. If a sound change is lexically gradual, individual words undergo the change at different rates or different times. (...) One early contribution to this debate by Schuchardt (1885) is the observation that high-frequency words are affected by sound change earlier and to a greater extent than low-frequency words. (...) phonetically conditioned changes that affect high-frequency words before low-frequency words are best accounted for in an exemplar model of phonological representation that allows for change to be both phonetically and lexically gradual. (...) a word’s contexts of use also affect the rate of change. Words that occur more often in the context for change, change more rapidly than those that occur less often in that context. (...) sound changes can also progress more rapidly in high-frequency morphemes. (...) the contexts of use determine the rate at which a word or morpheme undergoes a sound change.

1. Regular sound change or lexical diffusion?

The hypothesis that sound change is lexically regular seems well supported by the
facts of change. When we observe that two languages or dialects exhibit a phono-
logical difference, it is very likely that this difference is regular across all the words
that have the appropriate phonetic environment. This observation is fundamental to
the comparative method; the establishment of genetic relations and the reconstruc-
tion of protolanguages are based on the premise that sound change affects all words
equally. Schuchardt (1885) was one of the detractors from this position. When he
observed sound change in progress, he noted that all words did not change at the
same rate and that the differences were not due to “dialect mixture,” as was often
claimed by the Neogrammarians, who supported the regularity position.

Labov (1981, 1994) He proposed two types of sound change: “regular sound change” is gradual, phonetically motivated, without lexical or grammatical conditioning, and not influenced by social awareness, whereas “lexical diffusion” change, such as the phenomena studied by Wang, is “the result of the abrupt substitution of one phoneme for another in words that contain that phoneme” (Labov 1994:542). According to Labov, this type of change occurs most often “in the late stages of internal change that has been differentiated by lexical and grammatical conditioning” (ibid.). Labov went so far as to propose that certain changes, such as the deletion of glides and schwa, would be regular changes, while the deletion of obstruents would show lexical diffusion.

(...) even gradual, phonetically conditioned change exhibits gradual lexical diffusion (...)

Hooper (1976) identified a lexical diffusion paradox. Reductive sound change tends to affect high-frequency words before low-frequency words, but analogical leveling or regularization tends to affect low-frequency words before high-frequency words.

2. Frequency effects on regular sound change

Sound changes that are complete can be identified as regular or not, depending upon whether they affected all lexical items existing at the time of the change. Ongoing changes cannot be designated as regular or not, since they are not complete. However, one can reference the typical characteristics of a change to project whether it will be regular or not. That is, a phonetically gradual change with clear phonetic conditioning falls into Labov’s first type, and thus we can project its regularity

2.1. American English t/d deletion

Consider the deletion of final /t/ and /d/ in American English, which occurs most commonly in words ending in a consonant plus /t/ or /d/, such as just, perfect, child, or grand. This much-studied variable process has been shown to be affected by the preceding and following consonant, with more deletion in a consonant environment; by grammatical status, with less deletion if the /t/ or /d/ is the regular past tense; and by social and age factors, with more deletion among younger, lower socioeconomic class speakers (Labov 1972; Neu 1980).

(...) I found that deletion occurred more in high-frequency words. (...)

3. Changes that affect low-frequency words first

As previously mentioned, Hooper (1976) noted a lexical diffusion paradox: sound change seems to affect high-frequency words first, but analogical change affects low-frequency words first. The first tendency has already been documented. The second tendency is evident in the fact that low-frequency verbs, such as weep/wept, leap/leapt, creep/crept, are regularizing, while high-frequency verbs with the same pattern show no such tendency: that is, keep/kept, sleep/slept, leave/left show no evidence of regularizing. Hooper (1976) argued that changes affecting high-frequency words first have their source in the automation of production, whereas changes affecting low-frequency words first are due to imperfect learning. In the latter category are changes that affect words that do not conform to the general patterns of the language. Such exceptional words can be learned and maintained in their exceptional form if they are of high frequency in the input and in general use. However, if their frequency of use is low, they may not be sufficiently available in experience to be acquired and entrenched. Thus they may be subject to changes based on the general patterns of the language.

4. Modeling phonetic and lexical gradualness

The view of lexical diffusion espoused by both Wang and Labov assumes that a change that diffuses gradually through the lexicon must be phonetically abrupt. This is a necessary assumption if one is to accept a synchronic phonological theory that has phonemic underlying representations. Words can change one by one only if the change is a substitution of phonemes in such a theory. The discovery that sound change can be both phonetically gradual and lexically gradual forces a different view of the mental representation of the phonology of words (Hooper 1981; Bybee 2000b). If subphonemic detail or ranges of variation can be associated with particular words, an accurate model of phonological representation must allow phonetic detail in the cognitive representation of words. A recent proposal is that the cognitive representation of a word can be made up of the set of exemplars that have been experienced by the speaker/hearer. Thus all phonetic variants of a word are stored in memory and organized into a cluster: exemplars that are more similar are closer to one another than to ones that are dissimilar, and exemplars that occur frequently are stronger than less frequent ones (Johnson 1997; Bybee 2000a, 2001; Pierrehumbert 2001). These exemplar clusters, which represent autonomous words, change as experience with language changes. Repeated exemplars within the cluster grow stronger, and less frequently used ones may fade over time, as other memories do.

Changes in the phonetic range of the exemplar cluster may also take place as language is used and new tokens of words are experienced. Thus the range of phonetic variation of a word can gradually change over time, allowing a phonetically gradual sound change to affect different words at different rates. Given a tendency for reduction during production, the phonetic representation of a word will gradually accrue more exemplars that are reduced, and these exemplars will become more likely to be chosen for production, where they may undergo further reduction, gradually moving the words of the language in a consistent direction. The more frequent words will have more chances to undergo online reduction and thus will change more rapidly. The more predictable words (which are usually also the more frequent ones) will have a greater chance of having their reduced version chosen, given the context, and thus will advance the reductive change more rapidly.

The exemplar clusters are embedded in a network of associations among words that map relations of similarity at all levels. Distinct words with similar phonetic properties are associated, as are words with shared semantic features. I have shown in (Bybee 1985, 1988) that morphemes and morphological relations in such a network emerge from parallel phonetic and semantic associations and that schemas or abstractions over relations of similarity can be formulated to account for the regularities and patterns evident in language use.

(...)

An important property of the exemplar model is the emphasis on words as storage units. Various models have proposed that even multi-morphemic words have lexical listing. Vennemann (1974) argued that appropriate constraints on syllable structure can only be applied to whole words, not to morphemes. The common objection to this proposal made in the 1970s was that the human brain does not have sufficient storage capacity for all the words of a language, especially a language with large morphological paradigms. This argument has now been dismissed with the discovery of the huge amount of detail that the brain is capable of recording. Moreover, newer conceptions of the lexicon, not as a list but as a network with tight interconnections, provide the insight that listing two related words, such as start, started, does not take up as much cognitive space as listing two unrelated words, such as start, flower (Bybee 1985). Thus connectionist models (Rumelhart and McClelland 1986) and analogical models (Skousen 1989, 1992; Eddington 2000) have storage of whole words with morphological relations emergent from the categorization involved in storage. In addition, the lexical diffusion data provide evidence that multi-morphemic words can have lexical storage. As we saw in table 11.3, high-frequency regular past-tense English verbs are more likely to have their final /t/ or /d/ deleted than are low-frequency regular verbs. In order for a frequency effect to accrue to a word, that word must exist in memory storage. Since multi-morphemic words evince frequency effects, they must be stored in the lexicon. (...)

5. The effect of frequency of use in context

Given that the exemplar model tracks tokens of use and the exemplar cluster changes according to the phonetic shape of these tokens, it follows that if the context of use affects the phonetic shape of a word, the exemplar cluster will change accordingly. The effect of context can be best exemplified in changes that take place around word or morpheme boundaries, where the segment affected by the change is sometimes in the context for the change and sometimes not. Timberlake (1978) called this an alternating environment. Since the exemplar model registers phonetic tokens, the probabilities inherent in the alternating environment affect the shape of the exemplar cluster. (...) The exemplar cluster, then, appears to reorganize itself, with the stronger exemplars being more frequently chosen for use than the less frequent ones despite the context.

Thus along with the general measure of frequency of use, the relative frequency of the immediate linguistic context of use can also affect the lexical diffusion of a sound change. Even holding frequency constant, a word that occurs more often in the right context for a change will undergo the change more rapidly than a word that occurs less often in the conditioning context.

(...)

8. Consequences for a usage-based theory

The study of the diffusion of sound change in the lexicon contributes to a better understanding of the nature and causes of sound change. Changes that affect high-frequency words first are a result of the automation of production, the normal overlap and reduction of articulatory gestures that comes with fluency (Browman and Goldstein 1992; Mowrey and Pagliuca 1995). The strong directionality of such changes indicates that they are not the result of random variation, but that they stem from reduction processes resulting from repetition and the normal automation of motor activity. If a sound change does not proceed from the most frequent to the least frequent words, then we should seek its explanation in some other mechanisms of change.

Moreover, I have proposed a model in which variation and change are not external to the lexicon and grammar but inherent to it (Pierrehumbert 1994). Sound change is not rule addition—something that happens at a superficial level without any effect on the deeper reaches of grammar. Rather, lexical representations are affected from the very beginnings of the change. Indeed, they supply an ongoing record of the change since they track the details of the phonetic tokens experienced. Further evidence for sound change having an immediate impact on representation is the fact that sound changes are never reversed or undone (Cole and Hualde 1998; Bybee 2001). The morphological structure of words also plays a role from the initial stages of a change, but less because morphemes have some special status with respect to change and more because of the contexts in which they appear. Alternating contexts retard change, while uniform ones allow change to hurry ahead.

Effects of frequency and context demonstrate that language use has an effect on mental representations. In this view, representations and the grammatical structure that emerges from them are based on experience with language. New linguistic experiences are categorized in terms of already-stored representations, adding to the exemplar clusters already present and, at times, changing them gradually. Various levels of abstraction emerge as exemplars are categorized by phonological and semantic similarity—morphemes, words, phrases, and constructions can all be seen as the result of the categorization of linguistic experiences.

Complex Adaptive Systems and the Origins of Adaptive Structure: What Experiments Can Tell Us

Complex Adaptive Systems and the Origins of Adaptive Structure:

What Experiments Can Tell Us

Language is a product of both biological and cultural evolution. Clues to the origins of key structural properties of language can be found in the process of cultural transmission between learners. Recent experiments have shown that iterated learning by human participants in the laboratory transforms an initially unstructured artificial language into one containing regularities that make the system more learnable and stable over time. Here, we explore the process of iterated learning in more detail by demonstrating exactly how one type of structure—compositionality—emerges over the course of these experiments. We introduce a method to precisely quantify the increasing ability of a language to systematically encode associations between individual components of meanings and signals over time and we examine how the system as a whole evolves to avoid ambiguity in these associations and generate adaptive structure.

(...)

the very fact that language persists through multiple repeated instances of usage can explain the origins of key structural properties that are universally present in language. Because of this, taking a complex adaptive systems perspective on language lifts the burden of explanation for these properties from a putative richly structured domain-specific substrate, of the sort assumed by much of generative linguistics (e.g., Chomsky, 1965).

Much of the work over the past 20 years or so in modeling the evolution of language has taken this complex adaptive systems perspective (see, e.g., Brighton, Smith, & Kirby, 2005; Kirby, 2002b; Steels, 2003, for review). One particular strand of work has focused on the adaptation of language through a repeated cycle of learning and use within and across generations, where adaptation is taken to mean a process of optimization or fitting of the structure of language to the mechanisms of transmission (Kirby, 1999).

(...) One of the ways in which a language can evolve to become more learnable is by becoming structured.

"alien" language... chain learning

First, by looking at the learning errors made between adjacent generations, it was shown that the languages in both conditions were being acquired significantly more faithfully toward the end of the chains than they were at the beginning. Second, this increase in learnability over time occurred as a result of the languages becoming more structured over time.

Summary

Kirby et al. (2008) found that the languages that emerge through a repeated cycle of learning and production in a laboratory setting show evidence of adaptation to the bottleneck placed on their transmission. Making even minor changes to the way in which language is culturally transmitted can produce radically different types of structures. Given only a bottleneck on transmission preventing a proportion of the language from being seen by the next generation, language can adapt in such a way that ensures that it is stably transmitted to future generations. However, this occurs at the expense of being able to uniquely refer to every meaning. When they introduced the additional pressure of having to use a unique signal for each meaning, the language once again adapted to cope with these new transmission constraints, this time by becoming compositional. Having a compositional system ensures that both signals and meanings survive the bottleneck.

Because the participants could not know which condition they were in, it is impossible that the resulting languages were intentionally designed as adaptive solutions to the transmission bottleneck. Rather, the best explanation for the result is that in these experiments, just as in the computational models, linguistic adaptation is an inevitable consequence of the transmission of linguistic variants under particular constraints on replication. The result is apparent design, but without an intentional designer.

(...)

It seems clear from all of this that, first, cultural transmission alone is capable of explaining the emergence of languages that exhibit that appearance of design and, second, experimental studies of the iterated learning of artificial languages are a potentially useful methodological tool for those interested in studying cultural evolution.

Conclusion

This article has extended previous work on iterated language learning experiments by showing, using data obtained from an earlier study, exactly how compositional structure emerges over time as a result of cultural transmission. Using a recently developed analytical technique that calculates the regularity of mapping between signal and meaning elements (Tamariz & Smith, 2008), we were able to precisely quantify changes in the language’s ability to systematically encode such associations between meaning and signal components. From this we were able to explain the amplification effect the bottleneck seems to
have on systematicity in language, arguing that the sampling of smaller subsets of the language for training input to the next generation tends to make weaker patterns that are not visible at the level of the entire language appear stronger locally.

Evolution of Brain and Language

Evolution of Brain and Language
Thomas Schoenemann

The evolution of language and the evolution of the brain are tightly interlinked. Language evolution represents a special kind of adaptation, in part because language is a complex behavior (as opposed to a physical feature) but also because changes are adaptive only to the extent that they increase either one’s understanding of others, or one’s understanding to others. Evolutionary changes in the human brain that are thought to be relevant to language are reviewed. The extent to which these changes are a cause or consequence of language evolution is a good question, but it is argued that the process may best be viewed as a complex adaptive system, in which cultural learning interacts with biology iteratively over time to produce language.

A full accounting of the evolution of language requires an understanding of the brain changes that made it possible. Although our closest relatives, the apes, have the ability to learn at least some critical aspects of language (Parker & Gibson, 1990), they never learn language as completely or as effortlessly as do human children. This means that there must be some important differences between the brains of human and nonhuman apes. A fair amount is known about the ways in which human brains differ from the other apes, and we know much about specific functions of different parts of the brain. These two fields of study, combined with an understanding of general evolutionary processes, allow us to draw at least the broad outlines of the evolutionary history of brain and language.

There is a complex interplay between language evolution and brain evolution. The existence of language presupposes a brain that allows it. Languages must, by definition, be learnable by the brains of children in each generation. Thus, language change (a form of cultural evolution) is constrained by the existing abilities of brains in each generation. However, because language is critical to an individual’s adaptive fitness, language also likely had a fundamental influence on brain evolution. Humans are particularly socially interactive creatures, which makes communication central to our existence. Two interrelated evolutionary processes therefore occurred simultaneously: Language adapted to the human brain (cultural evolution), while the human brain adapted to better subserve language (biological evolution). This coevolutionary process resulted in language and brain evolving to suit each other (Christiansen, 1994; Christiansen & Chater, 2008; Deacon, 1992).

The coevolution of language and brain can be understood as the result of a complex adaptive system. Complex adaptive systems are characterized by interacting sets of agents (which can be individuals, neurons, etc.), where each agent behaves in an individually adaptive way to local conditions, often following very simple rules. The sum total of these interactions nevertheless leads to various kinds of emergent, systemwide orders. Biological evolution is a prime example of a complex adaptive system: Individuals within a species (a “system”) act as best they can in their environment to survive, leading through differential reproduction ultimately to genetic changes that increase the overall fitness of the species. In fact, “evolution” can be understood as the name we give to the emergent results of complex adaptive systems over time. One can also view the brain itself as a complex adaptive system. This is because brain circuits are not independent of each other. Processing in one area affects processing in connected areas; therefore, processing changes in one area—whether due to biological evolution or learning—influence (and select for over evolutionary time) changes in other areas.

A number of neural systems relevant specifically to language interact with and influence each other in important ways. Syntax depends fundamentally on the structure of semantics, because the function of syntax is to code higher level semantic information (e.g., who did what to whom). Semantics in turn depends on the structure of conceptual understanding, which—as will be reviewed later—is a function of brain structure. These structures are in turn the result of biological adaptation: Circuits that result in conceptual understanding that is relevant and useful to a given individual’s (ever-changing) environmental realities will be selected for and will spread over evolutionary time.

(...)

Therefore, language evolution itself will be strongly constrained by pre-existing cognitive abilities within each generation. Changes affecting the perception of linguistically relevant signals would have been favored only to the extent that they increase the individual’s ability to perceive and rapidly process the acoustic signals already used by others for language. Changes affecting the production of linguistically relevant signals would be favored only to the extent that they could be understood by the preexisting perceptual abilities of others. Signals too complicated or subtle for others to process would not be adopted and, hence, mutations influencing them would not likely spread.

(...)

Classical Language Areas

Broca’s and Wernicke’s areas were the first cortical regions to be associated with specific linguistic abilities. Broca’s aphasics display nonfluent, effortful, and agrammatical speech, whereas Wernicke’s aphasics display grammatical but meaningless speech in which the wrong words (or parts of words) are used (Bear, Connors, & Paradiso, 2007; Damasio et al., 1993). Broca’s area is located in the posterior-inferior frontal convexity of the neocortex, whereas Wernicke’s area is localized to the general area where parietal, occipital, and temporal lobes meet. For most people, these areas are functional for language primarily in the left hemisphere.

Additional areas, adjacent to, but outside these classic language areas, appear to be important for these aspects of language processing as well. Broca’s and Wernicke’s aphasias (i.e., the specific types of language deficits themselves) are not exclusively associated with damage to Broca’s and Wernicke’s cortical areas (Dronkers, 2000). Damage to the caudate nucleus, putamen, and internal capsule (structures of the cerebral hemispheres that are deep to the cortex) also appear to play a role in Broca’s aphasia, including aspects of syntactic processing (Lieberman, 2000).

The evolutionary histories of these areas are quite curious, as homologues to both Broca’s and Wernicke’s areas have been identified in nonhuman primate brains (Striedter, 2005). Exactly what function they play in other species is not currently known, but an evolutionary perspective would predict that they likely process information in ways that would be useful to language (Schoenemann, 2005), consistent with the view of language adapting to the human brain by taking advantage of circuits that already existed. The presence of these areas in nonlinguistic animals is a glaring anomaly for models that emphasize the evolution of completely new language-specific circuits in the human lineage (e.g., Bickerton, 1990; Pinker, 1995). In any case, although detailed quantitative data on these areas in nonhuman primates have not been reported, it does appear that they are significantly larger both in absolute and relative terms in humans as compared to macaque monkeys (Petrides & Pandya, 2002; Striedter, 2005).

Given that Broca’s and Wernicke’s areas mediate different but complementary aspects of language processing, they must be able to interact. A tract of nerve fibers known as the arcuate fasciculus directly connects these areas (Geschwind, 1974). The arcuate fasciculus in humans tends to be larger on the left side than on the right side, consistent with the lateralization of expressive language processing to the left hemisphere for most people (Nucifora, Verma, Melhem, Gur, & Gur, 2005).

The arcuate fasciculus appears to have been elaborated in human evolution. The homologue of Wernicke’s area in macaque monkeys does project to prefrontal regions that are close to their homologue of Broca’s area, but apparently not directly to it (Aboitiz & Garcia, 1997). Instead, projections directly to their homologue of Broca’s area originate from a region just adjacent to their homologue of Wernicke’s area (Aboitiz & Garcia, 1997). Thus, there appears to have been an elaboration and/or extension of projections to more directly connect Broca’s and Wernicke’s areas over the course of human (or ape) evolution. Recent work using diffusion tensor imaging (which delineates approximate white matter axonal connective tracts in vivo) suggest that both macaques and chimpanzees have tracts connecting areas in the vicinity of Wernicke’s area to regions in the vicinity of Broca’s area (Rilling et al., 2007). However, connections between Broca’s area and the middle temporal regions (important to semantic processing; see below) are only obvious in chimpanzees and humans and appear to be most extensive in humans (Rilling et al., 2007). Presumably these connections were elaborated during human evolution specifically for language (Rilling et al., 2007).

Prefrontal Cortex

Areas in the prefrontal cortex (in addition to Broca’s area) appear to be involved in a variety of linguistic tasks, including various semantic aspects of language (Gabrieli, Poldrack, & Desmond, 1998; Kerns, Cohen, Stenger, & Carter, 2004;
Luke, Liu, Wai, Wan, & Tan, 2002; Maguire & Frith, 2004; Noppeney & Price, 2004; Thompson-Schill et al., 1998), syntax (Indefrey, Hellwig, Herzog, Seitz, & Hagoort, 2004; Novoa & Ardila, 1987), and higher level linguistic processing, such as understanding the reasoning underlying a conversation (Caplan & Dapretto, 2001).

(...)

Right Hemisphere

Although the cortical language areas discussed so far are localized to the left hemisphere in most people, there is substantial evidence that the right hemisphere also contributes importantly to language. The right hemisphere understands short words (Gazzaniga, 1970) and entertains alternative possible meanings for particular words (Beeman & Chiarello, 1998), suggesting that it is better able to interpret multiple intended meanings of a given linguistic communication. The right hemisphere also plays a greater role in a variety of types of spatial processing in most people (Tzeng & Wang, 1984; Vallar, 2007), thus presumably grounding the semantics of spatial terms. The right frontal lobe mediates aspects of prosody (Alexander, Benson, & Stuss, 1989; Novoa & Ardila, 1987), which is critically important to understanding intended meaning (consider sarcasm, in which the intended meaning is directly opposite the literal meaning).

(...)

Cerebellum

The primary function of the cerebellum was long thought to be monitoring and modulating motor signals from the cortex (Carpenter & Sutin, 1983). However, more recent work has implicated the cerebellum in a whole range of higher cognitive functions, including goal organization and planning, aspects of memory and learning, attention, visuo-spatial processing, modulating emotional responses, and language (Baillieux, De Smet, Paquier, De Deyn, & Marien, 2008). The cerebellum appears to play a role in speech production and perception, as well as both semantic and grammatical processing (Ackermann, Mathiak, & Riecker, 2007; Baillieux et al.; De Smet, Baillieux, De Deyn, Marien, & Paquier, 2007). The cerebellum also seems to play a role in timing mechanisms generally (Ivry & Spencer, 2004), which may explain its functional relevance to language (given the importance temporal information plays in language production and perception).

Conclusion

Many evolutionary changes in the brain appear to have relevance to language evolution. The increase in overall brain size paved the way for language both by encouraging localized cortical specialization and by making possible increasingly complicated social interactions. Increasing sociality provided the central usefulness for language in the first place and drove its evolution. Specific areas of the brain directly relevant to language appear to have been particularly elaborated, especially the prefrontal cortex (areas relevant to semantics and syntax) and the temporal lobe (particularly areas relevant to connecting words to meanings and concepts). Broca’s and Wernicke’s areas are not unique to human brains, but they do appear to have been elaborated, along with the arcuate fasciculus connecting these areas. Other areas of the brain that participate in language processing, such as the basal ganglia and cerebellum, are larger than predicted based on overall body weight, although they have not increased as much as a number of language-relevant areas of the cortex. Finally, little evidence suggests that significant elaboration of the auditory processing pathways up to the cortex has occurred, but direct pathways down to the tongue and respiratory muscles have been strengthened, with new direct pathways created to the larynx, presumably specifically for speech.

These findings are consistent with the view that language and brain adapted to each other. In each generation, language made use of (adapted to) abilities that already existed. This is consistent with the fact that the peripheral neural circuits directly responsible for perceptual and productive aspects of language have shown the least change. It makes sense that languages would evolve specifically to take advantage of sound contrasts that were already (prelinguistically) relatively easy to distinguish. This perspective is also consistent with the fact that Broca’s and Wernicke’s areas are not unique to humans. Differences in language circuits seem mostly to be quantitative elaborations, rather than completely new circuitry.

Three major factors seem to have conspired to drive the evolution of language: first, the general elaboration of—and increasing focus on—the importance of learned behavior; second, a significant increase in the complexity, subtlety, and range of conceptual understanding that was possible; and third, an increasingly complex, socially interactive existence. Each of these is reflected by a variety of changes in the brain during human evolution. Because language itself facilitates thinking and conceptual awareness, language evolution would have been a mutually reinforcing process: Increasingly complicated brains led to increasingly rich and varied thoughts, driving the evolution of increasingly complicated language, which itself facilitated even more complex conceptual worlds that these brains would then want to communicate (Savage-Rumbaugh & Rumbaugh, 1993; Schoenemann, 2009). The interplay between internal (conceptual) and external (social) aspects of human existence that drove this coevolutionary process highlights the usefulness of thinking about language evolution as a complex adaptive system. The extent to which increasing conceptual complexity itself might have driven language evolution represents an intriguing research question for the future.

A Usage-Based Approach to Recursion in Sentence Processing

A Usage-Based Approach to Recursion in Sentence Processing
Morten H. Christiansen - Cornell University
Maryellen C. MacDonald - University of Wisconsin-Madison

Most current approaches to linguistic structure suggest that language is recursive, that recursion is a fundamental property of grammar, and that independent performance constraints limit recursive abilities that would otherwise be infinite. (...) recursion is construed as an acquired skill and in which limitations on the processing of recursive constructions stem from interactions between linguistic experience and intrinsic constraints on learning and processing.

Introduction

Ever since Humboldt (1836/1999, researchers have hypothesized that language makes “infinite use of finite means.” Yet the study of language had to wait nearly a century before the technical devices for adequately expressing the unboundedness of language became available through the development of recursion theory in the foundations of mathematics (cf. Chomsky, 1965). Recursion has subsequently become a fundamental property of grammar, permitting a finite set of rules and principles to process and produce an infinite number of expressions.

(...)

This article presents an alternative, usage-based view of recursive sentence structure, suggesting that recursion is not an innate property of grammar or an a priori computational property of the neural systems subserving language. Instead, we suggest that the ability to process recursive structure is acquired gradually, in an item-based fashion given experience with specific recursive constructions. In contrast to generative approaches, constraints on recursive regularities do not follow from extrinsic limitations on memory or processing; rather they arise from interactions between linguistic experience and architectural constraints on learning and processing (see also Engelmann & Vasishth, 2009; MacDonald & Christiansen, 2002), intrinsic to the system in which the
knowledge of grammatical regularities is embedded. Constraints specific to particular recursive constructions are acquired as part of the knowledge of the recursive regularities themselves and therefore form an integrated part of the representation of those regularities.

A Connectionist Model of Recursive Sentence Processing

Our usage-based approach to recursion builds on a previously developed Simple Recurrent Network (SRN; Elman, 1990) model of recursive sentence processing (Christiansen, 1994; Christiansen & Chater, 1994). The SRN, as illustrated in Figure 1, is essentially a standard feed-forward network equipped with an extra layer of so-called context units. The hidden unit activations from the previous time step are copied back to these context units and paired with the current input. This means that the current state of the hidden units can influence the processing of subsequent inputs, providing the SRN with an ability to deal with integrated sequences of input presented successively.

Usage-Based Constituents

A key question for connectionist models of language is whether they are able to acquire knowledge of grammatical regularities going beyond simple co-occurrence statistics from the training corpus. Indeed, Hadley (1994) suggested that connectionist models could not afford the kind of generalization abilities necessary to account for human language processing (see Marcus, 1998, for a similar critique). Christiansen and Chater (1994) addressed this challenge using the SRN from Christiansen (1994).

Deriving Novel Predictions

Simple Recurrent Networks have been employed successfully to model many aspects of psycholinguistic behavior, ranging from speech segmentation (e.g., Christiansen, Allen, & Seidenberg, 1998; Elman, 1990) and word learning (e.g., Sibley, Kello, Plaut, & Elman, 2008) to syntactic processing (e.g., Christiansen, Dale, & Reali, in press; Elman 1993; Rohde, 2002; see also Ellis & Larsen-Freeman, this issue) and reading (e.g., Plaut, 1999). Moreover, SRNs have also been shown to provide good models of nonlinguistic sequence learning (e.g., Botvinick & Plaut, 2004, 2006; Servan-Schreiber, Cleeremans, & McClelland, 1991). The human-like performance of the SRN can be attributed to an interaction between intrinsic architectural constraints (Christiansen & Chater, 1999) and the statistical properties of its input experience (MacDonald & Christiansen, 2002). By analyzing the internal states of SRNs before and after training with right-branching and center-embedded materials, Christiansen and Chater found that this type of network has a basic architectural bias toward locally bounded dependencies similar to those typically found in iterative recursion. However, in order for the SRN to process multiple instances of iterative recursion, exposure to specific recursive constructions is required. Such exposure is even more crucial for the processing of center-embeddings because the network in this case also has to overcome its architectural bias toward local dependencies. Hence, the SRN does not have a built-in ability for recursion, but instead it develops its human-like processing of different recursive constructions through exposure to repeated instances of such constructions in the input.

Lexicalization of Sound Change and Alternating Environments

Lexicalization of Sound Change and Alternating Environments

Bybee

1.Usage-based theory

Over the last twenty years a significant functionalist trend has developed in the study of morphosyntax with the aim of explaining the nature of grammar by studying how language is used in context. The basic premise of this work is that frequently used patterns become conventionalized or fossilized as grammatical patterns; that is, grammar is emergent from language use (Givon 1979; Hopper and Thompson 1980, 1984; DuBois 1985; Hopper 1987; and many more). Haiman (1994) has discussed the process by which repeated patterns become part of “grammar” in terms of ritualization, showing that the effects that repeated stimuli or repeated action has on an organism -- automatization, habituation, and emancipation—are also operative in the process of grammaticalization or the creation of new grammar (see also Boyland 1997 for a discussion of the psychological mechanisms involved).

Some comparable research in the phonological domain has begun to appear in recent years. For instance, several studies have shown that speakers’ judgments of the grammaticality of phonotactic patterns is based on the frequency of consonant and vowel combinations actually occurring in the language (Pierrehumbert 1994, Frisch 1996). In addition, speakers’ ability to access the lexicon may involve a complex interplay between the frequency of words and the number and frequency of words with similar phonological shape (Pisoni et al. 1985). Connectionism offers the possibility of formally modeling the effect of use on mental representations of language, and such models have been tested in the phonological and morphological domains (for example in Dell 1989 and Daugherty and Seidenberg 1994).

(...)

First, I present evidence that many, if not all, sound changes progress in lexical items as they are used, with more frequently used words undergoing change at a faster rate than less frequently used words. Then I examine “alternating environments” -- cases in which a sound in a particular word or morpheme is sometimes in the environment for the change to take place and sometimes not. In cases in which the targeted sound is at the edge of a word, the change can go through even where the sound is not in the appropriate phonetic environment and thus no alternation is produced. In such cases, we have evidence for the restructuring of the lexical representation of the word. However, when the alternating environment is inside of a word, the change can be retarded even in the appropriate environment, but eventually an alternation can be created, showing, again, restructuring of the lexical representation of the word. I argue that lexical representations are restructured gradually on the basis of actually occurring variants of a word and that postulating words and frequent phrases as the units of representation explains the development of word-level phonology. In addition, it will be argued that reference to the frequency with which words begin in consonants explains why a final word boundary often conditions changes as though it were a consonant.

2. The frequency effect on sound change

One of the aims of this chapter is to explore from a phonological perspective the size and nature of storage and processing units. I will present evidence that words and often longer units, such as frequent phrases, are the units of lexical storage. For the moment, however, I assume that words are the units of lexical storage. It is reasonable to assume that lexically stored words are in many ways like other mental records of a person’s experience. First, there is no reason to believe that these memorial records have details and predictable features abstracted away from them (Langacker 1987; Ohala and Ohala 1995), and second, it is reasonable to believe that new experiences are categorized, to the extent possible, in terms of the already-stored record of past experiences (see Klatzky 1980).

Each use of a word requires retrieval by the speaker and a matching of the incoming percept to stored images by the hearer (and the speaker, who is monitoring his or her own speech). My thesis in this chapter is that the act of using a word, in either production or perception, has an effect on the stored representation of the word. We already know this is true in terms of the degree of entrenchment of a word (or the resting level of activation): high-frequency words have stronger representations that make them easier to access, more resistant to change on the basis of other patterns, and more likely to serve as the basis for the creation of new forms (Bybee 1985).

In addition, certain levels of use affect the stored representation of words by actually changing their shapes. That is, along with the entrenchment effect of frequency, there is also an automation effect: words and phrases that are used a lot are reduced and compressed. This effect is very salient in grammaticizing phrases (such as 'going to' becoming 'gonna' and 'want to' becoming 'wanna') and more conventionalized contractions (such as 'won’t' and 'didn’t'), but it also occurs in a more subtle form across the lexicon when a sound change is taking place. Sound changes (phonetically motivated changes, which are usually the reduction of the magnitude of gestures or retiming of gestures; Browman and Goldstein 1992) tend to be phonetically gradual and also lexically gradual: high-frequency words undergo change at a faster rate than low-frequency words. The effects of frequency in the diffusion of a sound change through the lexicon have been shown for vowel reduction and deletion in English (Fidelholtz 1975; Hooper 1976b), for the raising of /a/ to /o/ before nasals in Old English (Phillips 1984), for various changes in Ethiopian languages (Leslau 1969), for the weakening of stops in American English and vowel change in the Cologne dialect of German (Johnson 1983), for ongoing vowel changes in San Francisco English (Moonwomon 1992), and for tensing of short a in Philadelphia (Labov 1994:506–507). In a recent paper, I have shown that there is also a frequency effect in the application of t/ d deletion in American English (Bybee 2000). Deletion occurs more in high-frequency words, including of course monomorphemic nouns and adjectives, but also regular past tense verbs, a point to which I will return later.

My interpretation of the frequency effect in the diffusion of sound change (following Moonwomon 1992) is that sound change takes place in small increments in real time as words are used. The more a word is used, the more it is exposed to the reductive effect of articulatory automation. The effects that production pressures have on the word are registered in the stored representation, probably as an ever-adjusting range of variation. Thus words of higher frequency undergo more adjustments and register the effects of sound change more rapidly than low-frequency words.

The frequency with which a word is subject to the ravages of articulation is not the only factor that encourages sound change. We also have to take into account the fact that certain speech styles allow more reduction and compression than others. In particular, casual speech among familiars typically shows more reduction. Thus words that are used in casual situations will also undergo change at a faster rate (see this volume, chapter 2, and D’Introno and Sosa 1986). Of course, these words are also likely to be those that are of higher frequency overall.

Another factor affecting reduction is the status of the word within the discourse. Fowler and Housum (1987) found that the first use of a word in a spoken text was longer than in subsequent uses. This means that speakers articulate more clearly in the first use of a word, where identification by the hearer might be more difficult,
and then allow the reductive processes to apply later when identification by the listener is aided by the context and the fact that the word has already been activated. In fact, speakers may use reduction to indicate that a referent is not new but rather one that has already been accessed in the discourse. Words that are used more often within a text are produced in reduced form more often. If the produced form affects the stored form, then words that are repeated more often in a discourse will reduce at a faster rate than words that are repeated less often.

3. Exemplar-based representations

The account of phonetically gradual lexical diffusion of a sound change given in the preceding section requires a model of memory storage for linguistic units based on actual tokens of use. Each experience of a word is stored in memory with other examples of use of the same word. These memories of specific tokens are organized into clusters with more frequently ocurring exemplars, and tokens that share many properties with high-frequency exemplars are treated as more central, while less common or more deviant tokens are treated as more marginal. Thus linguistic experiences are categorized in the same way as other types of perceptual experiences. Rather than conceiving of stored representations as abstractions from the phonetic tokens, representations are considered to be the result of the categorization of phonetic tokens. This proposal, which will be referred to as “the exemplar model,” adapts proposals made by Miller (1994) for phonetic segments and Johnson (1997) for larger units. Similar arguments for phonological representations have been made by Hooper (1981) and Cole and Hualde (1998). Note that this model does not distinguish between phonetic and phonemic features in lexical representation (see Steriade 2000). Further implications of this model for sound change will be discussed in the next sections.

4. Alternating environments

Given that produced tokens affect stored representations, what would happen when a word or morpheme occurs in different environments, such that it is subject to a change in one environment but not in another? In the case of such “alternating environments” (as Timberlake [1978] calls them) two or more different surface forms map onto a stored form. How are such alternate mappings resolved?

Here I approach this question by examining cases of sound change in progress. It is necessary to distinguish the phonetic variation that goes on while a sound change is in progress from the conventionalized alternations that can eventually arise from such sound change. By alternation I mean that a word or morpheme has two or more variants that are not phonetically continuous or variable but rather constitute discrete alternants conditioned by specific phonological, grammatical, or lexical contexts. An alternation, then, roughly corresponds to the level of variation generated by a classical phonological rule. By studying the conditions under which such alternations are conventionalized and the conditions under which they are not, we learn something about how variants of words and morphemes are organized in memory.

The study of alternating environments in cases of sound change in progress reveals that the outcome differs according to whether it is a word or a morpheme that is in the alternating environment. When the same morpheme is in an alternating environment in different words, a change is retarded even in the conditioning environment (the Timberlake Effect; see below), but an alternation can eventually arise. When the alternates are in two forms of the same word, alternations arise only under special conditions, but ordinarily only one alternate survives. I will argue that the differential behavior of morphemes and words with respect to sound change in progress provides strong evidence for the stored representation of words and frequent phrases.

(...)

Ordinarily, alternations do not develop where the conditioning is across a word boundary. This fact gives rise to the notion of “word-level phonology”—that is, the fact that most alternations occur within words. The explanation being investigated here is that ordinarily there is only one stored representation for each word. Where variation arises during change in progress, the variation is resolved in terms of one variant or the other. The exceptions to this arise only in the case of frequently used phrases, to which I will return shortly.

First let us consider how variation at the word level is represented and how cases of sound change in an alternating environment would eventually be resolved. In the exemplar model described earlier, the representation of a word is a cluster of actually occurring tokens, with more frequent tokens accumulating greater weight or strength. Thus each word has its own range of variation dependent upon its frequency and the contexts in which it is used. When little or no sound change is affecting a word, the range of variation in the tokens may be small and relatively stable. During change, however, the range of variation increases and the center of the cluster gradually shifts.

When the same word occurs in both an environment that conditions a change and a non-conditioning environment, as in the Spanish s-aspiration case, the cluster for a word may divide into two (or more) subclusters, each one with a strong center of high-frequency tokens. In this case, each subcluster is associated with one environment -- the word-final [s] tokens with the environment before a vowel and the word-final [h] tokens with the environment before a C. It appears that such a situation is unstable when the environment is not also part of the representation, because it tends to be resolved in favor of one variant for all environments. That is, the most frequent variant, the weakened consonant, [h], wins out and tends to be chosen even in contexts before a vowel.

In contrast, when the environment is part of the stored unit, an alternation can be established, in the sense that the [s] can remain before a vowel. This happens in frequent phrases.

(...)

Thus my hypothesis is that words and frequent phrases are storage units and that ordinarily there is only one representation per word, so that variations in the form of a word are normally reconciled to a single form and no alternation is created through sound change. Exceptions to this occur when a word is used in high-frequency phrases and/or phrases involving grammatical morphemes, such as pronouns and articles. This hypothesis makes strong predictions about the conditions under which sandhi phenomena will develop. It predicts that sandhi processes will only occur in phrases of high frequency and most commonly in those involving grammatical morphemes or other high-frequency words. This prediction is borne out by the most famous cases of sandhi, such as French liaison (Tranel 1981).1 The hypothesis also predicts that cases of reduction restricted to certain “syntactic” environments, such as English auxiliary contraction and the reduction of don’t, will occur only in the most frequent contexts in which the form appears. This prediction is also borne out where it has been tested (Krug 1999; Bybee and Scheibman 1998; see also Bybee 1998).

4.2. Sound change inside of words

Of course, alternations do develop inside of words. A morpheme inside a word may undergo phonological change, producing a new allomorph and thus an alternation. This fact follows from the hypotheses presented earlier: if sound change permanently affects stored units and words—even morphologically complex ones—are the units of storage, then the same morpheme in different words will take on different phonological shapes.

Further evidence for the hypotheses developed here is the fact that the effect of an alternating environment inside a word is very different from the effect across word boundaries. Inside a word, a variable process never applies outside of its phonetic environment (as, say, the aspiration of /s/ in Spanish occurs even ____##V). Instead the effect is the reverse: there is evidence that a change can be retarded even in its phonetic environment if it occurs in a morpheme that also has alternates that appear outside of the phonetic environment. Timberlake (1978) has made this point by presenting examples of changes that progress faster in uniform environments and are retarded in alternating ones.

(...)

These examples show that the implementation of a sound change in particular words (that is, its lexical diffusion) depends heavily on the contexts in which the sound is used. Since I have been arguing that the unit that serves as the context for a sound as it undergoes change is the word, then we must now consider how to account for the fact that the environment of a morpheme in one word affects the rate of change for the same morpheme in a different word. To understand this issue, we must understand the nature of morphological relationships, a matter to which I now turn.

5. A network model

In various works (Bybee 1985, 1988, 1995) I have proposed that the lexicon is organized into a complex set of relations among words and phrases by connections drawn among phonologically and semantically similar items. Parallel phonological and semantic connections constitute morphological relations if they are repeated across multiple pairs of items. As Dell (2000) points out, morphological relatedness is the joint effect of the organization of words into phonological and semantic neighborhoods.

In this model, the relations between base and past-tense forms of English verbs are diagrammed as in figure 10.1,3 where semantic relations are not explicitly shown and where relations of similarity (rather than identity) between segments are shown with broken lines. Affixes are not explicitly listed in storage but emerge from sets of connections made among stored words and phrases. The very high type frequency of the regular English past tense strengthens its representation in memory and makes it highly productive. It can then apply to verbs whose past-tense forms are not accessible because they have never been encountered or are of such low frequency as to not be easily accessible.

Past tense constitutes a category, but not one that can be accessed independently of a particular verb, because it is a category to which verb forms may belong or not belong. How then do individual tokens of the past-tense suffix relate to one another? This is, of course, an empirical question, and here is the evidence we
have so far.

First, instances of the suffix attached to a verb are affected by the token frequency of the whole verb form: the rate of deletion for a final [t] or [d] on a pasttense verb is affected by the frequency of the form, as shown in Bybee (2000). In that study, past-tense forms with a frequency in Francis and Kucera (1982) of 36 or greater are considered high frequency and those with a frequency of less than 36 as low frequency, following a suggestion by Stemberger and MacWhinney (1988), who establish that the mean frequency of inflected verbs in Francis and Kucera is 35. Using this cutoff point, I (Bybee 2000) find that there is a significant difference between high- and low-frequency verbs in the position for deletion. See table 10.4.

Second, in the data I cited earlier, the overall trend for past-tense [t] and [d] is that they delete less often than [t] or [d] in monomorphemic words.

A related third point is that Losiewicz (1992) has shown that not only are monomorphemic [t] and [d] shorter than past-tense [t] and [d], but high-frequency past-tense [t] and [d] are shorter than low-frequency [t] and [d]. Losiewicz proposes to account for her data by a dual-access model in which high-frequency morphologically complex forms are stored and retrieved as wholes, while low-frequency forms are composed by adding the suffix to a base form using a schema. But this model would predict the same rate of deletion in high-frequency regular past tense forms as in monomorphemic forms of comparable frequency, and this prediction is not borne out by the data. Rather, in the data used in Bybee (2000), the rate of deletion for all words with frequencies of 36–403 was 54.4%, while the rate for past-tense forms of the same frequency was 39.6%. Thus we must posit that the past tense in low-frequency verbs, which is longer and phonetically fuller, can have some effect on the past-tense suffix on high-frequency verbs, which is shorter and more prone to deletion but not as short and prone to deletion as the [t] and [d] of monomorphemic forms. Thus the fuller form of the suffix on low-frequency verbs has some impact on the suffix on other verbs.

(...)

This account of the Timberlake Effect makes predictions about the circumstances under which the effect will be the strongest. A change will be retarded most noticeably in an alternating environment when the alternates that are not in the environment to undergo the change are the most frequent—either there are more conditions in the paradigm in which the change does not take place or the environments that do not condition a change occur in the unmarked or most frequent categories. Furthermore, it is less likely that the Timberlake Effect will be observed in high-frequency paradigms in which individual forms have a greater lexical strength (accessibility) and weaker connections with related words (see Bybee 1985).

Now compare again these word-internal cases to those where change is occurring at a word boundary. The word may for a time have multiple variants, suggesting either a range of variation in the changing segment or even multiple representations for a single word. However, in this case the tendency is to resolve the variation in favor of a single form for each word, except in the case of high-frequency or grammatical words. It appears that the cases in which distinct alternates become established are just those cases in which the conditioning environment is registered in storage with the alternating item. Thus in the case of progu, progi each variant can be registered because we are dealing with two different (although related) words, one of which consistently has the palatalizing environment and one that does not. Similarly in frequent phrases, such as muchos año(s), the [s] preceding the vowel may be preserved (as though it were word-internal) because the conditioning vowel occurs with it in storage and processing. Other instances of the same word may occur without the [s], as [mucoh] or [muco]. Such variation would not necessarily exist indefinitely. Unless one variant is in a highly entrenched phrase, the variation is likely to be eventually leveled out.

Thus by registering words in the lexicon and establishing connections among them we are able to account for the two different effects on sound change of alternating environments inside of words and across word boundaries.

6. Lexical phonology

Some of the effects of variable processes that I have discussed earlier have been addressed by Guy (1991a, 1991b) in the context of Lexical Phonology. This proposal is relevant here, even though I will argue that it does not work for all the cases at hand, because it incorporates the notion argued for here, that some words behave as if variable processes have applied to them more than once. Guy proposes that variable rules may apply cyclically and at all levels of a Lexical Phonology and offers an account for the variation in t/d deletion that is conditioned by the morphological structure of the word. The facts are as follows: on average, the highest rate of deletion of /t/ or /d/ takes place in monomorphemic words, the next highest rate in pasts with vowel changes (such as slept, left, told), and the lowest rate in regular past-tense forms.

The advantage of the Lexical Phonology approach is that it does recognize that fairly low-level variable phonology is deeply entwined with the lexicon and morphology. It also suggests that the greater progress of a sound change can be attributed to more applications of the “rule.” The problem with it is that it makes incorrect predictions in some cases and it cannot deal at all with frequency effects across lexical items.

Consider first another case of an alternating environment that Timberlake describes. Timberlake brings up this case to show that it is the alternating environment itself, and not the morpheme boundary, that causes the retardation of change.

(...)

This example shows that the Lexical Phonology approach is fundamentally the wrong approach, for it is a fact of usage, not structure, that is accelerating the change: since the /d/ in /ado/ is in the context for reduction and deletion no matter what verb it is added to and since many verbs with this suffix are of very high frequency, there is a frequency effect to accelerate the change and nothing to impede it.

Finally, Lexical Phonology, as a theory of structure and not a theory of usage, cannot account for the frequency effects demonstrable in the lexical diffusion of sound change. In Bybee (2000) I have shown that t/d deletion occurs more often in words of higher frequency. This is true of all of the 2,000 tokens studied; this relation also holds when nouns and adjectives, semi-weak past-tense verbs, and regular past-tense verbs are considered as well. Since the semi-weak past-tense verbs are all of high frequency, frequency of use alone can account for their higher rate of deletion over the regular past verbs. I conclude, then, that variable rates of phonological change are the product of usage, not of structure.

7. Conclusions

The evidence discussed in this essay bears on two issues regarding the nature of stored memory for linguistic forms. First, the minimal unit of independent storage is the word, which is also the minimal unit of production, since smaller units cannot be used in isolation. I hasten to add, however, that that does not mean that other much longer sequences are not stored and processed as wholes. Here we have seen evidence that frequently used phrases behave like single processing units (just as words typically do) in that they preserve segments that might otherwise be lost at word edges. In various papers I have argued for a highly redundant storage mechanism that includes specific instances of phrases and clauses as well as more generalized constructions as storage and processing units (Bybee and Scheibman 1999; Bybee 2000).

The view of sound change as affecting sounds in words according to their context of use allows us to understand why most phonological alternations occur at the word level: alternations can only be established in cases in which the conditioning environment is present in the storage and processing unit. Words or other units that occur in alternating environments that are not part of the stored unit will not have variants but rather will resolve any variation in favor of one form or the other. This proposal also allows us to make interesting predictions about the development of liaison or sandhi phenomena. Conventionalized alternations across traditional word boundaries indicate that at least one alternate is part of a larger stored unit. Thus such liaison alternations can be used to study the nature and size of storage units. Finally, the view of sound change as affecting sounds in words provides an account of the different effects of alternating environments inside of words and across word boundaries.

The second major aspect of the model presented in this essay is that sound change has an immediate and permanent effect on stored representations. This view contrasts with the generative and structural view that underlying representations remain fixed and sound change is “rule addition”—nothing more than a change in the phonological component. The evidence that sound change has an immediate effect on the lexicon is that words change gradually and at different rates according to their token frequency, even while a “rule” is still “variable.” The evidence that such change is permanent is the fact that old underlying forms never resurface, even when the “phonological rule” becomes unproductive (see Cole and Hualde 1998 for more evidence on this point). Instead the progress of a change is inexorably unidirectional in both a phonetic and a morpho-lexical sense. In the phonetic sense, we see the unidirectionality in chains of reduction and assimilation changes, such as those shown
in (8), where one change builds on the other and continues its direction:

(8)

t→d→ð→Ø
s→h→Ø
k’ → kj → c

If stored items are changed gradually and the motivation for increased automation remains fairly constant, then the continuous nature and strict directionality of such changes is predicted. If sound change were “rule addition,” there would be no explanation for why, for example, after “adding the rule” d → ð / V_V, a language would
go on to “add the rule” d → Ø / V_V.

Inexorable unidirectionality is also apparent in the morphologization and lexicalization of the results of sound change. While no one disputes that morphologization eventually takes place, I have shown here and elsewhere that involvement with the lexicon and grammar occurs very early (Hooper 1976a, 1981; Bybee 2000). Examples given earlier are the word frequency effect in lexical diffusion, the lower rate of deletion of morphemic /t/ and /d/ in American English, the appearance of aspiration for earlier /s/ before a vowel at the end of a word in Cuban Spanish, and the examples described as the Timberlake Effect. Once involved with the lexicon and morphology, alternations become more and more entrenched and can only be undone by the strong pattern pressures we know as analogical leveling.

The two hypotheses of words as storage units and the immediate and permanent effect of sound change on words explain why most phonological alternations occur at word level, that is, why word boundaries block phonological “rules” and morpheme boundaries do not. I have also shown here that the tendency of final word boundaries to act like consonants follows from these hypotheses and from the fact that the segment most frequently following a final word boundary is a consonant.

The larger theoretical message is that use impacts representation, a point often made in studies of the discourse origins of syntax and a point that is also being made by connectionist modelers of language. As I have argued here, many cases of what was earlier postulated to be structural turn out to be derivable from the way language is used. I also see many instances where a careful look at use brings to light new data that was ignored before. I suspect that a usage-based perspective will be very productive in generating new questions and new answers in phonology.