(Ignatius G. Mattingly)
More recently, however, the perception of speech has come to be regarded by many as an “active" process basically similar to speech production. The listener understands what is said through it process of “analysis by synthesis" [Stevens and Halle 1967]. Parallel proposals have accordingly been made for reading. Thus Hochberg and Brook  suggest that once the reader can visually discriminate letters and letter groups and has mastered the phoneme-grapheme correspondences of his writing system, he uses the same hypothesis-testing procedure in reading as he does in listening (Goodman’s  view of reading as a "psycholinguistic guessing game" is a similar proposal). Though the model of linguistic processing is different from that of Bloomfield and Fries, the assumption of a simple parallel between reading and listening remains, and the only differences mentioned are those assignable to modality, for example, the use which the reader makes of peripheral vision, which has no analog in listening.
We know that all living languages are spoken languages, and that every normal child gains the ability to understand his native speech as part of a maturational process of language acquisition. In fact we must suppose that, as a prerequisite for language acquisition, the child has some kind of innate capability to perceive speech, In order to extract from the utterances of others the "primary linguistic data" that he needs for acquisition, he must have a "technique for representing input signa1s" [Chomsky 1965, p. 30].
In contrast, relatively few languages are written languages. In general, children must be deliberately taught to read and write, and despite this teaching, many of them fail to learn. Someone who has been unable to acquire language by listening -- a congenitally deaf child, for instance -- will hardly be able to acquire it through reading; on the contrary, as Liberman and Furth [Kavanagh 1968] point out, a child with a language deficit owing to deafness will have great difficulty learning
to read properly.
The apparent naturalness of listening does not mean that it is in all respects a more efficient process. Though many people find reading difficult, there are a few readers who are very proficient: in fact, they read at rates well over 2000 words per minute with complete comprehension. Listening is always a slower process: even when speech is artificially speeded up in a way which preserver frequency relationships, 400 words
per minute is about the maximum possible rate [Orr, Friedman et at. 1965]. It has often been suggested [e.g., Bever and Bower 1966; Bower, 1970] that high-speed readers are somehow able to go directly to a deep level of language, omitting the intermediate stages of processing to which other readers and all listeners must presumably have recourse.
(...) The listener is processing a complex acoustic signal in which the speech cues that constitute significant linguistic data are buried. Before he can use these cues, the listener has to "demodulate" the signal: that is, he has to separate the cues from
the irrelevant detail. The complexity of this task is indicated by the fact that no scheme for speech recognition by machine has yet been devised that can perform it properly. The demodulation is largely unconscious; as a rule, a listener is unable to perceive the actual acoustic form of the event which serves as a cue unless it is artificially excised from its speech context [Mattingly, Liberman et al. 1971]. The cues are not discrete events well separated in time or frequency; they blend into one another; we cannot, for instance, realistically identify a certain instant as the ending of a formant transition for an initial consonant and the beginning of the steady state of the following vowel.
The reader, on the other hand, is processing a series of symbols that are quite simply related to the physical medium that conveys them. The task of demodulation is straightforward: the marks in black ink are information; the white paper is background. The reader has no particular difficulty in seeing the letters as visual shapes if he wants to. In printed text, the symbols are discrete units. In cursive writing, of course, one can slur together the symbols to a surprising degree without loss of legibility. But though they are deformed, the cursive symbols remain essentially discrete. It makes sense to view cursive writing as a string of separate symbols connected together for practical (convenience; it makes no sense at all to view the speech signal in this way.
Reading as a Language-Based Skill
Our view is that reading is a language-based skill like Pig Latin or versification and not a form of primary linguistic activity analogous to listening. From this viewpoint, let us try to give an account, necessarily much oversimplified, of the process of reading a sentence.
The reader first forms a preliminary, quasiphonological representation of the sentence based on his visual perception of the written text. The form in which this text presents itself is determined not by the actual linguistic information conveyed by the sentence but by the writer's linguistic awareness of the process of synthesizing the sentence, an awareness which the writer wishes to impart to the reader.
How can we explain the very high speeds at which some people read? To say that such readers go directly to a semantic representation, omitting most of the process of linguistic synthesis, is to hypothesize a special type of reader who differs from other readers in the nature of his primary linguistic activity, and differs in a way which we have no other grounds for supposing possible. As far as I know, no one has suggested that high-speed readers can listen, rapidly or slowly, in the way they are presumed to read. A more plausible explanation is that linguistic synthesis takes place much faster than has been supposed, and that the rapid reader has learned how to take advantage of this. The relevant experiments (summarized by Neisser ) have measured the rate at which rapidly articulated or artificially speeded speech can be comprehended, and the rate at which a subject can count silently, that is, the rate
of "inner speech". But since temporal relationships in speech can only withstand so much distortion, speeded speech experiments may merely reflect limitations on the rate of input. The counting experiment not only used unrealistic material but assumed that inner speech is an essential concomitant of linguistic synthesis.