Harris, Zellig S.;
Language and Information
Columbia University Press, Dec 1987, 120 pages
ISBN 0231066627 9780231066624
topics: | linguistics | syntax | information-theory
Bampton lectures in America no.28 [Columbia U, 1986]. Describes a formal theory of language structure where entities are defined by their frequency of occurrence, rather than by phonetic or semantic properties. Sufficient regularities [in syntax] have not been found... In each language there are some relations that can be called grammatical, but a satisfactory general definition is lacking. Furthermore, grammatical relations are unique to natural language, and if we can describe language only in such terms we will be unable to compare language to anything else, not even with such close relatives as gesture on the one hand and mathematics on the other. [p.1-2] Finally, the elements on which grammatical relations hold are not adequately defined. The one type of element that is precisely established is the set of phonemes, the charcacteristic sounds of language. In general, the investigation of a field, and the defining of its entities, is carried out in a metalanguage of this field, a language of broader informational capacity than the given field. This is clearly so in mathematics and logic, where the precision as to what is in the field enables us to recognize that statements said about the field are not in it... But NL has no external metalanguage. [p.2]
A. Phonology:
it is possible to determine the phonemic distinctions in a language
by a behavioral test that does not involve the specific meaning of
words ... test with S,H both speakers of the lg:
S : utters two words (e.g. sea, see; or hard, heart)
H : says if these were repeated words or different
phonemes = most economical collection of distinctions arising in diff
contexts- e.g. the ph of pin vs the p of spin
B. Word boundaries
check the n-grams - e.g. in sentence "if he comes call me",
the probability of "ifh" is lower than other phrases.
C. Sentence boundaries
similar to above, but with words, and with a more complex stochastic
process. at some points the probabiliity of the next word returns to
the situation at the start of the sentence.
the phoneme sequence property is less interesting - cannot be tied to how
meaning arises; but the word sequence relation is of
decisive importance for the structure and meaning of sentences.
Ellipsis:
Given two sentences connected by "and" - e.g.
I knocked and I entered
John came over and John introduced Mary
etc.
suggests that the "least grammar" that accounts for John V and W goes
via John V and John W plus the zeroing of the element in the second
that appears n the same position as in the first. 8