NetNews Usenet Archive 1992 #26

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #26 / NN_1992_26.iso / spool / bit / listserv / csgl / 1402 < prev next >

Wrap

Internet Message Format | 1992-11-08 | 28.2 KB

Path: sparky!uunet!stanford.edu!bu.edu!wupost!darwin.sura.net!paladin.american.edu!auvm!CCB.BBN.COM!BNEVIN From: bnevin@CCB.BBN.COM (Bruce E. Nevin) Newsgroups: bit.listserv.csg-l Subject: language issues Message-ID: <CSG-L%92110613223540@VMD.CSO.UIUC.EDU> Date: 6 Nov 92 19:11:10 GMT Sender: "Control Systems Group Network (CSGnet)" <CSG-L@UIUCVMD.BITNET> Lines: 576 Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU [From: Bruce Nevin (Mon 92112 07:38:16 through Fri 92116 14:05:06)] Awfully hard getting any consecutive time to respond to all the interesting stuff going by. (Avery Andrews 921026.0928) -- >I'd agree that this little story of mine would stand or fall on the >basis of careful examination of this kind of case. But if this story >falls, we're probably left with something like GB as the best bet for >a theory of grammar!!. Are the choices really so few and so constrained? This seems based upon an argument of the following form: Either A or B. If A, then something like GB If not (this little story of mine) then not B Here, A is something like the claim that "learners don't have access to negative evidence," i.e. they are not provided with starred examples, and B is the claim that they are. For non-linguists: in Generativist writings one puts an asterisk before examples that "you can't say" (at least according to the author). Hence, these are "starred" examples. (I used to call them "asterisky.") For example, the linguist might consider sets of roughly equivalent sentences like this: 1a John gave the Ramadan bake sale money to the fund. b John gave to the fund the Ramadan bake sale money. c John gave the fund the Ramadan bake sale money. On the basis of this and other sets of sentences, the linguist might postulate a rule that says you can move a "dative" phrase--"to x" after a verb with some kind of "give" meaning--to a position before the verb, and delete the preposition "to". This rule predicts that the following sentences are also equivalent to each other: 2a John donated the Ramadan bake sale money to the fund. b John donated to the fund the Ramadan bake sale money. c *John donated the fund the Ramadan bake sale money. However, for many English speakers (including Avery), (2c) is not acceptable, hence the asterisk or star. Others, focussing on the meaning of (2c), may overlook the syntactic or stylistic anomaly; Bill is evidently among these, but it is possible that he represents a third set of others for whom (2c) is not anomalous at all. Returning to the issue at hand: the claim that language learners do not have access to negative evidence seems to amount to a claim that they do not experience error when they produce syntactically anomalous utterances. The claim is usually that overt correction ("don't say X, say Y") is such a miniscule part of the language learner's experience, and covers so few of the possible errors, that it alone cannot account for the incredible amount of learning that takes place in such a short time (hence much of the structure of language cannot be learned at all and instead must be hard wired in the genome, etc, etc). Several assumptions underly this claim. Doesn't this assume that overt correction by other speakers is the only source of negative evidence? It should be obvious to students of PCT that there are many occasions for experiencing error aside from attempts of others at social control. For starters, consider the experience of not being understood and having to try again, or the experience of having the other paraphrase what one has said, either in confirmation or in replaying it to someone who did not understand clearly, both of which are *very* common experiences for children learning a language. Generativists assume that the learner is faced with a range of alternative grammars, and that learning is a process of eliminating those that don't work. On this view of how learning works, counterexamples have an enormously important and pervasive role. GB (Government-[and-]Binding Theory, now being supplanted by Chomsky's latest "minimalist" proposals) carries this to an extreme. UG (Universal Grammar), hard wired into the genome, provides for all possible languages. Learning one particular language involves setting parameters in UG so that one possibility is admitted for that parameter and the others are precluded. An example might be the choice of SOV (subject object verb) word order, as opposed to the alternatives. Having made that choice, certain other choices ripple out as entailed consequences (order of modifiers relative to modified words, etc.). Much ink has been spilled in a search for the minimal parameters whose settings cannot be predicted from the settings of other parameters, or which once set can be the basis for predicting a larger number of others, etc. (In this research climate, language differences and variation are not always so closely investigated as they might be, but that can be only a secondary side comment here.) Generativists also commonly perceive that language structure is so very complex and arcane as to defy a child learning it at all absent biological inheritance of much of the complexity in UG. It is true that Generativist descriptions of language are very complex and arcane. It is also true that not all ways of describing language are so. None of these assumptions is very convincing to me, but they do keep a number of people employed as linguists (and a number of others not employed as linguists). It also puts a great many properties of language in genetically determined UG and therefore outside the realm of perceptual control. ******************** Returning to the examples, sentence (2c) is somewhat peculiar for me, but not completely unacceptable, so I mark it with a question mark rather than an asterisk: 2c ?John donated the fund the Ramadan bake sale money. Contrast (2c) with an example that unequivocally merits an asterisk: 3. *John fund the donated. In operator grammar, the "to" may be zeroed with a verb after which it is so strongly expected as to be redundant. (In some UK dialects, it need not be immediately after the verb, e.g. "John gave it me." Similarly in Danish.) Thus, the reason given in Operator Grammar that (2c) is peculiar is that speakers of English do not have a sufficiently strong expectation of the preposition "to" occurring after "donate." One might suppose this is because "donate to" simply is not used as frequently as "give to." It is true that the dative reduction is more or less peculiar for many other verbs in the "give" set. Call them the "donate" subset, including: ? John administered [to] the fund the money ? John allocated [to] the fund the money ? John apportioned [to] the fund the money ? John communicated [to] the fund the money ? John consigned [to] the fund the money ? John contributed [to] the fund the money ? John conveyed [to] the fund the money ? John dealt [to] the fund the money ? John delivered [to] the fund the money ? John dispensed [to] the fund the money ? John distributed [to] the fund the money ? John entrusted [to] the fund the money ? John purveyed [to] the fund the money ? John relinquished [to] the fund the money ? John rendered [to] the fund the money ? John surrendered [to] the fund the money ? John transferred [to] the fund the money ? John transmitted [to] the fund the money ? John vouchsafed [to] the fund the money But a considerable number of other verbs in the "give" set are acceptable with the transposed argument order and zeroed "to". Call them the "give" subset: John accorded [to] the fund the money John allotted [to] the fund the money John assigned [to] the fund the money John awarded [to] the fund the money John conceded [to] the fund the money John furnished [to] the fund the money John granted [to] the fund the money John handed [to] the fund the money John left [to] the fund the money John lent [to] the fund the money John offered [to] the fund the money John paid [to] the fund the money John presented [to] the fund the money John proffered [to] the fund the money John provided [to] the fund the money John sent [to] the fund the money John supplied [to] the fund the money John yielded [to] the fund the money I don't think a "frequency of occurrence" argument holds up. Now, according to Bill's intuitions about how language works, we use words because they are associated with perceptions that we are trying to get our audience to attend to (in the environment or in imagination). Examining these two lists, I don't see any obvious differences in some sort of "to" perception associated with one more strongly than the other. Instead, the difference appears to be a pretty arbitrary, learned, socially shared perception that "to" is redundant after verbs in the "give" list, but not after those of the "donate" list. Maybe someone else can pull out some generalization that I'm missing. There is another factor, one that motivated my putting the modifier "Ramadan bake sale" in the examples of (1) and (2), and that is that by minimizing the distance between an operator and its arguments (distance in terms of intervening words that could also be candidates as arguments), we reduce the load on short-term memory for perception of operator-argument dependencies. Thus, when we have alternative orderings for the arguments of an operator we prefer to put the shortest one first (counting all its modifiers): 4a John handed the money over to the fund. b ?John handed over to the fund the money. Sentence (4a) is definitely better than the transposed version (4b). However, watch what happens when you lengthen "the money" with some modifiers: 4a John handed the Ramadan bake sale money that Kathy had collected from the first half of her list of groups over to the fund. b John handed over to the fund the Ramadan bake sale money that Kathy had collected from the first half of her list of groups Now consider the original two sets of sentences without the modifier: 1a' John gave the money to the fund. b' ?John gave to the fund the money. c' John gave the fund the money. 2a' John donated the money to the fund. b' ?John donated to the fund the money. c' ??John donated the fund the money. Notice that (1b') is a bit awkward but (1b) is more acceptable: 1b' ?John gave to the fund the money. 1b John gave to the fund the Ramadan bake sale money. This is seen in the "donate" sentences as well: 2b' ?John donated to the fund the money. 2b John donated to the fund the Ramadan bake sale money. The fronting of the second argument of "give" (with its argument- indicator "to") is motivated by its length with its modifiers in (2b), but there is no such motivation in (2b'). Control for minimizing the distance between operators and their arguments is presumably universal, but to say it was a property of Universal Grammar would be fatuous. It is surely a universal property of control systems that are controlling for sequences that may be interrupted by other controlled sequences. So likewise, I believe, many other claims for UG are really claims for the universality of control. (Bill Powers (921024.0830) ) -- > Avery Andrews (921024.1519) -- > >So the non-occurrence of the second (a piece of `negative > >evidence') becomes accessible to the learning system. > Control theory handles the non-occurrance in terms of a reference > signal that demands the occurrance. When there is a reference signal > spcifying that some perception occur, then as soon as the reference > signal is set there is an error, which persists until the perception > occurs. In this case it is not occurrance of the sentence that > matters, but of the meaning. Maybe you've sorted it out by now. You and Avery were operating at right angles to each other throughout much of this exchange. In this instance, "non-occurrence" meant for Avery a starred example--something marked as not acceptable or not occurring in acceptable usage, or perhaps (by the reification that is normal in Generativist literature) not occurring in the grammar. At issue above is the question who if anyone other than the linguist has a reference signal for such a thing occurring; and it is not the meaning of the sentence that matters, but whether or not you can express the given meaning in that particular way. (Avery Andrews (921024.1519) ) -- >A further prediction is that the kind of optionality above will be >an inherently unstable feature of languages: if two such forms are >used with no discernable difference in meaning, the language- >acquisition systems of the speakers will be constantly reorganizing >without being able to find an error-free configuration, & >presumably at some point one of the forms will other drop out, or >they will acquire subtly different meanings Something like this was advanced by Kurylowicz, re the classic observation that "doublets" (specifically an inherited but "irregular" form alongside the form used by an innovating group, not necessarily a younger generation, that used analogy to create a more "regular" form) tend to become differentiated semantically. I think this is part of a more general process whereby distinctions tend to be exploited as differences that make a difference, or else are no longer maintained. But this is a *social* process (i.e. ongoing negotiation of agreements about what a given distinction either constitutes or means). Your formulation here does not differentiate between this social process and the processes of individual language learners, hence Bill's inference: >The implication is that the mature speaker will come to express the >same meanings in the same words all of the time, never paraphrasing or >varying the wording. I'm not sure you would want to maintain that. What prevents rapid closure, smoothing out all the bumps and ripples in the process of learning a language, is precisely the fact that it is not a process that any one person can carry out in isolation, but rather a social negotiation of agreements, and the prior commitments and motivations of the participants are not necessarily commensurate with each other. Pronouncing "bird" as "boyd" constitutes a token of membership in a community that talks that way, and presenting oneself to others by using an ensemble of such tokens means different things to different people, and on different occasions. A person who can and does pronounce "bird" both ways may in fact be motivated NOT to smooth out the difference. (Use one in Brooklyn, use the other in my job as a newscaster.) As you say, >if two such forms are >used with no discernable difference in meaning But this is to say that they are used without ever experiencing error, no matter which is used. You suggest that the availability of alternative ways A and B of expressing meaning M, where A and B are not distinguished from one another (no A-circumstance where it is an error to use B, and vice versa), is itself an occasion for error. But this is a sort of meta-error, of a different logical type from the the first that you proposed, as I understand it: 1. A person hears utterance x 2. The person's "comprehension system" comes up with meaning m. 3. The person's "production system" generates in imagination a set of utterances U = {w, y, z}, where each member of U expresses meaning m. 4. The absence of utterance x from U is occasion of an error signal. Apparently, the person guesses at the meaning m. There must have been some basis for this guess--word choice, context, part of x is like part of w, part is like y, and so on. It seems to me that what one typically does with this sort of error is to perceive the other person's utterance x as an error of production or reception or both, perhaps as a repair or a change of syntactic horses in mid-sentence, etc. One's conversational response (or one's strategy as a listener or reader) includes heightened attention to those elements (words, word dependencies) about which one is less certain. One attends also to some alternative meaning m', and its alternative utterance set {p, q, r} also generated in imagination in parallel. One of these may suddenly pop up to most favored status. Socalled "garden path sentences" provide simple examples of this process: "The horse raced past the barn fell down". In the meta-error case, the person wishes to express meaning m and finds that there are alternative ways {w, y, z} for expressing it. If there is some motivation for choosing one rather than another, some difference that makes a difference to the person, then the choice is obvious and there is no error. Commonly, it seems to me, the unthinking preference is a repetition of words or of a syntactic construction recently used in the same discourse (conversation, speech, text, etc.). A more polished speaker may make a different choice with deliberate artifice, perhaps for emphasis or simply for variety. Again, so long as there is a basis for choice, I don't see any occasion for error. Error may occur for example when choice w vs. choice y constitutes a token of social membership (like the different pronunciations of "tomato," "neither," and "economics"). This is not a difference of meaning in the sense that Bill intends, I think. (That's why I say it "constitutes" instead of "means" social membership. It's performative, in the same sense that "I hereby declare this a disaster area", spoken by a duly empowered official, constitutes a change to the status that the phrase "disaster area" by itself means.) You (Avery) had specifically in mind examples like (2a') and (2c'): 2a' John donated the money to the fund. 2c' ?John donated the fund the money. You propose that just because the learner's "production system" at time t produces both of these indistinguishably to express a given meaning, then every time the learner hears (2a') an error signal results. The language learner's response, you proposed, is to eliminate that error by reorganizing, specifically, by altering the production system so that it no longer produces (2c'). I think that a one-step elimination, analogous to marking (2c') with an asterisk (or question mark), will not do. Such a mechanism must surely function to eliminate production of (1a') every time the learner hears (1c') and to eliminate production of (1c') every time the learner hears (1a'): 1a' John gave the money to the fund. 1c' John gave the fund the money. Instead, perhaps there is an incremental change in the expectability of one form or another. Since there may be three or four or more alternative forms in some cases, the best move it would seem to me is in the opposite direction: to *raise* the expectability of the form that actually occurs, rather than lowering that of the ones that do not occur. But no error signal is needed for this, and no reorganization. In fact, I see no obvious reason to suppose that a construction such as (2c') could not hang around for the life of the language learner--one of many constructions predicted on analogy to forms that the learner has heard ("predicted by the rules of the grammar" in the customary hypostasis), but not itself actually encountered. I think that it would be untenable to claim otherwise in the face of language novelty, language change, and our capacity for understanding the meaning of utterances across dialect variation, non-native usage, speaker error, difficulties of hearing, and so on. This interpretation accords with the fact that the acceptability of sentences is not a binary, yes/no property, but rather a graded property, and probably complexly so (that is, on more than one graded parameter in parallel, e.g. differentiated by subject-matter domain for starters). So I think there are other ways of showing that language acquisition, like all control processes, is guided by perceptual error ("negative evidence"), and that the alternatives are not limited to your proposal and GB. -=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=- (Bill Powers (921025.1900) ) -- > desired > meaning (nonverbal perception) > | > perceived meaning-->COMP-------error ---> > | | > | | > input function output function > | | | | | | | | > sentences sentence-variations > | | > <---Imagination or real words <----- > Going up the left side, you perceive sentences, which then give rise > through some sort of input function to perceived meanings. The > perceived meanings (which are not in words, but in terms of > perceptions of the things that the words mean) are compared directly > with the desired meanings. The error acts through an output function > to vary the construction of sentences. These are the sentences that > are perceived, closing the loop. This diagram makes the point about arriving at outputs by continuous control of perception rather than by top-down realization of some sort of archetype. The actual flow of control is probably quite different. Penni asks (penni sibun 921026): >where are the symbol structures coming from? why are any necessary? >what does meaning have to do with anything? why can't speech actions >be just like other actions, which presumably aren't mediated by >``meaning''? I would suggest that all perceptions "have" meaning, that is, any given perception is linked in associative memory to other perceptions. I'm not sure precisely what you (penni) mean by the phrase "mediated by meaning". Let me lay out a bit of what I think is going on. Even before we consider meanings, language perceptions involve most of the proposed levels of the perceptual hierarchy, if not all. Phonology alone runs from intensity level up to at least the event level, possibly the program and even principle levels if some ways of doing phonology survive the eventual PCT shakeout. Morphemes and words are event perceptions in Bill's usual estimate of things. (Bill: Is this problematic if syllables or semisyllables turn out to be events? If so, word events would be constituted of syllable events.) Nonverbal perceptions obviously involve all the proposed levels of the hierarchy, independently and in parallel to these language perceptions. Event-level word perceptions are linked with nonverbal perceptions. Bill has suggested that the input of a category perception ECS can be satisfied by various nonverbal perceptions (glimpse wagging tail, barking sound ==> dog) or by a word ("dog") or both. I have suggested that the link is in associative memory, and that it is not a simple 1-1 correlation of words and categories. Nonverbal Perceptions Word Events P1 P1' P1" <----> W1 P1' P16 P17 <----> W2 P2 P3 P4 P5 P6 P6' P6" <----> W6 P7 P8 P9 P10 P11 P12 P13 P13' P13" <----> W13 P14 P15 . . . Here, I represent word W1 as evoking perceptions P1, P1', and P1", word W2 evoking perceptions P1', P16, and P17, and so on. This sort of partial overlapping and intersecting is typical of word meanings. ("Dog" links through one sort of perception back out to the word "hound" and through another sort of (sexist) perception back out to "broad." Each of these words evokes other perceptions, which in turn have yet other words associated with them.) Conversely, perception P1' evokes both word W1 and word W2. Now suppose I want to communicate perception P1' to someone. I can pick either word W1 or word W2 (in our simplified universe here). My choice might be determined in part by extraneous perceptions associated with one word but not the other. As Penni and Avery have suggested, my choice is constrained by considerations that apply to the words qua words with respect to other words already spoken or planned to be spoken. Among these considerations are the operator status of the word, the other words it requires as arguments. Bill argues that these factors are determined by the perceptions (meanings) associated with the words: You can't have a "jump" perception without a perception of a thing jumping. Against this is the proposition that there are only certain kinds of perceptions that you can talk about, namely, those provided for in your language and culture. (You can talk about more inchoate perceptions, but only with difficulty, and with far less assurance of being understood.) Also among these considerations are the reductions available when the word enters on its arguments. This is determined largely by convention. Some redundancy allowing for reduction is determined by universal criteria such as word repetition, but as we saw earlier with respect to the "give" and "donate" lists of verbs some of it is learned convention, differing from one language to another, with a determinable history, and with many regular correspondences among the different languages spoken by people whose ancestors once spoke the same language. One learns these idiosyncrasies and controls one's perceptions of one's behavioral outputs for conformity to them, for the sake of constituting one's membership in one social group or another. > desired > meaning (nonverbal perception) > | > perceived meaning-->COMP-------error ---> > | | > | | > input function output function > | | | | | | | | > sentences sentence-variations > | | > <---Imagination or real words <----- In this loop, you have to understand that the meanings are not on a higher level of the perceptual control hierarchy, they are in parallel nonverbal parts of the hierarchy. The labels "input function" and "output function" conceal a great deal of complexity controlling for words as event perceptions, operator-argument dependencies, reductions, and "style" perceptions of many sorts (ranging from "that's a cliche" to "what kind of person talks this way"). And as Martin emphasizes, the "desired meaning" component probably equates to one's perception of the audience "getting it," in imagination or in the environment. I'm sorry, I can't seem to do any better than that just now, and I see by my date stamp that it has taken me all week to get even this far. Maybe I can do better next week. -=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=- This is what I started with, but I'm putting it last as being of least interest to non-linguists. (Avery Andrews 920211.1049) -- The terms S, NP, `noun phrase', and `clause', and the familiar tree structures (and rewrite rules, labelled bracketings, etc.) are elements of certain systems for describing language, based upon phrase-structure grammar. Insofar as they are not elements of other systems for describing language, and absent any demonstration that the phrase-structure systems are "correct" and the other systems are "incorrect" with respect to the status of these elements, it seems that these things are not elements of language. In operator grammar, these seeming elements are seen to be byproducts of relations and processes involving simpler elements. Most perspectives agree that these simpler elements--e.g. words, morphemes, sequences of them-- are elements of language. The relations are relations of word dependency (and word classes based on these dependencies), and the processes involve assertion ("predication") plus reduction of the actually pronounced form of a word when the gain on control for its being pronounced is reduced. The two perspectives are in general agreement about much of this, though their ways of talking about it differ. There are other ways of describing language that see S, NP, etc. and as byproducts of the simpler elements and relations that they do countenance. However, these other systems are like the familiar phrase-structure-based systems in that they, too postulate ancillary metagrammatical elements and relations as somehow basic. Harris's aim was a "least grammar", since any additional objects and relations can only obscure the objects and relations by which language "carries" information. Bruce bn@bbn.com