NetNews Usenet Archive 1992 #19

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #19 / NN_1992_19.iso / spool / comp / compiler / 1444 < prev next >

Wrap

Internet Message Format | 1992-08-23 | 3.7 KB

Path: sparky!uunet!dtix!darwin.sura.net!mips!sdd.hp.com!uakari.primate.wisc.edu!usenet.coe.montana.edu!ogicse!das-news.harvard.edu!spdcc!iecc!compilers-sender From: jrickard@eoe.co.uk (John Rickard) Newsgroups: comp.compilers Subject: Re: Generating LALR(1) Grammar from an arbitrary CFG. Keywords: LALR, parse, theory Message-ID: <92-08-144@comp.compilers> Date: 23 Aug 92 16:24:47 GMT References: <92-08-118@comp.compilers> Sender: compilers-sender@iecc.cambridge.ma.us Reply-To: jrickard@eoe.co.uk (John Rickard) Organization: Compilers Central Lines: 58 Approved: compilers@iecc.cambridge.ma.us kumar@ra.csc.ti.com (Sundeep Kumar) writes: : Is there a utility that accepts an abitrary CFG and either gives an : equivalent LALR(1) grammar or decides that the CFG has no equivalent : LALR(1) grammar ? Is this problem in general, solvable ? No, it's not soluble. (Which doesn't exclude the possibility of such a utility that works in the cases one's interested in.) Sketch proof: I assume the lemma: There is no algorithm to decide whether a CFG generates the language consisting of every word of its alphabet. (This result can be found in books on languages and automata, and I give a very sketchy proof below.) Given any language L on alphabet A with a CFG, take a context-free language M on a disjoint alphabet B such that M has no LALR(1) grammar. Consider the language N on the alphabet A U B consisting of all words S such that either discarding all letters in B from S yields a word in L or discarding all letters in A from S yields a word in M. If L contains all words on A, then N contains all words on A U B and so N has a LALR(1) grammar; if L does not contain all words on A, then it is not hard to see that N does not have a LALR(1) grammar (since if it did then so would M). So if it were decidable whether N had a LALR(1) grammar, then it would be decidable whether L contained all words on A, which contradicts the lemma. Very sketchy proof of lemma: This uses the well-known fact that it is not decidable whether an arbitrary Post correspondence problem is soluble. A Post correspondence problem is defined by a finite set of ordered pairs of words over a given alphabet; a solution is a finite non-empty sequence of pairs of words, each from the given set, such that the concatenation of the first word from each pair equals the concatenation of the second word from each pair. (For example, is there an English "sentence" with a word-for-word French translation that consists of the same letters in the same order, with only the positioning of the spaces different?) Now given a Post correspondence problem, we add two characters "," and ":" to its alphabet and consider the language consisting of all words on the extended alphabet except those that are generated from a solution of the Post correspondence problem by concatenating (to use the terms of the example above): the English words, separated by commas; a colon; the reverse of the string formed by the French words, separated by commas. This language has a CFG, and it contains all words on its alphabet if and only if the Post correspondence problem is not soluble; this proves the lemma. (Why does the language have a CFG? Well, it's the union of these three languages, and it's not hard to find a CFG for each of them: (a) words not containing exactly one colon; (b) words containing exactly one colon that do not form palindromes if the commas are omitted; (c) words containing exactly one colon where the substrings between commas do not match up around the colon as English words and reversals of French translations.) -- John Rickard -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.