NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / compiler / 2232 < prev next >

Wrap

Text File | 1993-01-23 | 6.3 KB | 226 lines

Newsgroups: comp.compilers Path: sparky!uunet!cs.utexas.edu!qt.cs.utexas.edu!yale.edu!yale!mintaka.lcs.mit.edu!spdcc!iecc!compilers-sender From: jan@klikspaan.si.hhs.nl Subject: Recursive descent parsing is better than LL(1) Reply-To: jan@klikspaan.si.hhs.nl Organization: Compilers Central Date: Fri, 22 Jan 1993 16:41:50 GMT Approved: compilers@iecc.cambridge.ma.us Message-ID: <93-01-160@comp.compilers> Keywords: parse, LL(1), theory References: <93-01-122@comp.compilers> <93-01-123@comp.compilers> Sender: compilers-sender@iecc.cambridge.ma.us Lines: 211 Barton Christopher Massey < bart@cs.uoregon.edu > writes: > I will personally choose to use YACC or something similar for parsing until > I can implement an LL(1) parser generator which will take a reasonably > ad-hoc grammar and either transform it to an LL(1) grammar or report > failure to do so, "most of the time". > In order to construct such a tool, I would like to prove or disprove the > following conjectures: > LL(1) Transformability: There exists an algorithm > which, given any grammar G which is unambiguous and > describes an LL(1) language L(G), produces an LL(1) > grammar G' for the language L(G). > LL(1) Decidability: There exists an algorithm which, > given any grammar G which is unambiguous and describes > a language L(G), decides whether L(G) is an LL(1) > language. > After some thought, it still seems most likely to me that LL1T is true, > but I'm skeptical about LL1D . I posted these conjectures to > comp.compilers previously, but didn't get too far with them then. In a > way, I'm surprised if such apparently important questions haven't been > carefully studied, but I haven't found any very useful references yet. Bill Purvis <W.Purvis@daresbury.ac.uk> writes: > In Computer Journal, Vol 11, number 1, May 1968, J. Foster describes SID, > a syntax improving program which accepts a fairly arbitrary BNF > description and applies the neccesary transformations to produce an LL(1) > grammar (assuming that the language is LL(1). I think that that fully > covers the first conjecture. > The program is pretty comprehensive and while it might not cover all > cases, I suspect that in practical terms it probably covers the second > conjecture. This discussion is a bit blurred by the use of the clauses `given any grammar G' and `reasonably ad-hoc grammar'. I agree with Bill Purvis that SID does a good job on some fuzzy set of `reasonably good ad-hoc grammars'. The general case `given any unambigous grammar G' cannot be solved by SID if it only uses the transformations substitution, elimination of left recursion and factorization. I present a simple contra-example: Consider the language defined by the next CFG: S -> A S -> B A -> `x' A `y' A -> `a' B -> `x' B `z' B -> `b' SID uses elimination of left recursion, factorisation and substitution to search for a better grammar. The problem here is that A and B have `x' in common in their FIRST-sets. So let's substitute A and B in the S-rules giving: S -> `x' A `y' S -> `a' S -> `x' B `z' S -> `b' A -> `x' A `y' A -> `a' B -> `x' B `z' B -> `b' Two S-rules now start both with `x' so we have to factorize them giving: S -> `x' S1 S -> `a' S -> `b' A -> `x' A `y' A -> `a' B -> `x' B `z' B -> `b' S1 -> A `y' S1 -> B `z' Now after solving the problem with the S-rules, we get the same problem with the S1-rules. We can repeat both substitution and factiorisation on S1 giving: S -> `x' S1 S -> `a' S -> `b' A -> `x' A `y' A -> `a' B -> `x' B `z' B -> `b' S1 -> `x' S2 S1 -> `a' S1 -> `b' S2 -> A `y' `y' S2 -> B `z' `z' Giving rise to the same kind of conflict, but now on S2-rules. Obviously the proces of substitution and factorization can be repeated indefinitedly. The loop suggests the conjecture of decidability doesn't hold. Foster expresses this problem with `SID may also report failure because its attempt to transform the grammar causes it to loop.' But let's return to the initial discussion: recursive descent versus YACC. Maybe we underestimate the power of recursive descent technique if only we could get rid of the idea each nonterminal should be mapped to one function recognizing the right sides of its productions. Let's try a different approach: Each function recognizes right sides of one or more nonterminals, returning as a result the kind(s) of nonterminals recognized. Let's return to the example grammar above: S -> A S -> B A -> `x' A `y' A -> `a' B -> `x' B `z' B -> `b' As we cannot discern A from B let's integrate them into one function AorB. This should answer the question `What did you see' with : neither (call this an error), A, B, or Both. AorB might be modelled with a statediagram like: +-----------+ +------A-->-+ `y' +---y---> result is A | +-----------+ +--------+ +---+--+ =-----------+ >----+ `x' +---x-->-+ AorB +---B----+ `z' +---z---> result is B +--------+ +-+-+--+ +-----------+ | | +-----------+---x---> result is A V +----Both---+ `y' or z' + neither +-----------+---y---> result is B | conflicting inputs ---+---------------------------------> result is neither As AorB has no Both exit, this branch could be pruned from the diagram. In C this would look like: #define neither 0 #define Both 1 #define A 2 #define B 3 char lookahead; int match(char c) { if c == lookahead) { lookahead = getchar(); return 1; } else return 0; } int AorB() { int r; if (match('x')) { switch (AorB()) { case neither: return neither; case A: if (match('y')) return A; else return neither; case B: if (match('z')) return B; else return neither; case Both: /* omitted */ } } else return neither; } Note: This approach is top-down and recursive descent, but not predictive parsing! I suppose others have figured this out before. I have no label to stick upon it. Maybe someone else on the net. Regards, Jan Schramp -- Jan Schramp, Haagse Hogeschool - Sector Informatica. Louis Couperusplein 19, 2514 HP Den Haag, Netherlands. E-mail: jan@si.hhs.nl -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.