[GTALUG] Online Course for Lex/Yacc?
D. Hugh Redelmeier
hugh at mimosa.com
Sun Sep 16 13:47:30 EDT 2018
Ater advising against YACC, I thought I should promote it a bit.
YACC uses a formal declarative system for specifying a language
grammar (Backus-Naur Form). This has a number of nice features:
- BNF is very well described and extensively used in the literature
- it was invented to describe the programming language Algol 60.
That document is one of the classics of computer science
and is still a must-read. Here's a copy:
<https://www.masswerk.at/algol60/report.htm>
- many bastardizations of BNF have been used. The real thing is better
than most of its successors.
- a BNF grammar is a context-free grammar (Chomsky's term. Yes, that
Noam Chomsky)
- if a grammar is ambiguous, YACC will tell you. Not at runtime but
at table-building time. This is really really useful because it is
very easy to inadvertently create an ambiguous grammar -- generally
a Bad Thing. Informal recursive descent parsers never detect such
problems.
This feature is especially useful for those still learning about
language design.
- YACC has features to resolve ambiguities. They are short-cuts that
cloud the issues and I think that they are a Bad Thing.
- an LR(k) grammar (invented by Knuth before LALR) means that a
deterministic Left to Right single-pass parser (i.e. one without any
backtracking) can "recognize" the language with only a k-symbol
look-ahead. LALR(k) is a subset of LR(k) for which it is known how
to generate an efficient parser. In practical terms, k should be 1.
- when given a choice between a declarative and a procedural model,
always at least consider declarative. Declarative is much easier to
reason about as the system gets even a little complicated.
One learns a lot about language design by writing a BNF grammar and
debugging it through YACC.
lex is based on some theory (Chomsky Type 0 (Regular) languages) but
is more ad hoc.
More information about the talk
mailing list