yacc (yet another compiler compiler) allows you to provide a set of grammar rules for a language, and it produces a program that can then parse input files for that language. It also allows you to tell yacc what to do with the language component it recognizes, allowing you to effectively create a syntax checker, interpretter, or compiler for any language you want (as long as you can come up with the grammar rules for the language and C code telling yacc what to do with the various language statements).
Suppose I have some language, L, and I want to be able to create a syntax-checking program, checkL, that reads programs written in L and checks that their syntax is valid, producing appropriate error messages if not. The steps would be as follows:
The trick now is understanding how we have to express our token and grammar rules in a way that lex and yacc understand, and that also correctly describes our language L.
There are quite a few intros to lex/yacc available online, I'd recommend some browsing
to find one written in a style that appeals to you.
Some specific grammar examples for the C language are given below.
The small examples below use lex/yacc to recognize various languages/features, with some comments in the .lex and .yacc files to explain what is being done.
Each directory contains a makefile to build an interpretter for the language (called interp in each case) and a subdirectory of test cases.
Run make to build lex.yy.c, y.tab.c, and interp,
then apply the interpretter using ./interp < filename
e.g. ./interp < testcases/t1
order --> items, payments items --> item items --> item, items payments --> payment payments --> payment, payments item --> price, itype, quantity, etc payment --> ptype, amount, etc
forest --> tree forest --> tree, forest tree --> node // i.e. the root node node --> data, children children --> node children --> node, children data --> datum data --> datum, data datum --> dtype, dname, dconstraints
erd --> entities, relationships entities --> entity entities --> entity, entities entity --> ename, attributes attributes --> attribute attributes --> attribute, attributes attribute --> type, attrname, attrconstraints relationships --> relationship relationships --> relationship, relationships relationship --> entities, rname, rconstraints, entities // the "from" and "to" entities