Feel free to email me (David.Wessels@viu.ca) with anything you need clarified.
Essentially the idea is a string looks like:
(1) an opening "
zero or more of: (2) either a \ followed by anything OR anything except a \ or "
(3) a closing "
Of course, expressing (2) is the tricky part, since we need to precede \ with an
extra \ each time, so if our logical view of the regex for (2) is something like
The problem is, we cannot generally tell at compile time what data type any given variable holds, so cannot really type check them.
There is some sample code in one of the lex/yacc examples, though obviously not targetted at this specific project.
All we can really check are expressions with statically-known data types (i.e. those that use only literal values or function calls with a restricted return type like the three read functions.
Within those limitations, we could - in our parsing rules - build up
known information about an expression, e.g. it's type. This could
possibly be classified like 0=unknown, 1=integer, 2=string, etc.
and use something like the inum field to track an expression type, e.g.
type<inum> expression
The default would be unknown, set it to something else in those rare cases when you can identify the type, e.g. it is a literal, or it is an expression whose types are all known (and valid for the operator).
/* in defs section of .lex AND .yacc files: info to track about variables */ typedef struct VI { char* name; int scopelevel; } VarInfo; /* in your union, you can use the defined type, e.g. */ %union { int inum; float rnum; char * str; VarInfo v; } %type<v> variableAgain, there is some sample code with a nodeinfo struct in the last lex/yacc examples.
One possibility is to think of the scope we're in as the nesting depth, and just track that as an integer value - put variables and their scope into the table when you see them defined, and remove them when their scope ends. When you see a variable used, it's the one in the table right now with the highest scope. (If there isn't one then the variable hasn't been declared in any accessible scope.)
A second possibility is to keep variables in the table forever, but give each scope a unique identifier, and with each scope keep track of it's "parent" scope (maybe a little scope table someplace). Then when you see a variable used, check the current scope first, then its parent scope, then its parent, etc.