CSCI 439 Final Exam Practice Questions

These don't cover all the topics/areas from this year's course material, but, taken together with the term's quiz questions should provide a general feel for the style/difficulty/kinds of questions you may be asked.

Q1

In lab 6, to carry out your code generation (from Vurbossity to either Marie or C++) you needed to traverse (walk) the abstract syntax tree and reference the symbol/function tables provided as parts of the intermediate representation (IR).

Suppose the IR used a syntax graph instead of a tree, where multiple internal nodes could point to the same subtree.

Discuss why this would/would not have altered your overall approach to code generation from the IR.

Q2

In lab 5 we applied a value-number system to an expression for a specific grammar. Based on the same grammar,

Fill in the value-number table for the expression (val * x) - (y / (val * x))
Briefly explain how the table reveals where we can avoid re-computing an intermediate value.

Note: I should have specified this was to use addexpr as the starting non-terminal

In the actual exam, a copy of the grammar would be provided here, but for the practice exam it can be found on the lab5 page.

Similar questions could be asked for the other lab5 exercises (draw a syntax graph, rewrite a statement sequence using SSA, etc), though as exam questions they would need to be significantly shorter than the lab5 versions.

Q3

Many languages use dynamic typing: a variable can be assigned a value of any data type, e.g.

   x = 3;
   y = "foo"
   z = y
   y = x
etc

In such languages the data type is associated with the current value stored in the variable.

Suppose we have such a language with data types int, real, and string, and the language has the following type-handling functions built in:

typeof(value) returns the datatype of the value as one of three constants: INT, REAL, or STRING.
convert(value,type) returns a converted version of the value, where type is one of INT,REAL,STRING e.g. convert(3,STRING) might return "3"

The compiler is expected to insert additional code to detect and generate any needed type conversions type conversions as the program runs. For example, on encountering x + y it would need to generate runtime code that checks the types of x and y and applies any necessary conversion before performing the +.

Outline a way in which the compiler could generate and insert the necessary code for these runtime operations.

Q4

Suppose the programming community agreed upon a standard set of 3-address linear code instructions that could be used to effectively represent the intended behaviour of most common programming languages.

Discuss how such a standard would/could change the way we develop compilers, emphasizing any potential benefits and any potential problems you see arising from this.

Q5

Suppose you have a CFG whose rules currently support expressions using negation (e.g. -x) and subtraction (e.g. x-y), but you now need to add support for post-decrement (e.g. x--) and pre-decrement (e.g. --x).

Discuss how much more (or how much less) complicated your scanner and parser is likely to become (a) if whitespace is mandatory between operators, and (b) if whitespace is optional between operators.

Q6

Discuss how your design priorities and expectations would differ if you were developing a compiler for a general purpose language like C++ as compared to developing a compiler for a visual/graphical educational programming language like Snap.

Q7

Suppose you were working on a compiler for Vurbossity, and were basing (at least part of) the intermediate representation on three-address codes.

1. Make up a set of 10-15 such codes to try and cover core operations in the IR.

2. Discuss which (if any) Vurbossity features could not be easily represented using your given set of codes and why.

Q8

Given the following code segment (some short series of assignments of arithmetic expressions of variables and constants here), translate it into (i) one-address (stack) code, (ii) three-address code with naming in an SSA form

Q9

Describe, as accurately as you can, the relationship between the size of a source code program (in number of tokens) and the size of (i) its parse tree, (ii) its abstract syntax tree.

Q10

We considered the idea of generating a control flow graph as part of an IR based on blocks of linear code and jumps/branches between them.

Given the following intermediate representation of code in three-address form, (some chunk of code would go here):
(i) identify the basic blocks within the segment by the first and last statement in each block,
(ii) give a list of the directed edges in the control flow graph for the segment.

Q11

Languages that support 'goto' statements often only restrict the destination for the goto to labels within the same subroutine. If you were tasked with adding such a restriction to a general purpose language, would you attempt to make it part of the context free grammar, or part of the context sensitive checking? Why?