Overview ======== This directory contains a C tokenizer for the SIMP language, which is described further below. A test program is also provided, allowing you to run the tokenizer on a source code file containing (valid or invalid) SIMP code. The program calls the the tokenizer to produce a list of the tokens found in the file, and displays the resulting list. Repository contents =================== This repository contains: this README file, an Examples directory containing (valid and invalid) programs written in the Simp language, tokens.h and tokens.c containing the tokenizing code, tokenizer.c which contains the main routine to invoke the tokenizing code, tokenizer: the executable built from the three C files, and a makefile which can be used to compile tokenizer The data types, constants, and functions needed for tokenizing are specified in tokens.h. The Simp language ================= The tokens in Simp are as follows: Constants: begin with an uppercase alphabetic character, followed by zero or more digits Variables: consist of one or more lowercase alphabetic characters. Integers: consist entirely of digits. Symbols: consist of any one of the following single characters: + - * / = The modulo operator is the only two-character operator: %% The tokenizer does not need to know/check the validity of the programs it is given, the grammar for Simp is shown below for completeness only. Program --> Statements Statements --> Statement Statements --> Statement Statements Statement --> Variable = Expression Expression --> Expression Operator Value Expression --> Value Value --> Variable Value --> Constant Value --> Number Operator --> [+*/-] Operator --> %% Constant --> [A-Z] [0-9]* Variable --> [a-z]+ Number --> Integer Number --> FixNum Integer --> [0-9]+ FixNum --> [0-9]+.[0-9]+ Examples of Simp ================ The test cases in the examples directory are varsOnly - not full statements, just variables and whitespace basics - basic statements of form "variable = value" simpleExpr - statements with expressions of form "variable = value op value" complexExpr - statements with compound expressions badToken - includes an invalid symbol