Subtopic Notes
16.2 Translation Software
16. System Software
Interpreter
- Reads line by line and checks for error
- Executes the program line by line and stops when error is found
- Can execute program without creating a translated version
- Analysis and code generation run for each code line
- Each line is executed as soon as the intermediate code is generated
Stages of Compilation
- Lexical analysis
- Converts source code into tokens (smallest meaningful units).
- Comments removed
- The symbol table stores identifiers and their attributes
- The keyword table stores reserved words and operators
- Syntax analysis/Parsing
- Checks tokens against the grammar of the language
- Generates a parse tree (or abstract syntax tree)
- Reports syntax errors if rules are violated
- Code generation
- Translates the parse tree/intermediate representation into object code or machine code
- Produces intermediate code (e.g., reverse polish notation) before final output
- Optimization
- Improves the generated code for efficiency
- Aims to reduce execution time and/or memory usage without changing program behavior
Expressing Grammar of language
Syntax Diagram
- Visual way to represent the grammar of a programming language
- Flow: Start → Sequence of symbols/constructs → End.
- Helps understand valid sequences of statements or expressions
- The two methods of representing syntax (Diagram and BNF) directly maps onto one another
Backus-Naur Form (BNF)
- Context-free grammar commonly used by developers of programming languages to specify the syntax rules of a language.
- Used to represent the rules of a grammar.
- Usefulness of BNF
- BNF is a general approach which can be used to describe any assembly of data
- BNF can be used as the basis for an algorithm.
- Uses a range of symbols and expressions to create production rules
- It consists of:
- A set of terminal symbols
- A set of non-terminal symbols (<> used to represent)
- A set of production rules
- Symbols
- <> Non terminal symbols
- | Denotes alternative
::=Equal to
Format:
LHS ::= RHS
Examples
<digit> ::= 0|1|2|3|4|5|6|7|8|9
<fullname> ::= <title><name><name>
<title> ::= Mr|Mrs|Ms|Miss|Dr
<addition> ::= <number>+<number>
<number> ::= <sign><integer>|<integer>
<integer> ::= <digit>|<digit><integer>
<sign> ::= +|-
Example of Recursion
<number> ::= <digit>|<digit><number>
<digit> ::= 0|1|2|3|4|5|6|7|8|9
Reverse Polish Notation (RPN)
- A method of representing an arithmetic or logical expression without brackets or special punctuation.
- Uses postfix notation, where an operator is placed after the variables it acts on. For example, A + B would be written as A B +
- Compilers use RPN because any expression can be processed from left to right without backtracking.
- Advantages of RPN
- Do not need brackets, and there is no need for the precedence of operators
- Simpler for a machine to evaluate
- No need for backtracking in evaluation as the operators appear in the order required for computation and can be evaluated from left to right
- Reading RPN
- RPN expression is read left to right.
- When you find a number or operands: Push onto the stack.
- When you find an operator: Pop the last two values, apply the operator, and push the result back
- Continue until the end of the expression
- The final value in the stack is the result
- Infix to Reverse Polish Notation
- Operands (numbers/variables) → write directly to the output.
- Operators → push onto a stack.
- Pop operators from the stack to output if they have higher or equal precedence than the current operator.
- Parentheses
- ‘(’ → push onto the stack.
- ‘)’ → pop operators from the stack to output until '(' is found, then discard ‘(’.
- At the end, pop any remaining operators from the stack to the output
- Examples
- ( a - b ) * ( a + c ) / 7 → a b - a c + * 7 /
- ( a / b ) * 4 - ( a + b ) → a b / 4 * a b + -
