Semantic Analysis
From Wiki**3
- Abstract syntax tree; nodes
- Declarations and types
Representing Types
- Symbols and types
- Symbol table
- Types in variable and function declarations
Any program in any language must manipulate objects to perform its various tasks. These objects are characterized by their structure, which is mapped to the programmer level as data types.
Data types may be explicit, as in the C/C++ family of languages, or they may be inferred from the primitive objects and the operations they are subject to (e.g. CAML).
The Symbol Table
A interface pública da tabela de símbolos é a seguinte (foram omitidas todas as partes não públicas, assim como os métodos de construção/destruição):
<cpp> namespace cdk {
 template<typename Symbol>
 class symbol_table {
 public:
   void push();
void pop();
bool insert(const std::string &name, std::shared_ptr<Symbol> symbol);
bool replace_local(const std::string &name, std::shared_ptr<Symbol> symbol);
bool replace(const std::string &name, std::shared_ptr<Symbol> symbol);
std::shared_ptr<Symbol> find_local(const std::string &name);
std::shared_ptr<Symbol> find(const std::string &name, size_t from = 0) const;
}; </cpp>
- push - create a new context and make it current.
- pop - destroy the current context: the previous context becomes the current one. If the first context is reached no operation is performed.
- insert - define a new identifier in the local (current) context: name is the symbol's name; symbol is the symbol. Returns true if this is a new identifier (may shadow another defined in an upper context). Returns false if the identifier already exists in the current context.
- replace_local - replace the data corresponding to a symbol in the current context: name is the symbol's name; symbol is the symbol. Returns true if the symbol exists; false if the symbol does not exist in any of the contexts.
- replace - replace the data corresponding to a symbol (look for the symbol in all available contexts, starting with the innermost one): name is the symbol's name; symbol is the symbol. Returns true if the symbol exists; false if the symbol does not exist in any of the contexts.
- find_local - search for a symbol in the local (current) context: name is the symbol's name; symbol is the symbol. Returns true if the symbol exists; false if the symbol does not exist in the current context.
- find - search for a symbol in the avaible contexts, starting with the first one and proceeding until reaching the outermost context. name is the symbol's name; from how many contexts up from the current one (zero). Returns nullptr if the symbol cannot be found in any of the contexts; or, the symbol and corresponding attributes.
Type Checking: Using Visitors
Type checking is the process of verifying whether the types used in the various language constructs are appropriate. It can be performed at compile time (static type checking) or at run time.
The type checking discussed here is the static approach, i.e., checking whether the types used for objects and the operations that manipulate them at compile time are consistent.
In the approach followed by CDK-based compilers, code generation is carried out by visitors that are responsible for traversing the abstract syntax tree and generate, evaluating each node. Node evaluation may depend on the specificities of the data types being manipulated, the simplest of which is the data type's size, important in all memory-related operations.
Examples
Type checking example: the Tiny language
The following example considers a simple grammar and performs the whole of the semantic analysis process and, finally, generates the corresponding C code. The semantic analysis process must account for variables (they must be declared before they can be used) and for their types (all types must be used correctly).
Type checking example: the Simple language
The following example considers an evolution of Compact, called Simple. Where Compact forces some verification via syntactic analysis (thus, presenting low flexibility), Simple has a richer grammar and, consequently, admits constructions that may not be correct in what concerns types of operators, functions, and their arguments. Type checking in this case is built-in, since, without it, it would be impossible to guarantee the correctness of any expression.
Exercises
- Exercise 01 - The "Let" language