Typed high-level intermediate representation (THIR)
The THIR is used as a transition phase between the HIR and the MIR. It progressively rewrites the IR into lower-level constructs until it can be readily lowered into MIR. Compared to HIR, the THIR is:
- Combined (there is only one
Model
which contains all of the items from all model files) - Fully typed (expressions always have full, correct types - we abort compilation if there were errors)
- Destructured - declaration items can only declare a single identifier
- Fully resolved (identifiers do not have their own names, they simply point to the item they are for)
- Desugared (since type information is now available, we can perform more desugarings)
- Expressions are boxed rather than arena allocated
The language constructs made available in THIR are an attempt to provide rewriting stages with a convenient representation which is not too high-level and complex to process, but not so low-level as to be cumbersome to use.
Structure
The root THIR node is the Model
which contains arenas for each kind of item. Since there is only one Model
, the
item indices now globally identify the items in the program. Each item contains storage for the expressions it owns in
its ExpressionAllocator
. Expressions contain their types, which are computed during construction. A builder API is
provided (ExpressionBuilder
) for creating expressions.
Lowering from HIR
Lowering involves adding the items from the HIR into the THIR model. This is done in topologically sorted order, so we only have to visit each item once. One exception is that functions with bodies are lowered in two stages: first, the function signature is added, then the body is added to it after all items have been processed. Otherwise identifiers in the function bodies could refer to items not yet added to the THIR.
Desugarings
The THIR involves several semantic desugarings from the HIR.
Destructuring variable declarations are rewritten using multiple declarations:
HIR syntax | Desugaring |
---|---|
|
|
Type aliases are removed as they are resolved:
HIR syntax | Desugaring |
---|---|
|
|
2D array literals are rewritten using array2d
:
MiniZinc syntax | Desugaring |
---|---|
|
|
|
|
Indexed array literals are rewritten using arrayNd
:
MiniZinc syntax | Desugaring |
---|---|
|
|
|
|
Array access is rewritten into a call to '[]'
:
HIR syntax | Desugaring |
---|---|
|
|
Slicing is rewritten using mzn_slice
function calls:
HIR syntax | Desugaring |
---|---|
|
|
|
|
Performing a tuple/record access on an array is rewritten to perform the field access on each element of the array using a comprehension:
HIR syntax | Desugaring |
---|---|
|
|
|
|
Case expressions are rewritten such that the destructuring is moved into the branch RHS, and pattern identifiers which
create new variables are replaced with the wildcard _
pattern:
HIR syntax | Desugaring |
---|---|
|
|
Pattern matching in comprehension generators is rewritten using a case expression in the generator's where
clause.
Destructuring is rewritten as assignment generators:
HIR syntax | Desugaring |
---|---|
|
|
Model transformations
At the THIR level, several model-to-model transformations occur which progressively remove language constructs which do not exist in the mid-level intermediate representation.
The order of these transforms is often important as certain information or language constructs may be removed in one transform, and so cannot be used in subsequent transforms.
The transforms which occur are:
- Output item removal
- Generating constraints for domains
- Top-down typing
- Type specialisation
- Removal of overloading
- Erasure of records
- Erasure of enums
- Desugaring of comprehensions
- Erasure of option types
- Desugaring of capturing
See Model transformations for more details.