*********************************************************************************
Basic Types
*********************************************************************************


Scalar Types
================================

Integer Types
----------------------------------------

There are signed and unsigned integers for various bitsizes

``Byte``
    8 bit unsigned integer

``Int32, UInt32``
    32 bit signed and unsigned integer

``Char``
    32 bit unicode code point

``Int64, UInt64``
    64 bit signed and unsigned integer

``Int, UInt``
    architecture dependent signed and unsigned integer


Semantics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The semantics of builtin unsigned and signed integers is defined via an
embedding into ℕ or ℤ. This embedding is defined by an embedding function and a
proof that it is an embedding (i.e. it is injective).

In the following we show the necessary definitions for ``UInt32``.

.. code-block::

    UInt32.toNatural:   UInt32 → ℕ
    UInt32.fromNatural: ℕ → UInt32        -- modulo 2^32

    UInt32.embedded: ∀ n: fromNatural (toNatural n) = n
    UInt32.embedded: ∀ n m: toNatural n = toNatural m → n = m

    UInt32.(≤) (n m: UInt32): Prop :=
        toNatural n ≤ toNatural m

    UInt32.(≤?) (n m: UInt32): Bool

    Unit32.bitSize: ℕ      -- bitsize is 'n + 1', cannot be zero

    UInt32.(+) (n m: UInt32): UInt32 :=
        fromNatural (toNatural n + toNatural m)

    UInt32.(-) (n m: UInt32): UInt32 :=
        fromNatural (toNatural n + 2^(succ bitsize)- toNatural m)


Compile to Javascript
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For the node platform and the browser, scalar values up to the bitsize of 32 can
be represented as javascript numbers. 64 bit scalars have no direct
representation in javascript. We have to generate an object with two 32 bit
sized numbers.

This workaround is necessary although javascript numbers are 64 bit floating
point values. However it is not possible to do 64 bit integer arithmetic in
javascript on 64 bit floating point values.

With the ``x|0`` annotation we can force javascript to do signed 32 bit integer
arithmetics on javascript numbers. The expression ``x >> 0`` converts 32 bit
integer as well. ``x >>> 0`` converts to an unsigned 32 bit integer (i.e. ``-1
>>> 0`` is converted to ``0xff_ff_ff_ff``).

Signed and unsigned integer arithmetic is the same. Only the javascript
comparison operators ``<=``, ``<``, ... give wrong results. Before doing the
comparisons, it is necessary to add the lowest negative number
``0x8000_0000`` which is :math:`-2^{31}`. This shifts the number zero to the
lowest negative number, i.e. all other numbers are greater or equal to this
number.


Compile to Machine Code
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


If compiling to machine code (e.g. via LLVM or Rust) the situation is different.

Scalar types can be allocated on the stack. This is possible to bitsizes up to
128 (or maybe in LLVM even more).

The code is fastest if all scalar objects are allocated on the stack and scalar
objects within other objects are completely within the surrounding object. I.e.
there are no pointers to scalar objects (they are *unboxed*). This creates two
possible problems:

Garbage collection:
    Pointer occupy a machine word and the machine number occupies a machine word
    as well. The runtime cannot distinguish between a machine number and a
    pointer into the heap.

    Ocaml resolved this problem by making the machine numbers of size
    :math:`2^{31}` or :math:`2^{63}` and representing the number :math:`i` by
    the number :math:`2i + 1`. Therefore in machine numbers the least
    significant bit has always the value 1. Since heap locations are always word
    aligned the corresponding pointers have a least significant bit of 0. The
    garbage collector can recognize pointer into the heap by looking at the
    least significant digit.

Polymorphic Functions:
    Generic functions on objects pointing into the heap need only one machine
    code representation for all its possible types.

The most efficient and closest to compilable mainstream languages would be to
represent all scalar types which can fit into a machine word by the
corresponding machine word and represent scalar types which cannot fit into a
machine word (e.g. ``Int64`` on 32 bit machines) by pointer to a boxed value on
the heap.

Polymorphic arrays are then always sequences of machine words. Character arrays
on 64 bit machines need 64 bits per character (however strings remain packed).

The garbage collector needs type information. It cannot know by just looking at
a word if it represents a scalar value or a pointer into the heap. It has to
know the layout of each stack frame and the layout of all objects on the heap.

.. note::
    More detailed analysis needed!


Floating Point
----------------------------------------