t81-foundation

title: T81 Foundation Specification nav:

T81 Data Types Specification — Version 1.2 (Normative)

Status: Stable
Last Revised: 2026-03-01
Applies To: T81Lang, TISC, T81VM, Axion, Cognitive Tiers
Supersedes: v1.1
Purpose: Define deterministic, canonical, base-81 type semantics for the T81 ecosystem.

Freeze Exception — 2026-03-01
Scope: Additive corrections only — no existing type semantics changed.
Authorized by: @t81dev
Rationale: Prior versions documented only 4–5 of the 34 implemented type kinds. §11 adds the Extended Type Inventory; §9 is updated to remove types that are already implemented. No DCP surface content was changed.

Binary Host Execution Boundary

T81 is a ternary semantic architecture executed on binary hardware. This is an intentional design choice. The platform guarantees exact ternary correctness while leveraging 2-bit packed trits and SWAR (SIMD Within A Register) vectorization to map naturally and efficiently onto binary host CPUs.

0. Scope

This document defines:

primitive numeric types
composite structural types
canonical representation rules
deterministic arithmetic semantics
Axion visibility invariants
VM and ISA interoperability rules

It is normative for all runtime, compilation, and cognitive layers.

1. Design Goals (Normative)

All T81 data types MUST satisfy:

Deterministic Semantics
- no nondeterministic outcomes
- all arithmetic, comparisons, and structural operations MUST produce identical results across implementations
Canonical Representation
- each representable value MUST have exactly one canonical encoding
- no alternative or redundant forms permitted
Base-81 Numeric Foundation
- all numeric types MUST use balanced-ternary or base-81 semantics
- internal binary shortcuts are permitted but MUST NOT affect observable behavior
Zero Undefined Behavior
- every operation MUST define behavior for all inputs
- errors MUST resolve to deterministic fault states or Result[T, E] representations
Axion Visibility
- all canonical forms must be introspectable by the Axion kernel
- all normalization steps MUST emit metadata hooks

2. Primitive Numeric Types

Primitive types form the base of all T81 computation.

2.1 Trit

Definition

A trit is the fundamental balanced-ternary digit:

−1 → T̄  
 0 → T0  
+1 → T1

Representation

Canonical 2-bit balanced encoding (implementation detail, nondeterministic allowed internally).

Operations

unary negation
comparison
trit-wise logic (AND, OR, XOR, XNOR)

All MUST be deterministic.

2.2 T81BigInt

Definition

An arbitrary-precision base-81 integer with the following constraints:

digits are base-81 symbols: 0–80
sign is stored separately and canonically
no leading zero digits unless value is exactly zero

Canonical Form

A value MUST satisfy:

- Zero encoded as: [+] [0]
- Nonzero MUST NOT contain leading zeros
- Negative: sign bit set, magnitude canonical

Arithmetic

All operations MUST be:

deterministic
exact
overflow-free
producing canonical normalized output

Operations:

add
sub
mul
div (trunc / floor modes)
mod
pow
gcd
compare

Implementation

VM MAY implement long arithmetic, Karatsuba, FFT-based, or hardware-accelerated multiplication as long as results remain identical. Implementations MAY also spill large digit arrays to deterministic backing storage (e.g., mmap’d scratch files with fixed naming and allocation rules) when operands exceed in-memory thresholds. Such spill logic, as pioneered in the legacy hvm-trit-util.cweb, MUST remain transparent to observable behavior (no timing-dependent faults, no nondeterministic resource selection).

2.3 T81Float

Definition

A floating-point format with deterministic canonical storage but host-dependent arithmetic for complex operations.

base-81 mantissa
base-81 exponent
balanced rounding rules

Requirements (Normative)

Storage Determinism: The canonical representation (mantissa/exponent/sign) MUST be identical across platforms.
Arithmetic Dependency:
- Add, Sub, Mul: MUST be deterministic (software implementation).
- Div, Transcendentals (sin, cos, log, etc.): MAY rely on host double precision. Strict bit-exact determinism is NOT guaranteed for these operations in the current version.
No NaN, no infinities: Invalid states MUST map to a deterministic error code.
Round-trip encoding: encode(decode(x)) = x MUST hold for the canonical form.

Components

mantissa: T81BigInt
exponent: T81BigInt
sign: 1 trit

Note

Floating points are never silently lossy; any precision loss MUST be made explicit via a Result[T, E].

2.4 T81Fraction

Definition

A rational number represented as:

numerator:   T81BigInt  
denominator: T81BigInt (non-zero)

Canonicalization Rules

Fraction MUST always be in lowest terms.
Denominator MUST always be positive (+ sign).
Zero MUST be encoded as 0/1.
GCD MUST be computed deterministically.

Arithmetic

Exact and deterministic:

add
sub
mul
div
invert

No floating approximations allowed.

2.5 T81Prob

Definition

A native log-odds probability representation using:

log-odds: T81Int<N> (an internal C++ fixed-width template type, not the user-facing T81Lang T81BigInt; typically 27 trits wide)
stored in fixed-point base-φ (golden ratio) or natural log scale

Note: T81Int<N> is a C++ implementation detail in include/t81/types.hpp used to express fixed-width ternary integers at compile time. The user-facing language type for arbitrary-precision integers is T81BigInt. See §11 for the complete type inventory.

Canonicalization Rules

Value is stored as log(p / (1-p)) scaled to fixed-point integer.
T81Int representation MUST be canonical (no leading zeros).
Special values:
- 0 (zero) represents p=0.5 (log-odds 0).
- kMinValue represents p=0 (minus infinity log-odds).
- kMaxValue represents p=1 (plus infinity log-odds).

Arithmetic

Operations are performed in log-space:

+ (addition): component-wise addition of log-odds (Bayesian update).
softmax: implemented as log_softmax via deterministic ternary addition.
cmp: standard integer comparison on log-odds values.

All operations MUST be deterministic and overflow-checked.

3. Composite Types

3.1 Arrays

Properties

Fixed size or dynamically sized
Deterministic iteration order
Memory representation MUST follow:

[header][length][canonical elements...]

Allowed element types

Any T81 type.

3.2 Vectors, Matrices, Tensors

Canonical rules

Shape MUST be immutable once constructed
Dimensions MUST be ≥ 1
All values MUST be normalized
Out-of-bounds MUST be deterministic fault

Numeric Classification Note

Tensor implementations may distinguish between:

semantically exact tensors (ExactTrit, ExactInt)
non-exact float-domain tensors (HostFloat)

This classification is a semantic/result-class boundary, not a complete description of the arithmetic path used to compute the tensor.

In particular, a HostFloat tensor MAY still be produced through deterministic software-defined math in strict or deterministic execution modes. Current tensor contracts therefore treat:

exactness / strict-core promotion
canonical fixed storage availability
deterministic arithmetic provenance

as related but separate concerns.

Operations

reshape (dimensionally consistent only)
transpose
tensor contraction

3.3 Vector / Sequence Types

A Vector is a rank-1 Tensor. The canonical serialization of a Vector is identical to that of a rank-1 Tensor.

elementwise ops
norm
dot products

All MUST produce deterministic and canonical results.

3.3 Graphs

Structure

A deterministic graph consists of:

nodes: array of canonical nodes  
edges: array of (nodeA, nodeB, metadata)

Requirements

node ordering MUST be preserved
edge ordering MUST be preserved
adjacency queries MUST be deterministic
metadata MUST be Axion-visible

4. Structural Types

4.1 Records (Structs)

Requirements

fields MUST have a fixed global ordering
no implicit reordering
field names MUST be unique
all fields MUST be canonical and Axion-visible

4.2 Enums

Requirements

variants MUST be globally distinct
variant ordering MUST be preserved
payloads MUST be canonicalized

4.3 Optional and Result Types

Option[T]

MUST be either Some(value) or None
MUST NOT allow null references

Result[T, E]

deterministic error propagation
MUST NOT support exceptions
MUST encode domain errors as canonical E

5. Canonicalization Rules (Critical Normative Section)

This is the most important part of the entire spec.

Canonicalization MUST occur after:

creation
arithmetic operations
parsing
loading from VM memory
serialization
Axion inspection

5.1 Invariants for All Types

No redundant forms
- fractions reduce
- integers strip leading zeros
- floats normalize mantissa/exponent
Deterministic ordering
- arrays, structs, enums all follow strict order rules
Deterministic hashing
- MUST depend only on canonical form
Axion visibility
- Axion MUST be able to inspect normalized representation

Conformance programs: spec/conformance/t81-data-types/canonical-encoding.t81 · canonical-ordering.t81

6. Interoperability Rules

6.1 With TISC (Instruction Set)

TISC immediates MUST encode canonical forms
registers MUST contain normalized values only
decoding MUST fail deterministically for non-canonical inputs

6.2 With T81VM

GC MUST preserve canonical representations
runtime MUST reject malformed structural types
VM serialization MUST preserve canonical form exactly

6.3 With T81Lang

static type checking MUST enforce canonical invariants
semantic analyzer MUST normalize literals
IR must mark all values with canonical metadata

6.4 With Axion

Axion MUST receive metadata on:
- normalization
- overflow attempts
- uncanonical construction
- drift in recursive structures

6.5 With Cognitive Tiers

Higher tiers depend on deterministic and canonical types for:

symbolic graphs
tensor recursion
cognitive state transitions

7. Error Model (Normative)

All errors MUST be represented via:

Result[T, E]

Errors include:

division by zero
non-canonical input
malformed tensor shape
overflow in intermediate BigInt
invalid Fractions (0 denominator)
invalid enumeration variants
recursion depth failures

No exceptions or traps allowed.

8. Serialization

All serialized forms MUST:

encode canonical representations only
be round-trip stable
be fully deterministic
be Axion-inspectable
never encode invalid states

Binary and textual variants allowed; semantics identical.

9. Future Extensions (Non-Normative)

The following types are candidates for future normative specification. Types that were previously listed here but are now implemented are documented in §11.

Future type extensions MAY include:

holotensor types for high-tier cognitive operations (multi-dimensional symbolic arrays beyond the current Tensor[T] rank constraints)
probabilistic bounded distributions (richer than T81Prob; full distribution types with sampling semantics)
canonical semantic graphs (immutable content-addressed graph structures for Tier 4+ reasoning)

All MUST follow determinism and canonicalization invariants.

10. Status

This document is v1.9.0 of the T81 Data Types Standard (freeze exception applied 2026-03-01). The §2 primitive types and §3–§8 normative rules are the frozen DCP surface. §11 (Extended Type Inventory) is additive and non-DCP unless the types listed there have individually been promoted to Verified status in the Implementation Matrix (docs/status/IMPLEMENTATION_MATRIX.md).

11. Extended Type Inventory (Freeze Exception — Additive)

This section was added in the 2026-03-01 freeze exception. It documents the full set of type kinds currently implemented in the T81 semantic analyzer (lang/frontend/semantic_analyzer.cpp). The four tiers mirror the structure in spec/t81lang-spec.md §2.

Conformance program: spec/conformance/t81-data-types/type-kind-completeness.t81

11.1 Ternary Core (Tier 1 — Fully Deterministic)

These types are DCP-verified. Their semantics are normatively defined in §2.

Type	Kind	Description
`T81BigInt`	`BigInt`	Arbitrary-precision base-81 integer. User-facing language type. See §2.2.
`T81Float`	`Float`	Base-81 floating-point. See §2.3.
`T81Fraction`	`Fraction`	Exact rational `p/q`. See §2.4.
`Symbol`	`Symbol`	Interned immutable identifier. Created with `:name` literal syntax in T81Lang. Stored in the VM’s symbol pool; canonical text comparison.

Symbol literal syntax: In T81Lang, Symbol values are created using the colon-prefix literal: :my_symbol. This produces a value of type Symbol that is interned in the VM symbol pool for the lifetime of the program.

11.2 Text and Binary (Tier 1)

Type	Kind	Description
`T81String`	`String`	UTF-8 text. Immutable, canonical. Compared by code-point sequence.
`T81Bytes`	`Bytes`	Raw byte array. Immutable, canonical. No encoding assumed.

11.3 Extended Numeric (Tier 2 — DCP-Target)

These types are implemented but not yet DCP-promoted. They follow the determinism and canonicalization rules of §1 and §5.

Type	Kind	Description
`T81Fixed`	`Fixed`	Fixed-point decimal with explicit scale.
`T81Complex`	`Complex`	Complex number with `T81Float` real and imaginary parts.
`T81Quaternion`	`Quaternion`	Quaternion `(a + bi + cj + dk)` over `T81Float`.
`T81Prob`	`Prob`	Log-odds probability. See §2.5.
`T81Qutrit`	`Qutrit`	Single balanced ternary digit `{-1, 0, +1}`.
`T81Uint`	`Uint`	Unsigned base-81 integer (non-negative, no sign trit).

11.4 Binary Interop (Tier 1 — Interop Surface)

Fixed-width binary types for FFI and host-interop. These do NOT have balanced-ternary semantics; overflow semantics follow standard two’s-complement binary rules and MUST be explicitly annotated when used in T81Lang.

Type	Kind	Description
`i32`	`I32`	Signed 32-bit integer.
`i16`	`I16`	Signed 16-bit integer.
`i8`	`I8`	Signed 8-bit integer.
`i2`	`I2`	Signed 2-bit integer (values: -1, 0, +1; maps to a single trit).
`bool`	`Bool`	Boolean. Canonical values: `true`, `false`.

11.5 Collection Types (Tier 2)

Generic collection types with deterministic ordering and canonical iteration.

Type	Kind	Description
`Vector[T]`	`Vector`	Resizable ordered sequence.
`T81Vector[T, N]`	`T81Vector`	Fixed-size rank-1 tensor with base-81 dimension `N`.
`Matrix[T]`	`Matrix`	Rank-2 tensor (rows × cols). Shape immutable after construction.
`Tensor[T]`	`Tensor`	Rank-N tensor. Tier constraint: rank ≤ 9 (Tier 5 max).
`List[T]`	`List`	Singly-linked or array-backed deterministic list.
`Map[K, V]`	`Map`	Sorted key-value mapping. Key ordering MUST be canonical.
`Set[T]`	`Set`	Sorted set. Membership test MUST be canonical.
`Tree[T]`	`Tree`	Rooted tree with canonical child ordering.
`Graph`	`Graph`	Directed graph. See §3.3 for structure rules.

Generic type syntax uses square brackets: Vector[T81BigInt], Map[Symbol, i32].

11.6 Structural Types (Tier 1)

Type	Kind	Description
`Option[T]`	`Option`	`Some(value)` or `None`. No nulls. See §4.3.
`Result[T, E]`	`Result`	`Ok(value)` or `Err(error)`. No exceptions. See §4.3.

11.7 Meta Type

Type	Kind	Description
`void`	`Void`	Unit return type. Functions returning no value use `void`.

11.8 Numeric Widening Order

When types are mixed in arithmetic, the VM applies implicit widening in this rank order (lowest to highest):

T81Qutrit < i2 < i8 < i16 < i32 < T81Uint < T81BigInt
    < T81Fraction < T81Fixed < T81Float

Widening is always explicit at the TISC level (conversion opcodes I2F, I2FRAC, etc.). T81Lang performs widening implicitly within the same rank group but requires explicit casts across group boundaries.

Conformance programs: spec/conformance/t81-data-types/widening-order.t81 · widening-upper-chain.t81 · widening-binary-interop.t81

11.9 Model Weight Formats (RFC-0026 / RFC-0034)

Model weight files are opaque binary artifacts loaded through the WLOAD TISC opcode (RFC-0026 §5.15.3) after Axion policy gate approval. Two subtypes are defined; they are distinguished by a 6-byte magic prefix at offset 0 in the file.

T81WFQ — Float-Quantized Weight Format

Magic: T81WFQ (bytes 54 38 31 57 46 51).

Used by RFC-0026 WLOAD and QMATMUL. Weights are stored as quantized integer values with an associated T81Float dequantization scale. Full format definition is normatively owned by RFC-0026.

T81WTN — Ternary-Native Weight Format

Magic: T81WTN (bytes 54 38 31 57 54 4E).

Status: accepted — normative definition in RFC-0034; this entry reflects the committed format.

Used by RFC-0034 TWMATMUL, TWEMBED, and TATTN. Weights are stored as packed 2-bit balanced-trit values. No dequantization scale is present; the scale_absent flag in the header MUST be true.

Trit encoding:

→   0
→  +1
→  −1
→  (reserved; WLOAD MUST raise CanonFault)

Weights are packed 4 trits per byte, row-major. The header includes:

Field	Type	Description
`magic`	6 bytes	`T81WTN` — distinguishes from `T81WFQ`
`rank`	u8	tensor rank
`dims[rank]`	u32[]	dimension sizes, row-major
`canon_hash`	16 bytes	CanonHash81 of packed trit payload
`scale_absent`	u8	MUST be `0x01` for ternary-native files
`axion_policy_slot`	u32	optional Axion policy handle; `0` = ambient policy

Loaders that do not recognize T81WTN MUST reject the file with a FormatFault. Passing a T81WTN handle to an opcode expecting T81WFQ (e.g., QMATMUL) MUST raise a TypeFault.

The Axion ternary-weight-domain-check policy directive (RFC-0034 §3.3) controls whether the full trit payload is verified for reserved 11 encodings at load time (default: true).

Cross-References

Data Types

Primitive Types → t81-data-types.md
Composite Types → t81-data-types.md
Normalization Rules → t81-data-types.md

TISC (Ternary Instruction Set)

Machine Model → tisc-spec.md
Instruction Encoding → tisc-spec.md
Opcode Classes → tisc-spec.md

T81 Virtual Machine

Execution Modes → t81vm-spec.md
Deterministic Concurrency → t81vm-spec.md
Axion Interface Hooks → t81vm-spec.md

T81Lang

Current spec version: v1.9.0 (updated 2026-03-01).

Core Grammar → t81lang-spec.md
Type System → t81lang-spec.md
Purity and Effects → t81lang-spec.md

Axion Governance Kernel

Responsibilities → axion-kernel.md
Subsystems → axion-kernel.md
Recursion Controls → axion-kernel.md

Cognitive Tiers

Tier Structure → cognitive-tiers.md
Constraints → cognitive-tiers.md

This site is open source. Improve this page.