Engine Design Principles
========================

LedgerLoom is a teaching project, but it is also designed to be *real*:
what you learn here should transfer to production accounting systems.

This page explains the **software engineering principles** behind the engine,
and how they reduce errors in practical accounting workflows.

Principle 1 — Separate compute from I/O
---------------------------------------

The engine lives under ``ledgerloom/engine`` and aims to be **pure compute**:

- It accepts in-memory objects (Entries) and returns in-memory tables (DataFrames).
- It does not write files.
- It does not rely on global state.

Chapters still need I/O (CSV/JSON outputs). LedgerLoom keeps that I/O outside the engine
*but* centralizes the **deterministic writing contract** in :mod:`ledgerloom.artifacts`
(stable column order, LF newlines, sorted JSON keys, sha256 manifests).


Why it matters:

- **Refactor-friendly:** you can change chapter artifact formats without breaking ledger math.
- **Testable:** you can unit-test accounting invariants without touching the filesystem.
- **Reusable:** the same engine can power CLIs, notebooks, web APIs, or batch jobs.

Principle 2 — Determinism by default
------------------------------------

Accounting systems must be auditable. That implies **reproducibility**:

- the same input entries should produce the same postings
- ordering should be stable
- rounding rules should be explicit

LedgerLoom encodes that as:

- integer-cent arithmetic (no floating-point drift)
- stable identifiers (``entry_id`` and ``posting_id``)
- stable sorts (``kind="mergesort"``)

By default, the engine is **strict**: entries must provide an ``entry_id`` in
``entry.meta``. For quick demos or exploratory notebooks, you can opt into
``entry_id_policy="generated"``, which synthesizes a stable hash-based ID; when enabled,
engine invariants include a ``generated_entry_ids`` list so the run remains auditable.

Principle 3 — Make invariants first-class
-----------------------------------------

In accounting, *constraints are the product*:

- an entry that doesn’t balance is not “mostly correct” — it is **invalid**
- a ledger that doesn’t sum correctly is **untrustworthy**

In LedgerLoom, invariants are computed explicitly (see the data model reference):

- ``entry_double_entry_ok`` — every entry debits equal credits
- ``ledger_raw_delta_zero`` — total (debit-credit) sums to zero across the ledger
- ``posting_id_unique`` — traceability depends on stable IDs
- ``unknown_roots`` — schema hygiene (COA consistency)

This is a deliberate software design move:

- invariants become unit tests
- unit tests become regression protection
- regression protection makes refactors safe

Principle 4 — Prefer fact tables + views
----------------------------------------

Data professionals recognize this pattern immediately:

- **Fact table:** postings (one row per posting line)
- **Dimensions:** account, root, department, period
- **Views:** balances and reports computed from facts

This has two advantages:

- It matches how BI / analytics pipelines work (SQL, star schemas, OLAP).
- It prevents “report drift” because statements are derived from the same facts.

Principle 5 — Minimal public surface area
-----------------------------------------

The engine intentionally keeps a small API (``LedgerEngine`` and a few helpers).

As the chapter count grows, cross-chapter reusable transformations should live in
:mod:`ledgerloom.scenarios` (a small, stable layer that prevents chapters from importing
private helpers from earlier chapters).
A small API is easier to:

- document
- test
- keep stable while the project grows to many chapters

As LedgerLoom grows, we prefer adding *new* helpers instead of changing
existing behavior, so old chapters remain correct.

How this makes accounting better in practice
--------------------------------------------

Putting these principles together leads to practical wins:

- **Fewer errors:** invariants catch imbalances immediately.
- **Faster debugging:** stable IDs make it easy to trace a report number back
  to the exact entry and posting line.
- **Better collaboration:** accountants and developers can talk about the same
  data model (facts + views + constraints).
- **Safer change management:** determinism + tests allow incremental refactors
  without fear of silently changing financial meaning.