Appendix: Data model reference
This appendix describes the canonical tables produced by the LedgerLoom engine. It is written as a contract: if these structures are stable, chapter refactors and new report views remain safe.
Stable identifiers
LedgerLoom uses stable identifiers so every number in a statement can be traced back to an exact event.
Entry IDs (entry_id)
Each ledgerloom.core.Entry should have a stable ID stored in
entry.meta[cfg.entry_id_key] (default key: "entry_id").
Two policies exist (see ledgerloom.engine.config.LedgerEngineConfig):
Strict (default): missing
entry_idraises immediately. This mirrors real systems where matching (A/R open items, A/P matching, reconciliations) depends on durable keys.Generated (optional):
entry_id_policy="generated"synthesizes a stable ID (currentlyH<12 hex>) from the entry’s date, narration, and postings.
When the generated policy is enabled, ledgerloom.engine.ledger.LedgerEngine.invariants()
adds a generated_entry_ids list so the run explicitly records what was synthesized.
Posting IDs (posting_id)
Each posting line is uniquely identified as:
posting_id = "<entry_id>:<line_no>"
where line_no is formatted as two digits (01, 02, …). The table also includes
line_no as an integer column.
Postings fact table (postings.csv)
The engine compiles a list of entries into a single fact table: one row per posting line. This is the “append-only truth” you build reports from.
Columns
posting_id— stable line identifier (<entry_id>:<line_no two digits>)entry_id— stable entry identifierline_no— posting line number within the entry (1, 2, …)date— ISO date string (YYYY-MM-DD)department— optional segment (example dimension used in early chapters)narration— human description of the eventaccount— full account path (e.g.,Assets:Cash)root— first account segment (e.g.,Assets)debit— decimal string, two placescredit— decimal string, two placesraw_delta—debit - creditas a signed decimal stringsigned_delta— raw delta mapped into a “normal-balance” convention by root
Notes
The engine is intentionally not a database. It produces database-shaped tables that you can load into Pandas/SQL/BI tools.
Period is not stored in the fact table. Views (below) derive
periodby slicing thedatefield.
Derived views
The engine provides common “views” (materialized as DataFrames in chapters) derived from postings.
Balances by account (balances_by_account.csv)
Groups postings by (root, account) and sums signed_delta.
Balances by period (balances_by_period.csv)
Adds a derived period key (YYYY-MM) from date, then groups by
(period, root, account).
Balances by department (balances_by_department.csv)
If a department column exists, this view groups by (department, root, account).
(Other segment dimensions will appear in later chapters.)
As-of filtering
For point-in-time reporting (e.g., “balance as of 2026-01-31”), the engine provides
ledgerloom.engine.ledger.LedgerEngine.postings_as_of().
The contract is:
include postings where
date <= as_ofpreserve stable ordering
Invariants (invariants.json)
Invariants are explicit checks you can treat as unit tests. Core invariants include:
entry_double_entry_ok— every entry balances (debits == credits)ledger_raw_delta_zero— totalraw_deltasums to 0 across the full ledgerposting_id_unique— stable IDs are uniqueentry_id_presentandentry_id_unique— entry IDs exist and are uniqueposting_id_format_ok—posting_idmatches<entry_id>:<two digits>
When entry_id_policy="generated", additional diagnostics are included:
entry_id_policy— the active policy (currently"generated")generated_entry_ids— which entries required synthesized IDs and why
Why this appendix matters
If you are extending LedgerLoom, treat this appendix like a schema contract:
New chapters should prefer new derived views over changing the fact table.
Engine refactors should preserve these table shapes unless a version bump and doc update are intentional.