ktxby Kaelio
Docs
Concepts

The Context Layer

What a context layer is, why agents need one, and the YAML and Markdown surfaces ktx writes to disk.

A context layer is the trusted knowledge surface that sits between your data stack and the agents that query it. It holds the things a database connection can't tell an agent on its own: which metrics are canonical, which joins are safe, what your team means by "active customer", and where every definition came from.

ktx builds that layer as plain files - YAML, Markdown, and JSON - that agents can search and humans can review. This page covers what's in it, why agents need it, and how it compares to other semantic tooling.

Database access isn't enough

Hand an agent a database connection and it can run SQL. It still has to guess the part that matters: which table is the source of truth, which join is the one analysts actually use, and what definition the business agreed on. Plausible SQL becomes wrong SQL fast.

Schema-only access gives the agentWhat it still doesn't know
Tables, columns, and typesWhich table is canonical for revenue
Primary and foreign keysWhich join is safe and which fans out measures
Sample rowsWhich rows are test accounts the team excludes
orders.amount existsThat amount includes refunds unless filtered
A customers.segment columnThat legacy_segments is stale even though it exists
Column comments, sometimesThe board-approved definition of ARR

Schema is a starting point, not a contract. The context layer is the contract.

The two pillars

A ktx project has two committed surfaces, each tuned for a different question. Structured data lives where it can be compiled. Prose lives where it can be searched. Wiki pages cross-reference semantic sources by name, so every metric caveat stays anchored to the definition it explains.

Anatomy of a context layer

Two files, two jobs

YAML for what the warehouse can execute. Markdown for what the team needs to interpret it. Both are committed to git and reviewed like code.

semantic-layer/**/*.yaml

git

Semantic sources

structuredexecutable

Tables, grain, joins, measures, dimensions, filters, and segments. The compiler turns these into dialect-correct SQL.

Answers: how do I query this safely?

wiki/**/*.md

git

Wiki pages

free-formsearchable

Definitions, caveats, policies, and decisions. Frontmatter links each page back to the semantic sources it explains.

Answers: what does this mean to the business?

Behind the scenes. ktx also keeps scan snapshots and a per-run event log locally so every committed change is traceable to its evidence. You don't read or edit these files yourself - see Context as Code for how that audit trail flows into review.

Semantic sources

Semantic sources describe a table the way an agent can reason about it: row grain, typed columns, named measures, valid joins, filters, and segments. The planner compiles these into SQL; nothing else.

yamlsemantic-layer/warehouse/orders.yaml
name: orders
table: public.orders
grain: [id]
columns:
  - name: id
    type: number
  - name: status
    type: string
  - name: amount
    type: number
measures:
  - name: total_revenue
    expr: sum(amount)
    filter: "status != 'refunded'"
joins:
  - to: customers
    "on": customer_id = customers.id
    relationship: many_to_one

For how the compiler walks the join graph, handles fan-out, and transpiles dialects, read Semantic querying.

Wiki pages

Wiki pages hold the context that doesn't belong in a formula: business definitions, reporting policy, anomalies, and metric caveats. Each page links back to the semantic sources it explains through frontmatter.

markdownwiki/global/revenue.md
---
summary: Paid order value after refunds
tags: [finance, orders]
sl_refs: [warehouse.orders]
refs: [segment-classification]
usage_mode: auto
---

Revenue is paid order amount after refund adjustments.

Use `orders.total_revenue` for recognized order value and
`orders.order_count` for paid order volume.

A navigable graph

Those two reference fields - sl_refs from a wiki page to a semantic source, and refs from a wiki page to other wiki pages - turn the context layer into a graph agents traverse. An agent that finds this page while searching for "revenue" follows sl_refs straight to orders.total_revenue for the executable definition, then walks refs to related policies without rerunning search.

The graph only helps if the edges stay live. ktx validates references when wiki pages are written and prunes sl_refs during ingest when their target sources are deleted or their measures are renamed - so a stale page can never quietly route an agent to a definition that no longer exists.

The split between the two pillars is sharp:

Put it in YAMLPut it in Markdown
sum(amount)"Net revenue excludes successful refunds."
many_to_one join metadata"Use the contract segment for board reporting."
Row grain and column types"February had a one-time refund anomaly."
Default time dimension"Finance owns ARR definitions."

If a fact changes how the SQL runs, it goes in YAML. If a human needs it to trust the answer, it goes in Markdown.

How ktx compares

Two adjacent product categories cover parts of this problem - but each leaves a different gap.

Company brains (Glean, Notion AI, the search-over-everything tools) index your wikis, docs, and chats so an agent can find context fast. They aren't built for data stacks: there's no join graph, no canonical metrics, and no way to compile a question into safe SQL. An agent reading them still has to guess how to query the warehouse.

Traditional semantic layers (MetricFlow, Cube, Malloy) solve that side. They give agents reviewable metric definitions and a compiler that produces correct SQL. The cost is maintenance - models, joins, and dimensions are hand-written, and the layer doesn't learn from the warehouse, BI tools, or query history that surround it. The business context that explains why a definition exists usually lives somewhere else.

ktx bundles both surfaces - wiki for business context, semantic layer for queryable definitions - and keeps them current by reading the data stack and reconciling new evidence with the reviewed files. You get the breadth of a knowledge tool and the SQL safety of a semantic layer, without rewriting models every time the warehouse changes.

CapabilityCompany brainSemantic layerktx
SurfaceIndexed docs and chatsModeling language or runtimeYAML and Markdown files
Data-stack awarenessNone - treats data tools as textHigh for declared metrics, none for the surrounding warehouseBuilt in: scans schemas, dbt, BI tools, and query history
MaintenanceManual page authoringManual modeling, model-per-changeAuto-maintained: reconciles evidence with accepted files
SQL safetyNone - generates plausible textCompiled, dialect-correctCompiled with join-graph and fan-out handling
Agent edit loopText-onlyTied to the modeling workflowFirst-class: patch files, validate, review diffs

If you already use MetricFlow, LookML, dbt, or BI tools, ktx can ingest that context and turn it into agent-readable files. You don't need to replace your serving layer to give agents a better working surface.

A ktx project on disk

A ktx project is a directory of readable files. Semantic sources and wiki pages are committed to git; everything else ktx needs at runtime stays local and out of the repo.

output
my-project/
├── ktx.yaml                              # project config and connections
├── semantic-layer/
│   └── warehouse/
│       ├── orders.yaml
│       └── customers.yaml
├── wiki/
│   └── global/
│       ├── revenue.md
│       └── segment-classification.md
└── .ktx/                                 # local runtime state, git-ignored

This keeps analytics context close to the code review workflow: branch context changes, review YAML and Markdown diffs, merge accepted definitions, and let agents read the updated source of truth.

Agent usage notes

Use this page when an agent needs to explain why ktx exists, why schema-only database access isn't enough, or how ktx differs from traditional semantic layers.

Agent taskRelevant sectionNext page
Explain why a data agent wrote a plausible but wrong queryDatabase access isn't enoughWriting Context
Decide whether a fact belongs in YAML or MarkdownSemantic sources / Wiki pagesWriting Context
Compare ktx to another semantic layerHow ktx comparesPrimary Sources
Explain reviewability and source of truthA ktx project on diskContext as Code