Writing Context
Edit semantic sources and wiki pages so agents use your business logic.
Ingest creates the first draft. Edit source YAML and wiki Markdown when you need sharper metrics, joins, or business rules.
Editing workflow
Use this order for most context changes:
-
Discover existing context.
ktx sl --json ktx sl "revenue" --json ktx wiki "revenue recognition" --json --limit 10
-
Edit the smallest relevant files under
semantic-layer/<connection-id>/orwiki/. -
Validate semantic source changes.
ktx sl validate orders --connection-id warehouse
-
Compile a representative query before executing it.
ktx sl query \ --connection-id warehouse \ --measure orders.total_revenue \ --dimension orders.created_date \ --format sql
-
Search again using likely user wording to confirm the new context is discoverable.
Semantic sources
Semantic sources are YAML files for queryable tables or custom SQL. They define agent-facing measures, dimensions, segments, joins, and grain.
Semantic source files live at:
semantic-layer/<connection-id>/<source-name>.yamlMinimal source
name: orders descriptions: user: Customer orders with booked revenue. table: public.orders grain: - order_id columns: - name: order_id type: string descriptions: user: Unique order identifier. - name: order_date type: time role: time descriptions: user: Date the order was placed. - name: total_amount type: number descriptions: user: Booked order value in USD. measures: - name: total_revenue expr: SUM(total_amount) description: Sum of booked order value before refunds.
Full source shape
name: orders descriptions: user: Customer orders with line-item totals. table: public.orders grain: - order_id columns: - name: order_id type: string descriptions: user: Unique order identifier. - name: order_date type: time role: time descriptions: user: Date the order was placed. - name: status type: string visibility: public descriptions: user: Current order status. - name: _etl_loaded_at type: time visibility: hidden descriptions: user: Internal load timestamp. - name: total_amount type: number descriptions: user: Order total in USD. measures: - name: total_revenue expr: SUM(total_amount) description: Sum of all order values. - name: order_count expr: COUNT(DISTINCT order_id) description: Number of distinct orders. - name: avg_order_value expr: AVG(total_amount) description: Average booked order value. - name: high_value_revenue expr: SUM(total_amount) filter: total_amount > 100 description: Revenue from orders over $100. segments: - name: completed_orders expr: status = 'completed' description: Orders that completed fulfillment. joins: - to: customers on: orders.customer_id = customers.customer_id relationship: many_to_one - to: order_items on: orders.order_id = order_items.order_id relationship: one_to_many alias: items
Source fields
| Field | Required | Description |
|---|---|---|
name | Yes | Source identifier. Use lowercase words and underscores. |
descriptions | No | Description map keyed by source, such as user, dbt, or ai. |
table or sql | Yes | Database table or custom SQL expression. Use exactly one. |
grain | Yes | Columns that uniquely identify a row at the source grain. |
columns | Yes | Non-empty column definitions with type, role, visibility, and descriptions. |
measures | No | Aggregation expressions such as SUM, COUNT, and AVG. |
segments | No | Named predicates agents can reuse. |
joins | No | Relationships to other semantic sources. |
inherits_columns_from | No | Inherit column metadata from a manifest entry. |
Component fields
| Component | Field | Required | Description |
|---|---|---|---|
| Column | name | Yes | Column identifier used in SQL expressions. |
| Column | type | Yes | Agent-facing type: string, number, time, or boolean. |
| Column | role | No | Special role such as time for default time dimensions. |
| Column | visibility | No | public, internal, or hidden. |
| Column | descriptions | Strongly recommended | Description map keyed by source, such as user, dbt, or ai. |
| Measure | name | Yes | Queryable metric name. |
| Measure | expr | Yes | SQL aggregation expression at the source grain. |
| Measure | filter | No | SQL predicate applied only to this measure. |
| Measure | description | Strongly recommended | Definition agents can cite and compare. |
| Segment | name | Yes | Reusable filter name. |
| Segment | expr | Yes | SQL predicate for the segment. |
| Join | to | Yes | Target semantic source name. |
| Join | on | Yes | SQL join condition using source names or aliases. |
| Join | relationship | Yes | many_to_one, one_to_many, or one_to_one. |
| Join | alias | No | Query alias for repeated or clearer joins. |
Visibility
| Visibility | Agent behavior |
|---|---|
public | Included in listings and available for agent queries. |
internal | Available for joins and measures, but not highlighted to agents. |
hidden | Excluded from agent-facing context. Use for ETL fields and sensitive internals. |
Measures
Good measures have precise names, correct-grain SQL, and descriptions that name key inclusions and exclusions.
measures: - name: net_revenue expr: SUM(total_amount - refunded_amount) filter: status = 'completed' description: Completed order revenue after refunds, excluding cancelled orders.
Prefer one canonical measure plus wiki synonyms. Put competing definitions in a linked wiki page.
Joins and grain
grain and relationship prevent double-counted SQL. State the row grain even
when it seems obvious.
grain: - order_id joins: - to: customers on: orders.customer_id = customers.customer_id relationship: many_to_one
Use many_to_one for dimensions such as customer, account, product, or plan.
Use one_to_many only when the target can fan out rows.
Validate and query
Validation checks source YAML against the live database schema:
ktx sl validate orders --connection-id warehouse
It catches missing columns, invalid joins, and table-reference problems.
Compile a query to inspect generated SQL:
ktx sl query \ --connection-id warehouse \ --measure orders.total_revenue \ --dimension orders.order_date \ --filter "orders.status = 'completed'" \ --order-by orders.order_date:desc \ --limit 10 \ --format sql
Execute only when you need live rows:
ktx sl query \ --connection-id warehouse \ --measure orders.total_revenue \ --dimension orders.status \ --execute \ --max-rows 100
Wiki pages
Wiki pages hold context that does not belong in one semantic source: policies, caveats, vocabulary, freshness, known issues, and source-of-truth notes.
Wiki files live under:
wiki/
global/
user/<user-id>/Use global pages for shared rules and user-scoped pages for local notes.
Wiki page example
--- summary: Revenue recognition rules for finance reporting. tags: [revenue, finance, reporting] sl_refs: [orders] external_refs: - type: notion id: finance-revenue-policy --- ## Recognized Revenue Recognized revenue includes completed orders after refunds. It excludes cancelled orders, test orders, implementation fees, and tax. Finance reporting uses order completion date, not invoice creation date.
Useful frontmatter:
| Field | Required | Description |
|---|---|---|
summary | Yes | Short text shown in search results. |
tags | No | Business terms and synonyms that improve search. |
sl_refs | No | Semantic source names the page explains or constrains. |
external_refs | No | Source-of-truth system links or ids. |
Add searchable business context
-
Search first.
ktx wiki "active customer definition" --json --limit 10
-
If no page covers the rule, create or edit a Markdown file under
wiki/global/. -
Write a compact
summarywith the wording users are likely to ask. -
Add tags for synonyms and related business areas.
-
Add
sl_refsfor relevant semantic sources. -
Search again with a user-like phrase.
Review context changes
Before accepting agent-written context:
git diff -- semantic-layer wiki ktx sl validate orders --connection-id warehouse ktx sl "revenue" --json ktx wiki "revenue recognition" --json --limit 10
Check definitions, hidden columns, join relationships, and generated SQL.
Common errors
| Symptom | Likely cause | Recovery |
|---|---|---|
ktx sl validate reports a missing column | YAML references a column absent from the scanned table | Refresh database context or update the YAML |
| Query compilation double-counts a measure | grain or join relationship is missing or wrong | Add explicit grain and relationship values, then recompile |
| Agent cannot find a metric | Measure name and description do not match business terminology | Add a clearer measure description and a wiki page with synonyms |
| Wiki search misses a page | Summary, tags, or content do not match user wording | Rewrite the summary and add likely synonyms |
| Context diff is hard to review | One edit changed too many concepts | Split the change into focused source and wiki edits |