Guides

Building Context

Build and refresh ktx context from databases, context sources, query history, and text.

Build context after ktx setup creates ktx.yaml and at least one database or context-source connection. ktx writes local semantic sources and wiki pages for agents to use before writing SQL.

The build loop

Most projects use this loop:

  1. Check readiness with ktx status.
  2. Build one connection with ktx ingest <connectionId>, or build everything with ktx ingest --all.
  3. Search or inspect the generated files under semantic-layer/ and wiki/.
  4. Edit source YAML or Markdown when business logic needs refinement.
  5. Validate and query representative sources before handing the context to an agent.

ktx ingest --all runs databases first, then context-source connections, so external metadata can attach to known warehouse tables.

Database ingest

Database ingest always builds enriched context: tables, columns, types, constraints, and row counts, plus AI-generated descriptions, embeddings, and relationship evidence.

# Build one configured database connection
ktx ingest warehouse

# Build all configured connections
ktx ingest --all

Enriched ingest needs a configured model and embeddings. Run ktx setup first; connections without that configuration fail before any work starts.

Local-auth backends keep provider credentials out of ktx.yaml:

ktx setup --llm-backend claude-code --no-input
ktx setup --llm-backend codex --no-input

With claude-code, ktx agent loops can invoke only the ktx MCP tools for the current run. With codex, ktx restricts the temporary runtime MCP server to the current run's tool set, disables Codex web search, requests a read-only sandbox, and sets approval_policy=never. The public Codex SDK and CLI surface may still load user Codex config and built-in command execution or read-only file capabilities, so use claude-code for stricter runtime tool isolation.

Query history

PostgreSQL, BigQuery, and Snowflake can add query-history context: common joins, filters, redaction rules, high-usage templates, and service-account exclusions. When query history is enabled during setup, ktx reviews observed in-scope roles and can write exact filters.serviceAccounts patterns for operational traffic such as loader or refresh roles.

Enable it during setup, store it under connections.<id>.context.queryHistory, or request it for one run:

ktx ingest warehouse --query-history
# Set the lookback window for BigQuery or Snowflake query history
ktx ingest warehouse --query-history-window-days 30

Use --no-query-history when you want to skip a stored query-history setting for one run.

Relationship evidence

ktx scores relationship candidates during database ingest. The public CLI does not expose separate relationship review subcommands.

Context-source ingest

Context-source connections pull metadata from dbt, BI tools, Notion, and other configured systems. Pass one connection id or --all.

# Build one context-source connection
ktx ingest dbt_main

# Build every configured database and context-source connection
ktx ingest --all

Supported source types:

DriverTypical sourceOutput
dbtdbt project or Git repoSemantic sources with model, column, test, tag, and description metadata
metricflowMetricFlow project or Git repoMetrics, dimensions, entities, and semantic joins
lookmlLookML files or Git repoViews, explores, dimensions, measures, and joins
lookerLooker APIExplores, looks, dashboards, and model metadata
metabaseMetabase APIQuestions, dashboards, table metadata, and mappings
notionNotion APIWiki pages and business knowledge
sigmaSigma APIData model specs, pages, element metadata, and workbook metadata

Context-source ingest writes semantic source YAML and wiki Markdown, reconciling with local edits.

Text ingest

Use ktx ingest --text / ktx ingest --file for notes, Markdown, runbooks, Slack exports, or other searchable memory.

# Capture a Markdown file
ktx ingest --file docs/revenue-notes.md --connection-id warehouse

# Capture one stdin item
printf "Refunds are excluded from net revenue." | ktx ingest --file -

# Capture direct text
ktx ingest --text "ARR excludes one-time implementation fees."

Useful flags:

FlagDescription
--text <content>Capture inline text into memory; repeatable
--file <path>Capture a text file (or - for stdin) into memory; repeatable
--connection-id <connectionId>Attach the captured memory to a ktx connection
--user-id <id>Attribute capture to a user scope, default local-cli
--jsonPrint structured output
--fail-fastStop after the first failed text/file item

Use text ingest for small, high-signal documents. Prefer configured context-source ingest for Notion, dbt, Metabase, and similar systems.

Output and artifacts

Every ingest run prints a summary. Use --json for scripts and agents.

ktx ingest --all --json

Typical generated files:

PathCreated byPurpose
semantic-layer/<connection-id>/*.yamlDatabase and context-source ingestQueryable semantic source definitions
wiki/global/*.mdContext-source, text, and memory ingestShared business definitions and notes
wiki/user/<user-id>/*.mdText and memory ingestUser-scoped context
.ktx/setup/context-build.jsonSetup context buildResume and readiness state for setup

Ingest transcripts include tool calls, LLM responses, and write decisions.

Example: first full refresh

After interactive setup:

ktx status
ktx ingest --all
ktx status

Then inspect what changed:

git status --short
ktx sl --json
ktx wiki "revenue" --json --limit 10

Common errors

SymptomLikely causeRecovery
Connection not configuredThe connection id is missing from ktx.yamlAdd it with ktx setup
Enrichment is not configuredLLM or embeddings are not setup-readyRun ktx setup to configure a model and embeddings
Query history is unsupportedThe selected database driver does not expose query historyRun ingest without query-history flags
No connections configuredThe project has no entries under connectionsRun ktx setup and add a database or context-source connection
Context-source flags have no effectQuery-history flags were supplied for a context-source connectorUse query-history flags only for database connections
Text ingest stops early--fail-fast stopped on the first failed itemFix the item or rerun without --fail-fast