Primary Sources
Connect ktx to PostgreSQL, Snowflake, BigQuery, MySQL, SQL Server, or SQLite.
ktx connects to your data warehouse or database to build schema context,
discover relationships, and execute semantic layer queries. Each connection is
defined in ktx.yaml under the connections key.
For analytics tools and knowledge systems such as dbt, MetricFlow, LookML, Metabase, Looker, and Notion, use Context Sources. For Claude Code, Codex, Cursor, OpenCode, and other agent clients, use Agent Clients.
All connectors share these conventions:
- Sensitive values support
env:VAR_NAME(read from environment) andfile:/path/to/secret(read from file) references - Connections are read-only; ktx never writes to your database
- Database ingest discovers tables, columns, types, and constraints automatically
Connection field reference
Agents should prefer environment or file references over literal secrets.
| Field | Required | Applies to | Description |
|---|---|---|---|
driver | Yes | all connections | Connector driver such as postgres, snowflake, bigquery, mysql, sqlserver, or sqlite |
url | One of the connection methods | URL-style connectors | Database URL, env:NAME, or file:/path/to/secret |
host, port, database, username, password | One of the connection methods | PostgreSQL, MySQL, SQL Server | Field-by-field connection values |
schema or schemas | No | schema-aware warehouses | Single schema or list of schemas to scan |
context.queryHistory | No | PostgreSQL, Snowflake, BigQuery | Enables query-history ingestion when the warehouse supports it |
path | Yes for path-style SQLite | SQLite | Local SQLite database path or env:NAME reference |
max_bytes_billed | No | BigQuery | Maximum bytes billed per query job |
job_timeout_ms | No | BigQuery | BigQuery query job timeout in milliseconds |
project_id | No | BigQuery | Optional local descriptor and mapping metadata; not used for BigQuery authentication |
PostgreSQL
The most full-featured connector. Supports schema introspection, foreign key detection, column statistics, and query history via pg_stat_statements.
Connection config
connections: my-postgres: driver: postgres url: env:DATABASE_URL schema: public
Or with individual fields:
connections: my-postgres: driver: postgres host: localhost port: 5432 database: analytics username: ktx_reader password: env:PG_PASSWORD schemas: - public - analytics ssl: true
Authentication
| Method | Config |
|---|---|
| Password | password: env:PG_PASSWORD or password: file:/path/to/secret |
| Connection URL | url: env:DATABASE_URL |
| SSL | ssl: true, optionally rejectUnauthorized: false for self-signed certs |
Features
| Feature | Supported | Notes |
|---|---|---|
| Tables & views | Yes | Via pg_catalog |
| Primary keys | Yes | Via information_schema.table_constraints |
| Foreign keys | Yes | Full constraint detection |
| Row count estimates | Yes | Via pg_class.reltuples |
| Column statistics | Yes | Requires pg_read_all_stats role |
| Query history | Yes | Via pg_stat_statements extension |
| Table sampling | Yes | TABLESAMPLE SYSTEM |
Query history
PostgreSQL query history mines real query patterns from pg_stat_statements.
This helps ktx understand how your team actually queries the data.
Requirements:
pg_stat_statementsextension enabledpg_read_all_statsrole granted to the ktx user
Config options:
context: queryHistory: enabled: true minExecutions: 5 filters: dropTrivialProbes: true
Dialect notes
- SQL compilation uses
LIMIT/OFFSETpagination - Named parameters converted to positional (
$1,$2, ...) - Supports
COUNT(*) FILTER (WHERE ...)for null analysis - Full support for PostgreSQL types:
uuid,jsonb,timestamptz,numeric,text[], etc.
Snowflake
Connects via the Snowflake SDK. Supports multi-schema scanning, RSA key authentication, and query-history configuration for Snowflake query history.
Connection config
connections: my-snowflake: driver: snowflake account: xy12345 warehouse: ANALYTICS_WH database: PROD schema_name: PUBLIC username: KTX_SERVICE password: env:SNOWFLAKE_PASSWORD role: ANALYST
For multiple schemas:
schema_names: - PUBLIC - ANALYTICS - STAGING
Authentication
| Method | Config |
|---|---|
| Password | password: env:SNOWFLAKE_PASSWORD |
| RSA key pair | authMethod: rsa, privateKey: file:~/.ssh/snowflake_key.pem, optional passphrase |
Features
| Feature | Supported | Notes |
|---|---|---|
| Tables & views | Yes | Via INFORMATION_SCHEMA.TABLES |
| Primary keys | Yes | Via table constraints |
| Foreign keys | No | Not available in Snowflake |
| Row count estimates | Yes | From INFORMATION_SCHEMA.TABLES.ROW_COUNT |
| Column statistics | No | - |
| Query history | Yes | Via SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY when enabled |
| Table sampling | Yes | - |
Query history
Snowflake query history reads aggregated query-history templates from
SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY and feeds the same unified staged
artifact shape as Postgres and BigQuery.
context: queryHistory: enabled: true windowDays: 90 minExecutions: 5 filters: dropTrivialProbes: true serviceAccounts: patterns: ['^svc_'] mode: exclude redactionPatterns: []
Dialect notes
- All identifiers are uppercase by default (case-insensitive matching)
- Connection context set per query (
USE ROLE,USE WAREHOUSE,USE DATABASE,USE SCHEMA) - Parameter binding uses positional
?placeholders - Date values normalized to ISO 8601 strings
BigQuery
Authenticates via GCP service account credentials. Supports multi-dataset scanning and query-history configuration for INFORMATION_SCHEMA.JOBS_BY_PROJECT.
Connection config
connections: my-bigquery: driver: bigquery credentials_json: file:~/.config/gcloud/bq-service-account.json dataset_id: analytics location: US
For multiple datasets:
dataset_ids: - analytics - marketing - finance
Authentication
| Method | Config |
|---|---|
| Service account JSON | credentials_json: file:/path/to/key.json |
| Environment variable | credentials_json: env:BIGQUERY_CREDENTIALS_JSON |
The project ID is extracted automatically from the service account JSON file.
If you set project_id in ktx.yaml, ktx treats it as local descriptor and
mapping metadata. The BigQuery connector still authenticates with the
project_id inside credentials_json.
Features
| Feature | Supported | Notes |
|---|---|---|
| Tables & views | Yes | Including materialized views and external tables |
| Primary keys | Yes | Via INFORMATION_SCHEMA table constraints when declared |
| Foreign keys | No | Not available in BigQuery |
| Row count estimates | Yes | From table metadata |
| Column statistics | No | - |
| Query history | Yes | Via region-scoped INFORMATION_SCHEMA.JOBS_BY_PROJECT when enabled |
| Table sampling | Yes | - |
Query history
BigQuery query history reads aggregated query-history templates from
region-scoped INFORMATION_SCHEMA.JOBS_BY_PROJECT and feeds the same unified
staged artifact shape as Postgres and Snowflake.
context: queryHistory: enabled: true windowDays: 90 minExecutions: 5 filters: dropTrivialProbes: true serviceAccounts: patterns: ['@bot\\.'] mode: exclude redactionPatterns: []
Dialect notes
- Parameter binding uses named
@paramsyntax - Arrays flattened to comma-separated strings in results
- Location specified at query execution time
- Supports
max_bytes_billedandjob_timeout_mslimits fromktx.yaml
MySQL
Standard MySQL/MariaDB connector with full foreign key support and schema introspection.
Connection config
connections: my-mysql: driver: mysql url: env:MYSQL_DATABASE_URL
Or with individual fields:
connections: my-mysql: driver: mysql host: mysql.internal port: 3306 database: analytics username: ktx_reader password: env:MYSQL_PASSWORD ssl: true
Authentication
| Method | Config |
|---|---|
| Password | password: env:MYSQL_PASSWORD or password: file:/path/to/secret |
| SSL | ssl: true or ssl: { rejectUnauthorized: false } |
| URL parameters | ?ssl=true or ?sslmode=required in connection URL |
Features
| Feature | Supported | Notes |
|---|---|---|
| Tables & views | Yes | Via INFORMATION_SCHEMA.TABLES |
| Primary keys | Yes | Via KEY_COLUMN_USAGE |
| Foreign keys | Yes | Via REFERENTIAL_CONSTRAINTS |
| Row count estimates | Yes | From TABLE_ROWS (InnoDB estimate) |
| Column statistics | No | - |
| Query history | No | - |
| Table sampling | Yes | Uses RAND() filter |
Dialect notes
- Parameter binding uses positional
?placeholders - Uses
LIMIT X OFFSET Yfor pagination - Single database per connection (no multi-schema)
- Supports 20+ MySQL types including
enum,json,datetime,decimal - Table comments extracted with InnoDB metadata prefix stripping
SQL Server
Connects to Microsoft SQL Server and Azure SQL. Supports multi-schema scanning with dbo as the default schema.
Connection config
connections: my-sqlserver: driver: sqlserver url: env:SQLSERVER_DATABASE_URL
Or with individual fields:
connections: my-sqlserver: driver: sqlserver host: sql.internal port: 1433 database: Analytics username: ktx_reader password: env:MSSQL_PASSWORD schema: dbo trustServerCertificate: true
For multiple schemas:
schemas: - dbo - analytics - staging
Authentication
| Method | Config |
|---|---|
| SQL Server auth | username + password |
| Encrypted connection | Always enabled, trustServerCertificate: true for self-signed |
Features
| Feature | Supported | Notes |
|---|---|---|
| Tables & views | Yes | Via INFORMATION_SCHEMA.TABLES |
| Primary keys | Yes | Via TABLE_CONSTRAINTS and KEY_COLUMN_USAGE |
| Foreign keys | Yes | Via REFERENTIAL_CONSTRAINTS |
| Row count estimates | Yes | Via sys.dm_db_partition_stats |
| Column statistics | No | - |
| Query history | No | - |
| Table sampling | Yes | - |
| Nested analysis | No | - |
Dialect notes
- Parameter binding uses
@paramNamesyntax - Row limiting uses
SELECT TOP N * FROM (query) AS ktx_query_result - Encryption is always required; certificate validation is optional
- Multi-schema support with per-schema isolation
SQLite
File-based connector using better-sqlite3. Ideal for local development, embedded analytics, or testing.
Connection config
connections: my-sqlite: driver: sqlite path: ./data/warehouse.sqlite
Path supports multiple formats:
# Relative path (resolved against project directory) path: ./warehouse.sqlite # Absolute path path: /var/data/analytics.db # Home directory expansion path: ~/data/warehouse.sqlite # Environment variable path: env:SQLITE_DB_PATH # URL format url: sqlite:///path/to/db.sqlite
Authentication
No authentication required - SQLite is file-based. The file must be readable by the process running ktx.
Features
| Feature | Supported | Notes |
|---|---|---|
| Tables & views | Yes | Via sqlite_master |
| Primary keys | Yes | Via PRAGMA table_info() |
| Foreign keys | Yes | Via PRAGMA foreign_key_list() (requires PRAGMA foreign_keys = ON) |
| Row count estimates | Yes | Exact count via SELECT COUNT(*) |
| Column statistics | No | - |
| Query history | No | - |
| Table sampling | Yes | - |
| Nested analysis | No | - |
Dialect notes
- Synchronous query execution (no connection pooling)
- Parameter binding uses
:paramNamesyntax - Uses
LIMIT X OFFSET Yfor pagination - SQLite type affinity system:
TEXT,NUMERIC,INTEGER,REAL,BLOB - Foreign key enforcement requires explicit
PRAGMA foreign_keys = ON - Database file must exist before
ktx connection testor ingest runs
Common errors
| Error or symptom | Likely cause | Recovery |
|---|---|---|
| Connection URL appears in git diff | A literal credential URL was written to ktx.yaml | Replace it with env:NAME or file:/path/to/secret and rotate exposed credentials |
| Database ingest returns no tables | Schema, database, or project filter is wrong, or the user lacks metadata permissions | Verify the schema list and grant metadata read permissions |
| Query history is empty | Query history extension or warehouse history view is unavailable | Enable the warehouse-specific history feature, then rerun ktx ingest <connectionId> --query-history or ktx setup |
| Column statistics are missing | Connector cannot access stats tables or the warehouse does not expose them | Grant stats permissions where supported; otherwise rely on fast schema context |
| Semantic query execution fails | Connection is missing, unreachable, or query execution is disabled | Run ktx connection test <id> and check the ktx sl query flags |