ktxby Kaelio
Docs
Integrations

Primary Sources

Connect ktx to PostgreSQL, Snowflake, BigQuery, MySQL, SQL Server, or SQLite.

ktx connects to your data warehouse or database to build schema context, discover relationships, and execute semantic layer queries. Each connection is defined in ktx.yaml under the connections key.

For analytics tools and knowledge systems such as dbt, MetricFlow, LookML, Metabase, Looker, and Notion, use Context Sources. For Claude Code, Codex, Cursor, OpenCode, and other agent clients, use Agent Clients.

All connectors share these conventions:

  • Sensitive values support env:VAR_NAME (read from environment) and file:/path/to/secret (read from file) references
  • Connections are read-only; ktx never writes to your database
  • Database ingest discovers tables, columns, types, and constraints automatically

Connection field reference

Agents should prefer environment or file references over literal secrets.

FieldRequiredApplies toDescription
driverYesall connectionsConnector driver such as postgres, snowflake, bigquery, mysql, sqlserver, or sqlite
urlOne of the connection methodsURL-style connectorsDatabase URL, env:NAME, or file:/path/to/secret
host, port, database, username, passwordOne of the connection methodsPostgreSQL, MySQL, SQL ServerField-by-field connection values
schema or schemasNoschema-aware warehousesSingle schema or list of schemas to scan
context.queryHistoryNoPostgreSQL, Snowflake, BigQueryEnables query-history ingestion when the warehouse supports it
pathYes for path-style SQLiteSQLiteLocal SQLite database path or env:NAME reference
max_bytes_billedNoBigQueryMaximum bytes billed per query job
job_timeout_msNoBigQueryBigQuery query job timeout in milliseconds
project_idNoBigQueryOptional local descriptor and mapping metadata; not used for BigQuery authentication

PostgreSQL

The most full-featured connector. Supports schema introspection, foreign key detection, column statistics, and query history via pg_stat_statements.

Connection config

yamlktx.yaml
connections:
  my-postgres:
    driver: postgres
    url: env:DATABASE_URL
    schema: public

Or with individual fields:

yamlktx.yaml
connections:
  my-postgres:
    driver: postgres
    host: localhost
    port: 5432
    database: analytics
    username: ktx_reader
    password: env:PG_PASSWORD
    schemas:
      - public
      - analytics
    ssl: true

Authentication

MethodConfig
Passwordpassword: env:PG_PASSWORD or password: file:/path/to/secret
Connection URLurl: env:DATABASE_URL
SSLssl: true, optionally rejectUnauthorized: false for self-signed certs

Features

FeatureSupportedNotes
Tables & viewsYesVia pg_catalog
Primary keysYesVia information_schema.table_constraints
Foreign keysYesFull constraint detection
Row count estimatesYesVia pg_class.reltuples
Column statisticsYesRequires pg_read_all_stats role
Query historyYesVia pg_stat_statements extension
Table samplingYesTABLESAMPLE SYSTEM

Query history

PostgreSQL query history mines real query patterns from pg_stat_statements. This helps ktx understand how your team actually queries the data.

Requirements:

  • pg_stat_statements extension enabled
  • pg_read_all_stats role granted to the ktx user

Config options:

    context:
      queryHistory:
        enabled: true
        minExecutions: 5
        filters:
          dropTrivialProbes: true

Dialect notes

  • SQL compilation uses LIMIT/OFFSET pagination
  • Named parameters converted to positional ($1, $2, ...)
  • Supports COUNT(*) FILTER (WHERE ...) for null analysis
  • Full support for PostgreSQL types: uuid, jsonb, timestamptz, numeric, text[], etc.

Snowflake

Connects via the Snowflake SDK. Supports multi-schema scanning, RSA key authentication, and query-history configuration for Snowflake query history.

Connection config

yamlktx.yaml
connections:
  my-snowflake:
    driver: snowflake
    account: xy12345
    warehouse: ANALYTICS_WH
    database: PROD
    schema_name: PUBLIC
    username: KTX_SERVICE
    password: env:SNOWFLAKE_PASSWORD
    role: ANALYST

For multiple schemas:

    schema_names:
      - PUBLIC
      - ANALYTICS
      - STAGING

Authentication

MethodConfig
Passwordpassword: env:SNOWFLAKE_PASSWORD
RSA key pairauthMethod: rsa, privateKey: file:~/.ssh/snowflake_key.pem, optional passphrase

Features

FeatureSupportedNotes
Tables & viewsYesVia INFORMATION_SCHEMA.TABLES
Primary keysYesVia table constraints
Foreign keysNoNot available in Snowflake
Row count estimatesYesFrom INFORMATION_SCHEMA.TABLES.ROW_COUNT
Column statisticsNo-
Query historyYesVia SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY when enabled
Table samplingYes-

Query history

Snowflake query history reads aggregated query-history templates from SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY and feeds the same unified staged artifact shape as Postgres and BigQuery.

    context:
      queryHistory:
        enabled: true
        windowDays: 90
        minExecutions: 5
        filters:
          dropTrivialProbes: true
          serviceAccounts:
            patterns: ['^svc_']
            mode: exclude
        redactionPatterns: []

Dialect notes

  • All identifiers are uppercase by default (case-insensitive matching)
  • Connection context set per query (USE ROLE, USE WAREHOUSE, USE DATABASE, USE SCHEMA)
  • Parameter binding uses positional ? placeholders
  • Date values normalized to ISO 8601 strings

BigQuery

Authenticates via GCP service account credentials. Supports multi-dataset scanning and query-history configuration for INFORMATION_SCHEMA.JOBS_BY_PROJECT.

Connection config

yamlktx.yaml
connections:
  my-bigquery:
    driver: bigquery
    credentials_json: file:~/.config/gcloud/bq-service-account.json
    dataset_id: analytics
    location: US

For multiple datasets:

    dataset_ids:
      - analytics
      - marketing
      - finance

Authentication

MethodConfig
Service account JSONcredentials_json: file:/path/to/key.json
Environment variablecredentials_json: env:BIGQUERY_CREDENTIALS_JSON

The project ID is extracted automatically from the service account JSON file. If you set project_id in ktx.yaml, ktx treats it as local descriptor and mapping metadata. The BigQuery connector still authenticates with the project_id inside credentials_json.

Features

FeatureSupportedNotes
Tables & viewsYesIncluding materialized views and external tables
Primary keysYesVia INFORMATION_SCHEMA table constraints when declared
Foreign keysNoNot available in BigQuery
Row count estimatesYesFrom table metadata
Column statisticsNo-
Query historyYesVia region-scoped INFORMATION_SCHEMA.JOBS_BY_PROJECT when enabled
Table samplingYes-

Query history

BigQuery query history reads aggregated query-history templates from region-scoped INFORMATION_SCHEMA.JOBS_BY_PROJECT and feeds the same unified staged artifact shape as Postgres and Snowflake.

    context:
      queryHistory:
        enabled: true
        windowDays: 90
        minExecutions: 5
        filters:
          dropTrivialProbes: true
          serviceAccounts:
            patterns: ['@bot\\.']
            mode: exclude
        redactionPatterns: []

Dialect notes

  • Parameter binding uses named @param syntax
  • Arrays flattened to comma-separated strings in results
  • Location specified at query execution time
  • Supports max_bytes_billed and job_timeout_ms limits from ktx.yaml

MySQL

Standard MySQL/MariaDB connector with full foreign key support and schema introspection.

Connection config

yamlktx.yaml
connections:
  my-mysql:
    driver: mysql
    url: env:MYSQL_DATABASE_URL

Or with individual fields:

yamlktx.yaml
connections:
  my-mysql:
    driver: mysql
    host: mysql.internal
    port: 3306
    database: analytics
    username: ktx_reader
    password: env:MYSQL_PASSWORD
    ssl: true

Authentication

MethodConfig
Passwordpassword: env:MYSQL_PASSWORD or password: file:/path/to/secret
SSLssl: true or ssl: { rejectUnauthorized: false }
URL parameters?ssl=true or ?sslmode=required in connection URL

Features

FeatureSupportedNotes
Tables & viewsYesVia INFORMATION_SCHEMA.TABLES
Primary keysYesVia KEY_COLUMN_USAGE
Foreign keysYesVia REFERENTIAL_CONSTRAINTS
Row count estimatesYesFrom TABLE_ROWS (InnoDB estimate)
Column statisticsNo-
Query historyNo-
Table samplingYesUses RAND() filter

Dialect notes

  • Parameter binding uses positional ? placeholders
  • Uses LIMIT X OFFSET Y for pagination
  • Single database per connection (no multi-schema)
  • Supports 20+ MySQL types including enum, json, datetime, decimal
  • Table comments extracted with InnoDB metadata prefix stripping

SQL Server

Connects to Microsoft SQL Server and Azure SQL. Supports multi-schema scanning with dbo as the default schema.

Connection config

yamlktx.yaml
connections:
  my-sqlserver:
    driver: sqlserver
    url: env:SQLSERVER_DATABASE_URL

Or with individual fields:

yamlktx.yaml
connections:
  my-sqlserver:
    driver: sqlserver
    host: sql.internal
    port: 1433
    database: Analytics
    username: ktx_reader
    password: env:MSSQL_PASSWORD
    schema: dbo
    trustServerCertificate: true

For multiple schemas:

    schemas:
      - dbo
      - analytics
      - staging

Authentication

MethodConfig
SQL Server authusername + password
Encrypted connectionAlways enabled, trustServerCertificate: true for self-signed

Features

FeatureSupportedNotes
Tables & viewsYesVia INFORMATION_SCHEMA.TABLES
Primary keysYesVia TABLE_CONSTRAINTS and KEY_COLUMN_USAGE
Foreign keysYesVia REFERENTIAL_CONSTRAINTS
Row count estimatesYesVia sys.dm_db_partition_stats
Column statisticsNo-
Query historyNo-
Table samplingYes-
Nested analysisNo-

Dialect notes

  • Parameter binding uses @paramName syntax
  • Row limiting uses SELECT TOP N * FROM (query) AS ktx_query_result
  • Encryption is always required; certificate validation is optional
  • Multi-schema support with per-schema isolation

SQLite

File-based connector using better-sqlite3. Ideal for local development, embedded analytics, or testing.

Connection config

yamlktx.yaml
connections:
  my-sqlite:
    driver: sqlite
    path: ./data/warehouse.sqlite

Path supports multiple formats:

# Relative path (resolved against project directory)
path: ./warehouse.sqlite

# Absolute path
path: /var/data/analytics.db

# Home directory expansion
path: ~/data/warehouse.sqlite

# Environment variable
path: env:SQLITE_DB_PATH

# URL format
url: sqlite:///path/to/db.sqlite

Authentication

No authentication required - SQLite is file-based. The file must be readable by the process running ktx.

Features

FeatureSupportedNotes
Tables & viewsYesVia sqlite_master
Primary keysYesVia PRAGMA table_info()
Foreign keysYesVia PRAGMA foreign_key_list() (requires PRAGMA foreign_keys = ON)
Row count estimatesYesExact count via SELECT COUNT(*)
Column statisticsNo-
Query historyNo-
Table samplingYes-
Nested analysisNo-

Dialect notes

  • Synchronous query execution (no connection pooling)
  • Parameter binding uses :paramName syntax
  • Uses LIMIT X OFFSET Y for pagination
  • SQLite type affinity system: TEXT, NUMERIC, INTEGER, REAL, BLOB
  • Foreign key enforcement requires explicit PRAGMA foreign_keys = ON
  • Database file must exist before ktx connection test or ingest runs

Common errors

Error or symptomLikely causeRecovery
Connection URL appears in git diffA literal credential URL was written to ktx.yamlReplace it with env:NAME or file:/path/to/secret and rotate exposed credentials
Database ingest returns no tablesSchema, database, or project filter is wrong, or the user lacks metadata permissionsVerify the schema list and grant metadata read permissions
Query history is emptyQuery history extension or warehouse history view is unavailableEnable the warehouse-specific history feature, then rerun ktx ingest <connectionId> --query-history or ktx setup
Column statistics are missingConnector cannot access stats tables or the warehouse does not expose themGrant stats permissions where supported; otherwise rely on fast schema context
Semantic query execution failsConnection is missing, unreachable, or query execution is disabledRun ktx connection test <id> and check the ktx sl query flags