Configuration
LocalData MCP supports configuration through YAML files, environment variables, or a combination of both. Values are resolved in layers: built-in defaults are loaded first, then YAML files are merged on top, and environment variables override everything.
Config file locations
The server searches for configuration files in the following order and merges them together (later layers override earlier ones):
Priority |
Path |
Platform |
Description |
|---|---|---|---|
0 (highest) |
|
All |
Explicit env var override |
1 |
|
All |
Project-local config |
2 |
|
Linux, macOS |
XDG user config |
3 |
|
macOS |
macOS user config |
2 |
|
Windows |
Windows user config |
4 |
|
Linux, macOS |
System-wide config |
5 (lowest) |
|
All |
Legacy (deprecated) |
On Linux, $XDG_CONFIG_HOME defaults to ~/.config when unset.
How merging works
When no explicit --config flag is given, the server loads files in two passes:
Global config — the highest-priority user or system file found (XDG, AppData,
/etc, or legacy).Project-local config —
./.localdata.yaml, if present, is deep-merged on top of the global config. Only the keys you set in the project file are overridden; everything else is inherited from the global config.
If you pass --config PATH (or set LOCALDATA_CONFIG), that single file is used and the discovery process is skipped entirely.
Environment variables are applied last and override any value set in YAML.
Using the legacy path (~/.localdata.yaml) emits a deprecation warning at startup. See the Migration guide section below for how to move to the recommended location.
Environment variables
Environment variables override any values set in YAML files. All variables use the LOCALDATA_ prefix.
General
Variable |
Type |
Description |
|---|---|---|
|
|
Path to a specific YAML config file |
|
|
Log level: |
|
|
Path to log output file |
|
|
Overall memory limit in MB (performance section) |
Database connections
Define databases with the pattern LOCALDATA_DB_<NAME>_<PROPERTY>:
Variable pattern |
Type |
Description |
|---|---|---|
|
|
Database type (e.g., |
|
|
Connection string or file path |
|
|
Sheet name for multi-sheet formats |
|
|
Enable or disable ( |
|
|
Connection pool size |
|
|
Connection timeout in seconds |
|
|
Query timeout in seconds |
<NAME> is case-insensitive and converted to lowercase internally. For example, LOCALDATA_DB_MYDB_TYPE=sqlite creates a database named mydb.
Staging
Variable |
Maps to |
Type |
Default |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Memory
Variable |
Maps to |
Type |
Default |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Query execution
Variable |
Maps to |
Type |
Default |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Connections
Variable |
Maps to |
Type |
Default |
|---|---|---|---|
|
|
|
|
|
|
|
|
Security
Variable |
Maps to |
Type |
Default |
|---|---|---|---|
|
|
|
|
|
|
|
|
Disk budget
Variable |
Maps to |
Type |
Default |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Aggressive memory
Variable |
Maps to |
Type |
Default |
|---|---|---|---|
|
|
|
|
|
|
|
|
Configuration reference
databases
Each key under databases defines a named data source. Required fields: type and connection_string.
Key |
Type |
Default |
Description |
|---|---|---|---|
|
|
(required) |
One of: |
|
|
(required) |
Connection URI or file path |
|
|
|
Sheet or dataset name (Excel, ODS, Numbers, HDF5) |
|
|
|
Whether this source is active |
|
|
|
Maximum pool size |
|
|
|
Connection timeout in seconds |
|
|
|
Query timeout in seconds |
|
|
|
Arbitrary tags for organizing sources |
|
|
|
Additional key-value metadata |
staging
Controls temporary staging databases used during file processing. When you connect a file-based data source (CSV, Excel, Parquet, etc.), the server creates a temporary SQLite staging database so queries can run against it with SQL. These staging databases are managed automatically.
Key |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Maximum concurrent staging databases (1-100) |
|
|
|
Maximum size per staging database in MB |
|
|
|
Total disk budget for all staging databases |
|
|
|
Idle timeout before automatic cleanup |
|
|
|
Eviction strategy: |
When the total staging size exceeds max_total_mb, the least-recently-used staging database is evicted automatically (or the oldest, depending on eviction_policy). Use include_staging: true in list_databases to see active staging databases.
disk_budget
Fine-grained disk budget controls for staging databases. These settings complement the staging section with per-row monitoring and system-level headroom checks.
Key |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Maximum size of a single staging database in MB |
|
|
|
Total disk budget across all staging databases |
|
|
|
Disk usage ratio (0-1) that triggers a warning |
|
|
|
Minimum free disk space to maintain in MB |
|
|
|
Check disk usage every N rows during file import |
memory
Controls memory budgeting for query operations.
Key |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Hard memory budget cap in MB |
|
|
|
Percentage of system RAM to use (1-100) |
|
|
|
Free memory threshold that triggers conservation mode |
|
|
|
RAM percentage used in aggressive conservation mode |
|
|
|
Hard cap in MB during aggressive conservation |
query
Controls query execution behavior.
Key |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Number of rows per streamed chunk |
|
|
|
How long buffered results are kept (seconds) |
|
|
|
BLOB column handling: |
|
|
|
Maximum BLOB size to inline when handling is |
|
|
|
Run query analysis before execution by default |
connections
Controls global connection limits.
Key |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Maximum concurrent database connections |
|
|
|
Default connection timeout in seconds |
security
Controls query and path security.
Key |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
When enabled, file access is limited to |
|
|
|
Directories the server can access (only enforced when |
|
|
|
Block all write-oriented SQL operations (see Readonly mode section below) |
|
|
|
Maximum allowed SQL query length in characters |
|
|
|
SQL keywords to reject (e.g., |
logging
Controls log output.
Key |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Log level: |
|
|
|
Python log format string |
|
|
|
Path to write log files |
|
|
|
Maximum log file size in bytes (10 MB) |
|
|
|
Number of rotated log files to keep |
|
|
|
Print logs to stdout |
performance
Controls resource limits and tuning.
Key |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Overall memory limit in MB |
|
|
|
Buffer timeout in seconds |
|
|
|
Maximum concurrent connections |
|
|
|
Default chunk size for streaming |
|
|
|
Enable automatic query analysis |
|
|
|
Automatically clean up expired buffers |
|
|
|
Memory usage ratio that triggers warnings (0-1) |
Enterprise authentication
The connect_database tool accepts an optional auth parameter (JSON string) to configure authentication for enterprise database backends. The auth object must include a method field and any method-specific parameters.
Supported methods by database type
Database |
|
|
|
|
|
|
|---|---|---|---|---|---|---|
Oracle |
yes |
yes |
yes |
yes |
||
MSSQL |
yes |
yes |
yes |
yes |
yes |
|
PostgreSQL |
yes |
yes |
yes |
|||
MySQL |
yes |
yes |
yes |
|||
SQLite |
||||||
DuckDB |
Auth parameter examples
Password (default):
{"method": "password"}
No extra parameters needed; credentials are taken from the connection string.
Oracle Wallet:
{"method": "wallet", "wallet_path": "/opt/oracle/wallet"}
Kerberos:
{"method": "kerberos"}
Requires a valid Kerberos ticket in the environment. Oracle Kerberos also requires the Oracle Client to be installed.
Windows Trusted (MSSQL):
{"method": "trusted"}
Uses Windows Integrated Authentication (NTLM/SSPI).
Azure AD (MSSQL):
{"method": "azure_ad", "token": "<access-token>"}
Certificate:
{"method": "certificate", "cert_path": "/path/to/client.crt", "key_path": "/path/to/client.key"}
BLOB handling
By default, BLOB (binary large object) columns are excluded from query results to keep responses compact. You can control this per-query with the include_blobs parameter on execute_query, or set a server-wide default via the query.blob_handling config key.
Modes
Mode |
Behavior |
|---|---|
|
BLOB columns are omitted from results |
|
BLOBs are replaced with an informative placeholder: |
|
Small BLOBs (up to |
Per-query override
Pass include_blobs=true to execute_query to base64-encode BLOBs in that single query, regardless of the server-wide setting:
execute_query(name="mydb", query="SELECT * FROM documents", include_blobs=true)
Preflight query estimation
Before executing a potentially expensive query, you can request an estimate by passing preflight=true to execute_query. The server runs EXPLAIN (without executing the query) and returns a cost estimate.
Response format
{
"estimated_rows": 150000,
"estimated_size_bytes": 31457280,
"estimated_size_mb": 30.0,
"scan_type": "full_table_scan",
"confidence": 0.85,
"suggestion": "Consider adding a WHERE clause or LIMIT to reduce result size."
}
Field |
Type |
Description |
|---|---|---|
|
|
Estimated number of result rows |
|
|
Estimated total result size in bytes |
|
|
Same estimate in megabytes |
|
|
Query plan scan type (e.g., |
|
|
Confidence score for the estimate (0-1) |
|
|
Human-readable recommendation |
Set query.preflight_default: true in your config to enable preflight estimation for all queries by default.
YAML variable substitution
YAML config files support environment variable substitution with the ${VAR} syntax. You can provide a default value with ${VAR:default}:
databases:
production:
type: postgresql
connection_string: postgresql://${DB_USER:admin}:${DB_PASSWORD}@${DB_HOST:localhost}:5432/${DB_NAME:mydb}
If the variable is unset and no default is given, the literal ${VAR} string is preserved.
Example configurations
Minimal
A single SQLite database with all other settings left at defaults:
databases:
main:
type: sqlite
connection_string: ./data/app.db
Development
SQLite for local work and a PostgreSQL instance for shared data, with debug logging enabled:
databases:
local:
type: sqlite
connection_string: ./dev.db
shared:
type: postgresql
connection_string: postgresql://dev:dev@localhost:5432/devdb
logging:
level: debug
console_output: true
performance:
memory_limit_mb: 1024
chunk_size: 50
Oracle
Connect to an Oracle database. Requires the enterprise extra (pip install localdata-mcp[enterprise]).
databases:
# Basic password auth via connection string
finance:
type: oracle
connection_string: oracle+oracledb://${ORA_USER}:${ORA_PASS}@dbhost:1521/FINDB
# TNS alias (requires TNS_ADMIN env var)
finance_tns:
type: oracle
connection_string: oracle+oracledb://${ORA_USER}:${ORA_PASS}@FINDB_ALIAS
# Oracle Wallet (passwordless)
finance_wallet:
type: oracle
connection_string: oracle+oracledb:///@FINDB_ALIAS
metadata:
auth:
method: wallet
wallet_path: /opt/oracle/wallet
# Kerberos (requires Oracle Client)
finance_krb:
type: oracle
connection_string: oracle+oracledb:///@FINDB_ALIAS
metadata:
auth:
method: kerberos
# Mutual TLS
finance_cert:
type: oracle
connection_string: oracle+oracledb:///@FINDB_ALIAS
metadata:
auth:
method: certificate
cert_path: /etc/ssl/client.crt
key_path: /etc/ssl/client.key
Supported auth.method values for Oracle: password (default), wallet, kerberos, certificate.
MS SQL Server
Connect to a SQL Server instance. Requires the enterprise extra (pip install localdata-mcp[enterprise]).
databases:
# Password auth (pymssql driver)
erp:
type: mssql
connection_string: mssql+pymssql://${MSSQL_USER}:${MSSQL_PASS}@sqlserver.internal/erp_db
max_connections: 15
query_timeout: 120
# Windows Integrated Authentication
erp_trusted:
type: mssql
connection_string: mssql+pyodbc://sqlserver.internal/erp_db?driver=ODBC+Driver+18+for+SQL+Server
metadata:
auth:
method: trusted
# Kerberos
erp_kerberos:
type: mssql
connection_string: mssql+pyodbc://sqlserver.internal/erp_db
metadata:
auth:
method: kerberos
# Azure AD token
erp_azure:
type: mssql
connection_string: mssql+pyodbc://sqlserver.database.windows.net/erp_db
metadata:
auth:
method: azure_ad
token: ${AZURE_AD_TOKEN}
# Mutual TLS
erp_cert:
type: mssql
connection_string: mssql+pyodbc://sqlserver.internal/erp_db
metadata:
auth:
method: certificate
cert_path: /etc/ssl/client.crt
key_path: /etc/ssl/client.key
Supported auth.method values for MSSQL: password (default), trusted, azure_ad, kerberos, certificate.
Production
Multiple databases with tuned resource limits:
databases:
analytics:
type: postgresql
connection_string: postgresql://${PG_USER}:${PG_PASS}@db.internal:5432/analytics
max_connections: 20
query_timeout: 600
warehouse:
type: duckdb
connection_string: /data/warehouse.duckdb
reports:
type: excel
connection_string: /data/monthly_reports.xlsx
sheet_name: Summary
staging:
max_concurrent: 20
max_size_mb: 4096
max_total_mb: 20480
timeout_minutes: 60
eviction_policy: lru
memory:
max_budget_mb: 1024
budget_percent: 15
low_memory_threshold_gb: 2.0
query:
default_chunk_size: 500
buffer_timeout_seconds: 1200
blob_handling: placeholder
preflight_default: true
connections:
max_concurrent: 25
timeout_seconds: 60
security:
allowed_paths: ["/data", "/reports"]
max_query_length: 50000
blocked_keywords: ["DROP", "TRUNCATE", "ALTER"]
logging:
level: warning
file_path: /var/log/localdata-mcp/server.log
performance:
memory_limit_mb: 4096
memory_warning_threshold: 0.90
Readonly mode
When security.readonly is set to true, the server rejects any SQL statement that would write data as a side effect. This goes beyond the standard blocked_keywords list and catches patterns that embed writes inside otherwise valid SELECT statements:
Blocked pattern |
Example |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Standard DML (INSERT, UPDATE, DELETE, DROP, CREATE, ALTER, TRUNCATE) is already blocked by the query validator regardless of this setting.
Tools reference (v1.7.0)
The following tools were added in v1.7.0.
search_data
Search query results using a regular expression. Runs the given SQL query, then matches pattern against the specified columns (or all columns if none are specified).
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
yes |
Connected database name |
|
|
yes |
SQL query to execute |
|
|
yes |
Regular expression to match |
|
|
no |
Comma-separated column names to search (default: all) |
|
|
no |
Maximum number of matches to return |
transform_data
Apply a regex find-and-replace to a column in query results. Executes the SQL query, transforms matching values in the specified column, and returns the modified data.
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
yes |
Connected database name |
|
|
yes |
SQL query to execute |
|
|
yes |
Column to transform |
|
|
yes |
Regex pattern to find |
|
|
yes |
Replacement string (supports backreferences) |
|
|
no |
Maximum rows to process |
export_schema
Export database schema in a structured format. Useful for generating type definitions or documentation from a live database.
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
yes |
Connected database name |
|
|
no |
Comma-separated table names (default: all tables) |
|
|
no |
Output format: |
get_query_log
Retrieve the recent query execution history for auditing or debugging.
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
no |
Filter by database name |
|
|
no |
Maximum entries to return (default: 50) |
|
|
no |
Filter by status (e.g., |
|
|
no |
Only include entries from the last N minutes (default: 60) |
get_error_log
Retrieve the recent error history.
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
no |
Filter by database name |
|
|
no |
Maximum entries to return (default: 50) |
|
|
no |
Only include entries from the last N minutes (default: 60) |
Graph export formats
The export_graph tool supports a style parameter for markdown output. In addition to the default summary style, the following styles are available:
Style |
Description |
|---|---|
|
Default. Compact overview with node and edge counts, plus a sample of each. |
|
Indented tree layout rooted at source nodes. Suitable for DAGs and organizational charts. |
|
Compact adjacency-list format: each node followed by its outgoing neighbors. |
|
Full per-node listing with all attributes and edge details. Respects |
Migration guide
If you have a config file at the legacy location (~/.localdata.yaml), migrate it to the platform-appropriate path.
Automatic migration
Run:
localdata-mcp --migrate-config
This will:
Copy
~/.localdata.yamlto the recommended path (e.g.,~/.config/localdata/config.yamlon Linux)Create a backup at
~/.localdata.yaml.bakCreate any necessary parent directories
If a file already exists at the destination, the command exits with an error. Use --force to overwrite:
localdata-mcp --migrate-config --force
Manual migration
Find the recommended path for your OS:
Linux:
~/.config/localdata/config.yaml(or$XDG_CONFIG_HOME/localdata/config.yaml)macOS:
~/.config/localdata/config.yamlor~/Library/Application Support/localdata/config.yamlWindows:
%APPDATA%\localdata\config.yaml
Create the directory and copy:
mkdir -p ~/.config/localdata cp ~/.localdata.yaml ~/.config/localdata/config.yaml
Verify the server starts without deprecation warnings.
Optionally remove the old file once you have confirmed the new location works.
CLI flags
Flag |
Description |
|---|---|
|
Use a specific config file (takes highest precedence) |
|
Print the server version and exit |
|
Migrate |
|
Force overwrite if the destination file already exists during migration |
|
Load configuration, validate it, and exit with status 0 (valid) or 1 (errors) |
|
Print the resolved configuration as YAML (connection strings and tokens are redacted) and exit |
|
Create a default configuration file at the recommended platform path and exit |