Changelog

[2.1.0] - unreleased

Upgrading from 2.0.0

Two components were renamed and every skill moved. Both renames break callers by name, and nothing warns you — an old name simply resolves to nothing.

Rename graph-explore to graph-data-explore wherever you invoke it: /graph-explore becomes /graph-data-explore.
Rename graph-analyst to graph-data-analyst in any request or configuration that names the agent.
Update any pinned path of the form skills/<name>/SKILL.md. All eighteen skills now sit one level deeper, under a group directory: skills/<group>/<name>/SKILL.md. The eight that existed in 2.0.0 moved to skills/exploration/explore-data/, skills/statistical/ab-test/, skills/statistical/analyze-correlations/, skills/modeling/cluster-analysis/, skills/modeling/forecast/, skills/modeling/regression/, skills/graph-data/graph-data-explore/, and skills/workflow/data-pipeline/.
Reload the plugin so the moved directories are rescanned. Confirm with claude --plugin-dir <path> plugin details localdata-mcp, which should report 18 skills and 11 agents.

Nothing else in the tool surface changed: all 53 MCP tool names, parameters, and return shapes are as they were in 2.0.0. A client that uses only the MCP server, without the skills and agents, needs no action.

Removed

The skill name graph-explore and the agent name graph-analyst. Both are renames, not deletions of function — see Upgrading above for the replacements.
The flat skills/<name>/SKILL.md layout, replaced by skills/<group>/<name>/SKILL.md.

Added

Ten skills: data-quality, find-reference-data, anomaly-detection, dimensionality-reduction, geospatial, optimization, hypothesis-test, sampling-estimation, process-control, research-pipeline. The plugin now ships 18.
Five agents: data-researcher, data-scientist, geospatial-analyst, operations-analyst, research-analyst. The plugin now ships 11.
server.json describing the package for the official MCP registry, plus the mcp-name marker in README.
Docker Hub publish workflow and MCP metadata label on the image.
Kaggle demo: an RFM replication notebook over the UCI Online Retail dataset, with a Google Colab variant.
uv.lock, committed so development environments are reproducible.
docs/plugin.md, documenting every shipped skill and agent.

Changed

Licence migrated from MIT to Apache 2.0. LICENSE carries the Apache 2.0 text, NOTICE carries the attribution notice Apache 2.0 requires, and pyproject.toml declares license = "Apache-2.0" as a PEP 639 expression.
Breaking: skills are grouped into five domain directories (exploration, statistical, modeling, graph-data, workflow). All eight 2.0.0 skills moved, and graph-explore is now graph-data-explore. See Upgrading above.
Breaking: graph-analyst is now graph-data-analyst. See Upgrading above.
statistical-analyst covers sampling design and estimation alongside hypothesis testing.
PyPI metadata: added AI and data-science classifiers and cross-platform keywords.
Version raised to 2.1.0 in pyproject.toml, .claude-plugin/plugin.json, and server.json.

Fixed

Performance benchmarks workflow was invalid YAML and had never executed. Embedded Python scripts written at column 0 terminated the enclosing run: block scalar, so GitHub could not parse the file and every run completed with zero jobs.
Performance benchmarks workflow installed dependencies from requirements.txt and requirements-dev.txt, which no longer exist; it now installs from pyproject.toml. Deprecated actions were updated and the Python matrix aligned with CI.
Benchmark harness called token estimation with the wrong parameter name.
Retry-logic test counted sleeps raised on other threads and failed intermittently.
Plugin manifest and contributing guide still referenced the MIT licence after the migration to Apache 2.0. The License :: OSI Approved :: MIT License classifier was removed, since a PEP 639 license expression supersedes it.
README documented seven graph tools that do not exist (get_node_graph, set_node_graph, delete_node_graph, get_value_graph, set_value_graph, delete_key_graph, list_keys_graph). Graph node operations are served by the tree tools, which read path as a node ID for graph connections.
README named four environment variables the server never reads (LOCALDATA_MAX_CONNECTIONS, LOCALDATA_CHUNK_SIZE, LOCALDATA_BUFFER_TTL, LOCALDATA_WORKING_DIR) and gave a chunk-size default of 500 where the code uses 100.
README linked DATABASE_CONNECTIONS.md and FAQ.md, both deleted before 2.0.0.
CONTRIBUTING.md, TROUBLESHOOTING.md, and docs/getting-started.md told contributors to run uv sync --dev. The dev tools are a project extra, not a uv dependency group, so that command uninstalls pytest, black, and mypy instead of installing them. Corrected to uv sync --extra dev.
Tool counts and export-format lists across README and docs/tools-reference.md now match the registered tools; get_metrics is documented.

[2.0.0] - 2026-04-06

Added

Data science domain modules with sklearn-compatible transformers:
- Statistical analysis: hypothesis testing, ANOVA, effect sizes
- Regression modeling: linear, regularized, polynomial with diagnostics
- Pattern recognition: clustering, anomaly detection, dimensionality reduction
- Time series: decomposition, ARIMA/ETS forecasting, change detection
- Business intelligence: RFM segmentation, cohort analysis, A/B testing
- Optimization: linear programming, constrained optimization, network analysis
- Sampling & estimation: bootstrap, Bayesian estimation, Monte Carlo simulation
- Geospatial analysis: spatial joins, routing, clustering (optional geopandas)
Pipeline framework: sklearn-compatible data science pipeline with streaming support
12 new MCP tools for data science operations (52 total)
Integration shim layer for domain interoperability
Graph format support: DOT, GML, GraphML, Mermaid
RDF/SPARQL support: Turtle, N-Triples files and remote SPARQL endpoints
Memory-aware query execution with 3-path decision flow (in-memory, staging, refinement)
Graceful abort handler for streaming with partial result return
Claude Code plugin structure: 8 skills, 6 agents, plugin manifest
Comprehensive documentation: tools reference, data sources, 8 domain guides

Changed

Added scipy, scikit-learn, statsmodels, numpy as core dependencies
Added networkx, pydot, rdflib, SPARQLWrapper as core dependencies
Geospatial imports guarded with try/except for optional geopandas dependency
README rewritten for v2.0 (663 lines to 261 lines)
Replaced fixed 100-row query threshold with dynamic memory-based threshold
Refactored oversized modules into sub-packages for maintainability

[1.7.1] - 2026-04-02

Added

Enterprise-scale integration testing with NYC Taxi dataset (100K rows per database)
62 enterprise tests across 7 databases: PostgreSQL, MySQL, MSSQL, Oracle, SQLite, MongoDB, Elasticsearch
Oracle integration tests via oracledb thin mode (no Oracle Client required)
Automated integration test runner (scripts/run_integration_tests.py) with Docker lifecycle management
Public dataset download script (scripts/download_test_datasets.py) for NYC Taxi, Ames Housing, World Bank
Test fixture file generator (scripts/generate_test_files.py) for all supported formats
InfluxDB, Neo4j, CouchDB integration test suites and Docker services

Fixed

Pydantic v1 to v2 migration for configuration validators
Oracle SQLAlchemy connection string: use ?service_name= format for thin mode
MSSQL bulk loader: switch to executemany (2100 parameter limit prevents multi-row INSERT)
Oracle bulk loader: switch to executemany (multi-row INSERT VALUES not supported)
Enterprise parquet reader: stream via iter_batches to avoid loading 9.5M rows into memory
Enterprise fixture: read dataset once and pass to all database loaders
Various test assertion fixes for MCP response shapes, timeout manager mocks, security event fields

Changed

README updated: added Oracle, MSSQL, RDF/SPARQL to supported databases; documented 18 missing tools; removed 6 dead tool references; updated test stats and examples
Version references updated from v1.3.1 to v1.7.0 across API_REFERENCE.md, ARCHITECTURE.md, CONFIGURATION.md, ADVANCED_EXAMPLES.md
Test count: 1,621 unit tests + 234 integration tests + 62 enterprise tests

[1.7.0] - 2026-03-28

Added

Regex Search & Transform Tools

search_data MCP tool: search query results for regex patterns with column filtering, case sensitivity, and match position tracking
transform_data MCP tool: apply regex find/replace to query result columns with capture group support and change sampling
ReDoS (Regular Expression Denial of Service) prevention via sre_parse AST analysis detecting nested quantifiers and dangerous patterns
Pattern complexity scoring with configurable limits (200 char max, 10 groups max)
Timeout-protected execution (5 second default) via ThreadPoolExecutor

Enhanced Graph & Tree Export

Graph hierarchy export (style="hierarchy"): renders DAGs as indented trees with multi-parent annotations (also child of: X)
Graph detailed export (style="detailed"): full node property sections with incoming/outgoing edge tables
Graph adjacency list export (style="adjacency"): compact A -> B [label] format for token efficiency
Tree export with path breadcrumbs (include_path=True): shows root > parent > child navigation context
Cycle detection with automatic fallback to adjacency list for non-DAG graphs

Schema Export

export_schema MCP tool with four output formats:
- JSON Schema with type mapping, required fields, maxLength, x-primary-key, x-foreign-keys, x-indexes
- Python dataclass definitions with Optional fields for nullable columns
- TypeScript interfaces with optional property markers
- SQL DDL with CREATE TABLE, PRIMARY KEY, FOREIGN KEY, CREATE INDEX statements
SchemaIntrospector class using SQLAlchemy Inspector for cross-database schema extraction

Query Audit Log

get_query_log MCP tool: retrieve recent query execution history with database, status, and time filters
get_error_log MCP tool: filtered view of errors and timeouts with structured classification
In-memory ring buffer (configurable max 1000 entries) with thread-safe deque
Automatic audit recording integrated into execute_query with timing and error capture
Query hash generation for deduplication statistics

Readonly Mode

security.readonly: true configuration option that hard-blocks write-disguised-as-read SQL
Detects SELECT INTO, CREATE TABLE AS SELECT, COPY TO, INSERT ALL, MERGE INTO, OUTPUT INTO
Database-specific pattern coverage for Oracle, MS SQL, PostgreSQL, MySQL

[1.6.0] - 2026-03-28

Added

Enterprise Database Support

Oracle Database support via oracledb driver with password, Oracle Wallet, Kerberos, and certificate authentication
MS SQL Server support via pymssql/pyodbc with SQL Auth, Windows Integrated Auth, Azure AD, Kerberos, and certificate authentication
Enterprise authentication framework (auth parameter on connect_database) with support matrix per database type
Oracle and MSSQL error mappers with ORA-XXXXX and Msg/Severity code classification

Configuration System Upgrade

OS-aware configuration paths: XDG on Linux, ~/Library/Application Support on macOS, %APPDATA% on Windows
Legacy ~/.localdata.yaml path deprecated with migration support
New config sections: staging, memory, query, connections, security, disk_budget
Environment variable overrides for all new config sections (LOCALDATA_STAGING_*, LOCALDATA_MEMORY_*, etc.)
CLI flags: --config, --version, --migrate-config, --force, --validate-config, --show-config, --init-config
Default config file creation with create_default_config()

Structured Error Classification

StructuredErrorResponse with error_type, is_retryable, message, suggestion for LLM agents
Database-specific error mappers for SQLite, PostgreSQL, MySQL, DuckDB, Oracle, MS SQL
Error mapper registry with pluggable architecture for custom databases
Structured errors integrated into execute_query, connect_database, and streaming executor
Helper functions: classify_error(), is_error_retryable(), get_error_suggestion()

Query Execution Intelligence

Size estimation engine combining column type metadata with EXPLAIN row estimates
EXPLAIN parsers for SQLite, PostgreSQL, MySQL, Oracle (DBMS_XPLAN), MS SQL (SHOWPLAN_XML)
Pre-flight query estimation via preflight=True parameter on execute_query
Memory-aware execution with MemoryBudget auto-calculated from available RAM (10%, max 512MB)
Aggressive mode when RAM drops below 1GB (5%, max 128MB)
BLOB column detection with placeholder handling ([BLOB: size, mime_type]) and opt-in base64 encoding

Staging and Disk Management

Staging database manager with LRU eviction for large query results
Disk space monitoring during streaming with configurable thresholds
Graceful abort with partial results when disk limits are reached
include_staging parameter for list_databases to show staging databases
Cascade cleanup of staging databases on disconnect_database

Markdown Export

Markdown table export for query results with alignment and truncation
Tree/structured data export as heading hierarchy
Graph export with summary stats, node/edge tables, and Mermaid diagram embedding
Registered as markdown/md format in export_structured and export_graph

Documentation and Testing

Comprehensive configuration reference (docs/configuration.md)
Error classification reference (docs/error-classification.md)
Docker-based integration testing infrastructure (docker-compose.test.yml)
663 tests across 26 test files

[1.5.2] - 2026-03-26

Added

Validation warnings on graph import and edit operations

[1.5.1] - 2026-03-26

Added

Mermaid flowchart parser support for graph import
Mermaid export support with round-trip fidelity
Main README section documenting graph file, Mermaid, and SPARQL support

Fixed

Include node and edge metadata in graph exports
Six graph bugs found during code review
Mermaid parsing and export edge cases
File and function size limit compliance for graph modules

[1.5.0] - 2026-03-24

Added

Graph file format support: DOT, GML, GraphML
RDF format support: Turtle and N-Triples parsing
GraphStorageManager and RDFStorageManager for structured graph persistence
SPARQL query endpoint for RDF data
Graph-specific MCP tools for node, edge, path, and statistics operations
RDF export and describe_database support for graph connections
End-to-end integration tests for graph/RDF/SPARQL pipeline
move_node tool to relocate subtrees in tree storage

Fixed

Return dicts from tree tools to prevent double JSON serialization
Deserialize string values when explicit value_type is provided

[1.4.2] - 2026-03-23

Fixed

Use StaticPool for in-memory SQLite to fix cross-thread access errors
Clean up tree managers on disconnect to prevent resource leaks

[1.4.0] - 2026-03-23

Added

Tree storage model, manager, and hierarchical data persistence
Tree parsers for TOML, JSON, and YAML formats
MCP tool functions for tree navigation and mutation (get_node, set_node, get_children, delete_node)
Tree export to TOML, JSON, and YAML formats
Integration of tree storage with connect_database and tool registration
Data summary returned on connect_database
Comprehensive tests for tree tools, parsers, and export

Fixed

Migrate to fastmcp v3 API for tool registration
LogContext now accepts arbitrary keyword arguments
Row-normalise deeply nested TOML/YAML instead of column explosion

[1.3.1] - 2025-08-31

Added

SQL query parser and safety validator
QueryAnalyzer class with pre-query analysis system
ConfigManager class with pydantic-based configuration
StreamingQueryExecutor for memory-bounded streaming pipeline
QueryTimeoutManager for advanced timeout management
Streaming file processors integrated into file engine creation
TokenManager class with DataFrame-based token estimation
EnhancedConnectionManager with connection pooling and health monitoring
SecurityManager class with OWASP-aligned security patterns
Enhanced response metadata and LLM communication protocol with progressive data loading
Error handling system with custom exceptions, retry mechanisms, and circuit breaker pattern
Enhanced database manager with integrated error handling and recovery strategies
Structured logging system with structlog, Prometheus metrics, and JSON logging
CompatibilityManager for backward compatibility with v1.3.0
Performance benchmarking suite with CI integration
Comprehensive documentation: architecture, configuration, API reference, migration guide

Changed

Clean tool surface for production release (breaking: removed deprecated tool aliases)
Major cleanup and reorganization of project structure
README restructured with table of contents

Fixed

Environment variable names in migration script generation
Boolean column categorization and test data alignment
TokenManager integration robustness
DataFrame boolean ambiguity in return statements
Function abuse attack pattern regex
Leaked credential detection in documentation examples

[1.3.0] - 2025-08-30

Added

Modern database support: Redis, Elasticsearch, MongoDB, InfluxDB, Neo4j, CouchDB
DuckDB support for analytical SQL queries
HDF5 format support for scientific data
Apple Numbers (.numbers) document support
Docker containerization with complete docker-compose setup
Comprehensive test coverage improvements (68% coverage)

Changed

Updated documentation to reflect actual test coverage (68% vs previous claims)
Enhanced README with modern database features and Docker support
Improved error handling and graceful degradation for optional dependencies

Fixed

ODS import detection with proper library imports
JSON/TOML serialization for complex nested data structures
INI file parsing with special character handling
InfluxDB service configuration for proper startup

[1.2.0] - 2025-08-29

Added

Multi-sheet spreadsheet support (Excel .xlsx/.xls, LibreOffice .ods)
Enhanced file format support: XML, INI, TSV
Analytical formats: Parquet, Feather, Arrow
Large file handling with automatic SQLite conversion (100MB+ files)
Query buffering system for large result sets (100+ rows)
Advanced security features and input validation

Changed

Improved performance for large datasets
Enhanced error messages and validation
Better memory management and resource cleanup

Fixed

Path traversal security issues
SQL injection prevention
Connection limit enforcement

[1.1.0] - 2025-08-23

Added

Enhanced spreadsheet format support
Improved file type detection
Better error handling for malformed files

Changed

Performance optimizations for file processing
Updated documentation and examples

[1.0.3] - 2025-08-23

Fixed

Minor bug fixes and stability improvements
Documentation updates

[1.0.2] - 2025-08-23

Fixed

Package dependencies and compatibility issues
Installation process improvements

[1.0.1] - 2025-08-23

Fixed

Initial release bug fixes
Documentation corrections

[1.0.0] - 2025-08-23

Added

Initial release of LocalData MCP Server
Basic database connectivity (PostgreSQL, MySQL, SQLite)
File format support (CSV, JSON, YAML)
Core MCP protocol implementation
Security features and input validation