# LocalData MCP Tools Reference

Complete reference documentation for all 52 MCP tools. Organized by category for quick navigation.

## Table of Contents

1. Core Database (8 tools)
2. Streaming & Memory (9 tools)
3. Tree/Structured Data (10 tools)
4. Graph Operations (7 tools)
5. Search & Transform (2 tools)
6. Schema & Audit (3 tools)
7. System (1 tool)
8. Data Science (12 tools)

---

## Core Database (8 tools)

### connect_database

Open a connection to a database.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Unique connection identifier (e.g., "analytics_db", "user_data") |
| `db_type` | string | Yes | Database type: sqlite, postgresql, mysql, duckdb, csv, json, yaml, toml, excel, ods, numbers, xml, ini, tsv, parquet, feather, arrow, hdf5, dot, gml, graphml, mermaid, turtle, ntriples, sparql |
| `conn_string` | string | Yes | Connection string or file path |
| `sheet_name` | string | No | Sheet name for Excel/ODS/Numbers or dataset name for HDF5 |
| `auth` | string | No | JSON authentication config (e.g., `{"method": "wallet", "wallet_path": "/path"}`) |

**Returns:** Connection summary with metadata (JSON)

**Example:**
```python
# SQL database
connect_database("mydb", "postgresql", "postgresql://user:pass@localhost/dbname")

# CSV file
connect_database("data", "csv", "/path/to/file.csv")

# Graph file
connect_database("network", "graphml", "/path/to/network.graphml")
```

**Composition hints:** Use with `execute_query`, `describe_database`, or data manipulation tools.

---

### disconnect_database

Close a connection to a database. All connections close automatically on script termination.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name to close |

**Returns:** Success/error message with cleanup details (JSON)

**Example:**
```python
disconnect_database("mydb")
```

**Composition hints:** Call after completing work with a database to free resources.

---

### execute_query

Execute a SQL query and return results as JSON with memory-aware streaming.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query to execute |
| `chunk_size` | integer | No | Results per chunk for pagination |
| `enable_analysis` | boolean | No | Perform pre-query analysis (default: true) |
| `include_blobs` | boolean | No | Base64-encode BLOBs in results (default: false) |
| `preflight` | boolean | No | Run EXPLAIN only (default: false) |

**Returns:** Query results as JSON or streaming metadata (JSON)

**Example:**
```python
execute_query("mydb", "SELECT * FROM users WHERE active = true", chunk_size=100)
```

**Composition hints:** Use with `next_chunk` for large result sets, `get_query_metadata` for analysis.

---

### analyze_query_preview

Analyze a query without executing it to preview resource requirements.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query to analyze |

**Returns:** Analysis including estimated rows, memory, execution time, and risks (JSON)

**Example:**
```python
analyze_query_preview("mydb", "SELECT * FROM large_table JOIN other_table ON ...")
```

**Composition hints:** Use before `execute_query` on complex queries to assess feasibility.

---

### list_databases

List all available database connections with their SQL flavor information.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `include_staging` | boolean | No | Include active staging databases (default: false) |

**Returns:** Array of connection objects with names and types (JSON)

**Example:**
```python
list_databases()
```

**Composition hints:** Use to discover available connections for workflow orchestration.

---

### describe_database

Get detailed information about a database including its schema in JSON format.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Database connection name |

**Returns:** Database schema with tables, columns, types, and relationships (JSON)

**Example:**
```python
describe_database("mydb")
```

**Composition hints:** Use with `find_table` to locate specific tables, or `describe_table` for details.

---

### find_table

Find which database contains a specific table by name.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `table_name` | string | Yes | Name of table to locate |

**Returns:** Connection name containing the table, or error if not found (JSON)

**Example:**
```python
find_table("users")
```

**Composition hints:** Use in multi-database workflows to locate data without knowing connection.

---

### describe_table

Get detailed schema information for a specific table.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Database connection name |
| `table_name` | string | Yes | Name of table to describe |

**Returns:** Table schema with columns, types, constraints, and indexes (JSON)

**Example:**
```python
describe_table("mydb", "users")
```

**Composition hints:** Use before writing queries to understand table structure and column types.

---

## Streaming & Memory (9 tools)

### next_chunk

Retrieve the next chunk of rows from a buffered query result.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `query_id` | string | Yes | ID of buffered query result |
| `start_row` | integer | Yes | Starting row number (1-based) |
| `chunk_size` | string | Yes | Number of rows or "all" for remaining |

**Returns:** Chunk of rows with metadata (JSON)

**Example:**
```python
next_chunk("mydb_12345_abc1", 1, "100")
```

**Composition hints:** Chain multiple calls to paginate through large result sets.

---

### manage_memory_bounds

Monitor and manage memory usage across all streaming operations.

**Parameters:** None

**Returns:** Memory status, usage statistics, and cleanup actions taken (JSON)

**Example:**
```python
manage_memory_bounds()
```

**Composition hints:** Call when memory warnings appear or before large operations.

---

### get_streaming_status

Get detailed status of all active streaming operations and memory usage.

**Parameters:** None

**Returns:** Active buffers, memory usage, performance metrics (JSON)

**Example:**
```python
get_streaming_status()
```

**Composition hints:** Monitor streaming workloads and identify bottlenecks.

---

### clear_streaming_buffer

Clear a specific streaming result buffer to free memory immediately.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `query_id` | string | Yes | ID of buffer to clear |

**Returns:** Confirmation message (JSON)

**Example:**
```python
clear_streaming_buffer("mydb_12345_abc1")
```

**Composition hints:** Use after finishing with a large result set.

---

### get_query_metadata

Get comprehensive metadata for a query result including quality metrics and processing recommendations.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `query_id` | string | Yes | ID of query result |

**Returns:** LLM-friendly summary, quality metrics, complexity analysis (JSON)

**Example:**
```python
get_query_metadata("mydb_12345_abc1")
```

**Composition hints:** Use with LLM-based analysis workflows for decision making.

---

### request_data_chunk

Retrieve a specific chunk of data using the LLM communication protocol.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `query_id` | string | Yes | ID of query result |
| `chunk_id` | integer | Yes | Chunk ID to retrieve (0-based) |

**Returns:** Chunk data with metadata (JSON)

**Example:**
```python
request_data_chunk("mydb_12345_abc1", 0)
```

**Composition hints:** Use for targeted chunk retrieval in progressive loading workflows.

---

### request_multiple_chunks

Retrieve multiple chunks efficiently in a single call.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `query_id` | string | Yes | ID of query result |
| `chunk_ids` | string | Yes | Comma-separated chunk IDs (e.g., "0,1,2") |

**Returns:** Multiple chunks with metadata (JSON)

**Example:**
```python
request_multiple_chunks("mydb_12345_abc1", "0,1,2,3,4")
```

**Composition hints:** Use to load specific chunks in parallel.

---

### cancel_query_operation

Cancel an ongoing query operation and free resources.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `query_id` | string | Yes | ID of query to cancel |
| `reason` | string | No | Reason for cancellation (default: "User requested") |

**Returns:** Cancellation status message (JSON)

**Example:**
```python
cancel_query_operation("mydb_12345_abc1", "User interrupt")
```

**Composition hints:** Use on long-running queries to stop execution.

---

### get_data_quality_report

Get comprehensive data quality assessment for a query result.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `query_id` | string | Yes | ID of query to assess |

**Returns:** Detailed quality report including nulls, duplicates, outliers (JSON)

**Example:**
```python
get_data_quality_report("mydb_12345_abc1")
```

**Composition hints:** Use before analytics to understand data cleanliness.

---

## Tree/Structured Data (10 tools)

### get_node

Get node details or summary for tree, graph, or RDF connections.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `path` | string | No | Node path (tree), node_id (graph), or subject URI (RDF) |

**Returns:** Node details or root summary (JSON)

**Example:**
```python
get_node("config", "server/host")
get_node("network", "node-123")
```

**Composition hints:** Use with `get_children` to traverse hierarchies.

---

### get_children

Get children of a node with pagination.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `path` | string | No | Parent node path (null for root) |
| `offset` | integer | No | Pagination offset (default: 0) |
| `limit` | integer | No | Rows per page (default: 50) |

**Returns:** Array of child nodes with metadata (JSON)

**Example:**
```python
get_children("config", "database", offset=0, limit=20)
```

**Composition hints:** Chain calls with different offsets to paginate through large hierarchies.

---

### set_node

Create a node in tree or graph connection.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `path` | string | Yes | Path for new node |
| `label` | string | No | Display label (graph only) |

**Returns:** Confirmation with node details (JSON)

**Example:**
```python
set_node("config", "app/features/new_feature", "New Feature")
```

**Composition hints:** Use with `set_value` to add properties to nodes.

---

### move_node

Move a node and its subtree under a new parent or to root.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `path` | string | Yes | Node path to move |
| `new_parent` | string | No | New parent path (null for root) |

**Returns:** Confirmation with new location (JSON)

**Example:**
```python
move_node("config", "old/location/node", "new/location")
```

**Composition hints:** Use for reorganizing hierarchical structures.

---

### delete_node

Delete a node and all its descendants.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `path` | string | Yes | Node path to delete |

**Returns:** Confirmation with deletion details (JSON)

**Example:**
```python
delete_node("config", "deprecated/feature")
```

**Composition hints:** Use carefully as deletion cascades to all children.

---

### list_keys

List key-value pairs at a node with pagination.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `path` | string | Yes | Node path |
| `offset` | integer | No | Pagination offset (default: 0) |
| `limit` | integer | No | Results per page (default: 50) |

**Returns:** Array of key-value pairs (JSON)

**Example:**
```python
list_keys("config", "database")
```

**Composition hints:** Use with `get_value` to inspect node properties.

---

### get_value

Get a specific property value from a node.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `path` | string | Yes | Node path |
| `key` | string | Yes | Property key to retrieve |

**Returns:** Property value and metadata (JSON)

**Example:**
```python
get_value("config", "database", "host")
```

**Composition hints:** Use to read individual node properties.

---

### set_value

Set a property on a node (auto-creates node if needed).

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `path` | string | Yes | Node path |
| `key` | string | Yes | Property key |
| `value` | string | Yes | Property value |
| `value_type` | string | No | Data type hint |

**Returns:** Confirmation with updated node (JSON)

**Example:**
```python
set_value("config", "database", "host", "localhost", "string")
```

**Composition hints:** Use to modify node properties.

---

### delete_key

Delete a property from a node.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `path` | string | Yes | Node path |
| `key` | string | Yes | Property key to delete |

**Returns:** Confirmation with remaining properties (JSON)

**Example:**
```python
delete_key("config", "database", "deprecated_setting")
```

**Composition hints:** Use to remove obsolete node properties.

---

### export_structured

Export tree or RDF data as TOML, JSON, YAML, Turtle, or N-Triples.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `format` | string | Yes | Export format: json, yaml, toml, turtle, ntriples |
| `path` | string | No | Root path to export (null for all) |

**Returns:** Exported data in requested format (string)

**Example:**
```python
export_structured("config", "yaml", "server")
```

**Composition hints:** Use for data portability and backup.

---

## Graph Operations (7 tools)

### get_neighbors

Get neighbors of a graph node with edge information.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `node_id` | string | Yes | Node ID |
| `direction` | string | No | Direction: "in", "out", or "both" (default: both) |
| `offset` | integer | No | Pagination offset (default: 0) |
| `limit` | integer | No | Results per page (default: 50) |

**Returns:** Array of neighbors with edge information (JSON)

**Example:**
```python
get_neighbors("network", "user-123", direction="out")
```

**Composition hints:** Use for network analysis and traversal.

---

### get_edges

List edges in a graph, optionally filtered by node.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `node_id` | string | No | Filter by node (null for all edges) |
| `offset` | integer | No | Pagination offset (default: 0) |
| `limit` | integer | No | Results per page (default: 50) |

**Returns:** Array of edges with source, target, and properties (JSON)

**Example:**
```python
get_edges("network", node_id="user-123")
```

**Composition hints:** Use for graph structure analysis and export.

---

### add_edge

Add an edge to a graph (auto-creates nodes if needed).

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `source` | string | Yes | Source node ID |
| `target` | string | Yes | Target node ID |
| `label` | string | No | Edge label or relationship type |
| `weight` | number | No | Edge weight (for weighted graphs) |

**Returns:** Confirmation with edge details (JSON)

**Example:**
```python
add_edge("network", "user-1", "user-2", "follows", weight=1.0)
```

**Composition hints:** Use to build or modify graphs programmatically.

---

### remove_edge

Remove an edge from a graph.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `source` | string | Yes | Source node ID |
| `target` | string | Yes | Target node ID |
| `label` | string | No | Edge label (for filtering) |

**Returns:** Confirmation of removal (JSON)

**Example:**
```python
remove_edge("network", "user-1", "user-2", "follows")
```

**Composition hints:** Use to modify graph topology.

---

### find_path

Find path(s) between two graph nodes.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `source` | string | Yes | Start node ID |
| `target` | string | Yes | End node ID |
| `algorithm` | string | No | Algorithm: "shortest" (default) or "all" |

**Returns:** Path(s) with nodes and edges (JSON)

**Example:**
```python
find_path("network", "user-1", "user-5", algorithm="shortest")
```

**Composition hints:** Use for network analysis and influence tracing.

---

### get_graph_stats

Get advanced graph statistics including centrality measures.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |

**Returns:** Graph metrics: node count, edge count, density, diameter, clustering (JSON)

**Example:**
```python
get_graph_stats("network")
```

**Composition hints:** Use to characterize network properties.

---

### export_graph

Export graph as DOT, GML, or GraphML format.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Connection name |
| `format` | string | Yes | Format: dot, gml, graphml |
| `node_id` | string | No | Export subgraph from node (null for all) |

**Returns:** Graph in requested format (string)

**Example:**
```python
export_graph("network", "graphml")
```

**Composition hints:** Use for graph visualization and portability.

---

## Search & Transform (2 tools)

### search_data

Search query results for regex pattern matches.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query to search within |
| `pattern` | string | Yes | Regex pattern to find |
| `columns` | string | No | Comma-separated column names (null for all) |
| `case_sensitive` | boolean | No | Case-sensitive search (default: true) |
| `max_matches` | integer | No | Maximum matches to return (default: 100) |

**Returns:** Matching rows with metadata (JSON)

**Example:**
```python
search_data("mydb", "SELECT * FROM emails", ".*@company\\.com", columns="email", case_sensitive=False)
```

**Composition hints:** Use for content discovery and data validation.

---

### transform_data

Apply regex find/replace to a column in query results.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query to transform |
| `column` | string | Yes | Column name to apply transformation |
| `find` | string | Yes | Regex pattern to find |
| `replace` | string | Yes | Replacement string (supports capture groups) |
| `max_rows` | integer | No | Maximum rows to process (default: 1000) |

**Returns:** Transformed data preview (JSON)

**Example:**
```python
transform_data("mydb", "SELECT * FROM logs", "message", "ERROR: (.*)", "CRITICAL: $1")
```

**Composition hints:** Use for data cleaning and standardization.

---

## Schema & Audit (3 tools)

### export_schema

Export database schema in various formats.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `name` | string | Yes | Database connection name |
| `format` | string | No | Format: json_schema, python, typescript, sql_ddl (default: json_schema) |
| `tables` | string | No | Comma-separated table names (null for all) |

**Returns:** Schema in requested format (string/JSON)

**Example:**
```python
export_schema("mydb", format="typescript", tables="users,posts")
```

**Composition hints:** Use for code generation and documentation.

---

### get_query_log

Get recent query execution history.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `database` | string | No | Filter by database (null for all) |
| `status` | string | No | Filter by status: success, error, timeout |
| `since_minutes` | integer | No | Look back this many minutes (default: 60) |
| `limit` | integer | No | Maximum entries to return (default: 50) |

**Returns:** Query history entries with execution details (JSON)

**Example:**
```python
get_query_log(database="mydb", status="error", since_minutes=30)
```

**Composition hints:** Use for debugging and performance analysis.

---

### get_error_log

Get recent error and timeout history.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `database` | string | No | Filter by database (null for all) |
| `since_minutes` | integer | No | Look back this many minutes (default: 60) |
| `limit` | integer | No | Maximum entries to return (default: 50) |

**Returns:** Error entries with context and suggestions (JSON)

**Example:**
```python
get_error_log(database="mydb", since_minutes=60)
```

**Composition hints:** Use for troubleshooting connection and query issues.

---

## System (1 tool)

### check_compatibility

Check backward compatibility status and get migration recommendations.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `generate_migration_script` | boolean | No | Generate migration script for legacy config (default: false) |

**Returns:** Compatibility report and migration guidance (JSON)

**Example:**
```python
check_compatibility(generate_migration_script=True)
```

**Composition hints:** Use during upgrades or configuration migrations.

---

## Data Science (12 tools)

### analyze_hypothesis_test

Run statistical hypothesis test on query results.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query returning data |
| `test_type` | string | Yes | Test type: t_test, chi_square, mann_whitney, wilcoxon, kruskal, fisher |
| `column` | string | Yes | Column to test |
| `group_column` | string | No | Column defining groups for comparison |
| `alpha` | number | No | Significance level (default: 0.05) |
| `alternative` | string | No | Two-sided, less, or greater (default: two-sided) |

**Returns:** Test statistics, p-value, and interpretation (JSON)

**Example:**
```python
analyze_hypothesis_test("mydb", "SELECT * FROM experiments", "t_test", "score", "treatment", alpha=0.05)
```

**Composition hints:** Use for A/B testing and experiment validation.

---

### analyze_anova

Analyze variance across multiple groups using ANOVA.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query with group and value columns |
| `value_column` | string | Yes | Column with numeric values |
| `group_column` | string | Yes | Column defining groups |

**Returns:** F-statistic, p-value, group means, effect size (JSON)

**Example:**
```python
analyze_anova("mydb", "SELECT * FROM sales", "revenue", "region")
```

**Composition hints:** Use for comparing means across multiple groups.

---

### analyze_effect_sizes

Calculate effect sizes for statistical tests.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query returning data |
| `effect_type` | string | Yes | Type: cohens_d, cramers_v, eta_squared |
| `column1` | string | Yes | First column/group |
| `column2` | string | No | Second column for comparison |

**Returns:** Effect size value and interpretation (JSON)

**Example:**
```python
analyze_effect_sizes("mydb", "SELECT * FROM experiments", "cohens_d", "control", "treatment")
```

**Composition hints:** Pair with hypothesis tests for practical significance.

---

### analyze_regression

Fit regression models to data.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query with feature and target columns |
| `target_column` | string | Yes | Column to predict |
| `feature_columns` | string | Yes | Comma-separated feature columns |
| `model_type` | string | No | linear, ridge, lasso, polynomial (default: linear) |

**Returns:** Model coefficients, R², predictions (JSON)

**Example:**
```python
analyze_regression("mydb", "SELECT * FROM properties", "price", "sqft,bedrooms,year_built", model_type="linear")
```

**Composition hints:** Use with `evaluate_model_performance` for validation.

---

### evaluate_model_performance

Evaluate trained model performance on test data.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query with actual and predicted columns |
| `actual_column` | string | Yes | Column with true values |
| `predicted_column` | string | Yes | Column with predictions |
| `metric_type` | string | No | r2, rmse, mae, accuracy (default: r2) |

**Returns:** Performance metrics and diagnostics (JSON)

**Example:**
```python
evaluate_model_performance("mydb", "SELECT * FROM test_results", "actual_price", "predicted_price", metric_type="rmse")
```

**Composition hints:** Use after regression or classification.

---

### analyze_clusters

Perform clustering analysis on query data.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query with feature columns |
| `feature_columns` | string | Yes | Comma-separated numeric columns |
| `n_clusters` | integer | No | Number of clusters (default: auto) |
| `algorithm` | string | No | kmeans, hierarchical, dbscan (default: kmeans) |

**Returns:** Cluster assignments, centroids, silhouette score (JSON)

**Example:**
```python
analyze_clusters("mydb", "SELECT * FROM customers", "spending,frequency,recency", n_clusters=5, algorithm="kmeans")
```

**Composition hints:** Use for customer segmentation and pattern discovery.

---

### detect_anomalies

Detect anomalies in query data using statistical methods.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query with numeric columns |
| `column` | string | Yes | Column to analyze for anomalies |
| `threshold` | number | No | Standard deviation threshold (default: 3.0) |
| `method` | string | No | zscore, iqr, isolation_forest (default: zscore) |

**Returns:** Anomaly flags, scores, flagged rows (JSON)

**Example:**
```python
detect_anomalies("mydb", "SELECT * FROM transactions", "amount", threshold=2.5, method="iqr")
```

**Composition hints:** Use for data quality and fraud detection.

---

### reduce_dimensions

Perform dimensionality reduction on high-dimensional data.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query with feature columns |
| `feature_columns` | string | Yes | Comma-separated numeric columns |
| `n_components` | integer | No | Target number of dimensions (default: 2) |
| `method` | string | No | pca, tsne, umap (default: pca) |

**Returns:** Reduced components, explained variance (JSON)

**Example:**
```python
reduce_dimensions("mydb", "SELECT * FROM gene_data", "gene_*", n_components=3, method="pca")
```

**Composition hints:** Use before clustering on high-dimensional data.

---

### analyze_time_series

Analyze time series data for trends and patterns.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query with timestamp and value columns |
| `time_column` | string | Yes | Timestamp column name |
| `value_column` | string | Yes | Numeric value column |
| `period` | string | No | Decomposition period: daily, weekly, monthly |

**Returns:** Trend, seasonal, residual components; stationarity test (JSON)

**Example:**
```python
analyze_time_series("mydb", "SELECT * FROM stock_prices", "date", "price", period="daily")
```

**Composition hints:** Use before `forecast_time_series`.

---

### forecast_time_series

Generate time series forecasts.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query with historical data |
| `time_column` | string | Yes | Timestamp column name |
| `value_column` | string | Yes | Numeric value column |
| `periods_ahead` | integer | No | Number of periods to forecast (default: 10) |
| `method` | string | No | arima, exponential_smoothing, prophet (default: arima) |

**Returns:** Forecast values, confidence intervals (JSON)

**Example:**
```python
forecast_time_series("mydb", "SELECT * FROM sales_history", "date", "sales", periods_ahead=30, method="arima")
```

**Composition hints:** Use for sales, demand, and resource planning.

---

### analyze_rfm

Perform RFM (Recency, Frequency, Monetary) customer analysis.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query with customer transactions |
| `customer_id` | string | Yes | Column with customer identifiers |
| `date_column` | string | Yes | Transaction date column |
| `amount_column` | string | Yes | Transaction amount column |

**Returns:** RFM scores, customer segments, value tiers (JSON)

**Example:**
```python
analyze_rfm("mydb", "SELECT * FROM transactions", "customer_id", "order_date", "order_value")
```

**Composition hints:** Use for customer segmentation and targeting.

---

### analyze_ab_test

Analyze results from A/B tests with statistical rigor.

**Parameters:**

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `connection_name` | string | Yes | Database connection name |
| `query` | string | Yes | SQL query with control/test and outcome |
| `group_column` | string | Yes | Column denoting control/treatment groups |
| `outcome_column` | string | Yes | Column with binary or continuous outcome |
| `confidence_level` | number | No | Confidence level (default: 0.95) |

**Returns:** Test statistics, p-value, sample size, power analysis (JSON)

**Example:**
```python
analyze_ab_test("mydb", "SELECT * FROM experiment_results", "group", "converted", confidence_level=0.95)
```

**Composition hints:** Use for experimentation and decision-making.

---

## Quick Reference Summary

| Category | Count | Purpose |
|----------|-------|---------|
| Core Database | 8 | Connect, query, inspect databases |
| Streaming & Memory | 9 | Handle large results, memory management |
| Tree/Structured | 10 | Hierarchical JSON, YAML, TOML data |
| Graph | 7 | Network analysis and manipulation |
| Search & Transform | 2 | Pattern matching and text replacement |
| Schema & Audit | 3 | Introspection and query history |
| System | 1 | Compatibility and migration |
| Data Science | 12 | Statistical analysis and forecasting |
| **Total** | **52** | Complete LLM-native data platform |

## Parameter Type Reference

| Type | Format | Example |
|------|--------|---------|
| `string` | UTF-8 text | "mydb", "users", "/path/to/file" |
| `integer` | Whole numbers | 100, -1, 0 |
| `number` | Float/decimal | 0.05, 3.14, 2.5 |
| `boolean` | true/false | true, false |
| `optional` | [ ] indicates optional | Parameter with `[No]` in Required column |

## Return Format Convention

All tools return JSON-formatted responses with:

```json
{
  "status": "success|error|warning",
  "data": { },
  "metadata": { }
}
```

Successful queries return `"status": "success"`. Errors include context in `metadata` for debugging.

## Composition Patterns

### Sequential Loading
```
execute_query() -> get_query_metadata() -> request_multiple_chunks()
```

### Exploration Workflow
```
list_databases() -> describe_database() -> find_table() -> describe_table()
```

### Data Transformation
```
execute_query() -> search_data() -> transform_data() -> export_schema()
```

### Analysis Pipeline
```
execute_query() -> analyze_clusters() -> reduce_dimensions() -> get_graph_stats()
```

---

**For integration examples and advanced workflows, see the main documentation.**