Complete Code Explanation: sktime-mcpο
π Table of Contentsο
Project Overviewο
sktime-mcp is a Model Context Protocol (MCP) server that exposes the sktime time series library to Large Language Models (LLMs). It allows LLMs to:
Discover time series estimators from sktimeβs registry
Reason about their capabilities using tags
Compose estimators into pipelines
Execute real forecasting workflows on datasets
What Problem Does It Solve?ο
LLMs canβt directly interact with Python libraries. This MCP server acts as a semantic bridge, translating between:
LLM world: JSON-RPC requests with simple arguments
Python world: Complex object instantiation, method calls, and data manipulation
Architectureο
The codebase is organized into 5 main layers:
βββββββββββββββββββββββββββββββββββββββββββ
β MCP Server (server.py) β β Entry point, handles JSON-RPC
βββββββββββββββββββββββββββββββββββββββββββ€
β Tools Layer (tools/) β β MCP tool implementations
βββββββββββββββββββββββββββββββββββββββββββ€
β Registry (registry/) β Composition β β Discovery & Validation
β β (composition/)β
βββββββββββββββββββββββββββββββββββββββββββ€
β Runtime (runtime/) β β Execution & Handle Management
βββββββββββββββββββββββββββββββββββββββββββ€
β sktime Library β β Actual ML library
βββββββββββββββββββββββββββββββββββββββββββ
File-by-File Breakdownο
π Root Level Filesο
README.mdο
Purpose: Project documentation and quick start guide
Key Sections:
Installation instructions
Available MCP tools overview
Example LLM workflow
Project structure
pyproject.tomlο
Purpose: Python project configuration (PEP 518)
Key Contents:
Package metadata (name, version, description)
Dependencies:
mcp,sktime,pandas,numpy,scikit-learnOptional dependencies for dev and extended features
Entry point:
sktime-mcpcommand βsktime_mcp.server:mainTool configurations (ruff, pytest)
π src/sktime_mcp/ - Core Source Codeο
server.py - MCP Server Entry Pointο
Purpose: Main MCP server that handles all tool calls
Key Components:
sanitize_for_json(obj): Converts Python objects to JSON-serializable formatHandles numpy arrays, pandas objects, special types
@server.list_tools(): Registers all available MCP toolsReturns tool schemas (name, description, input schema)
Tools span Discovery, Instantiation, Execution, Data, Export, Persistence, Validation, and Job Management. (e.g.,
list_estimators,instantiate_pipeline,fit_predict_async,load_data_source,save_model,check_job_status).
@server.call_tool(name, arguments): Routes tool calls to implementationsValidates arguments
Calls appropriate tool function
Sanitizes and returns results
main(): Entry point that starts the MCP serverUses stdio transport (reads from stdin, writes to stdout)
Compatible with Claude Desktop and other MCP clients
Flow:
LLM β JSON-RPC request β server.call_tool() β tool function β sanitize β JSON response β LLM
π src/sktime_mcp/registry/ - Estimator Discoveryο
interface.py - Registry Interfaceο
Purpose: Wraps sktimeβs all_estimators() function and provides structured access
Key Classes:
EstimatorNode(dataclass)Represents a single estimator with all its metadata
Fields:
name: Class name (e.g., βARIMAβ)task: Task type (e.g., βforecastingβ)module: Python module pathclass_ref: Actual Python classtags: Capability tags (e.g.,{"capability:pred_int": True})hyperparameters: Constructor parameters with defaultsdocstring: Class documentation
Methods:
to_dict(): JSON serializationto_summary(): Minimal info for list operations
RegistryInterface(singleton)Purpose: Lazy-loads and caches all sktime estimators
Key Methods:
get_all_estimators(task, tags): Filter estimators by task and tagsget_estimator_by_name(name): Lookup specific estimatorlist_estimators(query=...): Text search in names/docstringsget_available_tasks(): List all task typesget_available_tags(): List all capability tags
Internal Methods:
_load_registry(): Calls sktimeβsall_estimators()for each task_create_node(): Extracts metadata from estimator class_get_tags(): Callscls.get_class_tags()_get_hyperparameters(): Inspects__init__signature
How It Works:
# First call triggers lazy loading
registry = get_registry()
registry._load_registry() # Calls sktime.all_estimators("forecasting"), etc.
# Creates EstimatorNode for each estimator
for name, cls in estimators:
node = EstimatorNode(
name=name,
task="forecasting",
class_ref=cls,
tags=cls.get_class_tags(),
hyperparameters=inspect.signature(cls.__init__).parameters
)
tag_resolver.py - Tag Resolutionο
Purpose: Handles tag-based filtering and compatibility checking
Key Functions:
Resolves tag queries (e.g.,
{"capability:pred_int": True})Checks if estimator tags match requirements
Used by registry filtering and composition validation
π src/sktime_mcp/composition/ - Pipeline Validationο
validator.py - Composition Validatorο
Purpose: Validates that estimator compositions are valid before instantiation
Key Classes:
CompositionType(Enum)Types of compositions: PIPELINE, TRANSFORMER_PIPELINE, FORECASTING_PIPELINE, MULTIPLEXER, ENSEMBLE, REDUCTION
CompositionRule(dataclass)Defines valid composition patterns
Example: Transformers can precede forecasters
ValidationResult(dataclass)Fields:
valid,errors,warnings,suggestionsMethod:
to_dict()for JSON serialization
CompositionValidator(singleton)Key Methods:
validate_pipeline(components): Check if pipeline is valid_check_pair_compatibility(first, second): Validate two estimators can be composed_check_tag_compatibility(first, second): Check tag requirementsget_valid_compositions(estimator_name): What can precede/follow this estimatorsuggest_pipeline(task, requirements): Suggest a valid pipeline
Validation Rules:
# Valid: Transformer β Forecaster
["Detrender", "ARIMA"] β
# Invalid: Forecaster β Forecaster
["ARIMA", "NaiveForecaster"] β
# Valid: Multiple Transformers β Forecaster
["ConditionalDeseasonalizer", "Detrender", "ARIMA"] β
π src/sktime_mcp/runtime/ - Execution Engineο
handles.py - Handle Managerο
Purpose: Manages references to instantiated estimator objects
Why Needed?:
LLMs canβt hold Python object references
Solution: Create string handles (e.g.,
"est_abc123") that map to objects
Key Classes:
HandleInfo(dataclass)Stores metadata about a handle
Fields:
handle_id,estimator_name,instance,params,created_at,fitted,metadata
HandleManager(singleton)Key Methods:
create_handle(estimator_name, instance, params): Create new handle β returns"est_xyz"get_instance(handle_id): Retrieve actual Python objectget_info(handle_id): Get handle metadatamark_fitted(handle_id): Mark estimator as fittedis_fitted(handle_id): Check if fittedrelease_handle(handle_id): Free memorylist_handles(): List all active handles_cleanup_oldest(): Auto-cleanup when max_handles reached
Flow:
# Instantiation
instance = ARIMA(order=[1,1,1])
handle = manager.create_handle("ARIMA", instance, {"order": [1,1,1]})
# Returns: "est_a1b2c3d4e5f6"
# Later retrieval
instance = manager.get_instance("est_a1b2c3d4e5f6")
instance.fit(y)
executor.py - Execution Runtimeο
Purpose: Orchestrates estimator instantiation, data loading, fitting, and prediction
Key Class: Executor (singleton)
Key Methods:
instantiate(estimator_name, params)Looks up estimator in registry
Instantiates with parameters
Creates handle
Returns:
{"success": True, "handle": "est_xyz", ...}
load_dataset(name)Loads demo datasets (airline, sunspots, etc.)
Uses sktimeβs dataset loaders
Returns: pandas Series/DataFrame
fit(handle_id, y, X, fh)Retrieves instance from handle
Calls
instance.fit(y, X=X, fh=fh)Marks handle as fitted
predict(handle_id, fh, X)Retrieves fitted instance
Calls
instance.predict(fh=fh, X=X)Returns predictions
fit_predict(handle_id, dataset, horizon)Convenience method: load β fit β predict
Returns:
{"success": True, "predictions": {...}, "horizon": 12}
instantiate_pipeline(components, params_list)β Most ComplexPurpose: Create complete pipelines from component names
Steps:
Validate pipeline composition
Instantiate each component
Build
stepsargument:[("name1", instance1), ("name2", instance2)]Determine pipeline type (TransformedTargetForecaster, Pipeline, etc.)
Instantiate pipeline with steps
Create handle
Why Complex: Handles the βsteps problemβ - LLMs canβt pass Python objects, so we build them server-side
Example Flow:
# LLM sends:
{"components": ["Detrender", "ARIMA"], "params_list": [{}, {"order": [1,1,1]}]}
# Executor does:
detrender = Detrender()
arima = ARIMA(order=[1,1,1])
steps = [("transformer", detrender), ("forecaster", arima)]
pipeline = TransformedTargetForecaster(steps=steps)
handle = handle_manager.create_handle("Pipeline", pipeline)
# Returns to LLM:
{"success": True, "handle": "est_xyz", "pipeline": "Detrender β ARIMA"}
π src/sktime_mcp/tools/ - MCP Tool Implementationsο
Each file implements one or more MCP tools that LLMs can call.
list_estimators.pyο
Tools:
list_estimators_tool(task, tags, query, limit)Calls
registry.get_all_estimators(task, tags)Returns:
{"success": True, "estimators": [...], "total": 50}
get_available_tags()Returns all capability tags
Example:
["capability:pred_int", "handles-missing-data", ...]
describe_estimator.pyο
Tool: describe_estimator_tool(estimator)
Looks up estimator in registry
Returns full EstimatorNode details
Includes: name, task, module, tags, hyperparameters, docstring
instantiate.pyο
Tools:
instantiate_estimator_tool(estimator, params)Calls
executor.instantiate(estimator, params)Returns handle
instantiate_pipeline_tool(components, params_list)βCalls
executor.instantiate_pipeline(components, params_list)Solves the βsteps problemβ
Returns single handle for entire pipeline
release_handle_tool(handle)Frees memory for a handle
list_handles_tool()Lists all active handles
load_model_tool(path)Loads a previously saved model via MLflow
fit_predict.pyο
Tools:
fit_predict_tool(estimator_handle, dataset, horizon)Calls
executor.fit_predict(handle, dataset, horizon)Complete workflow in one call
fit_predict_async_tool(estimator_handle, dataset, horizon)Dispatches a background job for fit and predict.
evaluate.pyο
Tool: evaluate_estimator_tool(estimator_handle, dataset, cv_folds)
Runs cross-validation using an expanding window splitter
Returns comparison metrics like MAE and RMSE
format_tools.pyο
Tools:
format_time_series_tool(...)Auto-formats, infers frequency, drops duplicates, and fills missing values.
auto_format_on_load_tool(enabled)Toggles whether new data sources get auto-formatted on load.
job_tools.pyο
Tools: check_job_status_tool, list_jobs_tool, cancel_job_tool, delete_job_tool, cleanup_old_jobs_tool
Interfaces with
JobManagerto control background training jobs.
save_model.pyο
Tool: save_model_tool(estimator_handle, path, mlflow_params)
Persists fitted estimators using MLflow.
list_available_data.pyο
Tool: list_available_data_tool(is_demo)
Returns available demo datasets and/or active user data handles
π examples/ - Usage Examplesο
01_forecasting_workflow.pyο
Purpose: Demonstrates all MCP capabilities end-to-end
Steps:
List datasets
Discover forecasting estimators
Filter by tags (probabilistic forecasters)
Describe an estimator
Validate pipeline compositions
Instantiate estimator
Fit and predict
List active handles
Show available tags
Run: python examples/01_forecasting_workflow.py
02_llm_query_simulation.pyο
Purpose: Simulates how an LLM would interact with the MCP
Scenario: User asks βForecast airline passengers with a probabilistic modelβ
LLM Steps:
list_estimators(task="forecasting", tags={"capability:pred_int": True})describe_estimator("ARIMA")instantiate_estimator("ARIMA", {"order": [1,1,1]})fit_predict(handle, "airline", 12)
03_pipeline_instantiation.pyο
Purpose: Demonstrates pipeline creation
Examples:
Simple 2-component pipeline
Complex 3-component pipeline (deseasonalize β detrend β forecast)
Pipeline with custom parameters
Invalid pipeline (shows validation errors)
04_mcp_pipeline_demo.pyο
Purpose: End-to-end pipeline workflow
Steps:
Validate pipeline
Instantiate pipeline β get handle
Fit and predict β get forecasts
Additional Examplesο
05_simple_deseasonalize_detrend_forecaster.py: Deseasonalize + detrend workflow06_simple_naive_forecaster.py: Basic NaiveForecaster examplebackground_training_example.py: Demonstrates async background jobsjob_management_demo.py: Demonstrates checking and listing job statuspandas_example.py: Demonstrates loading from in-memory pandas objectscsv_example.py: Demonstrates loading from CSV/TSV filessql_example.py: Demonstrates loading from SQL databases
π docs/ - Documentationο
architecture.mdο
Purpose: High-level block diagrams explaining the data flow and adapter registry.
data-sources.mdο
Purpose: Detailed guide on loading data from Pandas, SQL, and various file formats.
user-guide.mdο
Purpose: Information for end-users on how to use the MCP tools.
dev-guide.mdο
Purpose: Guidelines for contributors on extending the server or adding new adapters.
π tests/ - Test Suiteο
test_core.pyο
Purpose: Unit tests for core functionality
Test Classes:
TestRegistryInterfaceTests registry loading, filtering, lookup
TestHandleManagerTests handle creation, retrieval, fitting, release
TestCompositionValidatorTests pipeline validation logic
TestToolsTests MCP tool functions
Run: pytest tests/
How It All Works Togetherο
Example: LLM Forecasting Workflowο
User Prompt: βForecast airline passengers using ARIMAβ
Step 1: Discovery
LLM β list_estimators(task="forecasting")
β server.call_tool("list_estimators", {"task": "forecasting"})
β list_estimators_tool(task="forecasting")
β registry.get_all_estimators(task="forecasting")
β Returns: [{"name": "ARIMA", ...}, {"name": "NaiveForecaster", ...}, ...]
Step 2: Description
LLM β describe_estimator("ARIMA")
β describe_estimator_tool("ARIMA")
β registry.get_estimator_by_name("ARIMA")
β Returns: {"name": "ARIMA", "hyperparameters": {"order": ...}, ...}
Step 3: Instantiation
LLM β instantiate_estimator("ARIMA", {"order": [1,1,1]})
β instantiate_estimator_tool("ARIMA", {"order": [1,1,1]})
β executor.instantiate("ARIMA", {"order": [1,1,1]})
β ARIMA_class = registry.get_estimator_by_name("ARIMA").class_ref
β instance = ARIMA_class(order=[1,1,1])
β handle = handle_manager.create_handle("ARIMA", instance)
β Returns: {"success": True, "handle": "est_abc123"}
Step 4: Execution
LLM β fit_predict("est_abc123", "airline", 12)
β fit_predict_tool("est_abc123", "airline", 12)
β executor.fit_predict("est_abc123", "airline", 12)
β y = executor.load_dataset("airline")
β instance = handle_manager.get_instance("est_abc123")
β instance.fit(y)
β predictions = instance.predict(fh=[1,2,...,12])
β Returns: {"success": True, "predictions": {...}, "horizon": 12}
Data Flow Diagramο
βββββββββββ
β LLM β
ββββββ¬βββββ
β JSON-RPC request
βΌ
βββββββββββββββββββ
β MCP Server β β server.py
β (stdio) β
ββββββ¬βββββββββββββ
β Route to tool
βΌ
βββββββββββββββββββ
β Tool Function β β tools/*.py
ββββββ¬βββββββββββββ
β Call business logic
βΌ
ββββββββββββββββββββββββββββββββββββ
β Registry / Executor / Validator β β registry/, runtime/, composition/
ββββββ¬ββββββββββββββββββββββββββββββ
β Interact with sktime
βΌ
βββββββββββββββββββ
β sktime Library β
βββββββββββββββββββ
Key Conceptsο
1. Registry-First Designο
Donβt parse code or docs
Use sktimeβs
all_estimators()as source of truthExtract metadata from classes directly
2. Handle-Based Referencesο
LLMs canβt hold Python objects
Solution: String handles (
"est_abc123") map to objectsHandle manager maintains the mapping
3. Lazy Loadingο
Registry loads on first access
Singleton pattern ensures one instance
Caches all estimators for fast lookups
4. Tag-Based Discoveryο
Estimators have capability tags
LLMs can filter by requirements
Example:
{"capability:pred_int": True}finds probabilistic forecasters
5. Composition Validationο
Check pipeline validity before instantiation
Prevents runtime errors
Provides helpful error messages
6. The Steps Problemο
Problem: Pipelines need
steps=[("name", instance), ...]Solution: LLM sends component names, server builds instances
Benefit: LLM uses simple JSON, server handles complexity
7. JSON Sanitizationο
Convert numpy/pandas to JSON-serializable types
Handle special values (NaN, Infinity)
Ensure all responses are valid JSON
8. Singleton Patternο
Registry, Executor, HandleManager, Validator are singletons
Ensures shared state across tool calls
Efficient memory usage
Summaryο
sktime-mcp is a well-architected MCP server that:
Exposes sktimeβs 200+ estimators to LLMs
Validates compositions before execution
Manages object lifecycles via handles
Executes real ML workflows on real data
Translates between JSON (LLM) and Python (sktime)
Key Innovation: The instantiate_pipeline tool solves the βsteps problemβ, enabling LLMs to create complex pipelines with a single JSON-RPC call.
Architecture Highlights:
Clean separation of concerns (registry, composition, runtime, tools)
Singleton pattern for shared state
Handle-based object management
Comprehensive validation before execution
JSON-first API design
This enables LLMs to perform sophisticated time series forecasting workflows without writing any Python code! π