causal_agent.tools package

Causal Agent tools package.

This package contains LangChain tool wrappers for the causal_agent module, providing standardized interfaces for various causal inference components. Each tool wraps a core component to make it compatible with the LangChain agent framework.

Tools available: - input_parser_tool: Parses and validates user inputs - dataset_analyzer_tool: Analyzes dataset characteristics - query_interpreter_tool: Interprets natural language queries - method_selector_tool: Selects appropriate causal methods - method_validator_tool: Validates method assumptions - method_executor_tool: Executes causal inference methods - explanation_generator_tool: Generates explanations - output_formatter_tool: Formats final outputs

causal_agent.tools.input_parser_tool(input_text)

Parse the user’s initial input text to extract query, dataset path, and description.

This tool uses regex to find structured information within the input text and then leverages an LLM for more complex NLP tasks on the extracted query.

Parameters:: input_text (str) – The combined initial input string from the user/system.
Returns:: Dict containing parsed query information, path, description, and workflow state.
Return type:: Dict[str, Any]

causal_agent.tools.dataset_analyzer_tool(dataset_path, dataset_description=None, original_query=None)

Analyze dataset to identify important characteristics for causal inference.

This tool loads the dataset, calculates summary statistics, checks for temporal structure, identifies potential treatments/outcomes/instruments, and assesses variable relationships relevant for selecting a causal method.

Parameters:

dataset_path (str) – Path to the dataset file.
dataset_description (str | None) – Optional description string from input.
llm – Optional LLM client for enhanced analysis.

Returns:

A Pydantic model containing the structured dataset analysis results and workflow state.

Return type:

DatasetAnalyzerOutput

causal_agent.tools.query_interpreter_tool(query_info, dataset_analysis, dataset_description, original_query=None)

Interpret a causal query in the context of a specific dataset.

Parameters:

query_info (QueryInfo) – Pydantic model with parsed query information.
dataset_analysis (DatasetAnalysis) – Pydantic model with dataset analysis results.
dataset_description (str) – String description of the dataset.
original_query (str | None) – The original user query string (optional).

Returns:

A Pydantic model containing identified variables (including is_rct), dataset analysis, description, and workflow state.

Return type:

QueryInterpreterOutput

causal_agent.tools.method_selector_tool(variables, dataset_analysis, dataset_description=None, original_query=None, excluded_methods=None)

Select the most appropriate causal inference method based on structured input.

Applies decision logic based on dataset analysis and identified variables (including is_rct).

Parameters:

variables (Variables) – Pydantic model containing identified variables (T, O, C, IV, RDD, is_rct, etc.).
dataset_analysis (DatasetAnalysis) – Pydantic model containing results of dataset analysis.
dataset_description (str | None) – Optional textual description of the dataset.
original_query (str | None) – Optional original user query string.
excluded_methods (List[str] | None) – Optional list of method names to exclude from selection.

Returns:

Dictionary with method selection details, context for next step, and workflow state.

Return type:

Dict[str, Any]

causal_agent.tools.method_validator_tool(inputs)

Validate the assumptions of the selected causal method using structured input.

Parameters:: inputs (MethodValidatorInput) – Pydantic model containing method_info, dataset_analysis, variables, and dataset_description.
Returns:: Dictionary with validation results, context for next step, and workflow state.
Return type:: Dict[str, Any]

causal_agent.tools.method_executor_tool(inputs, original_query=None)

Execute the selected causal inference method function using structured input.

Parameters:: inputs (MethodExecutorInput) – Pydantic model containing method, variables, dataset_path, dataset_analysis, and dataset_description.
Returns:: Dict with numerical results, context for next step, and workflow state.
Return type:: Dict[str, Any]

causal_agent.tools.explanation_generator_tool(method_info, variables, results, dataset_analysis, validation_info=None, dataset_description=None, original_query=None)

Generate a single comprehensive explanation string using structured Pydantic input.

Parameters:

method_info (MethodInfo) – Pydantic model with method details.
variables (Variables) – Pydantic model with identified variables.
results (Dict[str, Any]) – Dictionary containing numerical results from execution.
dataset_analysis (DatasetAnalysis) – Pydantic model with dataset analysis results.
validation_info (Dict[str, Any] | None) – Optional dictionary with validation results.
dataset_description (str | None) – Optional string description of the dataset.
original_query (str | None) – Optional original user query string.

Returns:

Dictionary with the final explanation text, context, and workflow state.

Return type:

Dict[str, Any]

causal_agent.tools.output_formatter_tool(query, method, results, explanation, dataset_analysis=None, dataset_description=None)

Formats the final explanation and results using the output_formatter component, packages it into a dictionary, adds workflow state, and a JSON representation.

Parameters:

query (str) – Original user query.
method (str) – The method used (string name).
results (Dict[str, Any]) – Numerical results dict from method_executor_tool.
explanation (Dict[str, Any]) – Structured explanation dict from explainer_tool.
dataset_analysis (Dict[str, Any] | None) – Optional results from dataset_analyzer_tool.
dataset_description (str | None) – Optional initial description string.

Returns:

Dict containing the formatted output fields, workflow state, and a JSON string.

Return type:

Dict[str, Any]

Submodules

causal_agent.tools.data_analyzer module

Data Analyzer class for causal inference pipelines.

This module provides the DataAnalyzer class for analyzing datasets and extracting relevant information for causal inference.

class causal_agent.tools.data_analyzer.DataAnalyzer(verbose=False)[source]

Bases: object

Data analyzer for causal inference datasets.

This class provides methods for analyzing datasets to extract relevant information for causal inference, such as variables, relationships, and temporal structures.

__init__(verbose=False)[source]

Initialize the data analyzer.

Parameters:: verbose – Whether to print verbose information

analyze_dataset(dataset_path)[source]

Analyze a dataset and extract relevant information.

Parameters:: dataset_path (str) – Path to the dataset file
Returns:: Dictionary with dataset analysis results
Return type:: Dict[str, Any]

causal_agent.tools.dataset_analyzer_tool module

Tool for analyzing datasets for causal inference.

This module provides a LangChain tool for analyzing datasets to detect characteristics relevant for causal inference, such as temporal structure, potential instrumental variables, and variable relationships.

causal_agent.tools.dataset_analyzer_tool.dataset_analyzer_tool(dataset_path, dataset_description=None, original_query=None)

Analyze dataset to identify important characteristics for causal inference.

This tool loads the dataset, calculates summary statistics, checks for temporal structure, identifies potential treatments/outcomes/instruments, and assesses variable relationships relevant for selecting a causal method.

Parameters:

dataset_path (str) – Path to the dataset file.
dataset_description (str | None) – Optional description string from input.
llm – Optional LLM client for enhanced analysis.

Returns:

A Pydantic model containing the structured dataset analysis results and workflow state.

Return type:

DatasetAnalyzerOutput

causal_agent.tools.explanation_generator_tool module

Explanation generator tool for causal inference methods.

This tool generates explanations for the selected causal inference method, including what the method does, its assumptions, and how it will be applied.

causal_agent.tools.explanation_generator_tool.explanation_generator_tool(method_info, variables, results, dataset_analysis, validation_info=None, dataset_description=None, original_query=None)

Generate a single comprehensive explanation string using structured Pydantic input.

Parameters:

method_info (MethodInfo) – Pydantic model with method details.
variables (Variables) – Pydantic model with identified variables.
results (Dict[str, Any]) – Dictionary containing numerical results from execution.
dataset_analysis (DatasetAnalysis) – Pydantic model with dataset analysis results.
validation_info (Dict[str, Any] | None) – Optional dictionary with validation results.
dataset_description (str | None) – Optional string description of the dataset.
original_query (str | None) – Optional original user query string.

Returns:

Dictionary with the final explanation text, context, and workflow state.

Return type:

Dict[str, Any]

causal_agent.tools.input_parser_tool module

Tool for parsing causal inference queries.

This module provides a LangChain tool for parsing causal inference queries, extracting key elements, and guiding the workflow to the next step.

causal_agent.tools.input_parser_tool.input_parser_tool(input_text)

Parse the user’s initial input text to extract query, dataset path, and description.

This tool uses regex to find structured information within the input text and then leverages an LLM for more complex NLP tasks on the extracted query.

Parameters:: input_text (str) – The combined initial input string from the user/system.
Returns:: Dict containing parsed query information, path, description, and workflow state.
Return type:: Dict[str, Any]

causal_agent.tools.method_executor_tool module

Method Executor Tool for the causal inference agent.

Executes the selected causal inference method using its implementation function.

causal_agent.tools.method_executor_tool.method_executor_tool(inputs, original_query=None)

Execute the selected causal inference method function using structured input.

Parameters:: inputs (MethodExecutorInput) – Pydantic model containing method, variables, dataset_path, dataset_analysis, and dataset_description.
Returns:: Dict with numerical results, context for next step, and workflow state.
Return type:: Dict[str, Any]

causal_agent.tools.method_selector_tool module

Method Selector Tool for selecting causal inference methods.

This module provides a LangChain tool for selecting appropriate causal inference methods based on dataset characteristics and query details.

causal_agent.tools.method_selector_tool.method_selector_tool(variables, dataset_analysis, dataset_description=None, original_query=None, excluded_methods=None)

Select the most appropriate causal inference method based on structured input.

Applies decision logic based on dataset analysis and identified variables (including is_rct).

Parameters:

variables (Variables) – Pydantic model containing identified variables (T, O, C, IV, RDD, is_rct, etc.).
dataset_analysis (DatasetAnalysis) – Pydantic model containing results of dataset analysis.
dataset_description (str | None) – Optional textual description of the dataset.
original_query (str | None) – Optional original user query string.
excluded_methods (List[str] | None) – Optional list of method names to exclude from selection.

Returns:

Dictionary with method selection details, context for next step, and workflow state.

Return type:

Dict[str, Any]

causal_agent.tools.method_validator_tool module

Method validator tool for causal inference methods.

This tool validates the selected causal inference method against dataset characteristics and available variables.

causal_agent.tools.method_validator_tool.extract_properties_from_inputs(inputs)[source]

Helper function to extract dataset properties from MethodValidatorInput for use with the decision tree.

causal_agent.tools.method_validator_tool.method_validator_tool(inputs)

Validate the assumptions of the selected causal method using structured input.

Parameters:: inputs (MethodValidatorInput) – Pydantic model containing method_info, dataset_analysis, variables, and dataset_description.
Returns:: Dictionary with validation results, context for next step, and workflow state.
Return type:: Dict[str, Any]

causal_agent.tools.output_formatter_tool module

Output formatter tool for causal inference results.

This tool provides the LangChain interface for the output formatter component.

causal_agent.tools.output_formatter_tool.output_formatter_tool(query, method, results, explanation, dataset_analysis=None, dataset_description=None)

Formats the final explanation and results using the output_formatter component, packages it into a dictionary, adds workflow state, and a JSON representation.

Parameters:

query (str) – Original user query.
method (str) – The method used (string name).
results (Dict[str, Any]) – Numerical results dict from method_executor_tool.
explanation (Dict[str, Any]) – Structured explanation dict from explainer_tool.
dataset_analysis (Dict[str, Any] | None) – Optional results from dataset_analyzer_tool.
dataset_description (str | None) – Optional initial description string.

Returns:

Dict containing the formatted output fields, workflow state, and a JSON string.

Return type:

Dict[str, Any]

causal_agent.tools.query_interpreter_tool module

Tool for interpreting causal queries in the context of a dataset.

This module provides a LangChain tool for matching query concepts to actual dataset variables, identifying treatment, outcome, and covariate variables.

causal_agent.tools.query_interpreter_tool.query_interpreter_tool(query_info, dataset_analysis, dataset_description, original_query=None)

Interpret a causal query in the context of a specific dataset.

Parameters:

query_info (QueryInfo) – Pydantic model with parsed query information.
dataset_analysis (DatasetAnalysis) – Pydantic model with dataset analysis results.
dataset_description (str) – String description of the dataset.
original_query (str | None) – The original user query string (optional).

Returns:

A Pydantic model containing identified variables (including is_rct), dataset analysis, description, and workflow state.

Return type:

QueryInterpreterOutput