causal_agent

Causal Agent - Automated Causal Inference with Large Language Models.

The causal_agent module provides an LLM-powered tool for generating data-driven answers to natural language causal queries. It automatically:

Parses natural language causal questions
Analyzes dataset characteristics and variables
Selects appropriate causal inference methods
Executes causal analysis with proper diagnostics
Interprets results in plain language

Example

>>> from causal_agent import run_causal_analysis
>>> result = run_causal_analysis(
...     query="What is the effect of education on income?",
...     dataset_path="data.csv",
...     dataset_description="Education and income dataset"
... )
>>> print(f"Effect: {result['results']['results']['effect_estimate']}")

The module supports various causal inference methods including: - Randomized Controlled Trials (RCT) - Difference-in-Differences (DiD) - Instrumental Variables (IV) - Regression Discontinuity Design (RDD) - Propensity Score Matching/Weighting - Backdoor Adjustment - Linear Regression with controls

Functions

`analyze_dataset`(dataset_path[, llm_client, ...])	Analyze a dataset to identify important characteristics for causal inference.
`create_workflow_state_update`(current_step, ...)	Create a standardized workflow state update dictionary.
`format_output`(query, method, results, ...[, ...])	Format final results including numerical estimates and explanations.
`generate_explanation`(method_info, ...[, ...])	Generates a comprehensive explanation text for the causal analysis.
`interpret_query`(query_info, dataset_analysis)	Interpret query using hybrid heuristic/LLM approach to identify variables.
`parse_input`(query[, dataset_path_arg, ...])	Parse the user's causal query using LLM and regex.
`run_causal_analysis`(query, dataset_path[, ...])	Run causal analysis on a dataset based on a user query.
`validate_method`(method_info, ...)	Validate the selected causal method against dataset characteristics.

causal_agent.run_causal_analysis(query, dataset_path, dataset_description=None, api_key=None)[source]

Run causal analysis on a dataset based on a user query.

Parameters:

query (str) – User’s causal question
dataset_path (str) – Path to the dataset
dataset_description (str | None) – Optional textual description of the dataset
api_key (str | None) – Optional OpenAI API key (DEPRECATED - will be ignored)

Returns:

Dictionary containing the final formatted analysis results from the agent’s last step.

Return type:

Dict[str, Any]