causal_agent.analyze_dataset

causal_agent.analyze_dataset(dataset_path, llm_client=None, dataset_description=None, original_query=None)[source]

Analyze a dataset to identify important characteristics for causal inference.

Parameters:
  • dataset_path (str) – Path to the dataset file

  • llm_client (langchain_core.language_models.BaseChatModel | None) – Optional LLM client for enhanced analysis

  • dataset_description (str | None) – Optional description of the dataset for context

Returns:

  • dataset_info: Basic information about the dataset

  • columns: List of column names

  • potential_treatments: List of potential treatment variables (possibly LLM augmented)

  • potential_outcomes: List of potential outcome variables (possibly LLM augmented)

  • temporal_structure_detected: Whether temporal structure was detected

  • panel_data_detected: Whether panel data structure was detected

  • potential_instruments_detected: Whether potential instruments were detected

  • discontinuities_detected: Whether discontinuities were detected

  • llm_augmentation: Status of LLM augmentation if used

Return type:

Dict containing dataset analysis results