causal_agent.utils package

Submodules

causal_agent.utils.agent module

causal_agent.utils.llm_helpers module

Utility functions for LLM interactions within the causal_agent module.

causal_agent.utils.llm_helpers.call_llm_with_json_output(llm, prompt)[source]

Calls the provided LLM with a prompt, expecting a JSON object in the response. It parses the JSON string (after attempting to remove markdown fences) and returns it as a Python dictionary.

Parameters:

llm (langchain.chat_models.base.BaseChatModel | None) – An instance of BaseChatModel (e.g., from Langchain). If None, the function will log a warning and return None.
prompt (str) – The prompt string to send to the LLM.

Returns:

llm is None.
The LLM call fails.
The LLM response content cannot be extracted as a string.
The response content is empty after stripping markdown.
The response is not valid JSON.
The parsed JSON is not a dictionary.

Return type:

A dictionary parsed from the LLM’s JSON response, or None if

causal_agent.utils.llm_helpers.process_llm_response(response, method)[source]

causal_agent.utils.llm_helpers.get_columns_info(df)[source]

causal_agent.utils.llm_helpers.analyze_dataset_for_method(df, query, method)[source]

Use LLM to analyze dataset for appropriate method parameters.

Parameters:

df (DataFrame) – Input DataFrame
query (str) – User’s causal query
method (str) – The causal method being considered

Returns:

Dictionary with suggested parameters and validation checks from LLM.

Return type:

Dict[str, Any]

causal_agent.utils.llm_helpers.llm_identify_temporal_and_unit_vars(column_names, column_dtypes, dataset_description, dataset_summary, heuristic_time_candidates=None, heuristic_id_candidates=None, query='No query provided.', llm=None)[source]

Uses LLM to identify the primary time:

Parameters:

column_names (List[str]) – List of all column names.
column_dtypes (Dict[str, str]) – Dictionary mapping column names to string representation of data types.
dataset_description (str) – Textual description of the dataset.
dataset_summary (str) – Summary of the dataset
heuristic_time_candidates (List[str] | None) – Optional list of columns identified as time vars by heuristics (currently unused by prompt).
heuristic_id_candidates (List[str] | None) – Optional list of columns identified as unit ID vars by heuristics (currently unused by prompt).
llm (langchain.chat_models.base.BaseChatModel | None) – The language model client instance.

Returns:

A dictionary with keys ‘time_variable’ and ‘unit_variable’, whose values are the identified column names or None.

Return type:

Dict[str, str | None]