causal_agent.methods.generalized_propensity_score package

Generalized Propensity Score (GPS) method for continuous treatments.

Submodules

causal_agent.methods.generalized_propensity_score.diagnostics module

Diagnostic checks for the Generalized Propensity Score (GPS) method.

causal_agent.methods.generalized_propensity_score.diagnostics.assess_gps_balance(df_with_gps, treatment_var, covariate_vars, gps_col_name, **kwargs)[source]

Assesses the balance of covariates conditional on the estimated GPS.

This function is typically called after GPS estimation to validate the assumption that covariates are independent of treatment conditional on GPS.

Parameters:
  • df_with_gps (DataFrame) – DataFrame containing the original data plus the estimated GPS column.

  • treatment_var (str) – The name of the continuous treatment variable column.

  • covariate_vars (List[str]) – A list of covariate column names to check for balance.

  • gps_col_name (str) – The name of the column containing the estimated GPS values.

  • **kwargs (Any) – Additional arguments (e.g., number of strata for checking balance).

Returns:

{

“overall_balance_metric”: 0.05, “covariate_balance”: {

”cov1”: {“statistic”: 0.03, “p_value”: 0.5, “balanced”: True}, “cov2”: {“statistic”: 0.12, “p_value”: 0.02, “balanced”: False}

}, “summary”: “Balance assessment complete.”

}

Return type:

A dictionary containing balance statistics and summaries. For example

causal_agent.methods.generalized_propensity_score.estimator module

Core estimation logic for the Generalized Propensity Score (GPS) method.

causal_agent.methods.generalized_propensity_score.estimator.estimate_effect_gps(df, treatment, outcome, covariates, **kwargs)[source]

Estimates the causal effect using the Generalized Propensity Score method for continuous treatments.

This function will be called by the method_executor_tool.

Parameters:
  • df (DataFrame) – The input DataFrame.

  • treatment (str) – The name of the continuous treatment variable column.

  • outcome (str) – The name of the outcome variable column.

  • covariates (List[str]) – A list of covariate column names.

  • **kwargs (Any) – Additional arguments for controlling the estimation, including: - gps_model_spec (dict): Specification for the GPS model (T ~ X). - outcome_model_spec (dict): Specification for the outcome model (Y ~ T, GPS). - t_values_range (list or dict): Specification for treatment levels for ADRF. - n_bootstraps (int): Number of bootstrap replications for SEs.

Returns:

  • “effect_estimate”: Typically the ADRF or a specific contrast.

  • ”standard_error”: Standard error for the primary effect estimate.

  • ”confidence_interval”: Confidence interval for the primary estimate.

  • ”adrf_curve”: Data representing the Average Dose-Response Function.

  • ”specific_contrasts”: Any calculated specific contrasts.

  • ”diagnostics”: Results from diagnostic checks (e.g., balance).

  • ”method_details”: Description of the method and models used.

  • ”parameters_used”: Dictionary of parameters used.

Return type:

A dictionary containing the estimation results, including

causal_agent.methods.generalized_propensity_score.llm_assist module

LLM-assisted components for the Generalized Propensity Score (GPS) method.

These functions help in suggesting model specifications or parameters by leveraging an LLM, providing intelligent defaults when not specified by the user.

causal_agent.methods.generalized_propensity_score.llm_assist.suggest_treatment_model_spec(df, treatment_var, covariate_vars, query=None, llm_client=None)[source]

Suggests a model specification for the treatment mechanism (T ~ X) in GPS.

Parameters:
  • df (DataFrame) – The input DataFrame.

  • treatment_var (str) – The name of the continuous treatment variable.

  • covariate_vars (List[str]) – A list of covariate names.

  • query (str | None) – Optional user query for context.

  • llm_client (Any | None) – Optional LLM client for making a call.

Returns:

A dictionary representing the suggested model specification. E.g., {“type”: “linear”, “formula”: “T ~ X1 + X2”} or

{“type”: “random_forest”, “params”: {…}}

Return type:

Dict[str, Any]

causal_agent.methods.generalized_propensity_score.llm_assist.suggest_outcome_model_spec(df, outcome_var, treatment_var, gps_col_name, query=None, llm_client=None)[source]

Suggests a model specification for the outcome mechanism (Y ~ T, GPS) in GPS.

Parameters:
  • df (DataFrame) – The input DataFrame.

  • outcome_var (str) – The name of the outcome variable.

  • treatment_var (str) – The name of the continuous treatment variable.

  • gps_col_name (str) – The name of the GPS column.

  • query (str | None) – Optional user query for context.

  • llm_client (Any | None) – Optional LLM client for making a call.

Returns:

A dictionary representing the suggested model specification. E.g., {“type”: “polynomial”, “degree”: 2, “interaction”: True,

”formula”: “Y ~ T + T^2 + GPS + GPS^2 + T*GPS”}

Return type:

Dict[str, Any]

causal_agent.methods.generalized_propensity_score.llm_assist.suggest_dose_response_t_values(df, treatment_var, num_points=20, llm_client=None)[source]

Suggests a relevant range and number of points for estimating the ADRF.

Parameters:
  • df (DataFrame) – The input DataFrame.

  • treatment_var (str) – The name of the continuous treatment variable.

  • num_points (int) – Desired number of points for the ADRF curve.

  • llm_client (Any | None) – Optional LLM client for making a call.

Returns:

A list of treatment values at which to evaluate the ADRF.

Return type:

List[float]