API Reference

Core UQ-PhysiCell Module

The main module provides the core PhysiCell model interface and utilities.

UQ-PhysiCell: Uncertainty Quantification for PhysiCell Models

This package provides tools for uncertainty quantification, sensitivity analysis, and Bayesian optimization of PhysiCell models.

class uq_physicell.PhysiCell_Model(configFilePath: str, keyModel: str, verbose: bool = False)[source]

Bases: object

A class to manage PhysiCell model configurations and executions.

This class handles the setup of PhysiCell models, including reading configuration files, generating XML files, and running simulations with specified parameters.

Parameters:
  • configFilePath (str) – Path to the configuration file (INI format).

  • keyModel (str) – Key in the configuration file to identify the model.

  • verbose (bool) – If True, prints detailed information during execution.

info() None[source]

Print model configuration information.

RunModel(SampleID: int, ReplicateID: int, Parameters: ndarray | dict = {}, ParametersRules: ndarray | dict = {}, RemoveConfigFile: bool = True, SummaryFunction: None | str = None) None | DataFrame[source]

Run a single simulation with specified parameters.

Parameters:
  • SampleID (int) – Identifier for the parameter sample

  • ReplicateID (int) – Identifier for the simulation replicate

  • Parameters (np.ndarray or dict, optional) – Parameter values for XML configuration

  • ParametersRules (np.ndarray or dict, optional) – Parameter values for RULES configuration

  • RemoveConfigFile (bool, optional) – If True, removes the generated XML and RULES files after simulation

  • SummaryFunction (function, optional) – Function to summarize simulation output

run_simulation_subprocess(XMLFile, sample_id=None, replicate_id=None)[source]

Start the simulation as a subprocess and return the process handle.

Parameters:
  • XMLFile (str) – Path to the XML configuration file for the simulation

  • sample_id (int, optional) – Identifier for the parameter sample

  • replicate_id (int, optional) – Identifier for the simulation replicate

Returns:

Process handle for the running simulation

Return type:

subprocess.Popen

terminate_all_simulations()[source]

Terminate all active simulation processes.

Returns:

Dictionary of process IDs and their termination return codes

Return type:

dict

get_active_processes_info()[source]

Get information about all active processes.

Returns:

Dictionary containing information about all active processes

Return type:

dict

remove_io_folders()[source]

Model Analysis Module

Tools for sensitivity analysis, parameter sampling, and model analysis.

class uq_physicell.model_analysis.ma_context.ModelAnalysisContext(db_path: str, model_config: dict, sampler: str, params_info: dict, qois_info: dict, qoi_def: dict = {}, parallel_method: str = 'inter-process', num_workers: int = 1, summary_function=None, logger: Logger = None)[source]

Bases: object

Context manager for running PhysiCell model analysis simulations.

This class manages the configuration, database setup, and execution context for running sensitivity analysis and uncertainty quantification simulations on PhysiCell models.

Parameters:
  • db_path (str) – Path to the SQLite database file for storing results.

  • model_config (dict) – Dictionary containing PhysiCell model configuration. Must include ‘ini_path’ and ‘struc_name’ keys.

  • sampler (str) – Name of the sampling method to use (e.g., ‘LHS’, ‘Sobol’, ‘OAT’).

  • params_info (dict) – Dictionary containing parameter definitions with keys for each parameter name and values containing ‘ref_value’, ‘lower_bound’, ‘upper_bound’, and ‘perturbation’ information.

  • qois_info (dict) – Dictionary containing Quantities of Interest definitions.

  • qoi_def (dict) – first-class object, that can be used in qoi_functions lambda string, mapped to their name.

  • parallel_method (str, optional) – Parallelization method. Options are: ‘inter-process’ (single node), ‘inter-node’ (MPI), or ‘serial’. Defaults to ‘inter-process’.

  • num_workers (int, optional) – Number of parallel workers for inter-process execution. Defaults to 1.

  • summary_function (callable, optional) – Custom function for summarizing simulation output. Defaults to None.

Raises:
  • ImportError – If required parallelization libraries are not available.

  • ValueError – If invalid parallel_method is specified.

generate_samples(N: int = None, M: int = 4, seed: int = 42)[source]
set_samples(samples)[source]

Set user-defined parameter combinations for the ‘User-defined’ sampler.

Parameters:

samples – dict mapping integer sample IDs to parameter dicts, e.g. {0: {‘param1’: 1.0, ‘param2’: 2.0}, 1: {‘param1’: 1.5, ‘param2’: 3.0}} or a list of parameter dicts (IDs are assigned automatically starting from 0), e.g. [{‘param1’: 1.0}, {‘param1’: 1.5}]

run()[source]

Run simulations — convenience alias for run_simulations(context).

Allows the fluent pattern:

context.generate_samples(N=8)
context.run()
cancelled()[source]

Check if cancellation has been requested.

Returns:

True if cancellation was requested, False otherwise

Return type:

bool

request_cancellation()[source]

Request cancellation of all simulations.

This sets the internal cancellation flag to True, which will be checked by the simulation process at various points.

Returns:

Always returns True

Return type:

bool

uq_physicell.model_analysis.ma_context.run_simulations(context: ModelAnalysisContext)[source]

Run PhysiCell simulations based on the provided analysis context.

This function executes sensitivity analysis simulations using the specified parallelization method (serial, inter-process, or MPI). It manages database initialization, parameter sampling, simulation execution, and result storage.

Parameters:

context (ModelAnalysisContext) – The analysis context containing model configuration, sampling parameters, parallelization settings, and database information.

Raises:
  • ValueError – If there are issues with PhysiCell model initialization, database operations, or simulation execution.

  • ImportError – If required parallelization libraries are missing.

Note

This function handles three execution modes:
  • Serial: Single-threaded execution for small analyses

  • Inter-process: Multi-processing on a single node using concurrent.futures

  • Inter-node: Distributed execution across multiple nodes using MPI

Sampling Methods

uq_physicell.model_analysis.samplers.run_global_sampler(params_dict: dict, sampler: str, N: int = None, M: int = 4, seed: int = 42) dict[source]

Generate parameter samples using global sampling methods.

This function creates parameter samples using various global sampling strategies implemented in SALib, suitable for global sensitivity analysis methods.

Parameters:
  • params_dict (dict) – Dictionary containing parameter definitions with ‘lower_bound’ and ‘upper_bound’ for each parameter.

  • sampler (str) – Sampling method to use. Supported methods include: - ‘Fast’: FAST sampling for Fourier Amplitude Sensitivity Test - ‘Fractional Factorial’: Fractional factorial design - ‘Finite Difference’: Finite difference sampling - ‘Latin hypercube sampling (LHS)’: Latin hypercube sampling - ‘Sobol’: Sobol sequence sampling

  • N (int, optional) – Number of samples to generate. If None, method-specific defaults are used. Defaults to None.

  • M (int, optional) – Number of harmonics for FAST sampler. Only used with ‘Fast’ method. Defaults to 4.

  • seed (int, optional) – Random seed for reproducible sampling. Defaults to 42.

Returns:

Dictionary with sample IDs as keys and parameter dictionaries as values.

Each sample dictionary contains parameter names as keys and sampled values as values.

Return type:

dict

Raises:

ValueError – If an unsupported sampling method is specified or if required parameters are missing.

uq_physicell.model_analysis.samplers.run_local_sampler(params_dict: dict, sampler: str = 'OAT') dict[source]

Generate parameter samples using local sampling methods for sensitivity analysis.

This function creates parameter samples using local sampling strategies, particularly the One-At-a-Time (OAT) method, where parameters are perturbed individually around reference values.

Parameters:
  • params_dict (dict) – Dictionary containing parameter definitions. Each parameter should have: - ‘ref_value’: Reference value for the parameter - ‘perturbation’: Single value or list of perturbation percentages

  • sampler (str, optional) – Local sampling method to use. Currently only ‘OAT’ (One-At-a-Time) is supported. Defaults to ‘OAT’.

Returns:

Dictionary with sample IDs as keys and parameter dictionaries as values.

Sample 0 contains the reference values, and subsequent samples contain perturbations of individual parameters.

Return type:

dict

Note

For OAT sampling, the first sample (ID 0) contains all reference values. Subsequent samples perturb one parameter at a time while keeping others at their reference values. If multiple perturbations are specified for a parameter, multiple samples are generated for that parameter.

Sensitivity Analysis

uq_physicell.model_analysis.sensitivity_analysis.get_global_SA_parameters(db_file)[source]
uq_physicell.model_analysis.sensitivity_analysis.get_local_SA_parameters(db_file)[source]
uq_physicell.model_analysis.sensitivity_analysis.run_global_sa(params_dict: dict, qoi_names: list, df_qois: DataFrame, method: str, qoi_time_values: dict = None) tuple[source]

Run global sensitivity analysis using the specified method.

Parameters:
  • params_dict (dict) – Dictionary containing parameter names, properties, and sample values. Must include a ‘samples’ key with parameter sample dictionaries.

  • qoi_names (list) – List of QoI names to analyze.

  • df_qois (pd.DataFrame) – DataFrame containing QoI values with columns formatted as ‘{qoi_name}_{time_index}’ or ‘{qoi_name}’ and time as an index.

  • method (str) – Name of the sensitivity analysis method to use. Supported methods include ‘FAST - Fourier Amplitude Sensitivity Test’, ‘Sobol Sensitivity Analysis’, ‘PAWN Sensitivity Analysis’, etc.

  • qoi_time_values (dict, optional) – Dictionary mapping time labels to their values. If None, time labels will be extracted from df_qois. Defaults to None.

Returns:

A tuple containing:
  • sa_results_dict (dict): Nested dictionary with sensitivity analysis results. Structure: {qoi_name: {time_label: analysis_results}}

  • qoi_time_values (dict): Dictionary mapping time labels to their values, sorted by time.

Return type:

tuple

Raises:

ValueError – If there’s a mismatch between number of samples and QoI results, or if the specified method fails during analysis.

uq_physicell.model_analysis.sensitivity_analysis.OAT_analyze(dic_samples: dict, dic_qoi: dict, sample_ref: int = 0) dict[source]

Perform One-At-a-Time (OAT) analysis on the simulation results.

Parameters:
  • dic_samples (dict) – Dictionary of parameter sample dictionaries, where each key is a sample ID and each value is a dictionary of parameter names and values.

  • dic_qoi (dict) – Dictionary of Quantities of Interest (QoI) values, where each key is a sample ID and each value is the corresponding QoI result.

  • sample_ref (int, optional) – Sample ID to use as the reference for OAT analysis. Defaults to 0.

Returns:

Dictionary containing sensitivity indices for each parameter. Keys are

parameter names and values are arrays of sensitivity indices for each perturbation.

Return type:

dict

Note

The specified sample is treated as the reference sample. All other samples are compared against this reference to compute sensitivity indices.

uq_physicell.model_analysis.sensitivity_analysis.run_local_sa(params_dict: dict, qoi_names: list, df_qois: DataFrame, method: str = 'OAT', sample_ref: int = 0) tuple[source]

Run local sensitivity analysis using the One-At-a-Time (OAT) method.

Parameters:
  • params_dict (dict) – Dictionary containing parameter names, properties, and sample values. Must include a ‘samples’ key with parameter sample dictionaries.

  • qoi_names (list) – List of QoI names to analyze.

  • df_qois (pd.DataFrame) – DataFrame containing QoI values with columns formatted as ‘{qoi_name}_{time_index}’ or ‘{qoi_name}’ and time as an index.

  • method (str, optional) – Local sensitivity analysis method. Currently only ‘OAT’ (One-At-a-Time) is supported. Defaults to “OAT”.

  • sample_ref (int, optional) – Sample ID to use as the reference for OAT analysis. Defaults to 0.

Returns:

A tuple containing:
  • sa_results_dict (dict): Nested dictionary with sensitivity analysis results. Structure: {qoi_name: {time_label: {param_name: sensitivity_index}}}

  • qoi_time_values (dict): Dictionary mapping time labels to their values, sorted by time.

Return type:

tuple

Note

The OAT method computes sensitivity indices by comparing parameter perturbations against a reference sample (sample 0). Results are summed across all perturbations for each parameter.

uq_physicell.model_analysis.sensitivity_analysis.get_sa_results(dbfile: str, qoi_names: list, df_qois: DataFrame, method: str, sample_ref: int = 0, qoi_time_values: dict = None) tuple[source]

Get sensitivity analysis results for Global and Local methods.

Parameters:
  • dbfile (str) – Path to the database file containing simulation results.

  • qoi_names (list) – List of QoI names to analyze.

  • df_qois (pd.DataFrame) – DataFrame containing QoI values with columns formatted as ‘{qoi_name}_{time_index}’ or ‘{qoi_name}’ and time as an index.

  • method (str) – Name of the sensitivity analysis method to use. Supported methods include ‘FAST - Fourier Amplitude Sensitivity Test’, ‘Sobol Sensitivity Analysis’, ‘PAWN Sensitivity Analysis’, etc. For local sensitivity analysis, use “OAT”.

  • sample_ref (int, optional) – Sample ID to use as the reference for OAT analysis. Defaults to 0. Only used for local sensitivity analysis with method “OAT”.

  • qoi_time_values (dict, optional) – Dictionary mapping time labels to their values. If None, time labels will be extracted from df_qois. Defaults to None.

Returns:

A tuple containing:
  • sa_results_dict (dict): Nested dictionary with sensitivity analysis results. Structure: {qoi_name: {time_label: analysis_results}}

  • qoi_time_values (dict): Dictionary mapping time labels to their values, sorted by time.

Return type:

tuple

Utils

uq_physicell.model_analysis.utils.mcds_list_to_qoi_df_for_sa(recreated_qoi_funcs, all_sample_ids, chunk_size, db_file, verbose=False) DataFrame[source]

Convert a list of MCDS objects to a DataFrame of quantities of interest for sensitivity analysis.

This function processes a list of MCDS simulation results, extracting relevant quantities of interest (QoIs) at each time point and organizing them into a structured DataFrame suitable for sensitivity analysis.

Parameters:
  • recreated_qoi_funcs (dict) – Dictionary of QoI functions where keys are QoI names and values are callable functions.

  • all_sample_ids (list) – List of all sample IDs to process.

  • chunk_size (int) – Number of samples to process in each chunk to manage memory usage.

  • db_file (str) – Path to the database file containing simulation output.

Returns:

DataFrame with calculated QoI values indexed by SampleID and ReplicateID, with columns for each QoI - columns combined with time points.

Return type:

pd.DataFrame

uq_physicell.model_analysis.utils.mcds_list_to_qoi_df_long(recreated_qoi_funcs, all_sample_ids, chunk_size, db_file, verbose=False) DataFrame[source]

Convert a list of MCDS objects to a DataFrame of quantities of interest in long format.

This function processes a list of MCDS simulation results, extracting relevant quantities of interest (QoIs) at each time point and organizing them into a long structured DataFrame.

Parameters:
  • recreated_qoi_funcs (dict) – Dictionary of QoI functions where keys are QoI names and values are callable functions.

  • all_sample_ids (list) – List of all sample IDs to process.

  • chunk_size (int) – Number of samples to process in each chunk to manage memory usage.

  • db_file (str) – Path to the database file containing simulation output.

Returns:

DataFrame with calculated QoI values indexed by SampleID and ReplicateID, with columns for each QoI - columns combined with time points.

Return type:

pd.DataFrame

uq_physicell.model_analysis.utils.mcds_list_to_qoi_df_for_calib(recreated_qoi_funcs, all_sample_ids, chunk_size, db_file, verbose=False) DataFrame[source]

Convert a list of MCDS objects to a DataFrame of quantities of interest for calibration.

This function processes a list of MCDS simulation results, extracting relevant quantities of interest (QoIs) and organizing them into a structured DataFrame suitable for calibration tasks.

Parameters:
  • recreated_qoi_funcs (dict) – Dictionary of QoI functions where keys are QoI names and values are callable functions.

  • all_sample_ids (list) – List of all sample IDs to process.

  • chunk_size (int) – Number of samples to process in each chunk to manage memory usage.

  • db_file (str) – Path to the database file containing simulation output.

Returns:

DataFrame with calculated QoI values indexed by SampleID and ReplicateID, with columns for each QoI - columns is not combined with time points.

Return type:

pd.DataFrame

uq_physicell.model_analysis.utils.get_qoi_from_db_file(db_file: str, qoi_names: list) DataFrame[source]

Extract quantities of interest (QoIs) from a database file containing simulation results.

Returns a long-format DataFrame with one row per (SampleID, time, ReplicateID), consistent with the raw-MCDS storage path. Old Mode-A databases (Data column contains precomputed QoI DataFrames) are handled transparently.

Parameters:
  • db_file (str) – Path to the SQLite database containing simulation results.

  • qoi_names (list) – List of QoI names to extract.

Returns:

Long-format DataFrame with columns [SampleID, time, ReplicateID, <qoi_names>],

sorted by (SampleID, time, ReplicateID).

Return type:

pd.DataFrame

uq_physicell.model_analysis.utils.calculate_qoi_from_db_file(db_file: str, qoi_functions: dict, qoi_def: dict = {}, chunk_size: int = 10, mode='long', verbose=False) DataFrame[source]

Calculate quantities of interest from sensitivity analysis database results.

This function loads simulation results from a database in chunks and applies QoI functions to extract meaningful metrics from the time-series data. Processing in chunks helps avoid excessive memory usage for large databases.

Parameters:
  • db_file (str) – Path to the SQLite database containing simulation results.

  • qoi_functions (dict) – Dictionary of QoI functions where keys are QoI names and values are lambda functions or string representations.

  • qoi_def (dict) –

    first-class object, that can be used in qoi_functions lambda string, mapped to their name. e.g. for a function definition, if the function definition is: def my_func():

    print(‘hello world!’) return 0

    then the qoi_def dict would look like this: {‘my_func’: my_func}

  • chunk_size (int, optional) – Number of samples to process at a time. Default is 10. Adjust based on available memory and data size.

  • mode – Specify the form of the result dataframe. Possible modes are sa, calib, and long. The default is long.

Returns:

DataFrame with calculated QoI values indexed by SampleID and ReplicateID, with columns for each QoI.

Return type:

pd.DataFrame

Example

>>> qoi_funcs = {
...     'live_cells': lambda df: len(df[df['dead'] == False]),
...     'dead_cells': lambda df: len(df[df['dead'] == True])
... }
>>> qoi_df = calculate_qoi_from_db_file('study.db', qoi_funcs, chunk_size=20)
uq_physicell.model_analysis.utils.get_mean_std_qois(df_qois: DataFrame, filter_columns: list = []) DataFrame[source]

Calculate the mean and standard deviation of quantities of interest (QoIs) across replicates.

This function computes the mean and standard deviation QoI values for each parameter sample across all replicates, providing a central tendency and variability measure for the QoI estimates.

Parameters:
  • df_qois (pd.DataFrame) – DataFrame containing QoI values with SampleID, ReplicateID, and QoI columns.

  • filter_columns (list, optional) – List of columns to include in the calculation. If None, all numeric columns are used.

Returns: tuple: A tuple containing:

pd.DataFrame: DataFrame containing the mean QoI values for each SampleID, indexed by SampleID and ReplicateID. pd.DataFrame: DataFrame containing the standard deviation QoI values for each SampleID, indexed by SampleID and ReplicateID.

Note

The mean QoI values are calculated by averaging the QoI estimates across all replicates for each parameter sample. This provides a central tendency measure for the QoI estimates, which can be used for sensitivity analysis and comparison between different parameter samples.

uq_physicell.model_analysis.utils.get_relative_mcse_qois(df_mean: DataFrame, df_std: DataFrame, num_replicates: int, time_columns: list) DataFrame[source]

Calculate the relative Monte Carlo Standard Error (MCSE) for QoI estimates.

This function computes the relative MCSE for each QoI across replicates, providing insight into the uncertainty of the QoI estimates due to finite sampling in the simulations.

Parameters:
  • df_mean (pd.DataFrame) – DataFrame containing the mean QoI values for each SampleID, indexed by SampleID and ReplicateID.

  • df_std (pd.DataFrame) – DataFrame containing the standard deviation QoI values for each SampleID, indexed by SampleID and ReplicateID.

  • num_replicates (int) – The number of replicates used to calculate the mean and standard deviation of the QoIs.

  • time_columns (list) – List of column names corresponding to time points in the DataFrame, which should not be included in the MCSE calculation.

Returns:

DataFrame containing the relative MCSE values for each QoI, indexed by SampleID and ReplicateID.

Return type:

pd.DataFrame

Note

Relative Monte Carlo Standard Error (MCSE) is calculated as the standard deviation of the QoI across replicates divided by the square root of the number of replicates, and then normalized by the mean QoI value to express it as a percentage. This provides insight into the uncertainty of the QoI estimates due to finite sampling in the simulations.

  • < 1% (Excellent): This is the gold standard. If your relative MCSE is under 1%, your mean estimate is highly stable and precise. You can confidently use this metric for sensitivity analysis or publication.

  • 1% to 5% (Good / Acceptable): For stochastic biological simulations like PhysiCell, getting under 5% is generally considered reliable and practical.

  • 5% to 10% (Caution): You can use these metrics to observe broad trends, but small differences between parameters might just be noise. You likely need to run more replicates.

  • > 10% (Unreliable): The metric is too noisy. If you are running 50+ replicates and still have a relative MCSE > 10%, that specific QoI (Quantity of Interest) is likely a poor choice, or your biological system is fundamentally chaotic in that aspect.

uq_physicell.model_analysis.utils.get_summary_statistics_qois(df_qois: DataFrame) tuple[source]

Calculate summary statistics (mean, standard deviation, and relative MCSE) of quantities of interest.

Parameters:

df_qois (pd.DataFrame) – DataFrame containing QoI values with SampleID and ReplicateID columns.

Returns: tuple: A tuple containing:

pd.DataFrame: DataFrame with statistical summaries (mean) of QoIs grouped by SampleID, with columns for each QoI statistic. pd.DataFrame: DataFrame with standard deviation of QoIs grouped by SampleID, with columns for each QoI statistic. pd.DataFrame: DataFrame with relative Monte Carlo Standard Error (MCSE) of QoIs grouped by SampleID, with columns for each QoI statistic.

uq_physicell.model_analysis.utils.calculate_qoi_statistics(db_file_path: str, qoi_funcs: dict, df_qois_data: DataFrame = None, ignore_db_consistency: bool = False, qoi_def: dict = {}, chunk_size: int = 10) tuple[source]

Calculate statistical summaries (mean and relative MCSE) of quantities of interest across replicates.

This function computes mean and relative Monte Carlo Standard Error (MCSE) of QoI values across simulation replicates for each parameter sample, enabling uncertainty quantification.

Parameters:
  • db_file_path (str) – Path to the database file for context.

  • qoi_funcs (dict) – Dictionary of QoI functions where keys are QoI names and values are lambda functions or None.

  • df_qois_data (pd.DataFrame) – DataFrame containing QoI values with SampleID, ReplicateID, and QoI columns. Default is None, in which case the function will attempt to load QoI data from the database.

  • ignore_db_consistency (bool) – If True, bypasses the database consistency check.

  • qoi_def (dict) –

    first-class object, that can be used in qoi_funcs lambda string, mapped to their name. e.g. for a function definition, if the function definition is: def my_func():

    print(‘hello world!’) return 0

    then the qoi_def dict would look like this: {‘my_func’: my_func}

  • chunk_size (int, optional) – Number of samples to process at a time when loading from the database. Default is 10. Adjust based on available memory and data size.

Returns:

A tuple containing:
  • df_mean (pd.DataFrame): DataFrame with statistical summaries (mean) of QoIs grouped by SampleID, with columns for each QoI statistic.

  • df_std (pd.DataFrame): DataFrame with standard deviation of QoIs grouped by SampleID, with columns for each QoI statistic.

  • df_relative_mcse (pd.DataFrame): DataFrame with relative Monte Carlo Standard Error (MCSE) of QoIs grouped by SampleID, with columns for each QoI statistic.

Return type:

tuple

Note

Relative Monte Carlo Standard Error (MCSE) is calculated as the standard deviation of the QoI across replicates divided by the square root of the number of replicates, and then normalized by the mean QoI value to express it as a percentage. This provides insight into the uncertainty of the QoI estimates due to finite sampling in the simulations.

  • < 1% (Excellent): This is the gold standard. If your relative MCSE is under 1%, your mean estimate is highly stable and precise. You can confidently use this metric for sensitivity analysis or publication.

  • 1% to 5% (Good / Acceptable): For stochastic biological simulations like PhysiCell, getting under 5% is generally considered reliable and practical.

  • 5% to 10% (Caution): You can use these metrics to observe broad trends, but small differences between parameters might just be noise. You likely need to run more replicates.

  • > 10% (Unreliable): The metric is too noisy. If you are running 50+ replicates and still have a relative MCSE > 10%, that specific QoI (Quantity of Interest) is likely a poor choice, or your biological system is fundamentally chaotic in that aspect.

Raises:

ValueError – If no QoI functions are defined or data format is invalid.

Example

>>> qoi_funcs = {
...     'live_cells': lambda df: len(df[df['dead'] == False]),
...     'dead_cells': lambda df: len(df[df['dead'] == True])
... }
>>> df_mean, df_std, df_mcse = calculate_qoi_statistics(qoi_data, qoi_funcs, 'study.db')
uq_physicell.model_analysis.utils.apply_pca_to_qois(df_mean: DataFrame, latent_dim: int = 3, seed: int = None)[source]

Apply PCA to the selected QoIs.

This function uses PCA (scikit-learn) as a linear dimensionality reduction technique on the QoI matrix (samples x features).

Parameters:
  • df_mean (pd.DataFrame) – DataFrame with QoI mean values indexed by SampleID.

  • latent_dim (int) – Dimension of the latent encoding.

  • seed (int) – Random seed for reproducibility (used if PCA requires it). If None, no seed is set.

Returns:

{

‘method’: ‘pca’, ‘encoder_output’: np.ndarray (n_samples x latent_dim), ‘reconstruction’: np.ndarray (n_samples x n_features), ‘model’: PCA object, ‘scaler’: StandardScaler object

}

Return type:

dict

uq_physicell.model_analysis.utils.apply_autoencoder_to_qois(df_mean: DataFrame, latent_dim: int = 2, epochs: int = 200, batch_size: int = 32, verbose: bool = False, seed: int = None)[source]

Apply an autoencoder to the selected QoIs.

This function attempts to use PyTorch to train a small autoencoder on the QoI matrix (samples x features). If a seed is provided, results will be reproducible.

Parameters:
  • df_mean (pd.DataFrame) – DataFrame with QoI mean values indexed by SampleID.

  • latent_dim (int) – Dimension of the latent encoding.

  • epochs (int) – Training epochs for the PyTorch autoencoder.

  • batch_size (int) – Batch size for training.

  • verbose (bool) – Verbosity flag.

  • seed (int) – Random seed for reproducibility. If None, no seed is set.

Returns:

{

‘method’: ‘torch’, ‘encoder_output’: np.ndarray (n_samples x latent_dim), ‘reconstruction’: np.ndarray (n_samples x n_features), ‘model’: trained model object ‘scaler’: StandardScaler object

}

Return type:

dict

uq_physicell.model_analysis.utils.regression_accuracy_parameters(df_parameters: DataFrame, encoder_output: ndarray) DataFrame[source]

Calculate regression accuracy of parameters from the latent encoding.

This function trains a regression model (e.g., Random Forest) to predict each parameter from the latent encoding and evaluates the R² score for each parameter.

Parameters:
  • df_parameters (pd.DataFrame) – DataFrame containing parameter values indexed by SampleID.

  • encoder_output (np.ndarray) – Latent encoding of the QoI data (n_samples x latent_dim).

Returns:

DataFrame containing R² scores for each parameter, indexed by parameter name.
  • R² ≈ 1.0: Perfect fit. The model explains all the variance in the parameter values from the latent encoding.

  • R² < 0.5: Poor fit. The model does not capture the relationship between the latent encoding and the parameter values well.

  • R² ≈ 0: No fit. The model does not explain any of the variance in the parameter values from the latent encoding, indicating no relationship.

  • R² < 0: Worse than no fit. The model performs worse than simply predicting the mean parameter value for all samples, suggesting a very poor relationship between the latent encoding and the parameter values.

Return type:

pd.DataFrame

uq_physicell.model_analysis.utils.align_params_to_qois(df_params: DataFrame, df_qois: DataFrame) DataFrame[source]

Align parameter DataFrame to match the multi-index structure of QoI DataFrame.

If df_qois has a multi-index (SampleID, time) but df_params only has SampleID, this function replicates parameter rows for each time point so that both DataFrames can be joined on (SampleID, time).

String/categorical columns in df_params are automatically encoded to numeric values using LabelEncoder to ensure compatibility with scikit-learn regression models.

Parameters:
  • df_params (pd.DataFrame) – Parameter DataFrame indexed by SampleID.

  • df_qois (pd.DataFrame) – QoI DataFrame, possibly with multi-index (SampleID, time).

Returns:

Parameter DataFrame aligned to match df_qois index structure,

with categorical columns encoded as numeric.

Return type:

pd.DataFrame

uq_physicell.model_analysis.utils.find_optimal_qoi_set(df_qois: DataFrame, df_params: DataFrame) list[source]

Use Recursive Feature Elimination with Cross-Validation (RFECV) to find the optimal set of QoIs that best predict the parameters. This function trains a regression model (e.g., Random Forest) to predict parameters from the QoIs and uses RFECV to identify the most important QoIs for accurate parameter prediction. The optimal set of QoIs is determined based on the R² score obtained through cross-validation. :param df_qois: DataFrame containing QoI mean values indexed by SampleID and time. :type df_qois: pd.DataFrame :param df_params: DataFrame containing parameter values indexed by SampleID and time. :type df_params: pd.DataFrame

Returns:

List of optimal QoI names that best predict the parameters based on RFECV analysis.

Return type:

list

uq_physicell.model_analysis.utils.recursive_feature_elimination(df_qois_mean, df_qois_mcse, df_params, autoencoder_params={}, mcse_threshold=0.1, correlation_threshold=0.95, verbose=False)[source]

Perform smart feature elimination using balanced Autoencoder-first approach.

Philosophy (Hybrid): Removes only pathological features, preserves useful noise: 1. Remove EXTREME outliers (MCSE > 10% by default, configurable) 2. Remove highly correlated redundancy (>95%, configurable) 3. Apply Autoencoder on cleaned feature set 4. Use RFECV on latent space to find optimal encoding 5. Validate with Synthetic Recovery Test

This avoids “noise = non-informative” while still cleaning truly problematic features.

Parameters:
  • df_qois_mean (pd.DataFrame) – DataFrame containing mean QoI values indexed by SampleID and time.

  • df_qois_mcse (pd.DataFrame) – DataFrame containing relative MCSE values for QoIs indexed by SampleID and time.

  • df_params (pd.DataFrame) – DataFrame containing parameter values indexed by SampleID.

  • autoencoder_params (dict) – Parameters for autoencoder (latent_dim, epochs, batch_size, seed).

  • mcse_threshold (float) – Threshold for removing EXTREME noise (default 0.10 = 10%). Only removes pathological features.

  • correlation_threshold (float) – Threshold for removing redundant correlations (default 0.95 = 95%). Only removes near-duplicates.

  • verbose (bool) – Print detailed pipeline stages.

Returns:

{

‘removed_extreme_noise’: list of features removed as outliers, ‘removed_redundant’: list of highly correlated features removed, ‘cleaned_qois’: list of QoIs kept for autoencoder, ‘correlation_matrix’: full pairwise correlation matrix, ‘full_autoencoder_results’: autoencoder results on cleaned set, ‘full_regression_r2’: R² from full set, ‘final_qois’: QoI names selected by RFECV, ‘reduced_autoencoder_results’: autoencoder on final selected QoIs, ‘reduced_regression_r2’: R² from reduced set, ‘synthetic_recovery_test’: Full vs Reduced R² comparison,

}

Return type:

dict

Bayesian Optimization Module

Multi-objective Bayesian optimization for model calibration.

class uq_physicell.bo.bo_context.CalibrationContext(db_path: str, obsData: str | dict, obsData_columns: dict, model_config: dict, qoi_functions: dict, bo_options: dict, distance_functions: dict = None, search_space: dict = None, qoi_def: dict = {}, logger: Logger = None)[source]

Bases: object

Context for Bayesian Optimization calibration for single objective or multi-objective.

Parameters:
  • db_path (str) – Path to the database file for storing and retrieving samples.

  • obsData (str or dict) – Path or dict containing the observed data.

  • obsData_columns (dict) – Dictionary mapping QoI names to their corresponding columns in the observed data.

  • model_config (dict) – Configuration dictionary for the PhysiCell model, including paths and structure names.

  • qoi_functions (dict) – Dictionary of functions to compute quantities of interest (QoIs) from model outputs.

  • qoi_def (dict) – first-class object, that can be used in qoi_functions lambda string, mapped to their name.

  • distance_functions (dict) – Dictionary of functions to compute distances between model outputs and observed data.

  • search_space (dict) – Dictionary defining the search space for parameters, including bounds and types.

  • bo_options (dict) – Options for Bayesian Optimization including sampling parameters. - ‘use_correlated_gp’ (bool): If True, use MultiTaskGP to model correlations between objectives. If False (default), use independent GPs per objective. - Other options: num_initial_samples, num_iterations, batch_size_per_iteration, etc.

  • logger (logging.Logger) – Logger instance for logging messages during the calibration process.

default_run_single_replicate(sample_id: int, replicate_id: int, params: dict) dict[source]

Run a single replicate of the PhysiCell model. This function is responsible for executing the model with the given parameters and returning the results. :param sample_id: Unique identifier for the sample being processed. :type sample_id: int :param replicate_id: Unique identifier for the replicate being processed. :type replicate_id: int :param params: Dictionary of parameters to be used in the model run. :type params: dict

Returns:

dictionary of model outputs.

Return type:

dict

default_aggregation_func(replicate_results: list, sample_id: int) tuple[source]

Aggregate results from multiple replicates. This function computes the mean and standard deviation for each key in the results. :param replicate_results: List of dictionaries containing the results from each replicate. :type replicate_results: list

Returns:

A tuple containing the aggregated results, noise estimates, and a dictionary of all results.

Return type:

tuple

evaluate_params(params, sample_index)[source]

Evaluate a single parameter set by running replicates in parallel, aggregate outputs, compute multi-objective metrics and return a dict. :param params: Dictionary of parameters to be evaluated. :type params: dict :param sample_index: Index of the sample being evaluated. :type sample_index: int

Returns:

objectives (dict) containing the computed objectives for the given parameters,

obj_noise (dict) containing the standard deviation of objectives across replicates, dic_results (dict) containing the results from all replicates.

Return type:

tuple

save_results_to_db(sample_index: int, objectives: dict, noise_std: dict, dic_results: dict)[source]

Save results to database. :param sample_index: Index of the sample being saved. :type sample_index: int :param objectives: Dictionary containing the objective values. :type objectives: dict :param noise_std: Dictionary containing the noise standard deviations. :type noise_std: dict :param dic_results: Dictionary containing the results to save. :type dic_results: dict

generate_initial_samples_from_db(start_sample_id: int = 0, iteration_id: int = 0) tuple[source]

Generate initial samples from existing database for Bayesian optimization.

This function retrieves initial samples from the database, evaluates them with the QoI functions, and prepares the training tensors for the BO pipeline. Used for resume functionality.

Parameters:
  • start_sample_id (int, optional) – Starting sample ID. default will use 0 for new databases

  • iteration_id (int, optional) – Current iteration ID for tracking.

Returns:

(train_x, train_obj, train_obj_std) - Tensors ready for BO pipeline.

Return type:

tuple

generate_and_evaluate_samples(start_sample_id: int = 0, iteration_id: int = 0) tuple[source]

Generate and evaluate samples for Bayesian optimization using Sobol sequences.

This function generates samples in the search space using Sobol sequences, evaluates them with the model, and saves results to the database. Used for both initial sampling and restart functionality.

Parameters:
  • start_sample_id (int, optional) – Starting sample ID. default will use 0 for new databases or caller should provide the appropriate starting ID.

  • iteration_id (int, optional) – Current iteration ID for tracking.

Returns:

(train_x, train_obj, train_obj_std) - Tensors ready for BO pipeline.

Return type:

tuple

load_existing_data() tuple[source]

Load existing data from the database for resume functionality. :returns: A tuple containing training data tensors, latest iteration, and hypervolume. :rtype: tuple

update_bo_iterations(additional_iterations: int)[source]

Update the number of BO iterations to include additional iterations for resume.

analyze_convergence(hvs_list: list, train_obj: Any, train_obj_std: Any, train_x: Any, iteration: int) dict[source]

Noise-aware convergence analysis that distinguishes between: 1. True convergence (optimization found optimal solutions) 2. Noise-limited convergence (converged given noise level) 3. Stagnation with good coverage (likely converged to optimal region) 4. Stagnation with poor coverage (stuck in suboptimal region) 5. Still in progress

Parameters:
  • hvs_list (list) – History of hypervolume values

  • train_obj (torch_Tensor) – Objective values (used for Pareto analysis)

  • train_obj_std (torch_Tensor) – Standard deviation across replicates (noise estimate)

  • train_x (torch_Tensor) – Parameter values (normalized)

  • iteration (int) – Current iteration

Returns:

Convergence analysis results with recommendations

Return type:

dict

uq_physicell.bo.bo_context.single_objective_bayesian_optimization(calib_context, train_x, train_obj, train_obj_std, start_iteration)[source]

Single-objective Bayesian optimization loop.

uq_physicell.bo.bo_context.multi_objective_bayesian_optimization(calib_context, train_x, train_obj, train_obj_std, start_iteration, latest_hypervolume, resume_from_db)[source]

Multi-objective Bayesian optimization loop.

uq_physicell.bo.bo_context.run_bayesian_optimization(calib_context: CalibrationContext, additional_iterations: int | None = None)[source]

Execute the complete Bayesian optimization process.

Parameters:
  • calib_context (CalibrationContext) – The calibration context containing all configuration

  • additional_iterations (Optional[int]) – Additional iterations for resume functionality

Note

The Bayesian Optimization module requires additional dependencies (botorch, gpytorch, torch). Install them with: pip install botorch gpytorch torch

Plotting and Visualization

uq_physicell.bo.plots.plot_parameter_space(df_samples: DataFrame, df_param_space: DataFrame, params: dict = None, real_value: dict = None, axis=None)[source]

Plot the parameter space from the samples DataFrame.

Parameters:
  • df_samples – DataFrame containing the samples.

  • df_param_space – DataFrame defining the search space for each parameter.

  • params – Dictionary with parameter names as keys and their best values as values (optional).

  • real_value – Dictionary with real parameter values to plot (optional).

  • axis – Matplotlib axis to plot on (optional).

Returns:

Matplotlib figure and axis if axis is None, otherwise draw in the given axis.

uq_physicell.bo.plots.plot_parameter_space_db(db_file: str, params: dict = None, real_value: dict = None, axis=None)[source]

Plot the parameter space from the database file.

Parameters:
  • db_file – Path to the database file.

  • params – Dictionary with parameter names as keys and their best values as values (optional).

  • real_value – Dictionary with real parameter values to plot (optional).

  • axis – Matplotlib axis to plot on (optional).

Returns:

Matplotlib figure and axis if axis is None, otherwise draw in the given axis.

uq_physicell.bo.plots.plot_parameter_vs_fitness(df_samples: DataFrame, df_output: DataFrame, parameter_name: str, qoi_name: str, samples_id=None, axis=None)[source]

Plot the parameter values against the fitness values.

Parameters:
  • df_samples – DataFrame containing the samples.

  • df_output – DataFrame containing the output of the analysis.

  • parameter_name – Name of the parameter to plot.

  • qoi_name – Name of the QoI to plot against the parameter.

  • samples_id – List of sample IDs to highlight (optional).

  • axis – Matplotlib axis to plot on (optional).

Returns:

Matplotlib figure and axis if axis is None, otherwise draw in the given axis.

uq_physicell.bo.plots.plot_pareto_front(df_output: DataFrame, qoi_name1: str, qoi_name2: str, samples_id=None, axis=None, plot_std=False)[source]

Plot the Pareto front of the fitness values.

Parameters:
  • df_output – DataFrame containing the output of the analysis.

  • qoi_name1 – Name of the QoI to plot in x axis

  • qoi_name2 – Name of the QoI to plot in y axis

  • samples_id – List of sample IDs to highlight (optional).

  • axis – Matplotlib axis to plot on (optional).

Returns:

Matplotlib figure and axis if axis is None, otherwise draw in the given axis.

uq_physicell.bo.plots.plot_parameter_vs_fitness_db(db_file: str, parameter_name: str, qoi_name: str, axis=None)[source]

Plot the parameter space against the fitness values from the database file.

Parameters:
  • db_file – Path to the database file.

  • parameter_name – Name of the parameter to plot.

  • qoi_name – Name of the QoI to plot against the parameter.

  • axis – Matplotlib axis to plot on (optional).

Returns:

Matplotlib figure and axis if axis is None, otherwise draw in the given axis.

uq_physicell.bo.plots.plot_qoi_param(df_ObsData: DataFrame, df_output: DataFrame, samples_id: list, x_var: str, y_var: str, axis=None, swarmplot=False, plot_residuals=False)[source]

Plot the QoI parameter space from the database file.

Parameters:
  • df_ObsData – Observed QoI DataFrame.

  • df_output – Output DataFrame.

  • samples_id – List of Sample IDs to plot.

  • x_var – Variable to plot on the x-axis.

  • y_var – Variable to plot on the y-axis.

  • axis – Matplotlib axis to plot on (optional).

  • swarmplot – Whether to use swarmplot instead of scatterplot.

  • plot_residuals – Whether to plot residuals.

Returns:

Matplotlib figure and axis if axis is None, otherwise draw in the given axis.

uq_physicell.bo.plots.plot_qoi_param_db(db_file: str, samples_id: list, x_var: str = None, y_var: str = None, axis=None)[source]

Plot the QoI parameter space from the database file.

Parameters:
  • db_file – Path to the database file.

  • samples_id – List of Sample IDs to plot.

  • x_var – Variable to plot on the x-axis (optional).

  • y_var – Variable to plot on the y-axis (optional).

  • axis – Matplotlib axis to plot on (optional).

Returns:

Matplotlib figure and axis if axis is None, otherwise draw in the given axis.

Approximate Bayesian Computation Module

class uq_physicell.abc.abc_context.CalibrationContext(db_path: str, obsData: str | dict, obsData_columns: dict, model_config: dict, qoi_functions: dict, distance_functions: dict, prior: pyabc.Distribution, abc_options: dict, qoi_def: dict = {}, logger: Logger | None = None)[source]

Bases: object

Context for Approximate Bayesian Computation (ABC) calibration using pyABC.

This class encapsulates all necessary parameters and configurations for model calibration using ABC-SMC with sophisticated handling of multiple models, parallel computation, and adaptive strategies.

Parameters:
  • db_path (str) – Path to the database file for storing and retrieving calibration results.

  • obsData (str or dict) – Path to observed data CSV file or dictionary containing observed data.

  • obsData_columns (dict) – Dictionary mapping QoI names to their corresponding columns in the observed data.

  • model_config (dict) – Configuration dictionary for the PhysiCell model, including paths and structure names.

  • qoi_functions (dict) – Dictionary of functions to compute quantities of interest (QoIs) from model outputs.

  • qoi_def (dict) – first-class object, that can be used in qoi_functions lambda string, mapped to their name.

  • distance_functions (dict) – Dictionary of distance functions with their weights for comparing model outputs to observed data.

  • prior (Distribution) – Distribution defining the prior distributions for parameters

  • abc_options (dict) – Options for ABC-SMC including population parameters, sampling strategies, and convergence criteria.

  • logger (logging.Logger) – Logger instance for logging messages during the calibration process.

setup_sampler(cluster_setup_func=None)[source]

Setup the pyABC sampler based on configuration.

setup_population_strategy()[source]

Setup population size strategy.

setup_distance_function(distances_dict)[source]
setup_transition_function()[source]

Setup transition function for ABC-SMC.

setup_epsilon_function()[source]

Setup epsilon (tolerance) function for ABC-SMC.

create_model_wrapper(fixed_params_dict, workers_inner=None)[source]

Create wrapper function for PhysiCell model evaluation.

setup_abc_smc(models_list, priors_list, distance_function, population_size, transitions_func, my_sampler, eps_function)[source]

Setup the ABC-SMC object.

load_or_create_database(abc_smc, abc_id=1)[source]

Load existing database or create new one.

run_calibration(abc_smc, resume_db=False, current_populations=0, current_simulations=0)[source]

Run the ABC-SMC calibration.

check_convergence(abc_smc)[source]

Check convergence criteria.

include_additional_metadata(**metadata)[source]

Include additional metadata in the database.

uq_physicell.abc.abc_context.run_abc_calibration(calib_context: CalibrationContext) pyabc.History[source]

Execute the complete ABC-SMC calibration process.

This function orchestrates the entire ABC-SMC workflow using the CalibrationContext, including sampler setup, distance function configuration, model wrapper creation, and calibration execution with convergence checking.

Parameters:

calib_context (CalibrationContext) – The calibration context containing all configuration

Returns:

The pyABC History object containing calibration results

Return type:

History

Note

The Approximate Bayesian Computation module requires additional dependency (pyabc). Install them with: pip install pyabc

Database

Model Analysis

uq_physicell.database.ma_db.create_structure(db_file: str)[source]

Create the SQLite database structure for storing simulation analysis results.

This function initializes a SQLite database with five tables designed to store all components of a sensitivity analysis or uncertainty quantification study.

Parameters:

db_file (str) – Path to the SQLite database file to be created.

Tables Created:
  • Metadata: Stores simulation metadata (sampler, config path, model structure, uq_physicell version, pcdl version)

  • ParameterSpace: Stores parameter definitions (name, bounds, reference values)

  • QoIs: Stores quantities of interest definitions (name, function)

  • Samples: Stores parameter samples (sample ID, parameter name, value)

  • Output: Stores simulation results (sample ID, replicate ID, serialized data)

Note

If the database file already exists, it will be removed and recreated to ensure a clean structure.

Example

>>> create_structure('sensitivity_analysis.db')
>>> # Database created with all required tables
uq_physicell.database.ma_db.insert_metadata(db_file: str, sampler: str, ini_file_path: str, strucName: str)[source]

Insert metadata information into the Metadata table.

Parameters:
  • db_file (str) – Path to the SQLite database file.

  • sampler (str) – Name of the sampling method used (e.g., ‘Sobol’, ‘LHS’, ‘Morris’).

  • ini_file_path (str) – Path to the PhysiCell configuration (.ini) file.

  • strucName (str) – Name or identifier of the model structure used.

Raises:

sqlite3.Error – If database connection or insertion fails.

Example

>>> insert_metadata('study.db', 'Sobol', 'config/params.ini', 'tumor_growth')
uq_physicell.database.ma_db.insert_param_space(db_file: str, params_dict: dict)[source]

Insert parameter space information into the ParameterSpace table.

Parameters:
  • db_file (str) – Path to the SQLite database file.

  • params_dict (dict) – Dictionary containing parameter names and their properties. Each parameter should have keys: ‘lower_bound’, ‘upper_bound’, ‘ref_value’, and ‘perturbation’.

Raises:

RuntimeError – If parameter insertion fails due to database errors.

Example

>>> params = {
...     'param1': {
...         'lower_bound': 0.0,
...         'upper_bound': 1.0,
...         'ref_value': 0.5,
...         'perturbation': 0.1
...     }
... }
>>> insert_param_space('study.db', params)
uq_physicell.database.ma_db.insert_qois(db_file: str, qois_dic: dict)[source]

Insert quantities of interest (QoIs) into the QoIs table.

Parameters:
  • db_file (str) – Path to the SQLite database file.

  • qois_dic (dict) – Dictionary of QoIs where keys are QoI names and values are string representations of functions.

Raises:

RuntimeError – If QoI insertion fails due to database errors.

Example

>>> qois = {
...     'total_cells': "lambda data: data['cell_count'].sum()",
...     'max_radius': "lambda data: data['radius'].max()"
... }
>>> insert_qois('study.db', qois)
uq_physicell.database.ma_db.insert_samples(db_file: str, dic_samples: dict)[source]

Insert sample parameters into the Samples table.

Parameters:
  • db_file (str) – Path to the SQLite database file.

  • dic_samples (dict) – Nested dictionary where outer keys are sample IDs and inner dictionaries contain parameter names and values.

Raises:

RuntimeError – If sample insertion fails due to database errors.

Example

>>> samples = {
...     0: {'param1': 0.5, 'param2': 1.2},
...     1: {'param1': 0.8, 'param2': 0.9}
... }
>>> insert_samples('study.db', samples)
uq_physicell.database.ma_db.insert_output(db_file: str, sample_id: int, replicate_id: int, result_data: bytes, compress: bool = True)[source]

Insert simulation results into the Output table.

Parameters:
  • db_file (str) – Path to the SQLite database file.

  • sample_id (int) – Unique identifier for the parameter sample.

  • replicate_id (int) – Identifier for the simulation replicate.

  • result_data (bytes) – Serialized simulation results data.

  • compress (bool) – If True, compress data with zstd before storing (default True).

Raises:

RuntimeError – If output insertion fails due to database errors.

Example

>>> import pickle
>>> data = pd.DataFrame({'time': [0, 1, 2], 'cells': [1, 2, 3]})
>>> serialized_data = pickle.dumps(data)
>>> insert_output('study.db', 0, 1, serialized_data)
>>>
>>> # With compression
>>> insert_output('study.db', 0, 1, serialized_data, compress=True)
uq_physicell.database.ma_db.load_metadata(db_file: str) DataFrame[source]

Load metadata from the database.

Parameters:

db_file (str) – Path to the SQLite database file.

Returns:

DataFrame with columns [‘Sampler’, ‘Ini_File_Path’, ‘StructureName’].

Return type:

pd.DataFrame

Raises:

sqlite3.Error – If database connection or query fails.

Example

>>> df_metadata = load_metadata('study.db')
>>> print(df_metadata['Sampler'].values[0])
uq_physicell.database.ma_db.load_parameter_space(db_file: str) DataFrame[source]

Load parameter space from the database.

Parameters:

db_file (str) – Path to the SQLite database file.

Returns:

DataFrame with columns [‘ParamName’, ‘lower_bound’, ‘upper_bound’,

’ref_value’, ‘perturbation’]. Perturbation column is converted from string to numpy array.

Return type:

pd.DataFrame

Raises:

sqlite3.Error – If database connection or query fails.

Example

>>> df_params = load_parameter_space('study.db')
>>> print(df_params[['ParamName', 'ReferenceValue']])
uq_physicell.database.ma_db.load_qois(db_file: str) DataFrame[source]

Load quantities of interest (QoIs) from the database.

Parameters:

db_file (str) – Path to the SQLite database file.

Returns:

DataFrame with columns [‘QOI_Name’, ‘QOI_Function’].

Returns DataFrame with None values if no QoIs are defined.

Return type:

pd.DataFrame

Raises:

sqlite3.Error – If database connection or query fails.

Example

>>> df_qois = load_qois('study.db')
>>> print(df_qois['QOI_Name'].to_list())
uq_physicell.database.ma_db.load_samples(db_file: str) dict[source]

Load parameter samples from the database.

Parameters:

db_file (str) – Path to the SQLite database file.

Returns:

Dictionary where keys are sample IDs and values are dictionaries

of parameter names and values.

Return type:

dict

Raises:

sqlite3.Error – If database connection or query fails.

Example

>>> dic_samples = load_samples('study.db')
>>> print(f"Sample 0 parameters: {dic_samples[0]}")
uq_physicell.database.ma_db.load_output(db_file: str, sample_ids: list = None, replicate_ids: list = None, load_data: bool = True) DataFrame[source]

Load simulation output from the database with flexible filtering options.

This function allows selective loading of simulation results, which is useful for memory efficiency and performance when working with large databases.

Parameters:
  • db_file (str) – Path to the SQLite database file.

  • sample_ids (list, optional) – List of specific sample IDs to load. If None, loads all samples.

  • replicate_ids (list, optional) – List of specific replicate IDs to load. If None, loads all replicates.

  • load_data (bool, optional) – If True, deserializes the Data column. If False, only loads SampleID and ReplicateID. Default is True.

Returns:

DataFrame with columns [‘SampleID’, ‘ReplicateID’, ‘Data’] if load_data=True,

or [‘SampleID’, ‘ReplicateID’] if load_data=False. When load_data=True, the Data column contains deserialized objects.

Return type:

pd.DataFrame

Raises:

Examples

>>> # Load only metadata (SampleID, ReplicateID)
>>> df = load_output('study.db', load_data=False)
>>>
>>> # Load specific sample
>>> df = load_output('study.db', sample_ids=[0, 1])
>>>
>>> # Load specific replicate across all samples
>>> df = load_output('study.db', replicate_ids=[0])
>>>
>>> # Load specific sample and replicate combination
>>> df = load_output('study.db', sample_ids=[5], replicate_ids=[0])
uq_physicell.database.ma_db.load_data_unserialized(db_file: str, sample_ids: list = None, replicate_ids: list = None) DataFrame[source]

Load and expand simulation output into time-series columns based on QoIs.

This function loads the output data and, if QoIs are defined, expands each QoI and time array into separate columns (e.g., ‘qoi_0’, ‘qoi_1’, ‘time_0’, ‘time_1’). If no QoIs are defined, returns the raw deserialized data.

Parameters:
  • db_file (str) – Path to the SQLite database file.

  • sample_ids (list, optional) – List of specific sample IDs to load. If None, loads all samples.

  • replicate_ids (list, optional) – List of specific replicate IDs to load. If None, loads all replicates.

Returns:

DataFrame with SampleID, ReplicateID, and either:
  • Expanded QoI time-series columns if QoIs are defined

  • Raw Data column if no QoIs are defined

Return type:

pd.DataFrame

Raises:

RuntimeError – If data loading or processing fails.

Examples

>>> # Load all data with QoI expansion
>>> df = load_data_unserialized('study.db')
>>> print(df.columns)  # ['SampleID', 'ReplicateID', 'total_cells_0', 'time_0', ...]
>>>
>>> # Load specific sample
>>> df = load_data_unserialized('study.db', sample_ids=[0])
uq_physicell.database.ma_db.load_structure(db_file: str, load_result: bool = True) tuple[source]

Load the complete database structure and return all data components.

This is a convenience wrapper function that calls all modular load functions. For more control over what data is loaded, use the individual load functions:

  • load_metadata(db_file)

  • load_parameter_space(db_file)

  • load_qois(db_file)

  • load_samples(db_file)

  • load_output(db_file, sample_ids=None, replicate_ids=None, load_data=True)

  • load_data_unserialized(db_file, sample_ids=None, replicate_ids=None)

Parameters:
  • db_file (str) – Path to the SQLite database file.

  • load_result (bool, optional) – If True, loads and deserializes all output data with QoI expansion. If False, only loads SampleID and ReplicateID. Default is True.

Returns:

A 5-tuple containing:
  • df_metadata (pd.DataFrame): Metadata information (sampler, config, structure, uq_physicell version, pcdl version)

  • df_parameter_space (pd.DataFrame): Parameter space definitions

  • df_qois (pd.DataFrame): Quantities of interest definitions

  • dic_samples (dict): Dictionary of parameter samples by sample ID

  • df_results (pd.DataFrame): Simulation results (expanded or metadata only)

Return type:

tuple

Raises:

RuntimeError – If database loading fails.

Examples

>>> # Load everything with full data
>>> metadata, params, qois, samples, results = load_structure('study.db')
>>> print(f"Loaded {len(samples)} samples with {len(results)} results")
>>>
>>> # Load only metadata (no deserialization)
>>> metadata, params, qois, samples, ids = load_structure('study.db', load_result=False)
>>> print(f"Found {len(ids)} simulation results")
uq_physicell.database.ma_db.check_db_consistency(db_file)[source]

Check database consistency for Samples and Output.

This function verifies that all samples defined in the Samples table have corresponding entries in the Output table. It return the missing entries.

Parameters:

db_file (str) – Path to the SQLite database file.

uq_physicell.database.ma_db.check_simulations_db(PhysiCellModel: PhysiCell_Model, sampler: str, param_dict: dict, dic_samples: dict, qois_dic: dict, db_file: str) tuple[source]

Check database existence and identify missing simulations.

This function verifies if a database exists and contains all required simulation results. It compares the expected samples against completed simulations to identify any missing runs.

Parameters:
  • PhysiCellModel (PhysiCell_Model) – The PhysiCell model instance for simulations.

  • sampler (str) – Name of the sampling method used (e.g., ‘Sobol’, ‘LHS’).

  • param_dict (dict) – Parameter space definition with bounds and reference values.

  • dic_samples (dict) – Dictionary of parameter samples to be simulated.

  • qois_dic (dict) – Dictionary of quantities of interest definitions.

  • db_file (str) – Path to the SQLite database file.

Returns:

A 2-tuple containing:
  • exist_db (bool): True if database exists and contains valid structure

  • parameters_missing (list): List of dictionaries for missing parameter sets

Return type:

tuple

Example

>>> model = PhysiCell_Model('config.ini')
>>> exists, missing = check_simulations_db(
...     model, 'Sobol', params, samples, qois, 'study.db'
... )
>>> print(f"Database exists: {exists}, Missing: {len(missing)} simulations")
uq_physicell.database.ma_db.get_database_type(db_file: str) bool[source]

Determine the type of analysis database (Model Analysis or Bayesian Optimization).

This function examines the database structure to identify whether it contains Model Analysis (MA) or Bayesian Optimization (BO) data based on table schemas.

Parameters:

db_file (str) – Path to the SQLite database file to examine.

Returns:

Returns ‘MA’ for Model Analysis, ‘BO’ for Bayesian Optimization,

or None if the database type cannot be determined.

Return type:

str or None

Example

>>> db_type = get_database_type('analysis.db')
>>> if db_type == 'MA':
...     print("This is a Model Analysis database")
>>> elif db_type == 'BO':
...     print("This is a Bayesian Optimization database")

Bayesian Optimization

uq_physicell.database.bo_db.create_structure(db_path: str)[source]

Create the SQLite database structure for storing Bayesian Optimization (BO) calibration data. This function initializes the database with the necessary tables to store metadata, parameter space definitions, quantities of interest (QoIs), Gaussian Process models, samples, and simulation output.

Parameters:

db_path (str) – Path to the SQLite database file.

Tables Created:
  • Metadata: Stores information about the calibration (method, observed data path, .ini config path, model structure name, uq_physicell_version, pcdl_version, botorch_version).

  • ParameterSpace: Stores the parameter space information (ParamName, Type, Lower_Bound, Upper_Bound, Regulates).

  • QoIs: Stores the quantities of interest (QoI_Name, QoI_Function, ObsData_Column, QoI_distanceFunction, QoI_distanceWeight).

  • GP_Models: Stores the Gaussian Process models (IterationID, GP_Model, Hypervolume).

  • Samples: Stores the samples (IterationID, SampleID, ParamName, ParamValue).

  • Output: Stores the output of the simulations (SampleID, Objective_Function, Noise_Std, and Data).

Example

>>> create_structure('calibration.db')
# Database created with the necessary tables for BO calibration.
uq_physicell.database.bo_db.insert_metadata(db_path: str, metadata: dict)[source]

Insert BO metadata information into the Metadata table.

Parameters:
  • db_path (str) – Path to the database file.

  • metadata (dict) – Dictionary containing BO metadata information.

Example

>>> metadata = {
...     'BO_Method': 'Bayesian Optimization',
...     'ObsData_Path': 'observed_data.csv',
...     'Ini_File_Path': 'config.ini',
...     'StructureName': 'PhysiCell'
... }
>>> insert_metadata('calibration.db', metadata)
# Metadata inserted into the database.
uq_physicell.database.bo_db.insert_param_space(db_path: str, param_space: dict)[source]

Insert BO parameter space information into the ParameterSpace table.

Parameters:
  • db_path (str) – Path to the database file.

  • param_space (dict) – Dictionary containing parameter space information.

Example

>>> param_space = {
...     'param1': {'type': 'real', 'lower_bound': 0.0, 'upper_bound': 1.0},
...     'param2': {'type': 'real', 'lower_bound': 1.0, 'upper_bound': 5.0}
... }
>>> insert_param_space('calibration.db', param_space)
# Parameter space information inserted into the database.
uq_physicell.database.bo_db.insert_qois(db_path: str, qois: dict)[source]

Insert QoIs into the QoIs table.

Parameters:
  • db_path (str) – Path to the database file.

  • qois (dict) – Dictionary of QoIs (keys as names, values as lambda functions or strings).

Example

>>> qois = {
...     'total_cells': "lambda data: data['cell_count'].sum()",
...     'max_radius': "lambda data: data['radius'].max()"
... }
>>> insert_qois('calibration.db', qois)
# QoIs inserted into the database.
uq_physicell.database.bo_db.insert_gp_models(db_path: str, iteration_id: int, gp_model: Any, hypervolume: float)[source]

Insert Gaussian Process model into the GP_Models table.

Parameters:
  • db_path (str) – Path to the database file.

  • iteration_id (int) – The iteration ID for the GP model.

  • gp_model (ModelListGP) – The Gaussian Process model to be stored.

  • hypervolume (float) – The hypervolume value for this iteration.

Example

>>> from botorch.models import SingleTaskGP
>>> from botorch.models.model_list_gp_regression import ModelListGP
>>> import torch
>>> # Create a simple GP model for demonstration
>>> X = torch.rand(10, 1)
>>> Y = torch.sin(X * 2 * torch.pi) + 0.1 * torch.randn_like(X)
>>> gp = SingleTaskGP(X, Y)
>>> model_list = ModelListGP(gp)
>>> insert_gp_models('calibration.db', iteration_id=0, gp_model=model_list, hypervolume=0.5)
uq_physicell.database.bo_db.insert_samples(db_path: str, iteration_id: int, samples: dict)[source]

Insert samples into the Samples table.

Parameters:
  • db_path (str) – Path to the database file.

  • iteration_id (int) – The iteration ID for the samples.

  • samples (dict) – Dictionary of samples (keys as SampleID, values as dictionaries of ParamName and ParamValue).

Example

>>> samples = {
...     1: {'param1': 0.5, 'param2': 1.0},
...     2: {'param1': 0.3, 'param2': 1.2}
... }
>>> insert_samples('calibration.db', iteration_id=0, samples=samples)
uq_physicell.database.bo_db.insert_output(db_path: str, sample_id: int, obj_func: bytes, noise_std: bytes, data: bytes)[source]

Insert simulation results into the Output table.

Parameters:
  • db_path (str) – Path to the database file.

  • sample_id (int) – The sample ID.

  • obj_func (bytes) – The objective function values (as binary).

  • noise_std (bytes) – The noise standard deviation values (as binary).

  • data (bytes) – The simulation results data (as binary).

Example

>>> insert_output('calibration.db', sample_id=1, obj_func=b'...', noise_std=b'...', data=b'...')
uq_physicell.database.bo_db.load_metadata(db_file: str) DataFrame[source]

Load metadata from the BO database.

Parameters:

db_file (str) – Path to the SQLite database file.

Returns:

DataFrame with metadata information.

Return type:

pd.DataFrame

Raises:

sqlite3.Error – If database connection or query fails.

Example

>>> df_metadata = load_metadata('calibration.db')
>>> print(df_metadata['BO_Method'].values[0])
uq_physicell.database.bo_db.load_parameter_space(db_file: str) DataFrame[source]

Load parameter space from the BO database.

Parameters:

db_file (str) – Path to the SQLite database file.

Returns:

DataFrame with columns [‘ParamName’, ‘type’, ‘lower_bound’, ‘upper_bound’, ‘regulates’].

Return type:

pd.DataFrame

Raises:

sqlite3.Error – If database connection or query fails.

Example

>>> df_params = load_parameter_space('calibration.db')
>>> print(df_params[['ParamName', 'lower_bound', 'upper_bound']])
uq_physicell.database.bo_db.load_qois(db_file: str) DataFrame[source]

Load quantities of interest (QoIs) from the BO database.

Parameters:

db_file (str) – Path to the SQLite database file.

Returns:

DataFrame with columns [‘QoI_Name’, ‘QoI_Type’, ‘ObsData_Column’,

’QoI_distanceFunction’, ‘QoI_distanceWeight’].

Return type:

pd.DataFrame

Raises:

sqlite3.Error – If database connection or query fails.

Example

>>> df_qois = load_qois('calibration.db')
>>> print(df_qois['QoI_Name'].to_list())
uq_physicell.database.bo_db.load_gp_models(db_file: str) DataFrame[source]

Load Gaussian Process models from the BO database.

Parameters:

db_file (str) – Path to the SQLite database file.

Returns:

DataFrame with columns [‘IterationID’, ‘GP_Model’, ‘Hypervolume’].

GP_Model column contains deserialized torch model objects.

Return type:

pd.DataFrame

Raises:

sqlite3.Error – If database connection or query fails.

Example

>>> df_gp_models = load_gp_models('calibration.db')
>>> print(f"Loaded {len(df_gp_models)} GP models")
uq_physicell.database.bo_db.load_samples(db_file: str, iteration_ids: list = None) DataFrame[source]

Load parameter samples from the BO database.

Parameters:
  • db_file (str) – Path to the SQLite database file.

  • iteration_ids (list, optional) – List of specific iteration IDs to load. If None, loads all iterations.

Returns:

DataFrame with columns [‘IterationID’, ‘SampleID’, ‘ParamName’, ‘ParamValue’].

Return type:

pd.DataFrame

Raises:

sqlite3.Error – If database connection or query fails.

Example

>>> df_samples = load_samples('calibration.db')
>>> # Load specific iterations
>>> df_samples = load_samples('calibration.db', iteration_ids=[0, 1, 2])
uq_physicell.database.bo_db.load_output(db_file: str, sample_ids: list = None, load_data: bool = True) DataFrame[source]

Load simulation output from the BO database.

Parameters:
  • db_file (str) – Path to the SQLite database file.

  • sample_ids (list, optional) – List of specific sample IDs to load. If None, loads all samples.

  • load_data (bool, optional) – If True, deserializes the ObjFunc, Noise_Std, and Data columns. If False, only loads SampleID metadata. Default is True.

Returns:

DataFrame with columns [‘SampleID’, ‘ObjFunc’, ‘Noise_Std’, ‘Data’] if load_data=True,

or [‘SampleID’] if load_data=False.

Return type:

pd.DataFrame

Raises:

sqlite3.Error – If database connection or query fails.

Example

>>> # Load all output with deserialization
>>> df_output = load_output('calibration.db')
>>>
>>> # Load specific samples without deserialization
>>> df_output = load_output('calibration.db', sample_ids=[0, 1, 2], load_data=False)
uq_physicell.database.bo_db.load_structure(db_file: str, load_data: bool = True) tuple[source]

Load the complete BO database structure using modular load functions.

This is a convenience wrapper that loads all tables from the database. For more control over what data is loaded, use the individual load functions:

  • load_metadata(db_file)

  • load_parameter_space(db_file)

  • load_qois(db_file)

  • load_gp_models(db_file)

  • load_samples(db_file, iteration_ids=None)

  • load_output(db_file, sample_ids=None, load_data=True)

Parameters:
  • db_file (str) – Path to the SQLite database file.

  • load_data (bool, optional) – If True, deserializes GP models and output data. If False, only loads metadata without deserialization. Default is True.

Returns:

A 6-tuple containing:
  • df_metadata (pd.DataFrame): Metadata information

  • df_param_space (pd.DataFrame): Parameter space definitions

  • df_qois (pd.DataFrame): Quantities of interest definitions

  • df_gp_models (pd.DataFrame): Gaussian Process models

  • df_samples (pd.DataFrame): Parameter samples

  • df_output (pd.DataFrame): Simulation output

Return type:

tuple

Raises:

RuntimeError – If any database loading fails.

Example

>>> # Load everything with full data
>>> metadata, params, qois, gp_models, samples, output = load_structure('calibration.db')
>>>
>>> # Load only metadata (no deserialization)
>>> metadata, params, qois, gp_models, samples, output = load_structure('calibration.db', load_data=False)

Utils

Summary of simulation

uq_physicell.utils.sumstats.summ_func_FinalPopLiveDead(outputPath: str, summaryFile: str | None, dic_params: dict, SampleID: int, ReplicateID: int) DataFrame | None[source]

Final population of live and dead cells

Parameters:
  • outputPath – str -> Path to the PhysiCell output directory.

  • summaryFile – Union[str, None] -> File to store the summary (optional).

  • dic_params – dict -> Dictionary of simulation parameters.

  • SampleID – int -> Unique identifier for the sample.

  • ReplicateID – int -> Unique identifier for the replicate.

Returns:

pd.DataFrame or None -> DataFrame with the computed QoIs or None if saved to a file.

uq_physicell.utils.sumstats.summ_func_TimeSeriesPopLiveDead(outputPath: str, summaryFile: str | None, dic_params: dict, SampleID: int, ReplicateID: int) DataFrame | None[source]

Population over time of live and dead cells

Parameters:
  • outputPath – str -> Path to the PhysiCell output directory.

  • summaryFile – Union[str, None] -> File to store the summary (optional).

  • dic_params – dict -> Dictionary of simulation parameters.

  • SampleID – int -> Unique identifier for the sample.

  • ReplicateID – int -> Unique identifier for the replicate.

Returns:

pd.DataFrame or None -> DataFrame with the computed QoIs or None if saved to a file.

uq_physicell.utils.sumstats.safe_call_qoi_function(func: callable, mcds: pcdl.TimeStep | None = None, list_mcds: list | None = None)[source]

Safely call a QoI function with the appropriate dataframe based on parameter inspection.

Parameters:
  • func – The QoI function to call

  • mcds – pcdl.TimeStep or None -> The mcds object for single snapshot

  • list_mcds – list of pcdl.TimeStep or None -> The mcds time series object for multiple snapshots

Returns:

Result of the QoI function

uq_physicell.utils.sumstats.recreate_qoi_functions(qoi_functions: dict, qoi_def: dict = {}) dict[source]

Recreate QoI functions from their string representations.

Parameters:
  • qoi_functions

    Dictionary of QoI functions (keys as names, values as strings). Pass a mapping like the following:

    qoi_functions = {

    “mean_substrate”: “lambda df_subs: df_subs[‘substrate’].mean()”, “live_cell_count”: “lambda df_cell: len(df_cell[df_cell[‘dead’] == False])”, “max_volume”: “lambda df: df[‘total_volume’].max()”, “mean_radial_distance”: “lambda df: df[[‘position_x’, ‘position_y’, ‘position_z’]].apply(lambda row: ((row[‘position_x’]**2 + row[‘position_y’]**2 + row[‘position_z’]**2)**0.5), axis=1).mean()”

    }

  • qoi_def (dict) –

    first-class object, that can be used in qoi_functions lambda string, mapped to their name. e.g. for a function definition, if the function definition is:

    def my_func():

    print(‘hello world!’) return 0

    then the qoi_def dict would look like this:

    {‘my_func’: my_func}

Returns:

Dictionary of recreated QoI functions (keys as names, values as callables)

uq_physicell.utils.sumstats.summary_function(outputPath: str, summaryFile: str | None, dic_params: dict, SampleID: int, ReplicateID: int, qoi_functions: dict, RemoveFolder: bool = True, drop_columns: list | None = None) DataFrame | None[source]

Generic summary function for creating custom QoIs (Quantities of Interest) based on df_cell elements.

Parameters:
  • outputPath – str -> Path to the PhysiCell output directory.

  • summaryFile – Union[str, None] -> File to store the summary (optional).

  • dic_params – dict -> Dictionary of simulation parameters.

  • SampleID – int -> Unique identifier for the sample.

  • ReplicateID – int -> Unique identifier for the replicate.

  • qoi_functions – dict -> Dictionary of QoI functions with keys as QoI names and values as functions/lambdas.

  • RemoveFolder – bool -> Whether to remove the output folder after processing.

  • drop_columns – list -> List of columns to drop from the DataFrame.

Returns:

pd.DataFrame or None -> DataFrame with the computed QoIs or None if saved to a file.

uq_physicell.utils.sumstats.qoi_func_radial_density_summary(df, center=[0, 0, 0])[source]

Extract summary statistics from radial distribution of cells :param df: DataFrame with columns [‘position_x’, ‘position_y’, ‘position_z’] :param center: List or array-like of length 3 representing the center point for radial distance calculation (default is [0,0,0])

uq_physicell.utils.sumstats.qoi_func_persistent_homology(df: DataFrame, Plot=False) tuple[source]

Compute persistent homology vectorization using muspan. (source: https://docs.muspan.co.uk/latest/_collections/topology/Topology%203%20-%20persistence%20vectorisation.html)

Parameters:
  • df_cells – DataFrame -> DataFrame containing cell data with ‘position_x’, ‘position_y’, and ‘cell_type’ columns.

  • Plot – bool -> Whether to plot the persistence diagram (optional).

Returns:

tuple -> (pd.Series with vectorized persistent homology features, figure or None)

uq_physicell.utils.sumstats.qoi_func_relational_ph(df: DataFrame, landmark_type: str, witness_type: str, max_dim: int = 1, mode: str = 'distance', ax=None) tuple[source]

Relational Persistent Homology using Dowker/Witness idea. A→B (A as vertices, B as witnesses) describes how B are arranged around A geometry.

Parameters:
  • df – DataFrame with columns [‘position_x’,’position_y’,’cell_type’]

  • landmark_type – cell type to use as landmarks (A)

  • witness_type – cell type to use as witnesses (B)

  • max_dim – PH dimension (0 or 1)

  • mode – ‘distance’ or ‘count’

  • ax – matplotlib axis for plotting (optional)

Returns:

tuple -> (pd.Series with vectorized persistent homology features (summary statistics), raw persistence diagram (list of (dimension, (birth, death)) pairs)

Wrapper of output

uq_physicell.utils.model_wrapper.run_replicate(PhysiCellModel: PhysiCell_Model, sample_id: int, replicate_id: int, ParametersXML: dict, ParametersRules: dict, qoi_functions: dict | None = None, qoi_def: dict = {}, return_binary_output: bool = True, drop_columns: list | None = None, custom_summary_function: callable | None = None) tuple[source]

Run a single replicate of the PhysiCell simulation.

This function executes one simulation replicate with specified parameters and returns either processed QoI results or raw simulation data.

Parameters:
  • PhysiCellModel (PhysiCell_Model) – The PhysiCell model instance to run.

  • sample_id (int) – Unique identifier for the parameter sample.

  • replicate_id (int) – Identifier for the simulation replicate.

  • ParametersXML (dict) – Parameters to modify in the XML configuration.

  • ParametersRules (dict) – Parameters for custom rules modifications.

  • qoi_functions (dict) – Dictionary of QoI functions (keys as names, values as strings). If None, returns raw simulation data.

  • qoi_def (dict) – first-class object, that can be used in qoi_functions lambda string, mapped to their name. Defaults to {}.

  • return_binary_output (bool, optional) – Whether to return results as binary data. Defaults to True.

  • drop_columns (Union[list, None], optional) – List of columns to drop from DataFrame. Defaults to None.

  • custom_summary_function (callable, optional) – Custom summary function to use instead of the default generic QoI function. Defaults to None.

Returns:

A 3-tuple containing (sample_id, replicate_id, result_data) where:
  • If qoi_functions provided: result_data contains calculated QoI values

  • If qoi_functions is None: result_data contains list of MCDS objects

Return type:

tuple

Note

If custom_summary_function is provided, qois_dic and drop_columns are not used.

uq_physicell.utils.model_wrapper.run_replicate_serializable(PhysiCellModel_conf: dict, sample_id: int, replicate_id: int, ParametersXML: dict, ParametersRules: dict, qoi_functions: dict | None = None, qoi_def: dict = {}, return_binary_output: bool = True, drop_columns: list | None = None, custom_summary_function: callable | None = None) tuple[source]

Run a single replicate of the PhysiCell model and return the results. This wrapper function initializes the PhysiCell model and then calls the run_replicate function to execute the simulation. It is designed to be serializable for use in parallel processing contexts.

Parameters:
  • PhysiCellModel_conf (dict) – Configuration dictionary for the PhysiCell model with ini_path and struc_name.

  • sample_id (int) – Sample ID.

  • replicate_id (int) – Replicate ID.

  • ParametersXML (dict) – Dictionary of XML parameters.

  • ParametersRules (dict) – Dictionary of rules parameters.

  • return_binary_output (bool, optional) – Whether to return results as binary data. Defaults to True.

  • qoi_functions (dict, optional) – Dictionary of QoIs (keys as names, values as strings). Defaults to None.

  • qoi_def (dict) – first-class object, that can be used in qoi_functions lambda string, mapped to their name. Defaults to {}.

  • drop_columns (list, optional) – List of columns to drop from the output. Defaults to None.

  • custom_summary_function (callable, optional) – Custom summary function to use instead of the default generic QoI function. Defaults to None.

Returns:

A 3-tuple containing (sample_id, replicate_id, result_data) where:
  • If qoi_functions provided: result_data contains calculated QoI values

  • If qoi_functions is None: result_data contains list of MCDS objects

Return type:

tuple

Note

If custom_summary_function is provided, qois_dic and drop_columns are not used.

Distances

uq_physicell.utils.distances.SumSquaredDifferences(dic_model_data: dict, dic_obs_data: dict) float[source]

Compute the sum of squared differences between simulation outputs and observational data. :param dic_model_data: Dictionary containing model data with keys “time” and “value”. :type dic_model_data: dict :param dic_obs_data: Dictionary containing observational data with keys “time” and “value”. :type dic_obs_data: dict

Returns:

The sum of squared differences between the model data and observational data.

Return type:

float

uq_physicell.utils.distances.Manhattan(dic_model_data: dict, dic_obs_data: dict) float[source]

Compute the Manhattan distance (L1 norm) between simulation outputs and observational data. :param dic_model_data: Dictionary containing model data with keys “time” and “value”. :type dic_model_data: dict :param dic_obs_data: Dictionary containing observational data with keys “time” and “value”. :type dic_obs_data: dict

Returns:

The Manhattan distance between the model data and observational data.

Return type:

float

uq_physicell.utils.distances.Chebyshev(dic_model_data: dict, dic_obs_data: dict) float[source]

Compute the Chebyshev distance (L∞ norm) between simulation outputs and observational data. :param dic_model_data: Dictionary containing model data with keys “time” and “value”. :type dic_model_data: dict :param dic_obs_data: Dictionary containing observational data with keys “time” and “value”. :type dic_obs_data: dict

Returns:

The Chebyshev distance between the model data and observational data.

Return type:

float