comet.multiverse

class comet.multiverse.Multiverse(name='multiverse', path=None)[source]

Bases: object

Multiverse analysis class.

Parameters:

name (str) – Name of the multiverse analysis. Default is “multiverse”.
path (str) – Path to a multiverse directory (only used by the GUI).

add_results(name, values)[source]

Add a derived measure to the saved multiverse results.

The measure is written into multiverse_results.pkl so it becomes a regular column in get_results() and can be used by multiverse_plot(), specification_curve(), and integrate(). Useful for quantities computed from existing measures (e.g. the difference between two measures). An existing measure of the same name is overwritten.

Parameters:

name (str) – Name of the measure.
values (array-like or pandas.Series) – One value per universe. A pandas.Series is aligned by its index to the universe numbers (1..N); any other array-like is matched positionally in universe order (universe_1 .. universe_N). Each entry may be a scalar or an array.

Returns:

The updated results, as returned by get_results(as_df=True).

Return type:

pandas.DataFrame

compare_integration(measure, true_value=None, agg='mean', sigfigs=3)[source]

Compare all integration schemes for a measure and return a summary table.

Parameters:

measure (string) – Name of the measure to integrate.
true_value (float or None) – If given, absolute errors of each estimate against this value are added.
agg (string) – Per-universe aggregation for sequence-valued measures (see integrate).
sigfigs (int or None) – Round the numeric table columns to this many significant figures (default 3). Use None to keep full precision. Only affects the table; weights is unrounded.

Returns:

(table : pandas.DataFrame, weights : dict[str, np.ndarray]) The table has one row per scheme with the weighted median/mean and the Gini coefficient of the weights (0 = uniform, 1 = concentrated). If true_value is given, absolute errors of the median/mean are added (err_median, err_mean).

Return type:

tuple

create(analysis_template, forking_paths, config={}, verbose=False)[source]

Create the individual universe scripts

Parameters:

analysis_template (function) – Function containing the analysis template
forking_paths (dict) – Dictionary containing the forking paths
config (dict) – Configuration dictionary with optional combination rules - order : list of lists specifying the order of decisions - exclude : list of list[dict or str] (set listed keys to NaN if conditions match) - remove : list of list[dict or str] (drop universes if conditions match) - deduplicate : bool (collapse duplicates after exclude/remove; default True)

get_results(universe=None, as_df=False, expand_dec=False)[source]

Get the results of the multiverse (or a specific universe).

Parameters:

universe (int | None) – If given, return results for that specific universe.
as_df (bool) – False returns the raw dict (default). True returns a pandas DataFrame (only valid when universe is None).

integrate(measure=None, method='uniform', type='mean', agg='mean')[source]

Integrate the multiverse results for a specific measure into a single weighted estimate.

Weighting schemes follow Cantone & Tomaselli (2024); the decision-based schemes (mli, mli_restricted) require the decision columns, which are loaded via expand_dec=True.

Parameters:

measure (string) – Name of the measure to integrate.
method (string) –
Weighting scheme. Options are:
- ”uniform” (default): equal weights across all universes
- ”mli”: maximum local influence (specifications whose neighbours – by Gower distance over the decisions – give similar estimates get more weight)
- ”mli_restricted”: variant of MLI that ignores decisions not freely crossed across the multiverse, so the local neighbourhood is comparable across universes. Use this when the design includes remove rules that confine some decisions to a subset of levels.
type (string) – Type of (weighted) integration. Options are “mean” (default) or “median”.
agg (string) – Aggregation applied per universe when the measure holds sequences (arrays) rather than scalars. One of “mean” (default), “median”, “first”, “last”.

Returns:

(integrated_estimate : float, weights : np.ndarray)

Return type:

tuple

multiverse_plot(measure: str, n_bins: int = 20, sig_col: str | None = None, sig_threshold: float = 0.05, baseline: float | None = None, reference: float | None = None, reference_label: str | None = None, name_map: dict | None = None, figsize: tuple = (7, 9), fname: str = 'multiverse_plot', ftype: str = 'pdf', dpi: int = 300)[source]

Multiverse plot as introduced by Krähmer & Young (2026).

This plot visualises the distribution of multiverse outcomes together with heatmap strips showing how different analytic choices relate to the outcome. For each decision level, the average change in the outcome relative to the reference level is shown on the right.

The figure is saved to the results directory as “{fname}.{ftype}”.

References

Krähmer, D., & Young, C. (2026). Visualizing vastness: Graphical methods for multiverse analysis. PLOS One, 21(2). https://doi.org/10.1371/journal.pone.0339452

Parameters:

measure (str) – Name of the outcome/measure column in the multiverse results. Entries may be scalars or lists/arrays (in which case the mean is used).
n_bins (int, optional) – Number of bins used to discretise the outcome axis for the heatmap strips.
sig_col (str | None, optional) – Column indicating statistical significance. If provided: - boolean values are interpreted directly (True = significant), - numeric values are compared against sig_threshold. If None, significance is computed from per-universe sample arrays with a one-sample t-test against baseline (identical to specification_curve), thresholded by sig_threshold. This requires measure to hold array samples per universe; scalar measures cannot be tested and no overlay is drawn.
sig_threshold (float, optional) – Threshold applied to a numeric sig_col or to the computed p-values when sig_col is None (default is 0.05).
baseline (float | None, optional) – Baseline value for the outcome. If provided, a vertical dashed reference line is drawn at this value (extending through the strips to the x-axis), and it is the null used for the one-sample t-test when significance is computed (defaults to 0).
reference (float | None, optional) – A single reference outcome value to highlight on the density curve, e.g. the estimate of one specific universe (such as a previously published single-pipeline study). It is drawn as a marker sitting on the density at that x-position, showing where that single result falls within the full multiverse distribution.
reference_label (str | None, optional) – Legend label for reference (e.g. "Popp et al."). Defaults to "Reference".
name_map (dict | None, optional) – Optional mapping for display names. Keys may include the measure name, decision names, and option/level values. Values are the desired display labels. When the same option string is used by more than one decision, use a decision-qualified key "decision/option" (e.g. "family_splitting/all") to relabel it per decision; otherwise a bare option key (e.g. "spearman") is applied wherever it appears.
figsize (tuple, optional) – Figure size passed to Matplotlib (width, height) in inches.
ftype (str, optional) – File type used when saving the figure (e.g., "pdf", "png").
dpi (int, optional) – Resolution (dots per inch) used when saving the figure.

Returns:

The figure if in a .py script. None if in a .ipynb notebook (the figure is saved and displayed inline)

Return type:

Any

plot_integration(measure, weights=None, true_value=None, agg='mean', xlim=None, xlabel=None, figsize=(7, 3), title=None, fname='density', ftype='pdf', dpi=300)[source]

Plot weighted density distributions of a measure for each integration scheme.

Parameters:

measure (string) – Name of the measure to integrate.
weights (dict[str, np.ndarray] or None) – Per-scheme weights (e.g. from compare_integration). Computed automatically if None.
true_value (float or None) – If given, a vertical reference line is drawn at this value.
agg (string) – Per-universe aggregation for sequence-valued measures (see integrate).
xlim – Plot/save options. The figure is saved to the results directory as fname.ftype.
figsize – Plot/save options. The figure is saved to the results directory as fname.ftype.
title – Plot/save options. The figure is saved to the results directory as fname.ftype.
fname – Plot/save options. The figure is saved to the results directory as fname.ftype.
ftype – Plot/save options. The figure is saved to the results directory as fname.ftype.
dpi – Plot/save options. The figure is saved to the results directory as fname.ftype.

Returns:

The figure if in a .py script, None if in a notebook (shown inline).

Return type:

Any

remove_results(name)[source]

Remove a measure column from the saved multiverse results.

The measure is deleted from multiverse_results.pkl for every universe. Decision and internal columns (prefixed "__", e.g. __decisions) cannot be removed.

Parameters:: name (str) – Name of the measure to remove.
Returns:: The updated results, as returned by get_results(as_df=True).
Return type:: pandas.DataFrame

run(universe=None, parallel=1, combine_results=True)[source]

Run either an individual universe or the entire multiverse

Parameters:

universe (None, int, list, range) – Number of the universe to run. Default is None, which runs all universes
parallel (int) – Number of universes to run in parallel

specification_curve(measure: str, baseline: float | None = None, p_value: float | str | bool | None = None, ci: int | str | bool | None = None, smooth_ci: bool = True, title: str | None = None, name_map: dict | None = None, cmap: str = 'Set3', linewidth: float = 2, figsize: tuple | None = None, height_ratio: tuple = (2, 1), fontsize: int = 10, dotsize: int = 50, line_pad: float = 0.3, fname: str = 'specification_curve', ftype: str = 'pdf', dpi: int = 300, p_threshold: float = 0.05, ci_level_default: int = 95)[source]

Create a specification curve plot from multiverse results.

The figure is saved to the results directory as “specification_curve.{ftype}”.

Notes

If p_value is float or True, measure must contain list/array samples per universe.
If ci is int or True, measure must contain list/array samples per universe.
If p_value is a string, it is interpreted as a p-value column (numeric) or a significance flag (bool).
If ci is a string, it must contain per-universe (lower, upper) bounds.

Returns:: The figure if in a .py script. None if in a .ipynb notebook (the figure is saved and displayed inline)
Return type:: Any

summary(universe=None, print_df=True, return_df=False)[source]

Print the multiverse summary to the terminal/notebook

Parameters:: universe (int, range, or None) – The universe number(s) to display. Default is None (prints the head)

visualize(universe=None, figsize=(8, 5), node_size=1500, text_size=12, max_label_len=15, label_offset=0.04, cmap='Set2', exclude_single=False)[source]

Visualize the multiverse as a network.

Parameters:

universe (int or None) – The universe to highlight in the network. If None or if the provided universe number is higher than available universes, the entire multiverse is shown without highlighting. Default is None.
figsize (tuple) – Size of the figure. Default is (8,5).
node_size (int) – Size of the nodes. Default is 1500.
text_size (int) – Size of the text labels. Default is 12.
max_label_len (int) – Maximum length of decision labels before wrapping.
label_offset (float) – Offset multiplier for decision labels.
cmap (str) – Colormap to use for the nodes. Default is “Set2”.
exclude_single (bool) – Whether to exclude parameters with only one unique option.

comet.multiverse.load_multiverse(path=None)[source]

Load a previously created multiverse from disk.

Parameters:: path (str) – A full/relative path to an existing multiverse folder