Multiverse integration

Comet includes multiverse integration methods based on the weighting schemes introduced in:

Cantone, G. G., & Tomaselli, V. (2025). Characterisation and calibration of multiversal methods.
Advances in Data Analysis and Classification, 19(4), 989-1021. https://doi.org/10.1007/s11634-024-00610-9

For illustration and validation purposes, we use the “unidentified multiverse” example from the paper, which was introduced in a previous paper by Del Giudice and Gangestad (2021). https://doi.org/10.6084/m9.figshare.12089736

[1]:
from comet.multiverse import Multiverse

forking_paths = {
    "X1": [0, 1],
    "X2": [0, 1],
    "X3": [0, 1],
    "X4": [0, 1],
    "ANTE": [0, 1],
    "MEDI": [0, 1],
    "COLL": [0, 1],
    "CONF": [0, 1],
}

def analysis_template():
    import comet
    import statsmodels.formula.api as smf

    data = comet.utils.load_example("cantone")

    # Build composite predictor
    data["pred"] = 0.0
    if {{X1}}:
        data["pred"] += data["BM1"]
    if {{X2}}:
        data["pred"] += data["BM2"]
    if {{X3}}:
        data["pred"] += data["BM3"]
    if {{X4}}:
        data["pred"] += data["BM4"]

    # Build covariates
    cov = ""
    if {{ANTE}}:
        cov += " + genotype"
    if {{MEDI}}:
        cov += " + pain"
    if {{COLL}}:
        cov += " + fatigue"
    if {{CONF}}:
        cov += " + age"

    # Regression model
    formula = "depression ~ pred" + cov
    fit_res = smf.ols(formula=formula, data=data).fit()

    beta_raw = float(fit_res.params["pred"])
    sd_y = float(data["depression"].std())
    sd_x = float(data["pred"].std())
    beta_std = beta_raw * (sd_x / sd_y)

    out = {
        "formula": formula,
        "beta": float(beta_std),
        "beta_raw": float(beta_raw),
        "p_value": float(fit_res.pvalues["pred"]),
    }

    comet.utils.save_universe_results(out)

config = {
    "remove": [
        [{"X1": 0}, {"X2": 0}, {"X3": 0}, {"X4": 0}],
    ]
}

mverse = Multiverse(name="example_mv_integration")
mverse.create(analysis_template, forking_paths, config)
mverse.summary()
mverse.run(parallel=16)
Removed 16 out of 256 universes:
Universe Decision 1 Value 1 Decision 2 Value 2 Decision 3 Value 3 Decision 4 Value 4 Decision 5 Value 5 Decision 6 Value 6 Decision 7 Value 7 Decision 8 Value 8
0 Universe_1 X1 0 X2 0 X3 0 X4 1 ANTE 0 MEDI 0 COLL 0 CONF 0
1 Universe_2 X1 0 X2 0 X3 0 X4 1 ANTE 0 MEDI 0 COLL 0 CONF 1
2 Universe_3 X1 0 X2 0 X3 0 X4 1 ANTE 0 MEDI 0 COLL 1 CONF 0
3 Universe_4 X1 0 X2 0 X3 0 X4 1 ANTE 0 MEDI 0 COLL 1 CONF 1
4 Universe_5 X1 0 X2 0 X3 0 X4 1 ANTE 0 MEDI 1 COLL 0 CONF 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
235 Universe_236 X1 1 X2 1 X3 1 X4 1 ANTE 1 MEDI 0 COLL 1 CONF 1
236 Universe_237 X1 1 X2 1 X3 1 X4 1 ANTE 1 MEDI 1 COLL 0 CONF 0
237 Universe_238 X1 1 X2 1 X3 1 X4 1 ANTE 1 MEDI 1 COLL 0 CONF 1
238 Universe_239 X1 1 X2 1 X3 1 X4 1 ANTE 1 MEDI 1 COLL 1 CONF 0
239 Universe_240 X1 1 X2 1 X3 1 X4 1 ANTE 1 MEDI 1 COLL 1 CONF 1

240 rows × 17 columns

Starting multiverse analysis for all universes...
The multiverse analysis completed without any errors.

We visualise the multiverse results:

[2]:
mverse.multiverse_plot(measure="beta", n_bins=20, sig_col="p_value", baseline=0, figsize=(7,9))
../../_images/sections_notebooks_example_mv_integration_3_0.png

Comet currently contains two methods for integrating over the multiverse:

uniform

  • All universes are weighted equally (the estimate is just the plain mean/median across universes). This corresponds to the density function as plotted in the multiverse plot above.

mli (Maximum Local Influence)

  • Each universe is compared with its nearest neighbours (specifications differing by a single decision). Universes whose estimate barely changes when one decision is flipped are weighted more, down-weighting results that hinge on a single arbitrary choice.

As an example, we can use the .integrate() method to simply get an integrated measure over all universes. For example, using uniform weighting and mean aggregation:

[3]:
estimate, weights = mverse.integrate(measure="beta", method="uniform", type="mean")
print(f"Mean beta: {estimate:.4f}")
Mean beta: -0.0066

We can also use the compare_integration and plot_integration convenience functions to get a summary for both integration methods.

[4]:
table, weights = mverse.compare_integration(measure="beta", true_value=0.2)
table
[4]:
Scheme median mean gini_w err_median err_mean
0 Uniform -0.0210 -0.0066 0.000 0.221 0.207
1 MLI -0.0328 -0.0144 0.243 0.233 0.214

The columns are, for each weighting scheme:

  • median / mean: the integrated estimate (weighted median/mean of beta)

  • gini_w: how concentrated the weights are (0 = all equal, → 1 = dominated by a few universes).

If true_value is passed, additional err_* columns give the absolute error of each estimate against it. Usually, this is not available in real multiverses, but we can include it here as we have the ground truth from the simulation.

[5]:
mverse.plot_integration(measure="beta", weights=weights, true_value=0.2)
../../_images/sections_notebooks_example_mv_integration_9_0.png