Hurricane multiverse (non-neuroimaging)

The Comet toolbox is developed primarily for fMRI brain network analyses. However, we here showcases how the versatile multiverse workflow can also be applied outside of neuroimaging.

This script is a re-implementation og the famous “Female hurricanes are deadlier than male hurricanes” multiverse analysis originally presented in Simonsohn et al. (2020). Their study revisited Jung et al. (2014), who reported that hurricanes with more feminine names caused more fatalities. Simonsohn and colleagues systematically explored how robust this conclusion is across a large range of reasonable analytic specifications.

The Comet toolbox provides a Python-native multiverse framework that allows the entire hurricane multiverse analysis to be expressed with two simple objects:

  • A dictionary of forking paths

  • A single analysis template function containing placeholders for all decisions

Existing examples (R implementations) of this example can be found here:

[1]:
from comet.multiverse import Multiverse

forking_paths = {
    "death_outliers":  [2, 1, 0],    # Drop n hurricanes with the most deaths
    "damage_outliers": [3, 2, 1, 0], # Drop n hurricanes with the most damage

    "femininity_rating": ["fem_likert", "fem_binary"], # Femininity rating (11-point likert or binary)
    "damage_functional": ["ln", "linear"],             # Damage functional (ln or no transformation)

    "effect_type":  ["femininity * damages",
                     "femininity * damages + femininity * zpressure",
                     "femininity * damages + femininity * zwind",
                     "femininity * damages + femininity * zcat",
                     "femininity * damages + femininity * z3",
                     "femininity + damages + z3"],                       # Interactions or main effect only)
    "year_interaction":  ["", " + post1979:damages", " + year:damages"], # Interactions with year
    "model":             ["log_linear", "neg_binomial"]                  # Model (negative binomial or log-linear)
}

def analysis_template():
    import comet
    import numpy as np
    import statsmodels.api as sm
    import statsmodels.formula.api as smf

    # Load hurricane data
    df = comet.utils.load_example("hurricane")

    # Create derived predictors
    df["post1979"] = (df["elapsed_years"] > 1979).astype(int)
    df["zcat"] = (df["category"] - df["category"].mean()) / df["category"].std()
    df["zpressure"] = -((df["pressure"] - df["pressure"].mean()) / df["pressure"].std())
    df["zwind"] = (df["wind"] - df["wind"].mean()) / df["wind"].std()
    df["z3"] = (df["zpressure"] + df["zcat"] + df["zwind"]) / 3.0

    # Decisions 1 & 2: Exclude outliers
    top_deaths = df["deaths"].nlargest({{death_outliers}}).unique()
    top_damage = df["damages"].nlargest({{damage_outliers}}).unique()
    df = df[~df["deaths"].isin(top_deaths) & ~df["damages"].isin(top_damage)]

    # Decision 2: Choose femininity rating
    df["femininity"] = df[{{femininity_rating}}]

    # Decision 3: Transform damage variable
    if {{damage_functional}} == "ln":
        df["damages"] = np.log(df["damages"])

    # Decision 4 & 5: Effect type & account for year
    formula = "deaths ~ " + {{effect_type}} + {{year_interaction}}

    # Decision 6: Model type and fitting
    if {{model}} == "log_linear":
        df["deaths"] = np.log(df["deaths"] + 1)
        fit = smf.ols(formula, data=df).fit()
    else:
        fit = smf.glm(formula, data=df, family=sm.families.NegativeBinomial(alpha=1)).fit()

    # Results: Set femininity for male vs female and predict deaths at the sample means
    if {{femininity_rating}} == "fem_likert":
        fem_male = df["femininity"][df["fem_binary"] == 0].mean()
        fem_fem = df["femininity"][df["fem_binary"] == 1].mean()
    else:
        fem_male = 0
        fem_fem = 1

    base = df.mean(numeric_only=True).to_frame().T

    base_male = base.copy()
    base_male["femininity"] = fem_male

    base_fem = base.copy()
    base_fem["femininity"] = fem_fem

    if {{model}} == "log_linear":
        # Predict and back-transform (slightly simplified)
        pred_m = np.exp(fit.predict(base_male)) - 1
        pred_f = np.exp(fit.predict(base_fem)) - 1
    else:
        pred_m = fit.predict(base_male)
        pred_f = fit.predict(base_fem)

    # Get the additional deaths as multiverse outcome measure
    extra_deaths = pred_f.values - pred_m.values
    # The p-values mix interaction with main effect only so its a bit hacky, but good enough for illustration
    p_val = fit.pvalues["femininity:damages"] if "femininity:damages" in fit.pvalues.index else fit.pvalues["femininity"]

    comet.utils.save_universe_results({"extra_deaths": extra_deaths, "p_val": p_val})
[2]:
mverse = Multiverse(name="example_mv_hurricane")
mverse.create(analysis_template, forking_paths)
mverse.summary()
Universe Decision 1 Value 1 Decision 2 Value 2 Decision 3 Value 3 Decision 4 Value 4 Decision 5 Value 5 Decision 6 Value 6 Decision 7 Value 7
0 Universe_1 death_outliers 2 damage_outliers 3 femininity_rating fem_likert damage_functional ln effect_type femininity * damages year_interaction NaN model log_linear
1 Universe_2 death_outliers 2 damage_outliers 3 femininity_rating fem_likert damage_functional ln effect_type femininity * damages year_interaction NaN model neg_binomial
2 Universe_3 death_outliers 2 damage_outliers 3 femininity_rating fem_likert damage_functional ln effect_type femininity * damages year_interaction + post1979:damages model log_linear
3 Universe_4 death_outliers 2 damage_outliers 3 femininity_rating fem_likert damage_functional ln effect_type femininity * damages year_interaction + post1979:damages model neg_binomial
4 Universe_5 death_outliers 2 damage_outliers 3 femininity_rating fem_likert damage_functional ln effect_type femininity * damages year_interaction + year:damages model log_linear
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1723 Universe_1724 death_outliers 0 damage_outliers 0 femininity_rating fem_binary damage_functional linear effect_type femininity + damages + z3 year_interaction NaN model neg_binomial
1724 Universe_1725 death_outliers 0 damage_outliers 0 femininity_rating fem_binary damage_functional linear effect_type femininity + damages + z3 year_interaction + post1979:damages model log_linear
1725 Universe_1726 death_outliers 0 damage_outliers 0 femininity_rating fem_binary damage_functional linear effect_type femininity + damages + z3 year_interaction + post1979:damages model neg_binomial
1726 Universe_1727 death_outliers 0 damage_outliers 0 femininity_rating fem_binary damage_functional linear effect_type femininity + damages + z3 year_interaction + year:damages model log_linear
1727 Universe_1728 death_outliers 0 damage_outliers 0 femininity_rating fem_binary damage_functional linear effect_type femininity + damages + z3 year_interaction + year:damages model neg_binomial

1728 rows × 15 columns

[3]:
mverse.run(parallel=16)
Starting multiverse analysis for all universes...
The multiverse analysis completed without any errors.

We can then plot the specification curve:

[4]:
mverse.specification_curve("extra_deaths", figsize=(12,9), height_ratio=(1,2), line_pad=0.05)
../../_images/sections_notebooks_example_mv_hurricane_5_0.png

However, for large multiverses this type of visualisation can become difficult to interpret. A better alternative is a multiverse plot, which visualises the results as a density function and groups the specifications into bins.

Here it becomes easily visible how most effects are close to 0 and not significant:

[5]:
mverse.multiverse_plot(measure="extra_deaths", n_bins=20, sig_col="p_val", baseline=0, figsize=(7,10))
../../_images/sections_notebooks_example_mv_hurricane_7_0.png