{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Dynamic functional connectivity and graph analysis (simulated data)\n", "\n", "Brief outline:\n", "\n", "* **Input:** Simulated time series data with two randomly changing connectivity patterns (7.2 second trial lengths)\n", "* **Output:** Connectivity patterns are predicted from graph measures that are calculated from dFC data on a trial-by-trial basis\n", "* **Multiverse options:** dFC methods (5), temporal lag (2), graph density (2), graph weights (2), graph measures (2), SVC kernel (2) --> 160 universes\n", "\n", "*Note: This analysis is computationally expensive. For quick testing purposes, you can simply remove some of the options (e.g. 2-3 dFC methods or one of the graph measures) to reduce the size of the multiverse. Further, the analysis template already includes parallelisation for the graph measure calculation (`Parallel(n_jobs=4)`), so this should be considered when specifying how many universes to run in parallel (`mverse.run(parallel=4)`)*" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from comet.multiverse import Multiverse\n", "\n", "forking_paths = {\n", " \"dfc_method\": [ # Dynamic functional connectivity measures\n", " {\n", " \"name\": \"TSW21\",\n", " \"func\": \"comet.connectivity.SlidingWindow(time_series=ts_hp, windowsize=21, shape='gaussian', std=7).estimate()\"\n", " },\n", " {\n", " \"name\": \"SD\",\n", " \"func\": \"comet.connectivity.SpatialDistance(time_series=ts_hp, dist='euclidean').estimate()\"\n", " },\n", " {\n", " \"name\": \"MTD7\",\n", " \"func\": \"comet.connectivity.TemporalDerivatives(time_series=ts_hp, windowsize=7).estimate()\"\n", " },\n", " {\n", " \"name\": \"PSc\",\n", " \"func\": \"comet.connectivity.PhaseSynchronization(time_series=ts_hp, method='crp').estimate()\"\n", " },\n", " {\n", " \"name\": \"FLS\",\n", " \"func\": \"comet.connectivity.FlexibleLeastSquares(time_series=ts_hp, standardizeData=True, mu=50, num_cores=8, progress_bar=False).estimate()\"\n", " }],\n", " \"tr\": [0.72], # TR in seconds (not a forking path, but useful as a global parameter)\n", " \"segment_length\": [10], # Length of each trial segment (in TRs, not a forking path, but useful as a global parameter)\n", " \"delay\": [6, 10], # Shift to account for hemodynamic delay (in TR) -> delay*0.72 = 3/5 seconds\n", " \"density\": [0.5, 0.25], # Graph density (keep top X% of edges)\n", " \"binarise\": [True, False], # Graph binarisation (otherwise weighted)\n", " \"graph_measure\": [ # Graph measures\n", " {\n", " \"name\": \"clustering\",\n", " \"func\": \"comet.graph.clustering_coef(G)\"\n", " },\n", " {\n", " \"name\": \"efficiency\",\n", " \"func\": \"comet.graph.efficiency(G, local=True)\"\n", " }],\n", " \"svc_kernel\": [\"linear\", \"rbf\"] # SVC kernel\n", "}\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the forking paths defined, the analysis pipeline template can be created. Please not that the ```tr``` and ```segment_length``` parameters were also defined in the forking paths to allow for easy changes if necessary." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def analysis_template():\n", " import comet\n", " import numpy as np\n", " from sklearn.svm import SVC\n", " from sklearn.model_selection import StratifiedKFold\n", " from sklearn.metrics import accuracy_score\n", " from joblib import Parallel, delayed\n", " \n", " #####################################\n", " # 1. LOAD DATA AND EXTRACT PARAMETERS\n", " data_sim = comet.utils.load_example(\"simulation\")\n", " labels = data_sim[\"labels\"] # trial labels (connectivity state)\n", " ts_sim = data_sim[\"ts_sim\"] # time series data\n", " onsets = data_sim[\"onsets\"] # trial onsets in sec\n", "\n", " ###########################################\n", " # 2. DFC CALCULATION (DECISION: DFC METHOD)\n", " # Preprocessing. Band-pass filtering for phase-based methods, high-pass filtering for amplitude-based methods.\n", " ts_bp = comet.utils.clean(ts_sim, confounds=None, t_r={{tr}}, detrend=True, standardize=False, \\\n", " high_pass=0.03, low_pass=0.07) # band pass (for Hilbert transform)\n", " ts_hp = comet.utils.clean(ts_sim, confounds=None, t_r={{tr}}, detrend=True, standardize=False, \\\n", " high_pass=0.01) # high pass (for amplitude based methods)\n", "\n", " # Estimate dFC\n", " dfc_ts = {{dfc_method}}\n", "\n", " ###################################\n", " # 3. SEGMENT DATA (DECISION: DELAY)\n", " segments = []\n", " for i in onsets:\n", " segment = [i+j+{{delay}} for j in range(0, {{segment_length}})]\n", " segments.append(segment)\n", "\n", " segments = np.asarray(segments).astype(int)\n", " labels = np.asarray(labels).astype(int)\n", "\n", " # IMPORTANT! Handle the different lenghts of dfc time series as windowing methods will produce shorter dFC time series\n", " windowsize = ts_sim.shape[0] - dfc_ts.shape[2] + 1\n", " offset = windowsize // 2\n", " segments = np.asarray(segments) - offset\n", "\n", " index = []\n", " features = []\n", " behaviour = []\n", "\n", " # Get the trial segments (this only checks if we are outside the allowed bounds, otherwise we just keep all segments/labels)\n", " for segment, label in zip(segments, labels):\n", " if segment[0] > 0 and segment[-1] < dfc_ts.shape[2]: # make sure the trial is covered by the dfc data\n", " matrices = dfc_ts[:,:,segment]\n", " matrix = np.mean(matrices, axis=2) # average over the dFC estimates to reduce noise and get a single estimate for each trial\n", "\n", " features.append(matrix)\n", " behaviour.append(label)\n", " index.append(segment)\n", " else:\n", " raise ValueError(f\"Segment {segment} not covered by data, aborting calculations.\")\n", "\n", " index = np.asarray(index)\n", " dfc_features = np.asarray(features)\n", " behaviour = np.asarray(behaviour)\n", "\n", " ####################################################################\n", " # 4. CALCULATE GRAPH MEASURES (DECISIONS: DENSITY, BINARISATION)\n", " def compute_graph_measures(t, dfc_features, index, density, binarise):\n", " G = np.asarray(dfc_features[t, :, :]).copy()\n", " G = comet.graph.handle_negative_weights(G, type=\"absolute\")\n", " G = comet.graph.threshold(G, type=\"density\", density=density)\n", " G = comet.graph.postproc(G)\n", "\n", " graph_results = {{graph_measure}}\n", "\n", " return graph_results\n", "\n", " graph_results = Parallel(n_jobs=4)(delayed(compute_graph_measures)(t, dfc_features, index, {{density}}, {{binarise}}) for t in range(dfc_features.shape[0]))\n", "\n", " # Unpack the results\n", " graph_features = []\n", " for result in graph_results:\n", " graph_features.append(result)\n", "\n", " ##############################################\n", " # 5. CLASSIFICATION (DECISION: SVC KERNEL)\n", " graph_features = np.asarray(graph_features)\n", " labels = behaviour\n", "\n", " # Initialize the SVC\n", " svc = SVC(kernel={{svc_kernel}})\n", "\n", " # Perform 5-fold cross-validation\n", " accuracy = []\n", " skf = StratifiedKFold(n_splits=5)\n", "\n", " for train_index, test_index in skf.split(graph_features, labels):\n", " X_train, X_test = graph_features[train_index], graph_features[test_index]\n", " y_train, y_test = labels[train_index], labels[test_index]\n", "\n", " svc.fit(X_train, y_train)\n", " y_pred = svc.predict(X_test)\n", "\n", " accuracy.append(accuracy_score(y_test, y_pred))\n", "\n", " accuracy = np.asarray(accuracy)\n", "\n", " # Save the results (prediction accuracy and dFC estimates for ground truth comparison)\n", " results = {\"prediction\\naccuracy\": accuracy, \"dfc_estimates\": dfc_features}\n", " comet.utils.save_universe_results(results)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create (and if required run) the multiverse analysis:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Starting multiverse analysis for all universes...\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a90763bbd2ee4842b04baa5a122f8f8a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Performing multiverse analysis:: 0%| | 0/160 [00:00, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "The multiverse analysis completed without any errors.\n" ] } ], "source": [ "mverse = Multiverse(name=\"example_mv_fmri_sim\")\n", "mverse.create(analysis_template, forking_paths)\n", "mverse.run(parallel=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As the results from the multiverse analysis are already provided, we can explore and visualize the multiverse:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | Universe | \n", "Decision 1 | \n", "Value 1 | \n", "Decision 2 | \n", "Value 2 | \n", "Decision 3 | \n", "Value 3 | \n", "Decision 4 | \n", "Value 4 | \n", "Decision 5 | \n", "Value 5 | \n", "Decision 6 | \n", "Value 6 | \n", "Decision 7 | \n", "Value 7 | \n", "Decision 8 | \n", "Value 8 | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "Universe_1 | \n", "dfc_method | \n", "TSW21 | \n", "tr | \n", "0.72 | \n", "segment_length | \n", "10 | \n", "delay | \n", "6 | \n", "density | \n", "0.50 | \n", "binarise | \n", "True | \n", "graph_measure | \n", "clustering | \n", "svc_kernel | \n", "linear | \n", "
| 1 | \n", "Universe_2 | \n", "dfc_method | \n", "TSW21 | \n", "tr | \n", "0.72 | \n", "segment_length | \n", "10 | \n", "delay | \n", "6 | \n", "density | \n", "0.50 | \n", "binarise | \n", "True | \n", "graph_measure | \n", "clustering | \n", "svc_kernel | \n", "rbf | \n", "
| 2 | \n", "Universe_3 | \n", "dfc_method | \n", "TSW21 | \n", "tr | \n", "0.72 | \n", "segment_length | \n", "10 | \n", "delay | \n", "6 | \n", "density | \n", "0.50 | \n", "binarise | \n", "True | \n", "graph_measure | \n", "efficiency | \n", "svc_kernel | \n", "linear | \n", "
| 3 | \n", "Universe_4 | \n", "dfc_method | \n", "TSW21 | \n", "tr | \n", "0.72 | \n", "segment_length | \n", "10 | \n", "delay | \n", "6 | \n", "density | \n", "0.50 | \n", "binarise | \n", "True | \n", "graph_measure | \n", "efficiency | \n", "svc_kernel | \n", "rbf | \n", "
| 4 | \n", "Universe_5 | \n", "dfc_method | \n", "TSW21 | \n", "tr | \n", "0.72 | \n", "segment_length | \n", "10 | \n", "delay | \n", "6 | \n", "density | \n", "0.50 | \n", "binarise | \n", "False | \n", "graph_measure | \n", "clustering | \n", "svc_kernel | \n", "linear | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 155 | \n", "Universe_156 | \n", "dfc_method | \n", "FLS | \n", "tr | \n", "0.72 | \n", "segment_length | \n", "10 | \n", "delay | \n", "10 | \n", "density | \n", "0.25 | \n", "binarise | \n", "True | \n", "graph_measure | \n", "efficiency | \n", "svc_kernel | \n", "rbf | \n", "
| 156 | \n", "Universe_157 | \n", "dfc_method | \n", "FLS | \n", "tr | \n", "0.72 | \n", "segment_length | \n", "10 | \n", "delay | \n", "10 | \n", "density | \n", "0.25 | \n", "binarise | \n", "False | \n", "graph_measure | \n", "clustering | \n", "svc_kernel | \n", "linear | \n", "
| 157 | \n", "Universe_158 | \n", "dfc_method | \n", "FLS | \n", "tr | \n", "0.72 | \n", "segment_length | \n", "10 | \n", "delay | \n", "10 | \n", "density | \n", "0.25 | \n", "binarise | \n", "False | \n", "graph_measure | \n", "clustering | \n", "svc_kernel | \n", "rbf | \n", "
| 158 | \n", "Universe_159 | \n", "dfc_method | \n", "FLS | \n", "tr | \n", "0.72 | \n", "segment_length | \n", "10 | \n", "delay | \n", "10 | \n", "density | \n", "0.25 | \n", "binarise | \n", "False | \n", "graph_measure | \n", "efficiency | \n", "svc_kernel | \n", "linear | \n", "
| 159 | \n", "Universe_160 | \n", "dfc_method | \n", "FLS | \n", "tr | \n", "0.72 | \n", "segment_length | \n", "10 | \n", "delay | \n", "10 | \n", "density | \n", "0.25 | \n", "binarise | \n", "False | \n", "graph_measure | \n", "efficiency | \n", "svc_kernel | \n", "rbf | \n", "
160 rows × 17 columns
\n", "