Companion Data Page

Neural mechanisms of parasite-induced summiting behavior in "zombie" Drosophila

Carolyn Elya1,*, Danylo Lavrentovich1, Emily Lee1,^, Cassandra Pasadyn1,&, Jasper Duval1,$, Maya Basak1,#, Valerie Saykina1,@, and Benjamin de Bivort1,*

1 Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA, 02138
^ Current address: New York Genome Center, New York, NY 10013
& Current address: The Ohio State University College of Medicine, Columbus, OH 43210
$ Current address: Northeastern University, Boston, MA 02115
# Current address: Emory University, Atlanta, GA 30322
@ Current address: University of Connecticut, Storrs, CT, 06269


* Email correspondence: cnelya@g.harvard.edu; debivort@oeb.harvard.edu


Summit behavior assay construction (Microsoft Office & Adobe Acrobat)

Laser-cutting schematics
Box All panels to be cut from 1/4" black acrylic.
Arenas SummitArena_tray.pdf to be cut from 1/8" clear acrylic and mounted with diffusion film.

SummitArena_base.pdf to be cut from 1/8" clear acrylic and manually sanded.

SummitArena_walls.pdf to be cut from 1/8" black acrylic.

SummitArena_lid.pdf to be cut from 1/8" clear acrylic.
Assembly and use
Component list.pdf Components needed to assemble summiting box described in manuscript. Also available as .xlsx file
Summit box guide.pdf Instructions for box assembly, computer configuration, and behavioral assay execution. Also available as .docx file.

Data files (MATLAB 2018b, Python 3.6, Microsoft Excel)

Fig 1 & 1-S1
Cs_all_standard.mat "data" struct containing all Canton-S zombie and survivor traces from standard summiting assay (30°, food at y=0)

Fig 1D, E
Cs_all_standard_QUANT.mat "summit" struct containing all Canton-S zombie SM values from standard summiting assay (30°, food at y=0)

Fig 1I, J
Cs_30deg_food0.mat "data" struct containing Canton-S zombie and survivor traces from standard summiting experiment (30°, food at y=0)

Fig 1F, G, H
Cs_30deg_food0_QUANT.mat "summit" struct containing Canton-S zombie SM values from standard summiting experiment (30°, food at y=0)

Fig 1-S1K
Cs_0deg_food0.mat "data" struct containing Canton-S zombie and survivor traces from flat summiting experiment (0°, food at y=0)

Fig 1G
Cs_0deg_food0_QUANT.mat "summit" struct containing Canton-S zombie SM values from flat summiting experiment (0°, food at y=0)

Fig 1-S1K
Cs_30deg_food1.mat "data" struct containing Canton-S zombie and survivor traces from inverted summiting experiment (30°, food at y=1)

Fig 1H
Cs_starved_unexposed.mat "data" struct containing unexposed Canton-S traces from starvation experiment (30°, agar at y=0)

Fig 1-S1A,C
Cs_desiccated_unexposed.mat "data" struct containing unexposed Canton-S traces from desiccation experiment (30°, no media)

Fig 1-S1B,D
Cs_30deg_food1_72hr.mat "data" struct containing Canton-S zombie and survivor traces from inverted summiting experiment, tracked since Day 1 (30°, food at y=1)

Fig 1-S1E
Cs_0deg_food_choice.mat "data" struct containing Canton-S zombie and survivor traces from food choice experiment (0°, food at y=1; agar at y=0)

Fig 1-S1F
Cs_starved_exposed.mat "data" struct containing exposed Canton-S zombie and survivor traces from overnight starvation experiment (30°, agar at y=0)

Fig 1-S1A,C
Cs_tallwells_30deg.mat "data" struct containing exposed Canton-S zombie and survivor traces in tall arenas (30°, agar at y=0)

Fig 1-S1I
Cs_tallwells_0deg_QUANT.mat "summit" struct containing Canton-S zombie SM values from tall wells (0°, food at y=0)

Fig 1-S1J
Cs_tallwells_0deg.mat.mat "data" struct containing exposed Canton-S zombie and survivor traces in tall arenas (0°, agar at y=0)

Fig 1-S1I
Cs_tallwells_30deg_QUANT.mat "summit" struct containing Canton-S zombie SM values from tall wells (0°, food at y=0)

Fig 1-S1J
Cs_all_standard_F_QUANT.mat "summit" struct containing all female Canton-S zombie SM values from standard summiting assay (30°, food at y=0)

Fig 1-S1L,M
Cs_all_standard_M_QUANT.mat "summit" struct containing all male Canton-S zombie SM values from standard summiting assay (30°, food at y=0)

Fig 1-S1L,M
Fig 2 & 2-S1
Screen_SM_values.mat "summit" struct containing all SM values for all lines run in gene and neuron disruption screen

Fig 2B-G
Neurons (2B, 2D).xlsx Table of screened Gal4 lines, used to estimate summiting effect sizes

Fig 2B, D
Genes (2C, 2E).xlsx Table of screened mutants and RNAi lines, used to estimate summiting effect sizes

Fig 2C, E
Circadian (2F).xlsx Table of screened circadian mutants and RNAi lines with paired genetic controls, used to estimate summiting effect sizes

Fig 2F
PI and circadian Gal4 (2G).xlsx Table of screened PI and circadian Gal4 lines with paired genetic controls, used to estimate summiting effect sizes

Fig 2G
R19G10-syt-eGFP,DenMark_20x.czi Confocal micrograph z-stack of R19G10>syt-eGFP, DenMark brain

Fig 2H
Clk4_1_TrpA_all.mat data" struct containing all Clk4.1>TrpA1 fly traces in thermogenetic experiment

Fig 2I
Clk4_1_TrpA_sib_ctrl_all.mat data" struct containing Clk4.1 sibling control fly traces in thermogenetic experiment

Fig 2I
R19G10_TrpA_all.mat data" struct containing R19G10>TrpA1 fly traces in thermogenetic experiment

Fig 2J
R19G10_TrpA_sib_ctrl_all.mat "data" struct containing R19G10 sibling control fly traces in thermogenetic experiment

Fig 2J
05-24-2022-12-08-17__centroid.bin MARGO binary file containing (number of timepoints x number of ROIs) array of centroid positions for R19G10>CsChrimson constant red stimulus optogenetics experiment

Fig 2K, 2-SG
R19G10_TNT-E.mat "data" struct containing R19G10>TNT-E zombie and survivor traces

Fig 2-S1A
R19G10_TNT-E_QUANT.mat "summit" struct containing SM values for R19G10>TNT-E zombies

Fig 2-S1A
R19G10_sib_ctrl.mat "data" struct containing R19G10>TNT-E sibling control zombie and survivor traces

Fig 2-S1A
R19G10_sib_ctrl_QUANT.mat "summit" struct containing SM values for R19G10>TNT-E sibling control zombies

Fig 2-S1A
Clk4_1_TNT-E.mat "data" struct containing Clk4.1>TNT-E zombie and survivor traces

Fig 2-S1B
Clk4_1_TNT-E_QUANT.mat "summit" struct containing SM values for Clk4.1>TNT-E zombies

Fig 2-S1B
Clk4_1_sib_ctrl.mat "data" struct containing Clk4.1.TNT-E sibling control zombie and survivor traces

Fig 2-S1B
Clk4_1_sib_ctrl_QUANT.mat "summit" struct containing SM values for Clk4.1.TNT-E sibling control zombies

Fig 2-S1B
R18H11_TNT-E.mat "data" struct containing R18H11>TNT-E zombie and survivor traces

Fig 2-S1C
R18H11_TNT-E_QUANT.mat.mat "summit" struct containing SM values for R18H11>TNT-E zombies

Fig 2-S1C
R18H11_sib_ctrl.mat "data" struct containing R18H11>TNT-E sibling control zombie and survivor traces

Fig 2-S1C
R18H11_sib_ctrl_QUANT.mat "summit" struct containing SM values for R18H11>TNT-E sibling control zombies

Fig 2-S1C
Circadian neurons (S1-2D).xlsx Table of circadian lines driving non-TNT-E effectors and paired genetic controls

Fig 2-S1D
Cavanaugh pathway (S1-2F).xlsx Table of screened lines from Cavanaugh et al, 2014 with paired genetic controls, used to estimate summiting effect sizes

Fig 2-S1F
Clk4_1_TrpA__F.mat "data" struct containing Clk4.1>TrpA1 fly traces in thermogenetic experiment, females only

Fig 2-S2A
Clk4_1_TrpA_sib_ctrl_F.mat "data" struct containing Clk4.1>TrpA1 sibling control fly traces in thermogenetic experiment, females only

Fig 2-S2A
Clk4_1_TrpA_M.mat "data" struct containing Clk4.1>TrpA1 fly traces in thermogenetic experiment, males only

Fig 2-S2B
Clk4_1_TrpA_sib_ctrl_M.mat "data" struct containing Clk4.1>TrpA1 sibling control fly traces in thermogenetic experiment, males only

Fig 2-S2B
R19G10_TrpA_F.mat "data" struct containing R19G10>TrpA1 fly traces in thermogenetic experiment, females only

Fig 2-S2C
R19G10_TrpA_sib_ctrl_F.mat "data" struct containing R19G10>TrpA1 fly sibling control traces in thermogenetic experiment, females only

Fig 2-S2C
R19G10_TrpA_M.mat "data" struct containing R19G10>TrpA1 fly traces in thermogenetic experiment, males only

Fig 2-S2D
R19G10_TrpA_sib_ctrl_M.mat "data" struct containing R19G10>TrpA1 fly sibling control traces in thermogenetic experiment, males only

Fig 2-S2D
09-30-2022-10-39-38__centroid.bin MARGO binary file containing (number of timepoints x number of ROIs) array of centroid positions for R19G10>CsChrimson pulsed red stimulus optogenetics experiment

Fig 2-S2E-F
Fig 3 & 3-S1, S2
Slide1-head1-63x_fullstack.czi Confocal z-stack micrograph of R19G10>syt-eGFP, DenMark retrocerebral complexes stained for phalloidin (AF568), chicken-GFP (AF488), guinea pig-JHAMT (AF647)

Fig 3B
CA ablation with DTI.mat "combinedSummit" struct with SM for tub-Gal80(ts), Aug21>DTI CA ablated flies and sibling controls

Fig 3C
CA ablation with DTI.xlsx Table of accompanying data for effect size analysis of DTI CA ablation

Fig 3C
10x_CAKO_4_300ms_1_1.tif GFP epifluorescence micrograph of CA ablated fly. Additional examples in folder.

Fig 3D, Fig 3-S2D
CA ablation with DTI.xlsx GFP epifluorescence micrograph of sibling control

Fig 3D
Precocene dilution_SM.mat "combinedSummit" struct with SM for precocene and acetone treated Canton-S flies

Fig 3E
Precocene dilution.xlsx Table of accompanying data for effect size analysis of precocene treatment

Fig 3E
CA ablation with NiPP1_SM.mat "combinedSummit" struct with SM for for tub-Gal80(ts), Aug21>NiPP1 CA ablated flies and sibling controls

Fig 3-S1B
CA ablation with NiPP1.xlsx Table of accompanying data for effect size analysis of DTI NiPP1 ablation

Fig 3-S1B
NiPP1 ablation examples Directory of confocal micrographs of tub-Gal80(ts), Aug21>NiPP1 CA flies

Fig 3-S1C
NiPP1 ablation control examples Directory of confocal micrographs of tub-Gal80(ts), Aug21>NiPP1 sibling control flies

Fig 3-S1C
Fluvastatin_72_SM.mat "combinedSummit" struct with SM for for fluvastatin and vehicle control treated flies

Fig 3-S2B
Fluvastatin_72.xlsx Table of accompanying data for effect size analysis of fluvastatin at 72 hours

Fig 3-S2B
Fluvastatin_72.mat "data" struct containing traces for Canton-S flies fed fluvastatin at 72 hours

Fig 3-S2C
Fluvastatin_72_ctrl.mat "data" struct containing traces for Canton-S flies fed vehicle control at 72 hours

Fig 3-S2C
Fluvastatin_survival.xlsx Survival outcomes for fluvastatin treated flies at 24 and 72 hours

Fig 3-S2D
Fluvastatin_24.mat "data" struct containing traces for Canton-S flies fed fluvastatin at 24 hours. (All NI flies were treated as zombies in order to determine time of death.)

Fig 3-S2E
Fluvastatin_24_ctrl.mat "data" struct containing traces for Canton-S flies fed vehicle control at 24 hours

Fig 3-S2E
Precocene_survival.xlsx Survival outcomes for flies treated with precocene at 3 different concentrations

Fig 3-S2F
Precocene_0_1ug.mat "data" struct containing traces for Canton-S flies treated with 0.1 ug precocene

Fig 3-S2G
Precocene_0_1ug_ctrl.mat "data" struct containing traces for control flies for 0.1 ug precocene experiment

Fig 3-S2G
Precocene_2_5ug.mat "data" struct containing traces for Canton-S flies treated with 2.5 ug precocene

Fig 3-S2G
Precocene_2_5ug_ctrl.mat "data" struct containing traces for control flies for 2.5 ug precocene experiment

Fig 3-S2G
Precocene_5ug.mat "data" struct containing traces for Canton-S flies treated with 5 ug precocene

Fig 3-S2G
Precocene_5ug_ctrl.mat "data" struct containing traces for control flies for 5 ug precocene experiment

Fig 3-S2G
Methoprene_SM.mat "combinedSummit" struct with SM for flies treated with methoprene or vehicle control

Fig 3-S2H
Methoprene.xlsx Table of accompanying data for effect size analysis of methoprene treatment

Fig 3-S2H
Precocene JHA_SM.mat "combinedSummit" struct with SM for flies treated with precocene with or without added JHA

Fig 3-S2I
Precocene JHA.xlsx Table of accompanying data for effect size analysis of precocene and JHA treatment

Fig 3-S2I
Fig 4 & 4-S1
experiment-tag_margoConvert.mat For a particular MARGO tracking experiment, this file contains an expmt struct with centroid and speed (number of timepoints x number of ROIs) struct arrays. This tracking data is later read in to train/test the summiting classifier.

A total of 15 experiments were used for testing/training the classifier, each with an experiment-tag_margoConvert.mat file. The particular file linked at left belongs to the 03-22-2019-18-21-01__Circadian_CsWF-BoardC9_MF_Emuscae_1-128_Day3 experiment
experiment-time-tag-survival_data.xlsx For a particular MARGO tracking experiment, this spreadsheet contains manually-scored records of whether each fly summited in the time of the experiment (along with the time at which summiting was deemed to begin and the time of last movement). This information was used to create ground-truth summiting labels for training/testing the summiting classifier.

A total of 15 experiments were used for testing/training the classifier, each with an experiment-time-tag-survival_data.xlsx file. The particular file linked at left belongs to the 03-22-2019-18-21-01__Circadian_CsWF-BoardC9_MF_Emuscae_1-128_Day3 experiment
Fig 5 & 5-S1
HisRFP summiting brains Confocal micrographs from HisRFP summiting flies(.czi) and counts of nuclei across brain regions for these flies (.xslx)

Fig 5A-C
brain1_nc82.mat Contains grayscale intensity of nc82-stained fly brain in 3 dimensions

Fig 5B, Fig 5-S1C-D
brain1_nuclei.csv Contains 3-d coordinates of fungal nuclei

Fig 5B
HisRFP 72 hours exposed brains Confocal micrographs from HisRFP flies collected 72 hours after infection with E. muscae(.czi)

Fig 5-S1A
PI-CA>mcd8GFP brains Parent directory for confocal micrographs from brains collected from unexposed, exposed and summiting, or exposed and recently killed by fungus (Cadavers) from PI-CA>mcd8GFP flies(.czi). Brains were stained for phalloidin (AF568), GFP (AF488) and Pdf (Cy5). Also includes spreadsheet with E. muscae hole counts by neuropil (.xlsx)

Fig 5D-F
Non-summiting Aug21>GFP heads Example of Aug21>GFP heads and retrocerebral complexes from unexposed controls

Fig 5I, 5-S1F
Summiting Aug21>GFP heads Example of summiting Aug21>GFP heads and retrocerebral complexes

Fig 5I, 5-S1F
Fig 6 & 6-S1, S2
Classifier_metabolomics_peaks.cdResult Compound discoverer file containing metabolomics results from classifier-called experiment

Fig 6B, 6-S2
Classifier_metabolomics_data.xlsx Tabluated data for classifier-based metabolomics experiment

Fig 6B, 6-S2A-C
Manual_metabolomics_peaks.cdResult Compound discoverer file containing metabolomics results from manually staged experiment

Fig 6B, 6-S2
Manual_metabolomics_data.xlsx Tabluated data for manually staged metabolomics experiment

Fig 6B, 6-S2A-C
Data Fig 6-S2B.xlsx .xlsx file containing ratios of summiting/non-summiting and summiting/unexposed compound abundance for classifier-based metabolomics experiment (output of generateMetSuppPlotData.m)

Fig 6-S2B
Data Fig 6B.xlsx .xlsx file containing volcano plot data for classifier-based metabolomics experiment

Fig 6B
Data Fig 6-S2C.xlsx .xlsx file containing ratios of summiting/non-summiting and summiting/unexposed compound abundance for manually staged metabolomics experiment (output of generateMetSuppPlotData.m)

Fig 6-S2C
Overlap.xlsx .xlsx file containing intersection of manual and classifier-staged metabolomics experiments (based on output from compareExperiments.m)

Fig 6B, Fig 6-S2A-C
Transfusion data - healthy males Directory containing three folders, each with behavioral data (.mat) and metadata (.xlsx) file for a single hemolymph transfusion experiment with healthy male recipients

Fig 6C
Transfusion data - infected females Directory containing three folders, each with behavioral data (.mat) and metadata (.xlsx) file for a single hemolymph transfusion experiment with infected female recipients

Fig 6D

Experimental data analyses, visualization functions, and scripts (MATLAB 2018b)

Summit behavior analysis
File Input Computation Dependencies Output
blindTOD.m Raw behavior data (MARGO output) + experiment metadata (survival.xlsx)
  • Convert MARGO data to autotracker format (expmt struct)
  • Run denoising based on x position (optional)
  • Plot all xy data for all input files to assess tracking and check metadata entry (optional)
  • Randomize all cadavers over input data and request user to select time of death based on behavioral traces (fly genotypes and ROIs are masked)
Append times of death to experimental metadata
triggerOnDeath2019Fun.m Output directory, list of converted data files, analysis options
  • Determine number of unique genotypes (with option to split sexes) contained within data
  • Align data to common time axis
  • Compile genotype data split by survival outcome
  • Assign fictive time of death to non-zombie flies
  • Align behavioral trajectories based on time of death
  • Struct called “data” containing aligned and unaligned behavioral traces for each fly type, experiment metadata
  • Plots of average behavioral traces for zombies (Cadavers) and survivors (Alive) aligned to time of death
  • Histogram of times of death
quantifySummitIndividualFiles.m Folder containing one or more “data” structs generated from triggerOnDeath2019Fun.m
  • Calculates summit metric for each fly within each data struct
  • Struct named “summit” for each input genotype; contains SM for individual flies for each genotype and analysis metadata
  • Speed plot for every fly analyzed (Fig 1I)
  • Mean speed plot for genotype
mergeIndividualQuantFiles.m Folder containing one or more “summit” structs generated from quantifySummitIndividualFiles.m
  • Compare analysis parameters; if identical, compile all input genotype data into master struct
  • Struct named "combinedSummit"
  • List of all genotypes contained within combinedSummit and numerical indices
estimateEffectSize.m
  • combinedSummit struct
  • Excel spreadsheet listing experimental and control genotypes (and optionally, pathway/neuropil annotations)
  • For each experimental and control pair, calculate effect size for bootstrapped data 1,000 times
  • Perform two-tailed t-test on raw (unbootstrapped) data
  • Violin plots of effect sizes (optionally ordered by mean effect)
  • Histograms of annotations per quantile
Metabolomics analysis
File Input Dependencies Output
compareExperiments.m Metabolomic data from each of two experiments (xlsx spreadsheets) to generate overlap file
  • summary of overlapping metabolites between experiments
  • table containing data for plotting from each experiment (Fig 6-S1C,D)
generateMetSuppPlotData.m Overlap.xlsx (from compareExperiment.m) and data from desired metabolomics experiment
  • .xlsx file with information for overlapping compounds within selected experiment (Fig 6 S2-B,C)
metabolomicsVolcano.m Volcano plot data.xlsx Volcano plot (Fig 6B)
metabolomicsSupp.m Supp data.xlsx (from either manual or classifier experiment) Scatter plot of log2(fold change summiting/non-summiting) versus log2(fold change summiting/unexposed) with points labeled with putative identity and colored according to occurrence across sample types. (Fig 6 S2-B,C)
Hemolymph transfusion analysis
File Input Computation Dependencies Output
prepareSensitizedHemoExp.m
  • Converts data from MARGO format
  • experimental metadata spreadsheet
  • Convert MARGO data to autotracker format (expmt struct)
  • Auto denoise
  • Prompts user to determine time of first movement (i.e., first frame when fly appears to have fully recovered from cold anesthesia)
  • Asks user to determine if fly was viability, based on behavioral trace (i.e., if fly recovered from anesthesia).
  • Stores time of first movement to experimental metadata
  • Flags unviable flies to be dropped in subsequent analysis
hemoAnalysisCombined.m Behavioral dataset (either healthy male recipients or exposed female recipients)
  • Sorts behavioral data based on identity of hemolymph donor
  • Aligns behavior traces to time of first movement (recover from anesthesia
  • Calculates mean speed, distance traveled over desired window
  • Average speed by treatment (i.e, received summiting or non-summiting hemolymph)
  • Two-tailed t-test values for every 5 minute interval across first 2 hours
  • Plot of distance traveled by treatment (Fig 6C,D)
Figure plots
File Input Dependencies Output
plotExampleROI.m Converted MARGO data Plot of raw y position versus time (Fig 1D)
plotTodHistogram.m Data struct Histogram of times of death (Fig 1E; Fig 1-S1C,D,H; Fig 3-S2C, E, G)
plotYandSpeed.m Data struct
  • Plot of mean relative y position (+ /- standard error of mean) for zombies and survivors (Fig 1F-H; Fig 1-S1A,B,G,I,J)
  • As above but mean speed in mm/s (Fig 1F-H; Fig1-S1E, F, J)
plotSumIntVHt.m Data struct
  • Plot of relative y position change versus SM (Fig 1J)
plotTrpA1Exp.m Data structs for experimental and control genotypes Mean speed (+ /- standard error of mean) versus time with color bar along x-axis corresponding to observed temperatures across experiment (Fig 2I,J, Fig 2-S1E-H)
plotQuantScatter.m
  • combinedSummit struct
  • indices to plot (hard-coded)
Scatter plot of desired indices (Fig 1H; Fig 1-S1K-M)
picaCellBodies.m Numbers of cell bodies observed in R19G10>mcd8GFP unexposed, summiting, or recently killed by E. muscae (hardcoded in script)
  • Bumblebee plot of counts per treatment
  • p-values for two-tailed t-test for all pairwise comparisons (Fig 5E)
hisRFPNuclei.m Fungal nuclei counts across HisRFP summiting brains Scatter plot of fraction counts per brain region (Fig 5C)
picaNeuropilHoles.m Neuropil hole counts across R19G10>mcd8GFP summiting flies Scatter plot of fraction counts per brain region (Fig 5F)
bbb96hr.m Fraction of bright-eyed flies over unexposed, exposed, and infected conditions (hard-coded) Pie charts for each group (Fig 6A)
bbbOverTime.m Fraction of bright-eyed flies over unexposed, exposed, and infected conditions (hard-coded) Scatter plot of data vs. time observed with estimated standard error (Fig 6-S1)


Experimental data analyses, visualization functions, and scripts (Python 3.6)

R19G10>CsChrimson optogenetics analysis
File Input Dependencies Output
analyze_optogenetics_constant_red.ipynb MARGO tracking data Speed vs. time plot for ATR+, ATR- flies for 3 minutes before/after constant red light stimulus (Fig 2K)
analyze_optogenetics_pulsed_red.ipynb MARGO tracking data Kernel density estimates of speed distribution for ATR+, ATR- flies in the absence/presence of pulsed red light stimulus (Fig 2-S1 J-K)
3-d visualization of whole brain E. muscae invasion
File Input Output
plot_brain_nuclei.ipynb Anterior and posterior views of nuclei within fly brain (Fig 5B)


Random forest classifier of summiting behavior (Python 3.6)

All code is available on GitHub.
data processing, training & testing of classifier
training/data_processing_overview.ipynb Jupyter notebook that
  • introduces the tracking data (y position and speed through time)
  • shows examples of summiting/non-summiting trajectories
  • discusses how trajectories are distilled into feature vectors suitable for subsequently training a classifier
training/train_random_forest.ipynb Jupyter notebook that
  • splits fly trajectories into training/validation sets
  • uses the training data to train a random forest classifier
  • evaluates performance on the validation data
  • generates Fig 4C-D, Fig 4-S1A

Running this notebook to train a model saves a Python dictionary with relevant training parameters, as well as a scikit-learn RandomForestClassifer object representing the trained model, that can be accessed later to perform new classifications.

By default, running this notebook will read in the trained classifier model used in all experiments described in this paper. However, a new model can be trained and saved to the models/ directory by this notebook as well
training/test_classification.ipynb Jupyter notebook that
  • reads in the outputs of training/train_random_forest.ipynb (a dictionary of parameters and a random forest model)
  • performs classification on a held-out testing dataset
  • compares different decision rules for converting class probabilities into the ultimate decision rules for alerting experiments to actively-summiting flies
  • generates Fig 4B, E-G, Fig 4-S1B
trained random forest classifier model files ready to use for classification
models/params_example_classifier_model.p Python pickle file containing Python dictionary storing training parameters for model used in all classifier experiments in this paper, generated from training/train_random_forest.ipynb
models/clf_example_classifier_model.p Python pickle file containing sklearn RandomForestClassifier object representing the model used in all classifier experiments in this paper, generated from training/train_random_forest.ipynb
input files for running classifier experiments
input_files/input_file_example.py Python file defining a dictionary used in run_classifier.py with relevant parameters including the desired name of the output directory, the path to the tracking data, classifier settings, and email preferences
running classifier experiments
run_classifier.py Main Python file that reads in settings from a desired input file and performs classification on a desired tracking dataset with desired classification frequency
auxiliary functions used in setting up, running, and analyzing classifier
utils/time_utils.py Python script containing utility functions for extracting/generating timestamps, for instance from tracking frames
utils/load_data_utils.py Python script containing utility functions namely for reading in and processing centroid data outputted from MARGO
utils/process_data_utils.py Python script containing helper functions for generating feature vectors to train the classifier, and helper functions for producing / writing to log files classifier outputs
utils/prediction_utils.py Python script containing utility functions for converting predicted class probabilities from the classifier into alerts (e.g. call a fly summiting when there are consecutive stretches in which P(summiting) > P(non-summiting))
utils/gmail_account_info.py Python script containing variables storing account information for the gmail address that sends classifier outputs
utils/email_utils.py Python script containing utility functions for connecting to the internet and emailing classifier outputs. Uses account information specified in utils/gmail_account_info.py
utils/classifier_plot_utils.py Python script containing utility functions for plotting feature vector attributes, classifier output probabilities, etc.
utils/plot_trajs_with_class_probs.py Python script that plots trajectories of relative y position and speed, along with classifier class probabilities, for all ROIs of a summiting experiment