Companion Data Page

Wild flies hedge their thermal preference bets in response to seasonal fluctuations

Jamilla Akhund-Zade1, Denise Yoon1, Alyssa Bangerter2*, Nikolaos Polizos3*, Matthew Campbell2, Anna Soloshenko, Thomas Zhang3, Eric Wice4, Ashley R. Albright5, Aditi Narayanan6, Paul Schmidt7, Julia Saltz4, Julien Ayroles8, Mason Klein3, Alan O. Bergland2, Benjamin L. de Bivort1†

1 - Department of Organismal & Evolutionary Biology, Harvard University, Cambridge, MA, USA
2 - Department of Biology, University of Virginia, Charlottesville, VA, USA
3 - Department of Biology, University of Miami, Coral Gables, FL, USA
4 - Department of Biology, Rice University, Houston, TX, USA
5 - Department of Molecular & Cellular Biology, University of California Berkeley, Berkeley, CA, USA
6 - Department of Biology, California Institute of Technology, Pasadena, CA, USA
7 - Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
8 - Department of Ecology & Evolutionary Biology, Princeton University, Princeton, NJ, USA

* equal contribution
† corresponding author

Manuscript PDF
Pre-print at bioRxiv
Download all codes and data at Zenodo

Main Loading Scripts

These are scripts used for processing and filtering positional data exported by MARGO. MARGO is a MATLAB-based object tracking software, and the output of the beta version of MARGO (used exclusively for all the acquired experimental data) is a .mat file. The following scripts MATLAB-exclusive.
Plotting note: some MATLAB scripts require the gramm package for data visualization. Installation here.
createTempPref.m The main loading function that takes all of the raw experimental .mat files in a single folder and outputs an expmts struct and a tempPref cell array. The expmts struct has the experimental meta-data, raw centroid values, time stamps, and calculated temperature preference metrics for each experimental file. The tempPref cell array is temperature preference metrics + labels of all flies across the experiment files.

The tempPref column labels are as follows (rows correspond to individual flies):
1. occupancy of hot side (0-1)
2. distance traveled in px
3. vector of positions over time (used in plotting raw positional data)
4. x-position of top left corner of tunnel in captured camera image (px)
5. y-position of top left corner of tunnel in captured camera image (px)
6. length of tunnel in captured camera image (px)
7. width of tunnel in captured camera image (px)
8. x-position of tunnel center (px)
9. y-position of tunnel center (px)
twoChoicePref.m function to calculate occupancy score for `tempPref` cell array. Called by createTempPref.m.
splitROI.m Function that splits tunnel, so that positional data can be matched with the cold and hot sides. Called by twoChoicePref.m.
activityThresholding.m Thresholds fly activity based on % of time spent moving. thresh parameter should range from 0 -1, with 0 - no movement acceptable and 1 - only constantly moving flies acceptable.
removeInactiveBouts.m Removes bouts where a fly does not move more than dist px for longer than time and recalculates temperature preference. In addition, subsamples centroid data to subsampleRate to decrease file size. Timing parameters are in units of frames. Called by activityThresholding.m.
plotting4hrTracks.m Function to plot raw positional traces, along with color coding the positions on the hot side (orange) and cold side (blue).
tempPrefToDegrees.m Function to transform occupancy score (0-1) into a ºC temperature preference based on tunnel temperatures (limited to 20ºC vs 28ºC setpoints). Calls on .mat files in calibrated-tunnel-temps. Folder with .mat files with measured tunnel temperatures across all assay set ups when temperature control set to 20ºC vs 28ºC setpoints (exclusively used in my study). Three .mat files - bostonTunnelTemps.mat for the 3 Harvard assay set ups, virginiaTunnelTemps.mat for the UVA assay set up, and miamiTunnelTemps.mat for the UMiami assay set up.

No Temperature Tracking

tempofftracking.m Analysis scripts for processing experimental data from flies navigating the tunnels without no temperature stimulus. Used to determine the coefficients of the power curve of the relationship between sampling error and distance traveled (for use in the variability experiment).
occupancyByDistTrav.m Function to calculate the sampling error vs. distance traveled. Called by tempofftracking.m. Folder containing raw experimental data, tempOffData.mat - output of occupancyByDistTrav.m, powerFits.mat - power curve fits to sampling error vs. distance traveled, GOF_powerFits.mat - goodness-of-fit metrics for power curve fits.

Null Distribution of Preference

nullDistributionAnalysis.m Analysis scripts to create null distribution of thermal preference based on bout resampling of experimental data.
nullDistribution.m function to create simulated fly tracks based on experimental data. Called by nullDistributionAnalysis.m.
filterInactiveBouts_nullDist.m function to filter simulated fly tracks for inactive bouts (same as experimental data). Called by nullDistribution.m.
createBoutArray.m Function to split experimental positional data into "bouts" on the cold and hot sides. Called by nullDistributionAnalysis.m.
data directory
MA_11_38_nullDist_BoutResampling.mat Struct with output of nullDistribution.m for an isofemale line and observed data/metrics on that line.
SomA_nullDist_sampling.mat Struct with output of nullDistribution.m for an isogenic line and observed data/metrics on that line.

Thermal Preference Persistence

persistenceAnalysis_SomA_trpA1quant.m Analysis scripts for looking at persistence of thermal preference of flies from SomA isogenic line.
modelExplainedBehaviorVariation.m (created by Matt Churgin, de Bivort Lab) Function to model behavioral variance explained as a function of behavioral persistence and true correlation between latent variable of interest (e.g. transcript expression) with behavior. Folder with .mat files; SomA_trpA1_quant_boxlabel.mat - behavioral data for SomA (and some RAL535) flies.

Life History Curves

ThermPref vs Fitness Analysis.R R scripts for exploratory data analysis of life history traits of FL, MA, VA, isofemale lines and relationship of life history traits to individual temperature preference.
ThermPref vs Fitness Modeling.R R scripts for modeling the relationship between rearing temperature, place of origin, and thermal preference with life history traits for FL, MA, VA isofemale lines.
Life History Curves.R R scripts for fitting a curve to the relationship between temperature and development time/lifespan for FL, MA, VA isofemale lines. Used in the updated life history model to calculate bet-hedging advantage. CSV files with life history traits/behavioral data; processed data from CSVs in Fitness.RData and life history data used in Kain *et al.* model in Kain 2015 Life History Curves.RData.

Thermal Preference Plasticity

PlasticityDataAnalysis.m MATLAB script for processing behavioral data from thermal preference plasticity experiment; creates CSV file for input into Stan model.
PlasticityAnalysis.R R script to run Stan model (stanThermoWSampEst_degC.stan), plot posterior distributions, and do sampler diagnostics.
stanThermoWSampEst_degC.stan Stan model used by the Rstan package in PlasticityAnalysis.R to generate posterior distributions of thermal preference mean and variability estimates. For use with thermal preference metric in ºC. .mat files with behavioral data and .csv file with processed behavioral data for input into the Stan model. Output of the sampler run; Plasticity_FixedPhiPsi.RData - posterior distribution values; SimulatedPlasticityVals_FixedPhiPsi.RData - simulated data from posterior values for diagnostic checks.

Modeling Bet-Hedging Advantage

Main directory for all bet-hedging modeling in the project. Split into two sub-directories 1) map of predicted bet-hedging advantage, 2) modeling seasonal dynamics of mean preference
main data directory
.mat files of weather data and calibrated stations from Kain et al (2015)
hedgeWeatherPack.mat Kain et al. (2015) 2007-2011 climate normals data
stationDataBatch.mat Kain et al. bet-hedging advantage for ~1500 random weather stations
seasonalAverages.mat Climate normals 1981-2010 for the seven sampling locations
scripts directory
workhorse functions for modeling bet-hedging advantage
hedgeAnalytic.m Workhorse function for simulating bet-hedging and adaptive tracking populations; implements the analytical model of a effectively infinite population reproducing over a breeding season, under alternate modes of behavioral heritability (0 - bet-hedging, 1 - adaptive tracking), and alternate simulated weather conditions.
hedgeBDCalibrate.m Function that performs a hill-climbing algorithm to determine the values of birth (b) and death (d) rates in the model that satisfy the two assumptions (constant population size at the beginning and end of the season, and constant mean phototactic preference at the start and end of the season). This can be used to automatically calibrate the model for arbitrary weather and seasonal conditions. All calibration is done under an adaptive-tracking strategy, and only accommodates daily mean temperature data (i.e. not daily deviations or cloud cover, thus the calibration is deterministic). Calls hedgeAnalytic.m.
map of bet-hedging advantage stationDataAll.mat - bet-hedging advantage calculated for 7501 stations using betHedgeAdvantageCalc..m and updated model
hedgeColorMaps.mat - colormaps for map visualization from Kain et al. (2015)
hedgeStateBoundaries.mat - state boundaries for map visualization from Kain et al. (2015)
allSamplingLocations_LatLong.mat - latitude/longitude for the seven sampling locations
North_america98_48&PR.png - updated state boundaries geographicBetHedging.m - scripts to make color map of bet-hedging advantage with Gaussian convolution to imitate dispersal of fly populations; localized predictions for sampling locations.
hedgeMakeStationMap_new.m - workhorse function for making the color map; adapted from Kain et al.(2015)
betHedgeAdvantageCalc.m - wrapper function for batch submission to cluster of hedgeGeographyAll.m
hedgeGeographyAll.m - function that reads in weather data one location at a time and attempts to fit birth and death parameters for the seasonal weather at that location. If the parameters are fit, runs a bet-hedging scenario at this location and collects summary data of this location and the model performance here. Calls hedgeAnalytic.m and hedgeBDCalibrate.m (adapted from Kain et al. (2015)).
seasonal dynamics of mean preference stationData2018.mat - 2018 daily temperature data for FL, MA, and VA sampling locations
seasonalTempPref_2018.mat - behavioral data for flies sampled in 2018 from FL, MA, and VA
CollectionWeeks.mat - mapping between day of the year and the collection week BreedingSeasonSim2018.m - scripts to plot predicted preference dynamics vs. observed mean preference
calibratingBDrates.m - scripts to calibrate birth/death rates given climate normals for a particular location
LikelihoodAnalysis_SeasonalData.m - scripts to calculate log-likelihood ratio of bet-hedging vs. adaptive tracking
given observed data. Calls calculateLogLikelihood.m and logLikBootstrap.m. calculateLogLikelihood.m - function to calculate the log-likelihood ratio given predicted dynamics and observed data
logLikBootstrap.m - function to do bootstrap resampling of observed data in order to calculate the uncertainty in the log-likelihood ratio estimate.

Thermal Preference Variability

data directory mat files with raw behavioral data for isofemale lines and processed behavioral data in .csv files ready for analysis with Stan VariabilityXXXXHierModel_FixedPhiPsi.RData- values of posterior distributions for thermal preference mean and variability; 2018 - Nov/Dec 2018 batch, 2019 - Jan 2019 batch, Apr2019 - Apr 2019 batch, Total2019 - Jan + Apr 2019 batches combined.
SimulatedXXXXVals_FixedPhiPsi.RData - simulated data from posterior values for diagnostic checks. 2018 - Nov/Dec 2018 batch, 2019 - Jan 2019 batch, Apr2019 - Apr 2019 batch, Total2019 - Jan + Apr 2019 batches combined.
Site (Line) Averages Variability.RData - sampling site (isofemale line) average posterior estimates of variability + s.d. of posterior distribution.
scripts directory
VariabilityEstimation.R R script for processing thermal preference data, running Stan model, processing/plotting posterior distributions, and diagnostic checks of sampler output. Calls stanThermoHierarchicalWSampEst_degC.stan.
VariabilityAnalysis.m MATLAB script for loading and pre-processing of raw thermal preference behavioral data; generates input data as .csv for Stan model.
stanThermoHierarchicalWSampEst_degC.stan Stan hierarchical model for estimating line variability nested under sampling site variability.

Thermal Preference Heritability parents_fltd.mat - processed behavioral data from all parental flies
midParentValues.mat - struct with the thermal preference of female and male parents, and the average preference of the parents; includes both the occupancy data (col1 of arents_fltd.mat) and ºC (col21 of parents_fltd.mat). Output of heritCrossesTempPref.m.
heritabilityCrosses.mat - label array of parental IDs and corresponding cross ID label. F1_TempPref.mat - raw behavioral data of F1s from crosses.
F1_TempPref_fltd.mat is the processed behavioral data of F1s. HeritabilityAnalysis.m - scripts to process F1 data, pair behavioral data from parents and F1s, and plot/analyze mid parent-offpsring regression. Calls eritCrossesTempPref.m and summarizeF1TempPref.m.
heritCrossesTempPref.m - function to make midParentValues.mat.
summarizeF1TempPref.m - function to take in F1 data, get average preference of F1s grouped by cross, and filter out crosses that are not present in both datasets. Outputs a struct to use for regression analysis.

Genetic Diversity Analysis

Please see the variant calling pipeline Github repository for scripts relating to processing of sequencing data and generation of VCF files/PoPOOLation metrics. This directory contains only the script used in assessing the relationship between thermal preference final estimates of population genetic diversity.
data directory Output of vcftools (.imiss) to calculate the fraction of missing genotypes for each sequenced individual. Output of vcftoolds (.idepth) to calcualte average depth per individual when making a genotype call. Directory with output of custom CyVCF2/Python bootstrapping analysis to estimate Watterson's theta
herit_boot_all_sites - tables with counts of segregating sites for flies used in heritability analysis, sorted by sampling site (population). Output of bootstrapping analysis.
ar_boot_all_sites- tables with counts of segregating sites for flies used in the heritability analysis, sorted by sampling site (population). Output of bootstrapping analysis.
var_line_count_all_sites - tables with counts of segregating sites for flies used in the heritability analysis, sorted by isofemale line. Raw counts, no bootstrapping. Directory with output of PoPOOLation estimates of Watterson's theta
herit - heritability flies
var - variability flies
.theta files - contain the theta estimates across the whole genome in 50kb non-overlapping windows using the pileup file generated for each sampling site (population).
.theta.param files - shows the parameters used in the theta estimation
xx_bam_herit_merge.txt files - list of BAM files merged to make pileup file for theta estimation
Heritability Theta Estimates.RData Processed segregating sites/theta estimates from custom boostrapping and PoPOOLation for heritability flies.
Variability Theta Estimates.RData Processed segregating sites/theta estimates from custom boostrapping and PoPOOLation for variability flies.
scripts directory
VariantAnalysis.R Scripts to process theta estimates from both the bootstrapping and PoPOOLation analyses, plot relationship of theta estimates to thermal preference heritability/variability, and calculate correlations. Calls on Site (Line) Averages Variablity.RData found in the directory for thermal preference variability.

Two-Choice Assay Design Inventor files for custom PVC water blocks PIDController.sch - EAGLE schematic of connections for the custom Arduino-based temperature PID controller
PIDController_FourIBT2Hbridges.ino - Arduino script to do PID temperature control outer-box - PDF files with schematics for roof, 3 walls, door, and PID controller base of the behavioral box.
tray-design - PDF files with schematics for transparent coverslips, peltier base dividers, and main tray with tunnels (single PDF contains both layers).
Thermo Rig Parts List.xlsx The vendors, catalog nos. (or links), quantities for components used to make a single behavioral box + trays. Requires: water chiller, water blocks as heatsinks for Peltiers (custom milled)