de Bivort Lab: Leg-tracking and behavioral classification in Drosophila

Companion Data Page Leg-tracking and automated behavioral classification in Drosophila Jamey Kain¹, Chris Stokes¹, Quentin Gaudry², Xiangzhi Song^1,3, James Foley¹, Rachel Wilson², Benjamin de Bivort^1,4,5 ¹The Rowland Institute at Harvard, Cambridge, Massachusetts, USA. ²Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, USA. ³College of Chemistry & Chemical Engineering, Central South University, Changsha, P. R. China. ⁴Center for Brain Science, Harvard University, Cambridge, Massachusetts, USA. ⁵Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA. Manuscript PDF Manuscript (hosted at Nature Communications) Arxiv pre-print
Data and analysis .mats (MATLAB 2011a) and movies
legTrackerAllRawData.mat	Raw instrument-generated data for all flies analyzed, except those from optomotor experiments. Flies are distinguished by different f numbers, _1 indicates the first recording of animals recorded on more than one day, etc. Big file (274MB).
legTrackerOptomotorRawAndClassified.mat	Raw instrument-generated data for all flies in the optomotor experiments, as well as the results of automatic classification of those raw data sets. _d indicates dark phases, _o open-loop phases, and _c closed-loop phases. Biggish file (75MB).
legTrackerD2DFliesClassified.mat	Results of automatic classification of the flies in the day-to-day experiments. Prediction vector is .pred sub-object.
legTrackerTrainingBundle.mat	Variable needed to run automated classification script legTrackerClassifyRawData.m. Includes JK and BD manual classifications.
legTrackerAnnotateColors.mat	Small variable used to color the behavioral labels in Figures 2 and 3.
f12 movie f13 movie f14 movie f19 movie f24 movie	Movies used to generate the JK and BD manual annotations, which in turn provide the KNN classifier training data. Annotation sequences begin with the first frame in which the laser illumination is fully on. Movies f12, f13, f14, f19 and f24 were annotated for 8000, 4001, 4001, 4001 and 4001 frames respectively.
Supplementary Movie 1 Supplementary Movie 2	Supplementary Movies 1 and 2, illustrating the KNN classification vs manual annotation (Movie S1) and of novel instrument data (Movie S2). Movies are hosted on YouTube.
Instrument control .VIs (LabVIEW 8.6) Download .zip archive of all .vi files
master VI and leg position detection
LT4.vi	Leg Tracker master front panel (v.4).
LT1- initialize camera.vi	Initializes the GigE Cameras, setting the gain and exposure times.
LT2- reflect array.vi	Reflects a 2D array to undo the optical reflection from the beam splitter cube.
LT2- background subtractor 2 -LT3.vi	Subtracts one IR dye channel from the other to undo bleedthrough effects in the optical filtering.
LT2- find object precalculations.vi	Precalculates several variables for peak detection, such as the conical convolution filter, and bundles several variables into cluster passed between the other machine vision VIs.
LT2- find object convolution.vi	Uses 2D convolution and the pre-calculated convolution filter to blur the fluorescence images.
LT2- find objects detect all peaks.vi	Generates a list of local maxima coordinates in the convolved fluorescence image.
LT2- find objects brightest peaks.vi	Identifies the three (per dye set) brightest maxima in each flourescence image.
LT1- annotate 3 legs.vi	Uses positional logic to assign identities to each leg peak (front, mid, hind).
LT2- leg polar coordinates.vi	Converts the cartesian position of the legs to polar coordinates using the inferred insect center. These values are part of the instrument raw output.
ball rotation measurement
LT1- initialize sensors.vi	Populates registers in the tracking sensors to get them up and running.
LT1- read sensors.vi	Decodes the bit encoding of the x- and y- displacement values read from the tracking sensors.
LT1- update sensor value histories.vi	Updates the 1x10 vector of tracking sensor values, discarding the oldest and concatenting the newest.
rotational components.vi	Uses the raw x- and y- displacement values from the two tracking sensors, and trigonometry, to calculate the three rotational components of ball motion: pitch, yaw and roll. See Supplementary Note 2 in Kain et al.
other functions
LT1- adorn image.vi	Adds targeting reticle and ROIs for front panel visualization
LT1- adorn legs.vi	Adds colored circles to detected leg positions for front panel visualization.
LT1- organize output data.vi	Collates the frame number, timestamp, leg coordinates and ball rotation vectors, and saves them to a text file.
LT4- optomotor double flow.vi	Sub-VI for optomotor stimulus presentation. Keep running on second monitor while LT4.vi runs Used in Figures 3g and 3h.
Analysis .m functions (MATLAB 2011a) Download .zip archive of all .m files
raw data processing and classification
legTrackerClassifyRawData.m	Master script that takes imported instrument data and classifies the behavior at each frame using KNN. Requires variable in legTrackerTrainingBundle.mat.
legTrackerCleanData.m	Applies a three frame median filter to raw instrument data and interpolates it into evenly spaced 10ms frames. Called by legTrackerClassifyRawData.m.
legTrackerErrorCheck.m	Takes the output of legTrackerCleanData.m and checks all leg-position data points for tracking errors. Errors are identified as frames in which the apparent frame-to-frame motion exceeds 5 x the standard deviation of the frame-to-frame motions across the data vector. Called by legTrackerClassifyRawData.m.
legTrackerPrepLocalSTDs.m	Higher-order feature calculator Generates the first set of higher order features for classification. Takes the output of legTrackerErrorCheck.m and returns an array of equal size where each value has been replaced by the standard deviation of that vector in a sliding window of size +/- windowSize. Array is padded by zeros. windowSize=5 in our paper. Called by legTrackerClassifyRawData.m.
legTrackerPrepDerivatives.m	Higher-order feature calculator Generates the second set of higher order features for classification. Takes the output of legTrackerErrorCheck.m and returns an array of equal size where each value has been replaced by its vectors derivative at each frame. Array is padded by zeros. Called by legTrackerClassifyRawData.m.
legTrackerPredFilt.m	Applies a low-pass filter to the vector of classification labels produced by KNN classification. Frame-by-frame labels are replaced by the most abundant label in a sliding window of size +/- windowSize frames. windowSize=5 in our paper. Called by legTrackerClassifyRawData.m.
legTrackerMarkovMat.m	Calculates the percent of frames in a vector of KNN classifications of each behavioral type, as well as transition Markov Matrices, with option parameters specifying whether the diagonal of the Markov Matrix is set to 0, or whether it is row-normalized. See comments in code or type "help legTrackerMarkovMat" for details. Called by legTrackerClassifyRawData.m.
legTrackerPrepCorrs.m	Higher-order feature calculator Takes instrument data and returns the pairwise correlation coefficients between data vectors within a sliding window. Padded by zeros. Evaluated in Figure 2d, but NOT USED for the final classification protocol.
legTrackerPrepMedFilt.m	Higher-order feature calculator Takes instrument data and median filters it with a window of specifiable size, using built-in MATLAB function medfilt1. Evaluated in Figure 2d, but NOT USED for the final classification protocol.
statistical analysis
legTrackerPlausScore.m	Calculates the "plausible accuracy" metric in figure 2d: the unbalanced accuracy of the 12x12 behavioral confusion matrices (e.g. Figs 2c,e), ignoring errors in "plausibly confused" behavioral comparisons, such as complex motion vs forward/backward running, or stasis vs postural adjustments. See text discussion in Kain et al.
legTrackerD2DBS.m	D2DBS = day-to-day boostrapping. Resamples the identity of flies tested on more than one day, to generate a distribution on the mean intra-fly distance in a space passed in as input data, under the null hypothesis that intra-fly distances are identical to inter-fly distances. E.g. to get a distribution of intra-fly distances (under the null hypothesis) in the PC1/PC2 space of figures 3c and 3e pass 18x2 array, where 18 is the number of day-to-day recordings, and 2 corresponds to the PC1 and PC2 values of each recording. Performs 50,000 resamplings by default.
legTrackerAlignBy.m	Takes the output object of legTrackerClassifyRawData.m and a list of behavioral transitions, detects all instances of the behavioral transitions in the behavioral annotation sequence, and outputs windowSize*2 of the raw data and higher order features flanking the instances of those behavioral transition. This allows you to examine what, for example, a the x-component of a particular leg is doing during a behavioral transition of interest. Used in Fig 3f.
legTrackerInterPeakDist.m	Determines the distribution of intervals between local minima and local maxima in a time series vector. Used in Figs S2b and 3h (rows 2 and 4).
data visualization
legTrackerTracePlot.m	Creates a figure with the 15 instrument vectors plotted for a specified range of frames, with the traces nicely staggered and roughly colored. Used to make Figures 1f,g and 2f.
legTrackerScoreMovie.m	Generates an annotated movie of a fly running on the ball with manual behavioral classifications, KNN classifications, and visualizations of the raw data and higher order features used to calculate the KNN classifications. Movies S1 and S2 were generated with this script.
legTrackerEthogram.m	Generates an ethogram given a transition Markov Matrix. Used to make figure 3d. Adds nodes and edge colors and thicknesses. Arrows and node color were added after in Adobe Illustrator.

Data and analysis .mats (MATLAB 2011a) and movies

Instrument control .VIs (LabVIEW 8.6)

Analysis .m functions (MATLAB 2011a)