Companion Data Page

Genetic basis of offspring number and body weight in Drosophila melanogaster

Jamilla Akhund-Zade1, Shraddha Lall1*, Erika Gajda*, Denise Yoon1, Benjamin L. de Bivort1‡

1 - Department of Organismal & Evolutionary Biology, Harvard University, Cambridge, MA, USA
* equal contribution
‡ corresponding author


Manuscript PDF
Pre-print at bioRxiv
Download all codes and data at Zenodo

Scripts

R scripts used to analyze data and create paper figures.
ModelingPhenotypes.R Script to process data from the DGRP phenotype screen, perform batch adjustment, model line effects, and perform PCA, as well as calculate phenotype heritability. Outputs Batch Adjusted Phenotypes.csv, Raw Line Phenotypes.csv, Line Effects Data.csv, and pca phenotype.csv.
PhenotypePlots.R Script to generate phenotype plots in Figure 1 from batch corrected phenotype data. Uses Batch Adjusted Phenotypes.csv.
DensityAnalysis.R Script to process offspring phenotype data from the parental density experiment. Generates Figure 2 and corresponding analysis. Uses Density Data Records.csv.
Validations.R Script to process candidate gene validation data to generate Figure 3. Uses Validation Exelixis Lines.zip.
GeneExpressionCorrelations.R Script and functions to analyze DGRP expression data to generate Figure 4 and supplementary figures showing the correlation in expression among genes and correlation between phenotype and gene expression. Uses DGRP_gene_expression.zip and Line Effects Data.csv.
TraitCorrelations.R Script and functions to analyze correlations between DGRP phenotypes in our screen and previous DGRP GWAS. Uses DGRP_GWAS_phenotypes.zip.
CrossGWASComparison.R Script and functions to analyze overlap between candidate genes in our screen and candidate genes in Durham et al. (2014) DGRP fecundity GWAS (Durham_GWAS_Results.zip). Generates bootstrap null distribution of overlap using full list of genes hit by DGRP variants (dgrp.fb557.genes.txt processed from DGRP2 dataset).

Phenotype Data

DGRP Screen Data.csv Recorded data from DGRP screen - contains 1) date of record, 2) recorder ID, 3) DGRP line, 4) vial ID, 5) counts of eclosed F/M, 6) average weights of eclosed F/M, 7) batch ID. Used in ModelingPhenotypes.R.
Raw Line Phenotypes.csv Average phenotypes (number and average weight of F/M offspring) for DGRP lines uncorrected for batch effects. Output of ModelingPhenotypes.R.
Batch Adjusted Line Phenotypes.csv Average phenotypes (number and average weight of F/M offspring) for DGRP lines corrected for batch effects. Output of ModelingPhenotypes.R.
pca phenotype.csv DGRP line ID and offspring index (PC1 score) for use with DGRP2 webtool. Output of ModelingPhenotypes.R.
Line Effects Data.csv DGRP line ID, modeled line effects for each phenotype, and offspring index (PC1 score). Output of ModelingPhenotypes.R.
Density Data Records.csv CSV with recorded data from parental density experiment - 1) coded DGRP line ID 2) density treatment code, 3) replicate ID, 4) date of record, 5) counts/weights of offspring. Used in DensityAnalysis.R.
Validation Exelixis Lines.zip Two CSV files containing records from validations of Exelixis mutants as compared with the background W1118 strain. Records contain 1) date of record, 2) Exelixis line, 3) replicate ID, 4) counts of eclosed F/M, 5) average weights of eclosed F/M, and 6) batch ID. Used in Validations.R.

GWAS Data

candidate_genes.txt FlyBaseID and gene name of candidate genes associated with top variants.
validation_genes.txt FlyBaseID and gene name of genes chosen for validation experiments.
gwas_top.csv CSV with chromosome, position, affected gene, p-value, effect size, and location of variant for top associated variants (Table 1 in the manuscript). For use with CrossGWASComparison.R
gwas_results.zip Output from the DGRP2 webtool. gwas.top.csv contains top associated variants, allele infomation, p-values, and annotation. gwas.top.xlsx contains additional sheets with information on mutant lines used for validation. qqplots.zip contains qqplots used in the supplement. LDheat.eps is the heatmap of linkage disequilibrium of the variants. gwas.all.assoc contains p-values for all variants. pheno.adjust.txt contains ANOVA tests for inversions/Wolbachia status on phenotypes. raw.adjusted.pheno.txt has raw and adjusted phenotype data. snp_calls.csv contains variant genotypes for those in "gwas.top.csv". Please see the link to the DGRP2 webtool for detailed information on the files.
DRGP Data
DGRP_gene_expression.zip Contains Average DGRP Expression Data.RData with average gene expression data per line, as well as the raw expression data from Huang et al., 2015, dgrp.array.exp.female.txt and dgrp.array.exp.male.txt. Used in GeneExpressionCorrelations.R.
DGRP_GWAS_phenotypes.zip Contains CSVs and other data files with line phenotypes from other DGRP GWAS. Used in TraitCorrelations.R.
Durham_GWAS_Results.zip Contains CSVs with candidate genes from a fecundity GWAS by Durham et al. (2014). Used in CrossGWASComparison.R.