- Initial phenotype and genetics data on the cohort, 2021 Nat Metab, Roy et al., PMID 34552269
- Molecular liver data on the cohort, 2022 Cell Systems, Williams et al., PMID 34666007
Petr Pecina, Marek Vrbacký, Jan Šilhavý, Tomáš Čajka, Tomáš Mráček, Alena Pecinová, Michal Pravenec*
Institute of Physiology, Czech Academy of Sciences,Prague, Czech Republic
*Email address of corresponding author: michal.pravenec@fgu.cas.cz
Physiological variability in mitochondrial rRNA predisposes to metabolic syndrome in the rat
Obesity and its associated comorbidities, particularly metabolic syndrome (MS), are a growing concern in developed societies. Due to its polygenic nature, the genetic component of MS is only slowly being elucidated. Common mitochondrial DNA (mtDNA) sequence variants have been associated with late-onset human diseases, including cardiovascular disease or type 2 diabetes, and may therefore be relevant players in the genetics of metabolic syndrome. In the present study, we investigate the effect of mitochondrial sequence variation on the metabolic phenotype in conplastic rat strains with identical nuclear but different mitochondrial genomes that differ in the sequence of oxidative phosphorylation structural proteins, tRNAs and rRNAs. Exposure to the high-fat diet led to the development of insulin resistance in the conplastic animals, which was associated with the reduced oxidative capacity of the heart, but not liver mitochondria. Reduced fatty acid oxidation led to the accumulation of bioactive diacylglycerols and subsequent inhibition of insulin signalling. We propose that these metabolic perturbations stem from the 12s rRNA sequence variation, which affects mitoribosome assembly and mitochondrial protein translation. Our work has demonstrated that common sequence variation in mitochondrial rRNA predisposes to progression of metabolic syndrome.
Ziyun Zhou1, Besma Boussoufa1, Arianna Lamanna1, Rashi Halder1, Julien Schleich2, Robert Williams3, Paul Wilmes1, Yibo Wu4, and Evan Williams1
1Luxembourg Centre for Systems Biomedicine, University of Luxembourg. 2Faculty of Science, Technology and Medicine, University of Luxembourg. 3Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center. 4Faculty of Science, ChemBioMS, University of Geneva. Evan.williams@uni.lu
Interactions in the BXD metagenome, metatranscriptome, and tissue gene expression across genotype, age, diet, and tissues
We have previously followed a population of 2157 mice from 89 BXD strains across their lifespans, with 662 individuals from 56 strains sacrificed at various ages (6–24 months of age) and diet (chow–CD or high fat–HF). We recently analyzed the cecal metagenome by WGS for 210 individuals, and the metatranscriptome for 79 of those. These microbiome data are merged with analyses of other tissues, particularly the cecal transcriptome (79 of the same mice), the white adipose proteome (192), and recently-published data on the transcriptome, proteome, and metabolome from the liver (347; the same individuals measured overlap in each dataset). These datasets provide a global overview of the molecular factors changing over time, diet, and strain in tandem with phenotypic divergence, such as in body weight, expected lifespan, and clinical serum biomarkers like alkaline phosphatase.
The influence of diet and strain vastly outweigh the effect of age in all molecular datasets, with substantial variance across data type. For instance, the proteome is far more influenced by strain than by diet or age, while the metagenome is affected by diet vastly more than by genotype. Significant divergence is seen across datasets based on the independent variables of diet, age, and strain, such as that HF affects the adipose proteome more than the liver proteome. Complementary datasets broadly align; the median correlation for 162 taxa measured at the metagenome and metatranscriptome is r = 0.68, and the effect of diet across 3772 liver genes measured at the transcript and protein is r = 0.42. This alignment allows hypotheses discovered in multiple datasets to be robustly determined. Measurements and associations which are distinct between complementary datasets indicate directional relationships due to the chosen independent variables, permitting us to identify causal mechanisms connecting molecular patterns to one-another and, ultimately, phenotypic outcomes.
Context for the data & study:
Joy Afolabi
UNIVERSITY OF TENNESSEE HEALTH SCIENCE CENTER
MACHINE LEARNING FOR QUANTIFICATION OF BEHAVIOR IN RODENT MODELS OF AGING AND ALZHEIMER’S DISEASE.
Background: Aging is a major risk factor for many diseases, including Alzheimer’s disease (AD). The prevalence of AD continues to be positively correlated with longevity. Alzheimer's disease (AD) is a complex disease, that destroys neurons and brain cells involved in memory and eventually affects reasoning and social behavior. Accurate quantification of behavior is vital for exploring the genetics of these different aspects of AD. Machine learning tools are recently being used to quantify these behaviors.
Research aim: Mouse models have traditionally been used for behavior assays to test phenotypes including memory, gait and frailty. However, these models have traditionally involved the manual analysis of a limited number of pre-defined behaviors. We aim to use ‘computerized’ behavior quantification to explain genetic predisposition to aspects of AD and to identify new phenotypes that may be associated with AD and aging in mouse models.
Methods: We will identify novel behavioral phenotypes associated with aging and Alzheimer’s disease within traditional assays, using AD-BXD model mice. We will then use QTL mapping to determine regions of the genome associated with variation in any phenotype measured. Following, we will search for genes within the QTL to identify candidate genes that may influence our trait of interest. We have previously collected videos of mice exploring a snowflake maze. We will use DeepLabCut and a modified version of High-Resolution Net (HRNet) software (supervised learning) as well as Motion Sequencing (MoSeq) or DeepOf (unsupervised learning) in conjunction with Simple Behavioral Analysis (SimBA) software to test if machine learning tools can accurately quantify these behavioral phenotypes of aging and Alzheimer’s disease.
Expected Findings: Alzheimer's disease and aging both lead to a range of linked phenotypes. We aim to identify phenotypes which may be unique to one or shared by both and identify genetic loci underlying these phenotypes.
Jacqueline Harris1, Alana Smith1, Ernestine Kubi-Amos Abanyie1, Robert Williams1 , Athena Starlard-Davenport1
1College of Medicine, Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN.
* astarlar@uthsc.edu
Identifying Genetic Modifiers of Sickle Cell Disease by Backcrossing Genetically Diverse BXD Inbred Strains with preclinical Townes Sickle Cell Mice
Sickle cell anemia (SCA) is an inherited red blood cell disorders that affects more than 100,000 people in the United States and 20 million people worldwide. The condition affects SCA has an incidence of about 1 in every 365 African American births. SCA is caused by a substitution of a valine for glutamic acid at the seventh amino acid in the beta globin (HBB) chain, which produces abnormal sickle hemoglobin (HbS) and red blood cell sickling under hypoxic conditions. Disease severity and the clinical manifestations observed vary greatly in people with SCA, who experience more severe complications compared with other genotypes. Complications are often due to repetitive vaso-occlusive crises under hypoxic conditions that can lead to stroke, organ damage, and early death compared to the general population [1-3]. Therefore, identification of genetic modifiers associated with SCA severity is desirable. We hypothesize that mice strains harboring the sickle cell mutation and sickle cell trait will have worse outcomes, such as organ damage and red blood sickling, compared to those with protective loci. In this study, we will build causal models of SCA severity and identify genetic modifiers by crossing preclinical Townes SCA mice with the BXD recombinant inbred (RI) strains. The Townes mouse model is an excellent model for studying SCA since these mice carry human hemoglobin knock-in genes replacing the endogenous mouse genes that is the same as patients with human sickle cell anemia and/or the sickle cell trait However, the variability in sickle cell disease phenotypes is better studied in a genetically diverse population. The BXD RI mice is one of the best model systems to study genetic effects of SCA [4]. Taking advantage of the BXD RI model system, we propose to: 1) generate a small breeding colony of mice consisting of Townes wild-type (non-sickling) mice and Townes sickling SCA mouse strains to select for mice homozygous for SCA and/or the SCA trait by genotyping at weaning; and 2) backcross female BXD mice with male SCA mice, acquire tissues and bodily fluids to identify deep phenomes associated with SCA severity (Figure 1). In conclusion, we anticipate that crossing the BXD recombinant inbred strain of mice with Townes SCA mouse strains will be a novel and effective approach for studying genetic heritability in pain severity, red blood cell sickling, and organ failure contributing to SCA disease severity.
Jason A. Bubier1, Robyn L. Ball1, Jane Adams2, Yanjiao Zhou3, George M. Weinstock4, Michelle A. Borkin2, Elissa J Chesler1, Dong-binh Tran4, Belinda Cornes4, Vivek M. Philip1
1Center for Systems Neurogenetics of Addiction. The Jackson Laboratory, 600 Main St. Bar Harbor, ME 04605. 2Northeastern University, 420 Renaissance Park, 1135 Tremont Street Boston, MA 02115. 3School of Medicine, University of Connecticut Health Center, Farmington, Connecticut, USA. 06030. 44The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA. 06030.
Conditional inference trees to visualize dense systems genetics data.
Substance use disorders (SUDs) remain a significant public health challenge with negative health, social, and economic consequences. SUD research has mostly focused on the neurologic and genetic components of usage behavior, but there is increasing interest in the gut microbiome's role in the pathogenesis of SUDs. With the ongoing collection of large volumes of SUD-related data, researchers aim to facilitate microbiome research by building large reference data sets to serve as a foundation for new technologies and data analysis tools. Diversity Outbred (DO) mice represent one such new foundational resource as a genetically diverse population readily amendable to genetic mapping. Utilizing a large cohort of DO mice, the Center for Systems Neurogenetics of Addiction at The Jackson Laboratory investigated the role of genomic variation in regulating addition-predictive and addiction-like behaviors. Within this same population, we analyzed the composition of the gut microbiome in both the cecal feces and fecal pellets using 16S and WGS sequencing, respectively, to enumerate the communities of microbes within these matched DO mice. To associate microbes to addiction-relevant behaviors, we identified microbes associated with behavioral phenotypes using elastic net regularized regression and Permutation-based Maximum Covariance Analysis (PMCA). To discover how microbes work together to influence behavior, microbes associated with the behavior were included as covariates in each behavior-specific conditional inference tree analysis, i.e., a tree-structured regression with unbiased covariate selection. Results from these analyses are being used to create the public-facing interactive visualization dashboard. This dashboard will enable the scientific community to interrogate, generate new hypotheses and explore the microbe-microbe interactions that were discovered to influence addiction-related behaviors.
Mackenzie Fitzpatrick1, Alexandria Szalanczy1, Nataley Der1, Michael Grzybowski2, Aron Geurts2, Leah Solberg Woods1
1Wake Forest, Department of Molecular Medicine. 2Medical College of Wisconsin, Department of Physiology.
Transmembrane domain mutation in Adcy3 causes obesity in rats via altered food intake or energy expenditure depending on sex
Obesity is a growing epidemic that is associated with multiple comorbidities, including hyperinsulinemia. The gene adenylate cyclase 3 (Adcy3) has been linked to obesity in both humans and rodent models. Our lab identified a protein-coding variant in the transmembrane domain of Adcy3 that is associated with adiposity in rats. While protein-coding variants in Adcy3 have been associated with obesity in humans, existing rodent studies have only examined Adcy3 knockout (KO) models which have limited translational relevance to humans.
We developed an Adcy3 mutant rat model that has a protein-coding variant (Adcy3mut/mut) in the transmembrane domain that is similar to Adcy3 variants identified in human obesity. We placed wild-type (WT) and Adcy3mut/mut rats on a high-fat diet for 12 weeks and measured body composition, fasting insulin, food intake, and energy expenditure.
Adcy3mut/mut rats of both sexes weigh more than WT rats due to increased fat mass. However, only Adcy3mut/mut males have increased serum insulin. Furthermore, while male Adcy3mut/mut rats consume more food than WT males, they do not have altered energy expenditure. Interestingly, Adcy3mut/mut females expend significantly less energy than WT females without changes in food intake.
We have determined that the underlying cause of obesity in Adcy3mut/mut is sex-dependent, where increased food intake in male rats and decreased energy expenditure in female rats leads to increased adiposity. We also observed sex differences in the development of hyperinsulinemia. Future studies will investigate potential mechanisms for the observed sex differences as well as the molecular mechanisms by which Adcy3mut/mut causes obesity.
Brock, William1, Cai, Yanwei2, Eaddy, J. Scott1, Fallon, John3, Kim, Yunjung2, McGee, T2, Mosedale, Merrie3, Qasem, Rani J3, Roth, Sharin E1, Smith, Phil C3, Valdar, W.1
1Otsuka Pharmaceutical Development and Commercialization, Inc. 2Department of Genetics, University of North Carolina at Chapel Hill. 3UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill.
Genetic Variation Drives Differences in the mRNA and Protein Levels of Key Hepatic Drug Metabolizing Enzymes and Transporters in Collaborative Cross Mice
Genetic variation in drug metabolism and disposition contributes to patient variability in drug response. The Collaborative Cross (CC) mouse genetic reference population is a promising preclinical model to investigate such gene-by-treatment interactions in humans. However, there has been only limited characterization of pharmacogenes in CC mice. The objective of this study was to investigate the contribution of genetic variation to drug metabolizing enzyme and transporter (DMET) protein levels in the CC. Quantitative targeted proteomics was used to measure the levels of 25 key DMET proteins in the livers of 4 male mice from each of 45 CC lines that were vehicle-treated controls in a previous study. Liver gene expression profiling data available from the same mice were also leveraged to investigate possible mechanisms of protein regulation. Strain-dependent differences in the mRNA levels of 21 and the protein levels of 24 of the 25 DMET genes. Significant quantitative trait loci (QTL) were identified for 7 DMET genes and merge analysis was used to identify the underlying gene variants. Mediation analysis was then used to identify the relationship between mRNA and DMET protein levels and at least three different mechanisms of regulation were observed: 1) mediation of local protein QTL, 2) mediation of distal protein QTL, and 3) protein QTL without mediation. Taken together, these findings suggest that genetic variability contributes to differences in CC DMET protein levels, which is mediated, in part, by mRNA. This study also provides a rich dataset to support the design and analysis of future gene-by-treatment studies in the CC.
Rémi Planel1, Victoire Baillet1, Vincent Guillemot1, Jean Jaubert2, Christian Vosshenrich3, Rachel Torchet1, Marion Rincel4, Pascal Campagne1, Xavier Montagutelli2
1Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015 Paris, France. 2Institut Pasteur, Université Paris Cité, Mouse Genetics Laboratory, F-75015 Paris, France. 3Institut Pasteur, Université Paris Cité, Innate Immunity Unit, F-75015 Paris, France. 3Institut Pasteur, Université Paris Cité, Microenvironment and Immunity Unit, F-75015 Paris, France.
CCQTL: facilitating QTL mapping in the Collaborative Cross
Quantitative Trait Locus (QTL) mapping in mapping populations and Genome-Wide Association Studies (GWAS) in natural populations are complementary approaches for dissecting the genetic architecture of complex traits. While GWAS are typically carried out by statistical genetics groups well-versed in quantitative environments and code management, experimental geneticists performing QTL mapping focus on labor-intensive phenotyping experiments thus often requiring further support, both for code and statistics, to benefit from best practices in the field.
We present CCQTL, a comprehensive platform for QTL mapping in the Collaborative Cross (CC), an increasingly used mouse mapping population. CCQTL features an intuitive graphical user interface (GUI) for seamless end-to-end QTL mapping analysis, from data transformation to candidate gene identification. It also includes a robust database structure ensuring secure, organized storage of phenotypic data, accompanied by an advanced permissions system.
CCQTL's analytical component leverages R/qtl2 tools integrated into preconfigured Galaxy workflows designed explicitly for the CC. This setup facilitates one-click, reproducible analyses. The platform's interface (GUI, database, and analytics) is containerized using Docker, enabling straightforward deployment and scalability. While primarily designed to empower non- specialists in conducting their own data analyses, CCQTL's Galaxy-brought reproducibility and sophisticated database permission system also renders it valuable for experienced users seeking streamlined solutions.
Wesley L. Crouse1, Gregory R. Keele2, Madeleine S. Gastonguay2, Gary A. Churchill1, William Valdar1,3
1Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America. 2The Jackson Laboratory, Bar Harbor, Maine, United States of America. 3Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America.
A Bayesian model selection approach to mediation analysis
Genetic studies often seek to establish a causal chain of events originating from genetic variation through to molecular and clinical phenotypes. When multiple phenotypes share a common genetic association, one phenotype may act as an intermediate for the genetic effects on the other. Alternatively, the phenotypes may be causally unrelated but share genetic loci. Mediation analysis represents a class of causal inference approaches used to determine which of these scenarios is most plausible. We have developed a general approach to mediation analysis based on Bayesian model selection and have implemented it in an R package, bmediatR. Bayesian model selection provides a flexible framework that can be tailored to different analyses. Our approach can incorporate prior information about the likelihood of models and the strength of causal effects. It can also accommodate multiple genetic variants or multi-state haplotypes. Our approach reports posterior probabilities that can be useful in interpreting uncertainty among competing models. We compared bmediatR with other popular methods, including the Sobel test, Mendelian randomization, and Bayesian network analysis using simulated data. We found that bmediatR performed as well or better than these alternatives in most scenarios. We applied bmediatR to proteome data from Diversity Outbred (DO) mice, a multi-parent population, and demonstrate the power of mediation with multi-state haplotypes. We also applied bmediatR to data from human cell lines to identify transcripts that are mediated through or are expressed independently from local chromatin accessibility. We demonstrate that Bayesian model selection provides a powerful and versatile approach to identify causal relationships in genetic studies using model organism or human data.
Jennifer R Smith1, Stanley JF Laulederkind1, G Thomas Hayman1, Shur-Jen Wang1, Monika Tutaj1, Mary L Kaldunski1, Mahima Vedi1, Wendy M Demos1, Marek A Tutaj1, Jyothi Thota1, Logan Lamers1, Adam C Gibson1, Akhilanand Kundurthi1, Kent C Brodie2, Stacy Zacher3, Jeffrey L De Pons1, Melinda R Dwinell1, Anne E Kwitek1
1Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, 53226, USA. 2Clinical and Translational Science Institute, Medical College of Wisconsin, Milwaukee, WI, 53226, USA. 3Finance and Administration, Medical College of Wisconsin, Milwaukee, WI, 53226, USA.
Upgraded genome browsers at the Rat Genome Database support comparative and translational studies
The Rat Genome Database (RGD, https://rgd.mcw.edu), a cross-species knowledgebase and the premier online resource for rat genetic and physiologic data, has recently added JBrowse 2 to our suite of innovative analysis tools (https://rgd.mcw.edu/jbrowse2/). JBrowse 2 is an innovative genome browser with improved functionality for structural variant and comparative genomics visualization. While sharing many features with its predecessors GBrowse and JBrowse 1—the ability to select and view multiple data types in a single view, easy zooming and navigation across a chromosome, the ability to search for the name or ID of a genome feature such as a gene or QTL and go directly to that region of the genome—JBrowse 2 provides expanded functionality designed to facilitate comparative studies. For example, JBrowse 2's browsers are opened in sub-windows embedded in the larger browser window allowing the user to simultaneously view multiple regions, either within a single assembly or from different assemblies/species, in a single view. JBrowse 2 also supports multiple advanced "view types" including the linear synteny and breakpoint split views which use a stacked linear genome configuration to show syntenic alignments and connections between long split alignments or paired end reads across multiple chromosomes, respectively. A circular view gives an overview of chromosomal translocations and a dotplot view provides a comparison of whole genome alignments.
JBrowse 2 browsers have been set up for all ten RGD species, including browsers for multiple assemblies for each species where those are available. For rat in particular, a substantial set of tracks is available. In addition to gene, QTL and strain tracks, RGD's JBrowse 2 provides tracks for RNA-Seq BAM alignments, strain-specific variants, consolidated variants from the European Variant Archive, and most recently, ATAC-Seq and ChIP-Seq epigenetics data aligned onto mRatBN7.2.
In addition, for those interested in synteny across more than two species, RGD is developing the Virtual Comparative Map tool (VCMap, https://rgd.mcw.edu/vcmap/). VCMap, currently released as a beta version, provides a bird's eye view of synteny between two or more species/assemblies, whereas JBrowse2's synteny viewer is limited to only two. The most recent version of VCMap has improved performance and navigation for comparisons of rat, mouse and human syntenic regions, and now includes heatmap views of genomic variant densities for the three species. RGD's JBrowse 2 and VCMap provide valuable functionality for researchers engaging in comparative genomics and translational medicine.
Shur-Jen Wang1, Wendy M. Demos1, G. Thomas Hayman1, Mary L. Kaldunski1, Stanley J Laulederkind1, Jennifer R. Smith1, Monika Tutaj1, Mahima Vedi1, Stacy Zacher3, Kent C. Brodie2, Jeffrey L. De Pons1, Akhilanand Kundurthi1, Logan Lamers1, Jyothi Thota1, Marek A. Tutaj1, Melinda R. Dwinell1, Anne E. Kwitek1
1The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA. 2Clinical and Translational Science Institute, Medical College of Wisconsin. 3Finance and Administration, Medical College of Wisconsin.
Rat Genome Database: An Integrated Rat Phenomics and Genomics Data Resource
Rattus norvegicus is the choice of complex disease models for biomedical research. These models, arising from spontaneous mutations, selective breeding, or genome manipulation, carry unique genome compositions and thus exhibit specific phenotypes, e.g. susceptible, or resistant to a targeted disease. The Rat Genome Database (RGD) stores the available genome variations in these models and annotates them with controlled vocabularies. These curated disease models have been integrated into 15 disease portals at RGD and users can also find specific strains using a disease or phenotype term in the “Find Models” tool. The genomic variants of sequenced strains can be compared using the “Variant Visualizer” tool. Recently RGD started curating genome-edited mutations from mutant strains from publications and data submissions from researchers. This is an ongoing effort to capture the rich data produced by the rat research community. The RGD-curated variants, from sequence analyses or manual curation, are available in the genomic tools housed at RGD for studying genome variation. To link quantitative phenotypes to genotypes, RGD has worked on modifying the quantitative phenotype tool “Phenominer” to curate and display individual animal data. All these quantitative data with available genomic data are integrated into RGD analysis tools and will be presented in the “project page” designated for a specific project or study from research laboratories. Linking phenotypes and genomic variations will provide a useful approach to understanding the complexity of physiological genomics.
by Pjotr Prins, Rob W. Williams & the growing GeneNetwork team
2023 Update on GeneNetwork.org
Our GeneNetwork web-service (GN – formerly WebQTL) was developed for the needs of the mouse community, and represents over 25 years of research, data and software. Today we are growing support for other species, including synteny and pangenome-driven genotyping. In this talk we will discuss reorganization of the informatics of biomedical research and how that can go a long way towards addressing two problematic areas common to both human and model organism analysis: access to massive and complex ‘omics’ data, and the need for robust but accessible systems for analysis of joint statistical/causal models that produce useful predictions. We greatly improved search facilities and we are adding flexible data end-points exposing data and metadata. Combining and integrating datasets from old and new research, or de-siloing, will lead to novel connections and ideas followed by insight. Rather than having researchers creating their own silos we promote interaction and sharing of both data and tools and turn discovery into health. That also means we actively invite new communities to contribute their data to GeneNetwork and we can organize workshops to introduce data submission and analysis options. Finally we’ll announce the GeneNetwork consortium that will allow us to write a paper on more than 25 years of GeneNetwork related efforts!
Mary L. Kaldunski, Kent C. Brodie, Jeff L. De Pons, Wendy M. Demos, G. Thomas Hayman, Akhilanand Kundurthi, Logan Lamers, Stanley J.F. Laulederkind, Lynn Malloy, Rebecca Schilling, Jennifer R. Smith, Akiko Takizawa, Jyothi Thota, Marek A. Tutaj, Monika Tutaj, Mahima Vedi, Shur-Jen Wang, Stacy M. Zacher, Melinda R. Dwinell, Anne E. Kwitek
Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI 53226, USA.
Enhanced Hybrid Rat Diversity Panel resources at RGD
Genetic susceptibility to disease, sensitivity to environmental elements, and pharmacogenomics are critical components of precision medicine. Inbred rat strains control for genetic background and allow for reproducible molecular, cellular, and whole animal phenotyping. The Hybrid Rat Diversity Panel (HRDP), a panel of inbred rat strains, was carefully selected to maximize genetic and phenotypic diversity and thereby maximize the power to detect and fine-map genetic loci associated with complex traits. The HRDP includes 35 genetically diverse inbred strains plus two panels of recombinant inbred rat strains. The Hybrid Rat Diversity Program at the Medical College of Wisconsin is rederiving, sequencing, maintaining, and distributing the HRDP strains, including the available founder substrains of the Heterogeneous Stock (HS) to the scientific community. The Hybrid Rat Diversity Panel portal at the Rat Genome Database (http://rgd.mcw.edu) is a primary point of access for data related to these strains. The portal page contains information about the resource, lists the strains, and indicates those with sequenced genomes. The strains have been well characterized through studies focused on seizures, epilepsy, lymphoma and leukemia, blood pressure regulation, metabolic syndrome, alcohol consumption, and other conditions. RGD has recently completed a project to curate quantitative and qualitative phenotypic data from published literature for the HRDP/HS founder strains or related substrains. This data is housed in RGD’s PhenoMiner database and is a rich resource for studying HRDP diversity. RGD’s Variant Visualizer tool has been loaded with the variant data for the available sequenced strains, and vcf files are available for download. Transcriptomic and epigenomic data generated by the scientific community for these strains will also be incorporated into the portal through focused curation of publicly available data. The HRDP is a resource that combines the power of genetic stability within strains, power for genomic mapping strategies, and strength in animal models that mirror many human disease traits.
Kai Li, Elizabeth Hudson, Melissa Laird-Smith, Peter Doris, and Ted Kalbfleisch
Reference Genome Assemblies Built from PacBio HiFi long reads, Proximity Liga=on, and Op=cal Mapping Technologies for Inbred Rat Strains Important as Models of Complex Disease
Here we describe our work genera=ng reference quality genome assemblies for 6 rat strains that are important as models for complex disease. These strains include the stroke-prone spontaneously hypertensive SHRSP/BbbUtx, and stroke-resistant SHR/Utx , Wistar/Kyoto WKY/Bbb, Brown Norway BN/NHsdMcwi, Long Evans/Stm, and Fischer 344/Stm. During the two year course of this project we have used contemporary cuZng edge technologies, and have modified workflow to incorporate newly emerging methods to produce highly con=guous and accurate assemblies for these strains. Here we present the workflows used to produce these genomes, and measures of con=guity, completeness, and accuracy of the assemblies. The genomes for the two spontaneously hypertensive strains, as well as the Wistar/Kyoto strain have been deposited at NCBI and have been annotated using the NCBI automated pipeline. The Brown Norway genome assembly has been reviewed by the NCBI Genome Reference Consor=um and will be adopted as a new rat reference assembly (GRCr8) pending comple=on of annota=on. Four addi=onal strains (Dahl SS/Jr, Dahl SR/Jr, Lyon Hypertensive, Lyon Normotensive) are currently in process and will be completed within the next year.
Oksana Polesskaya1, Ely Boussarty2, Riyan Cheng1, Thiago Misfeldt Sanches1, Mika Okamoto1, Thomas Zhou2, Olivia LaMonte2, Abraham A. Palmer1,3, Rick A. Friedman2,4
1Department of Psychiatry, University of California San Diego, La Jolla, CA, 92093, USA. 2Department of Otolaryngology, University of California San Diego, La Jolla, CA, 92093, USA. 3Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, 92093, USA. 4Department of Surgery, University of California San Diego, La Jolla, CA, 92093, USA
Genome-wide association study finds multiple loci associated with age-related hearing loss in CFW mice
Background. Age-related hearing loss (ARHL) is influenced by environmental and genetic factors. ARHL is the most common cause of hearing loss and is one of the most prevalent conditions affecting the elderly globally. Investigating the genetic basis of ARHL in outbred mouse model may lead to a better understanding of the molecular mechanisms of this condition. The mouse and human inner ears are functionally and genetically homologous. Panels of inbred mice have been successfully used to identify several candidate loci. The current study used Carworth Farms White (CFW) outbred mice, because this strain has variation in the onset and severity of ARHL. The goal of this study was to identify genetic loci involved in regulating ARHL. Hearing at a range of frequencies was measured using ABR thresholds in male and female CFW mice at the age of 4-, 6-, and 14-month-old. Results. We obtained genotypes at ~2.5 million single nucleotide polymorphisms (SNP) using low-coverage WGS followed by imputation using STITCH. The reference panel was constructed from genotypes of 3,234 CFW mice that were sampled from the same population of commercially available CFW outbred mice, Crl:CFW(SW)-US_P08. To determine the accuracy of the genotyping we sequenced 8 samples at >30x coverage and used them to estimate the discordance rate, which was 0.53%. We performed genetic analysis for the ABR thresholds in ~800 CFW mice for each frequency for each age, ABR difference between 6 and 10 months for each frequency, and time of onset of deafness, defined as ABR threshold > 100 dB, for each frequency. The heritability ranged from 0 to 27% for different traits. Genome-wide association analysis identified seven regions associated with ARHL. A locus on chromosome 1 contained the gene Arhgef4 which plays a role in signal transduction and cytoskeleton assembly. Arhgef4 was not previously reported in relation to hearing loss, but the gene family member Arhgef6 is known to play a role in outer hair cell stereocilia survival during hearing loss. Other associated regions also contained potential candidate genes. This is work is ongoing, our final target sample size is 2,000 CFW mice. Conclusion. We performed GWAS for ARHL in CFW outbred mice and identified several QTLs, containing multiple candidate genes, notably Arhgef4. This work helps to identify genetic risk factors for ARHL and to define novel therapeutic targets for ARHL prevention.
Ling Li1, Deihui Kong1, Aijun Zhang1, Zhiping Wu2, Ariana Mancieri2, Laura Saba3, Hao Chen4, Michal Pravenec, Junmin Peng2,6, Robert W. Williams1, Xusheng Wang1,7
Correspondence: xwang39@uthsc.edu
1Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38103 USA. 2Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN 38105 USA. 3Department of Pharmaceutical Sciences, University of Colorado Denver, Aurora, CO 80045 USA. 4Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Memphis, TN 38103 USA. 5Institute of Physiology, Czech Academy of Sciences, 14220 Prague, Czech Republic. 6Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN 38105 USA. 7Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN 38105 USA.
Genetic Regulation of Protein Expression in Rat Brain
Genetic variation in protein expression has been implicated in a broad spectrum of common diseases and complex traits. However, the fundamental genetic architecture and variation of protein expression have received far less attention than either mRNA or classical phenotypes. In this study we systematic quantified protein, mRNA, and metabolites in the brains of a large family rats. Using tandem mass tag (TMT)-based quantitative mass-spectrometry (MS) technology, we identified and quantified 8,119 proteins of SHR/Olalpcv and BN-Lx/Cub, and 29 of their fully inbred HXB/BXH progeny. Differential expression (DE) analysis identified 597 proteins with significant differences in protein expression between parents (fold change >2 and FDR <0.01). We discovered 318 proteins linked to strong cis-acting quantitative trait loci (pQTLs, FDR < 0.05). We use phenome-wide association to test for down streams effects of pQTLs on the already extensive HXB family phenome. Collectively, this work demonstrates the value of large and systematic proteo-genetic datasets to understand the control of protein modulation in brain and function linkage on complex CNS traits.
Joel D Leal-Gutiérrez1, Apurva S Chitre1, Thiago Sanchesg1, Oksana Polesskaya1, Katie Holl2, Jianjun Gao1, Riyan Cheng1, Hannah Bimschleger1, Angel Garcia Martinez3 , Tony George4, Alexander F Gileta1,5, Wenyan Han3, Aidan Horvath6, Alesa Hughson6, Alexander Lamparelli7, Cassandra L. Versaggi7, Connor Martin4, Celine L St. Pierre8, Jordan A Tripi7 , Tengfei Wang3, Hao Chen3, Shelly B Flagel9, Paul Meyer7, Jerry Richards4, Terry E Robinson10, Keita Ishiwari4, Leah C Solberg Woods11, Allegra Aron12, Robin Schmid12 , Pieter Dorrestein12, Dimitri Krementsov13, Amelie Baud14, Abraham A Palmer1,15
1Department of Psychiatry, University of California San Diego, La Jolla, CA, 92093, USA. 2Human and Molecular Genetic Center, Medical College of Wisconsin, Milwaukee, Wisconsin, USA 3Department of Pharmacology, University of Tennessee Health Science Center, Memphis, Tennessee, USA 4Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, New York, USA 5Department of Human Genetics, University of Chicago, Chicago, Illinois, USA 6Department of Psychiatry, University of Michigan, Ann Arbor, Michigan, USA 7Department of Psychology, University at Buffalo, Buffalo, New York, USA 8Department of Genetics, Washington University, St. Louis, Missouri, USA 9Molecular and Behavioral Neuroscience Institute, University of Michigan, Ann Arbor, Michigan, USA 10Department of Psychology, University of Michigan, Ann Arbor, Michigan, USA 11Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA. 12Skaggs school of pharmacy and Pharmacuetical Sciences 13Department of Biomedical and Health Sciences, The University of Vermont, Burlington, Vermont, USA. 14Centre for Genomic Regulation, Barcelona, Spain. 15Institute for Genomic Medicine, University of California San Diego, La Jolla, California, USA
Genetic basis of cecum metabolome composition in heterogeneous stock rats identifies ABC and SLC transporters and enzymes including CES, CYP, and UGT as associated with metabolome abundance
Background: The qualitative and the quantitative collection of low-molecular-weight molecules or metabolome present in the cecum is influenced by genetic differences among individuals, and this variability can alter complex behavioral and physiological phenotypes. The cecum metabolome composition is also influenced by the production of metabolites by the gut microbiota [1]. Materials and Methods: The NIH HS colony was used. This colony was established using eight founder inbred strains: ACI/N, BN/SsN, BUF/N, F344/N, M520/N, MR/N, WKY/N, and WN/N [2]; 1,062 rats were used for cecum metabolome abundance quantification. Rats were from two research centers and genotyped for 5.4 million markers. Variant mapping was performed using the Rnor_7.2 genome assembly. The apical region of the cecum was collected from each rat after euthanasia and used to generate fecal metabolite extracts. The metabolomic procedure was performed using a Vanquish core ultra-high performance liquid chromatography system attached to a QExactive quadrupole-orbitrap mass spectrometer. Positive mode MS data was obtained using a data dependent acquisition method where the five most abundant ions were identified and subsequently scanned for MS/MS fragmentation via collision-induced dissociation. Thermo proprietary MS files were transformed into .mzML files. The open-source software MZmine3 was used for feature detection. Output from the MZmine3 was used for metabolite annotation using the software Sirius. Metabolites with at least 50% presence were retained. A total number of 4,861 metabolites were available and tested. Each observed phenotype was quantile normalized and adjusted by batch number, center, and sex using a linear model estimation. Residuals were fitted in a GWAS analysis applying the GCTA-LOCO (leave-one-chromosome-out) approach. We used a significance threshold of -log10(p)=5.8 which was based on a permutation analysis. Results: 76.8% of the tested metabolites were annotated indicating the likely molecular identity; 777 metabolites showed significant associations and 63% of the of them involved Terpenoids. We observed numerous loci that were associated with multiple metabolites, which we refer to as transbands. Some of the loci harboring the highest number of associated polymorphisms included chr14:13,718,747-29,565,628 (Figure 1) and chr20:4,097,028-18,158,649, including 149 and 52 associated metabolites, respectively. Identified genes inside the associated regions involved metabolite transporters such as ATP Binding Cassette Subfamily (ABC) and Solute Carrier Family (SLC), and genes encoding enzymes involved in metabolite synthesis, including Carboxylesterase (CES), Cytochrome P450 Family (CYP), and UDP-Glucuronosyltransferase Family (UGT). Conclusions: Multiple association transbands for metabolome abundance were identified, and genes involved in metabolome transport and synthesis were found.
Yingping Wang1, Corinne Hennessy1, Kristina Hatakka1, Stephen Humphries2, Evgenia Dobrinskikh2, David Clouthier3, Ivana V. Yang*, David A. Schwartz1
1Department of Medicine, University of Colorado School of Medicine, Aurora, Colorado, 80204, USA 2Department of Radiology, National Jewish Health, Denver, Colorado,80218, USA 3Department of Craniofacial Biology, University of Colorado School of Medicine, Aurora, Colorado, 80204, USA
Genetic locus associated with bleomycin induced lung fibrosis in mice
Idiopathic pulmonary fibrosis (IPF) is a complex genetic disease involving both rare and common genetic variants in its pathogenesis. One prominent common gain-of-function variant, rs35705950, is associated with IPF. However, the low penetrance of this variant remains a puzzle. Our previous study established a link between Muc5b expression and bleomycin induced lung fibrosis that varies among the eight founder mouse strains of Diversity Outbred (DO) mice, all carrying the risk T allele at the orthologous site of rs35705950. Building upon these observations, we hypothesize that genetic factors influencing bleomycin induced Muc5b expression have an impact on the risk of developing pulmonary fibrosis.
In this study, we computed 91 quantitative tomography (CT) indexes assessing bleomycin induced pulmonary fibrosis in 546 DO mice and conducted Quantitative Trait Locus (QTL) mapping. These mice underwent intratracheal bleomycin administration at 12 weeks of age and were harvested and phenotyped 10 weeks later. Mice were genotyped using the GigaMUGA array, providing 143,259 informative markers for genetic mapping. To evaluate lung fibrosis, we employed a technique for automated lung segmentation from microCT scans and computed various radiomic features using the pyradiomics software library. Genetic loci were identified by fitting mixed-effects linear models at each genetic marker and regressing CT indexes on haplotype probabilities.
The CT indexes among the 546 mice exhibited a broad spectrum, including Joint-Entropy values ranging from 6.207 to 9.043 (Figure 1), reflecting variation in lung fibrosis levels influenced by their distinct genetic compositions. A QTL peak (Chr7: 131.9-133.2 Mbp) located 9Mbp from the Muc5b gene was identified, influencing 29 CT indexes, and exhibiting significant associations with 4 (Table 1, Figure 2A). Strain PWK/PhJ displayed the strongest allele effect within this genomic interval with a notable positive impact on Joint-Entropy (Figure 2B) and a negative impact on Joint-Energy. Entropy and Energy are indices for quantifying CT images. Entropy measures intensity variability in the image neighborhood, while energy quantifies the presence of homogeneous patterns. Previous research has linked Entropy negatively and Energy positively with radiation-induced lung injury [1]. This QTL support interval containing 9 genes and 38 top SNPs, all deriving from PWK/PhJ (Figure 2C). These findings strongly suggest that this locus including several interesting genes may counteract the effect of Muc5b or act independently, reducing bleomycin induced lung injury and fibrosis. This conclusion aligns with our earlier work identifying PWK/PhJ as having low bleomycin induced Muc5b expression and minimal lung fibrosis [2].
Cara Green1,2, Michaela Murphy1,2, Isaac Grunow1,2, Yang Liu1,2, Reji Babygirija1,2, Mariah Calubag1,2, Shelly Sonsalla1,2, Astrid Martin1,2, Yang Yeh1,2, Dudley Lamming1,2
1Department of Medicine, University of Wisconsin-Madison, Madison, WI 53705, USA 2William S. Middleton Memorial Veterans Hospital, Madison, WI 53705, USA
Heterogeneity in the impact of dietary protein on metabolic health highlights the importance of precision dietetics
Low protein (LP) diets can improve metabolic health without caloric restriction and may be effective to promote healthy aging and combat diabetes and obesity. Many dietary recommendations exist at the population level, but individual information about the metabolic response to diet is lacking. In mice, a LP diet can promote weight loss, improve glycemic control, and increase lifespan however, this is sex and strain dependent. It is unknown which key genes may be responsible for determining the individual response to dietary protein. To identify genetic markers that may determine how dietary protein impacts metabolism, we characterized 40 recombinant inbred strains of male and female BXD mice. We found huge variation across strains and sexes in the response to protein restriction (PR), including in weight loss, adiposity, and fasting blood glucose. PR promoted positive and negative responses depending on strain and sex; male mice could lose 4g or gain 6g after 8 weeks on PR depending on strain. One of the phenotypes almost universally improved by PR was fasting blood glucose, this was reflected in correlation analyses, where 11 strains of mice showed a strong positive correlation (R>0.4) between protein intake and fasting blood glucose, relative to only 5 strains with total calorie intake. Interestingly, there was very little correlation of protein with final lean mass with 4 strains showing a positive (R>0.4) correlation with protein intake, however 21 BXD strains had a positive correlation between calorie intake and final lean mass, suggesting that calories, and not protein in the diet is a modulator of fat free mass in mice. Quantitative trait locus (QTL) analysis to discover links between these complex phenotypes and chromosome regions indicated there were no significant regions of interest conserved between males and females for any of the phenotypes we investigated. QTL analyses found genomic regions significantly associated with changes in lean mass and glucose tolerance with PR in males and females. In females, fasting blood glucose and fat mass were significantly associated with different genomic regions. These data show that the metabolic health effects of dietary protein are highly individualized based on sex and genetic background. This demonstrates the importance of precision dietetics to maximize metabolic health, and the potential significance of personalized dietary strategies if a similar response exists in humans. In the future, this may help us to promote healthy aging and improve metabolic health on an individual basis.
David J. Samuelson1, Michelle T. Barati2, Kathy J. Krentz3, C. Dustin Rubenstien3
1Department of Biochemistry & Molecular Genetics, University of Louisville 2Department of Medicine, Division of Nephrology & Hypertension, University of Louisville 3Biotechnology Center, University of Wisconsin-Madison
Rat MIER family member 3 (Mier3) is involved in spermatogenesis.
MIER family member 3 (MIER3) is moderately and highly expressed in seminiferous ducts and Leydig cells of testis tissue, respectively; however, nothing is known about MIER3 function in spermatogenesis or testosterone production. CRISPR/Cas9 was used to target Mier3 in Sprague Dawley (Hsd:SD) rats (Envigo). Three founders (F0) with unique targeted mutations (tms) resulting in premature stop codons, were backcrossed to Hsd:SD to develop SD-Mier3tm strains. Rat Mier3tm had a negative effect on male fertility. Male Mier3tm/tm rats were infertile or sub-fertile. One out of 11 (9%) homozygous Mier3tm/tm males sired litters compared to 26 out of 27 (96%) hemizygous Mier3tm/+ males (P<0.0001). Adult Mier3tm/tm rats had significantly lower testis mass, testis midsagittal area, and sperm concentration (P values < 0.05). Histological analysis of testis tissue revealed a likely defect in spermatogenesis. The negative effect of Mier3tm on fertility appeared to be sex-limited, as homozygous SD-Mier3tm/tm females were fertile (litter bearing) and lactated (raised pups to weaning age). These Mier3tm rat strains are a resource to study causes and mechanisms of male factor infertility.
Wendy M. Demos1*, Kent C. Brodie2, Jeff L. De Pons1, Adam C. Gibson1, G. Thomas Hayman1, Mary L. Kaldunski1, Akhilanand Kundurthi1, Logan Lamers1, Stanley J.F. Laulederkind1, Jennifer R. Smith1, Jyothi Thota1, Marek A. Tutaj1, Monika Tutaj1, Mahima Vedi1, Shur-Jen Wang1, Stacy M. Zacher1, Melinda R. Dwinell1, Anne E. Kwitek1
1Rat Genome Database, Dept. of Physiology 2Clinical and Translational Science Institute 3Finance and Administration; Medical College of Wisconsin, Milwaukee, WI 53226 USA
Expansion of Rat Expression Data at the Rat Genome Database
The Rat Genome Database (RGD) is expanding expression data content and incorporating the data into the larger ecosystem of the RGD database so users can seamlessly query for coherent gene information across data portals. Researchers will be able to access expression data that was submitted to public resources such as the Gene Expression Omnibus (GEO) data repository, with all data converted to TPM data type. In Phase One of the project, an Expression Curation Tool was developed to aid in automated as well as comprehensive manual curation of public datasets. The Expression Curation Tool relies on a pipeline that imports data from the GEO Accession Display and utilizes Natural Language Processing to match ontology terms to GEO accession attributes. Curators can enter missing terms, confirm the predicted term, or provide a more specific term when appropriate. Fields for descriptors such as tissue type, vertebrate trait, clinical measurement, strain, cell type, experimental condition, etc. are built into the user interface. A term is entered a single time for a GEO Accession and propagated across all applicable samples with the option to edit on a per-sample basis. When the metadata is correct and as complete as possible, it is loaded into the appropriate tables in RGD’s relational database and the user-submitted expression values are loaded for all genes in the corresponding files. To date, 101 studies have been uploaded through the new Expression Curation Tool. Currently, RGD has imported 1,858 GEO accessions related to rat expression studies. Of those, 800 have been reviewed and prioritized for curation. Data types submitted to the repository represent a wide range of analysis outputs (i.e., FPKM, counts, log2FC). The declared reference assemblies in the reviewed GEO accessions include rat assemblies rn4 -mRatBN7.2 as well as custom and non-rat references, making it difficult to correlate expression values across studies. In Phase Two, RGD is developing and evaluating a pipeline to standardize expression data across studies by remapping the data to the most current rat reference genome assembly. The pipeline downloads and converts fastq files from the Sequence Read Archive to generate TPM data aligned to the most recent and complete genome assembly. This pipeline integrates quality control measures, alignment with the STAR1 aligner, and abundance estimation with the RSEM2 software package. In Phase Three expression data will include the current tabular-based views as well as updated graphical data visualizations at the gene and transcript levels.
Trivett, C.1, Graham, D.1, McBride, M.W.1
1School of Cardiovascular and Metabolic Health, University of Glasgow, UK.
Quantifying the Early Cardiac Transcriptome in Rat Models of Left Ventricular Hypertrophy with or without Predisposition to Hypertension.
Increased left ventricular mass index (LVMI), an independent risk factor for cardiovascular morbidity and mortality, develops non-uniformly in hypertensive populations. An F2 cross from SHRSP and WKY strains identified a QTL on chromosome 14 (chr14) for LVMI. Two contrasting chr14 congenic strains were generated on both backgrounds. ardiac phenotypes of WKY.SPGla14a and SP.WKYGla14a strains diverge from background strains, where both SHRSP and WKY.SPGla14a develop increased LVMI and cardiac fibrosis, despite divergent blood pressure profiles. RNA-sequencing is an unbiased assessment of the transcriptome which is influenced by pipeline of analysis. We utilised 3 pipelines of RNA-seq analysis followed by differential gene expression (DESeq2) to explore the cardiac transcriptome during early development (GD18.5) in WKY, SHRSP and chr14 congenic strains (n=3 per group). The role of alternative splicing was investigated using differential transcript usage (DTUrtle) analysis following transcript-level quantification. HiSAT2 alignments followed by (a) FeatureCounts or (b) StringTie de novo assembly were compared with (c) Kallisto pseudo-alignment to mRatBN7.2 genome. FeatureCounts identified the least number of genes (24,057). StringTie assembled over 90,000 transcripts of which 54,930 were annotated in the reference genome (sensitivity = 99.1%, precision = 54.8%), resulting in 30,552 annotated genes. Kallisto identified the most genes with 30,562 annotated genes quantified after pseudo-alignment. Compared to WKY, SHRSP and WKY.SPGla14a showed significantly different expression of over 800 genes (939 and 813 genes respectively, FDR < 0.05, logFC +/- 1). Ingenuity Pathway Analysis (IPA) showed differentially expressed genes were enriched for biological processes decreasing oxidative phosphorylation and increasing mitochondrial dysfunction in both WKY.SPGla14a and SHRSP vs WKY. Fewer changes were detected in the SP.WKYGla14a vs SHRSP (226 genes, FDR <0.05, logFC +/-1). There were 1177 genes across all three comparisons which showed differential transcript usage, of which only 170 were shared in DGE analyses. Genes showing DTU were associated with phenotypes including increased blood pressure and abnormal heart/ventricle morphology in comparisons vs WKY strain. The chr14 congenic region significantly alters gene expression and transcript usage in genetic models of hypertension and cardiac remodelling in adulthood. During cardiac development in utero, future dysfunction is potentially primed by continual dysregulation of key genes regulating cardiac energy metabolism. Before the development of adverse phenotypes, genes associated with cardiac fibrosis and cardiac hypertrophy are differentially expressed between genetic models, where SHRSP genome confers greater risk. Alternative splicing may implicate an alternate set of genes with important roles in cardiac morphology and blood pressure control.
Andrew R Milner1*, Ashley C Johnson1,2, Esinam M Attipoe1, Lavanya Challagundla2, Michael R Garrett2,3
1Department of Experimental Therapeutics and Pharmacology 3Department of Cell and Molecular Biology, University of Mississippi Medical Center, Jackson, MS, USA
From genes to proteins: exploring nephron deficiency in the HSRA rat model
Objective: The HSRA (Heterogenous Stock-derived model of Unilateral Renal Agenesis) rat serves as a unique model of unilateral renal agenesis. Approximately 50-75% of offspring in this model is born with a single kidney, referred to as HSRA-S. In addition to failure of the one-kidney to develop, the remaining kidney in HSRA-S rats exhibits 20% less nephrons when compared to a kidney from their two-kidney siblings, labeled HSRA-C [1]. Such a deficit has been implicated in the heightened susceptibility of HSRA-S compared to uninephrectomized (HSRA-UNX) littermates to develop conditions like hypertension and chronic kidney disease (CKD) [2]. Our study seeks to uncover the molecular bases of this nephron deficit by contrasting transcriptomic and proteomic profiles of the two cohorts at 4 weeks of age. Methods: We isolated kidney tissue from both HSRA phenotypes. For transcriptomic analysis, we employed single nuclei RNA-sequencing (snRNAseq) on three male rats from each phenotype. In parallel, discovery proteomics (LC-MS/MS) was undertaken on a larger cohort, comprising nine HSRA-S and twelve HSRA-C rats. Results: The snRNAseq approach led to the identification of distinct clusters representing mature kidney-specific cell types (Figure 1a). Within these clusters, we observed differentially expressed genes, many of which have prior associations with kidney developmental and functional pathways. On the proteomic side, our data revealed significant alterations in expression of 366 proteins based on an unadjusted p-value criterion (pval < 0.055). However, a stringent filter using |logFC| > 1 narrowed this list to 23 proteins showing pronounced variations (Figure 1b). An interesting observation was the limited concordance between our transcriptomic and proteomic datasets. Conclusion: While our transcriptomic findings in HSRA-S rats provide crucial molecular insights into the underlying causes of nephron deficits, the discordance with our proteomic data may suggest a complex relationship. This may involve post-transcriptional modifications, differential protein turnover, or other regulatory mechanisms in the HSRA-S phenotype. Figure 1 (A). UMAP from 4wk HSRA snRNAseq. Clusters of mature kidney cell-types were identified along with DEGs related to kidney development and function. (B) Volcano plot of HSRA-C vs HSRA-S discovery proteomics differential analysis. Highlighted proteins are highly significant (p<0.055) with |logFC| > 1.
Robyn L Ball*, Hongping Liang, Molly A Bogue, Vivek M Philip, Elissa J Chesler
The Jackson Laboratory, Bar Harbor Maine, 04609, USA
*robyn.ball@jax.org
GenomeMUSter: A comprehensive mouse variation analytical resource for complex trait analysis
Over the past few decades, hundreds of mouse strains have been genotyped at varying densities. More recently, a subset of strains, including the Collaborative Cross, BXDs, and classical inbred strains have undergone whole genome sequencing, resulting in more dense variant data across the genome. Together, these data could allow for multiple advanced applications in complex trait analysis, such as GWAS meta-analysis, over new and archival sets of complex disease trait data across tremendous genetic diversity. However, these analyses are not readily possible due to the sparsity of the genotype data available and the state of existing data resources, which are not easily accessible or readily combined for use in analytical pipelines. To address these limitations, we created a web- and programmatically-accessible data service called GenomeMUSter (https://muster.jax.org) comprised of uniformly dense allelic state data for 657 inbred mouse strains at 106.8M segregating sites. We merged and harmonized 16 variant datasets and imputed missing genotypes using the Viterbi algorithm with a data-driven approach that incorporates local phylogenetic information. The median imputation accuracy on held-out test sets was 0.944 with interquartile range [0.923, 0.991]. We evaluated the utility of GenomeMUSter for genetic discovery of complex traits using GWAS meta- analysis on phenotypic endpoints related to Type 2 Diabetes and addiction. The results indicate that mouse multi-trait meta-analyses not only produce disease-relevant information but also facilitate the characterization of their role in disease. For example, our analysis of genes related to multi-substance use in humans reveals that the GWAS candidate gene Pde4b likely plays a role in withdrawal- mediated response to the drug, and it could represent a broad mechanism of substance use disorder across drug classes.
Acknowledgements Funding provided by NIH DA037927, DA028420, P50 DA039841 and by The Jackson Laboratory, The Cube Initiative Program Fund. David G. Ashbrook, Lu Lu, Robert W. Williams and the BXD sequencing effort were supported by the UT Center for Integrative and Translational Genomics and funds from the UT-ORNL Governor’s Chair. Gary Peltz and Fang Zhuoqing were supported by NIH/NIDA (1 U01 DA044399-01), awarded to Dr. Peltz. We gratefully acknowledge Gary A Churchill, Zhouqing Fang, Gary Peltz, Lu Lu, Robert W Williams, and David A Ashbrook for providing variant datasets and assisting with their incorporation into GenomeMUSter; Anna Lamoureux, John Bluis, Matthew Kim, Alexander K Berger, Sejal Desai, Beth A Sundberg, and David O Walton for the user interface design and development; Baha El Kassaby, Francisco Castellanos, Govind Kunde-Ramamoorthy, and Carol J Bult for variant accessioning and annotations; Anuj Srivastava, Matthew W Gerring, Hao He, Keith Sheppard, and Jake Emerson for data analysis and services; Alexander S Hatoum and Arpana Agrawal for their scientific expertise and data for the multi-substance use meta-analysis; Lisa Tarantino, J David Jentsch, Jane S Adams, and members of the Computational Sciences Service at The Jackson Laboratory supported by the JAX Cancer Center Support Grant (P30 CA034196) for expert assistance with the work described in this presentation.
Anna L Tyler1*, J Matthew Mahoney1*, Candice Baker1, Isabela Gerdes Gyuricza1, Mark Keller2, Alan D Attie2, Gary A Churchill1, Gregory W Carter1
1The Jackson Laboratory, Bar Harbor, ME 2University of Wisconsin, Madison, WI
* Equal contributions
gregory.carter@jax.org
High-dimensional mediation analysis identifies heritable transcriptomic signatures associated with metabolic traits in diversity outbred mice
Tissue-derived gene expression is increasingly viewed as a bridge between genotype and phenotype. Although genome-wide association studies (GWAS) identify specific genomic locations associated with disease risk, most GWAS hits are located in gene regulatory regions, and the molecular function of these variants is difficult to ascertain. Alignment of GWAS with expression quantitative trait loci (eQTL) studies has identified co-regulators of gene expression and disease pathology thereby improving annotation of the role of GWAS variants in disease. However, human studies of this nature are limited in multiple ways. First, GWAS and gene expression data are typically measured in different populations. Thus, the links between genotype, gene expression, and phenotype are inferred only indirectly, and population structure and variable environmental histories further limit power of such studies. In addition, high-quality tissue samples have limited availability in living human subjects. Outbred animal models, alternatively, provide an ideal system in which to investigate the path from genotype to phenotype through gene expression. Here we describe a multi-tissue analysis of metabolic traits in outbred laboratory mice. We performed comprehensive clinical phenotyping of metabolic traits in 371 diversity outbred mice maintained on a high-fat, high-sugar diet. From the same animals we collected transcriptomes from four tissues: white adipose, pancreatic islet, liver, and skeletal muscle. All mice were genotyped using the GigaMUGA. We used high-dimensional mediation analysis (HDMA) to link genetics, gene expression, and metabolic traits measured in this population. Through this systems-level analysis, we identified heritable transcriptomic signatures that mediated the effect of genotype on metabolic traits thereby linking genetic variation to molecular pathways underlying disease risk.
Boston W. Simmons1,2, Laura M. Sipe3, Sydney C. Joseph1,2, Casey J Bohl4, Samson Eugin Simon1,2, Sandesh J. Marathe1,2, David G. Ashbrook2,4, D. Neil Hayes1,2,4, Lu Lu4, Robert W. Williams2,4, Liza Makowski1,2
1Division of Hematology and Oncology, Department of Medicine, College of Medicine, University of Tennessee Health Science Center (UTHSC); Memphis, TN 38163, USA 2UTHSC Center for Cancer Research, College of Medicine, UTHSC; Memphis, TN 38163, USA 3Department of Biology, University of Marry Washington; Fredericksburg, Virginia 22401, USA 4Department of Genetics, Genomics, and Informatics, UTHSC; Memphis, TN 38163, USA.
Novel pre-clinical recombinant inbred model to identify genetic modifiers of breast cancer
The lack of understanding how genetic variants affect molecular mechanisms that mediate breast cancer (BC) aggression poses a substantial obstacle to advancement in the clinic. Current genetically engineered mouse models (GEMMs) of BC lack genetic complexity because mice are on a single inbred background which impairs the rigorous investigation into individual genetic variation and tumor initiation, progression, or response to therapy. Because of this limitation, pre-clinical models typically fail to translate well to impact patient care. We have thus pioneered a transformative approach with the creation of a novel murine model with robust, reliable, and reproducible phenotypic and genomic variation. We systematically crossed the C3(1)-Tantigen (C3Tag) GEMM, which resembles human basal-like BC, into the “BXD” family - the largest and best characterized genetic reference population. We hypothesize that the interaction of modifier and causal genes govern the heterogeneity of BC phenotypes. We aim to study BXD-BC F1 isogenic hybrids which display greatly differing severity of BC phenotypes, indicating that genetic modifiers indeed impact disease. The advantage of the BXD-BC F1 hybrids is that every genome is defined and reproducible. Using cutting edge systems genetics, the GeneNetwork database, and molecular candidate validation, we will identify candidates. Cross-species comparisons with publicly available human GWAS and genomic databases will identify conserved, biologically relevant, and targetable candidates to yield highly impactful and readily translatable findings. This novel model will contribute to significant advances in understanding risk and improving outcomes for breast cancer.
Manshi Zhou1*, Delyth Graham1, Martin W. McBride1†
1University of Glasgow, Glasgow, Scotland, G20 6HQ, UK
*m.zhou.2@research.gla.ac.uk
†martin.mcbride@glasgow.ac.uk
Renal multi-omics analysis of UMOD knockout mouse in response to salt
Introduction: Uromodulin (UMOD), also known as Tamm-Horsfall protein (THP), was previously identified in our human GWAS as a candidate gene for hypertension. We validated its involvement in blood pressure regulation by comparing UMOD knockout (KO) and wildtype control (WT) mice in response to normal (NS) and salt-loading (HS) drinking water for 6 weeks. Knockout mouse blood pressure was significantly lower at baseline and insensitive to salt- loading.
Method: To characterize differences in normal (WTNS and KONS) and salt-loading (WTHS and KOHS) blood pressure responses, a whole kidney of each mouse (n=3 per group) was ground under liquid nitrogen to extract total RNA and protein for bulk RNAseq and 16-plex TMT proteomics, respectively. RNAseq alignment, quality control and differential expression analysis were completed in Galaxy server. Peptide alignment, protein interpretation, quality control and quantification were performed in Proteome Discoverer. We used Ingenuity Pathway Analysis (IPA) to identify biological pathways from RNAseq (p.adj < 0.05) and proteome (Log Fold-change (FC) = ±1.2) data, and RT- qPCR to validate results.
Result: 4355 genes were differentially expressed in knockout salt-comparison (KOHS vs KONS) whereas 696 in wildtype (WTHS vs WTNS). Unfolded protein response (UPR) is the most significantly over-represented and regulated canonical pathway in KOHS vs KONS [ p-adj.= 1.45E-11, z-score= -3.8], and NRF2-oxidative stress response [ p- adj.= 7.14E-6, z-score= -2.8]. Using TaqMan we assessed three significant genes in UPR: Dnajb1 [wildtype p=0.037, fold-change (FC)=2.00; knockout p=0.253, FC=8.61], Hspa1a [wildtype p=0.019, FC=10.63; knockout p=0.086, FC=172.02], and Hspa5 [wildtype p=0.242, FC=1.44; knockout p=0.027, FC=3.74]. Disease and function analysis predicts salt-loading upregulating transport of molecules in WT but downregulating it in KO.
465 proteins were differentially expressed in knockout salt-comparison whereas 286 in wildtype. Oxidative phosphorylation and mitochondrial dysfunction are among the most significantly over-represented canonical pathways in both salt-comparison but differentially regulated in opposite direction [WTHS vs WTNS: p-adj.= 3.79E- 18, z-score= -4.5 and p-adj.= 2.15E-19, z-score= 3.2, respectively; KOHS vs KONS: p-adj.= 5.16E-12, z-score= 1.4 and p-adj.= 2.84E-17, z-score= -1.9, respectively]
Conclusion: Consistent with the blood pressure phenotype data, RNAseq suggests uromodulin’s potential function in modulating ion transport, protein trafficking and stress response, whereas proteome data alludes to differences in energy production, specifically oxidative phosphorylation and mitochondrial dysfunction. These suggests uromodulin is a multifunctional protein contributing to intracellular and intercellular transport and signaling.
Gary A. Churchill, Ph.D.
The Jackson Laboratory, Bar Harbor, ME
Genetic Integration of Multi-Omics Data: Realizing the Promise of Genetical Genomics
It is now more than 20 years since Jansen and Nap (2001) proposed ‘a merger of genomics and genetics’ as a new approach for ‘unraveling of metabolic, regulatory and development pathways.’ In the intervening time, advances in our ability to quantify the detailed molecular composition of biological samples combined with the development of powerful model organism genetic resources have changed ‘genetical genomics’ from a visionary aspiration to a routine practice. This talk will briefly review some of the key advances that made this possible, and present examples of multi-omics integration that leverage genetic variation to establish causal links across chromatin structure, gene expression, protein abundance, post-translational modifications, metabolites, and cell-based and whole organism phenotypes.
Karl W. Broman
Biostatistics & Medical Informatics, University of Wisconsin-Madison, USA
Data cleaning principles
Data cleaning is an important prerequisite to good data analysis, but the topic is seldom discussed or included in graduate training. Why don't we teach data cleaning? It has been said that it is difficult to generalize: that what we learn from cleaning Medicare data cannot be readily applied to the cleaning of RNA-seq data. To the contrary, I think there are important general principles for cleaning data, and there are more commonalities in the creative process of data cleaning than in other aspects of data analysis. I will seek to delineate and illustrate a set of data cleaning principles.
Montana Kay Lara1, Apurva Chitre1, Denghui Chen1, Khai-Minh Nguyen1, Katarina Cohen1, Shae Zeigler1, Angela Beeson2, Thiago Sanches1, Leah Solberg Woods2, Oksana Polesskaya1, Abraham A Palmer1,3, Suzanne H Mitchell4
1Department of Psychiatry, University of California San Diego, La Jolla, CA, 92093, USA 2Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA 3Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, 92093, USA 4Departments of Behavioral Neuroscience, Psychiatry, the Oregon Institute of Occupational Health Sciences, Oregon Health & Science University, Portland, OR, 97239 USA
Genome-wide association study of delay discounting in Heterogenous Stock rats
Delay discounting (DD) refers to the behavioral tendency to devalue rewards as a function of their delay in receipt. Increased DD has been associated with substance use disorders (SUD), as well as multiple psychopathologies that often co-occur with SUDs. Genetic studies in humans and animal models have established that there is an underlying genetic component to DD, but few genes have been associated with increased DD to date. Here, we aimed to identify novel genetic loci associated with DD through a genome-wide association study (GWAS) using Heterogenous Stock (HS) rats, an outbred population derived from eight inbred founder strains. DD was tested in 650 HS rats through an adjusting amount procedure that gave animals a choice between smaller immediate sucrose rewards or larger rewards at variable delays. Delay curves were plotted for each animal and both exponential and hyperbolic functions were used to fit the curves. Area under the curve (AUC) and the discounting parameter k of both functions were used as DD measures. GWAS for AUC and exponential k identified significant loci on chromosome 20 that were in strong linkage disequilibrium with one another and mapped to the gene Slc35f1, which encodes a member of the solute carrier family of nucleoside sugar transporters. SLC35F1 gene mutations in humans is linked to pediatric neurodevelopmental and epileptic disorders; and the gene has been associated with educational attainment from a GWAS in humans. Overall, the neurodevelopmental implications and our GWAS identification suggest further exploration of Slc35f1 in the context of DD. In conclusion, by leveraging the genetic and phenotypic diversity of HS rats, we performed GWAS for DD and identified a novel gene associated with DD measures in HS rats.
Suheeta Roy1*, David G. Ashbrook1, Khyobeni Mozhui2, Amandeep Bajwa3, Casey J. Chapman1, Melinda S. McCarty1, Arthur G. Centeno1, Amelia Lalou6, Michael R. MacArthur4, Sarah J. Mitchell4, Danny Arends7, Pjotr Prins1, Saunak Sen2, Collin Ewald5, Johan Auwerx6, Lu Lu1, Robert W. Williams1, Evan G. Williams8
1Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA 2Department of Preventive Medicine, University of Tennessee Health Science Center; Memphis, TN, USA 3Department of Surgery, University of Tennessee Health Science Center, Memphis, TN, USA 4Princeton University, Princeton, NJ, USA 5Department of Health Sciences and Technology, Institute of Translational Medicine Eidgenössische Technische Hochschule (ETH) Zürich, CH-8603 Zürich, Switzerland 6Laboratory of Integrative Systems Physiology, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland 7Department of Applied Sciences, Northumbria University, Newcastle upon Tyne, UK 8Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
Charting the genetic loci linked to lifespan and variable response to high fat diet in the BXD family
We recently quantified variation in lifespan across two large cohorts of highly diverse BXD females on two diets (CD, 6% calories from fat versus HFD, 60% calories from fat) (Roy et al. 2021; Williams et al. 2022; Mozhui et al., 2022). On average, HFD shortens mean lifespan by ~80 days, equivalent roughly to a 7-year loss in humans. However, the anti-longevity effect of HFD is not universal and there is remarkable diversity in response with some strains gaining lifespan on the HFD. Our goal now is to chart the genetic loci and pathways that contribute to the variation in lifespan under the two dietary conditions using new high density WGS-based genotypes (Ashbrook et al., 2022). For the present work, we have expanded the dataset, which now includes upto 100 isogenic strains per diet with ~90 BXD strains and 11 BXDF1 hybrids.
Methods: We measured lifespans of ~10 replicates of each strain on both diets. Mean lifespan was mapped using linear mixed models that correct for family substructure, variance differences, and cofactors such as early body weight and diet (GeneNetwork, BXD_18441 and BXD_18435).
Results: We detect a complex of three partially dissociable loci on proximal Chr 1 extending from the centromere to ~45 Mb (Vita1a, Vita1b, and Vita1c), all of which contribute to a significant gain of 24 to 55 days for individuals that inherit the D haplotype, but with much larger effects on the HFD. Peaks are centered at ~5, 23, and 38 Mb and have linkage scores with peaks of –logP 1.6 and 4.5 at Vita1a on the CD and HFD respectively, of 1.8 CD and 5.2 HFD at Vita1b, and of 1.7 CD and 4.3 HFD at Vita1c. The strongest candidate gene within Vita1b is Sdhaf4, a protein involved in the assembly of mitochondrial complex II (CII) which protects against free radical damage in the mitochondrial matrix. CII function declines with age, making it a potential target for interventions against aging (Adlimoghaddam et al. 2022). Knockdown of the homolog of this gene in C. elegans increases lifespan significantly. We detect a second locus on distal Chr 10 between 121–126 Mb (Vita10b) on both diets, but again with a stronger effect on the HFD—linkage scores of 2.4 vs 4.6. On both diets the B allele contributes to a gain of 33 to 47 days. Finally, we mapped locus on Chr 2 from 134 to 158 Mb (Vita2b) with a –logP of 3.0 to 4.0 on CD but under 2.9 on HFD. The B haplotype is associated with a 35-to-41-day gain of lifespan.
Conclusions: We detect GxDiet effects at all five Vita loci. In most cases, high fat diet strengthens the weak linkage detected on chow diet suggesting that these loci confer some resilience to the negative effects of HFD. In contrast, the Vita2a locus is weakened slightly by the HDF. In no instance are effect polarities of Vita loci reversed as a function of diet. Of the current loci, we are restricting efforts to define causal genes and variants to the Vita1 cluster with a focus on Sdhaf4 as a candidate gene for Vita1b.
Ellen L. Risemberg1,2, Johanna M. Smeekens, PhD3, Marta C. Cruz Cisneros, BS2,4, Brea K. Hampton, PhD2,4, Pablo Hock, BS2, Colton L. Linnertz, MS2, Darla R. Miller, BS2, Kelly Orgel, MD, PhD3, Ginger D. Shaw, BS2,5, Fernando Pardo Manuel de Villena, PhD2,5, A. Wesley Burks, MD3, William Valdar, PhD2,5, Michael D. Kulis, PhD3, Martin T. Ferris, PhD2
1Curriculum in Bioinformatics and Computational Biology, UNC Chapel Hill 2Department of Genetics, UNC Chapel Hill 3Department of Pediatrics, Division of Allergy and Immunology, UNC Chapel Hill 4Curriculum in Genetics and Molecular Biology, UNC Chapel Hill 5Lineberger Comprehensive Cancer Center, UNC Chapel Hill
A mutation in Themis contributes to peanut-induced oral anaphylaxis in CC027 mice
Peanut allergy is a potentially life-threatening disease present in at least 1% of the US population. While significant progress has been made in food allergy research over the last decade, including the development of therapies and diagnostics, critical knowledge gaps exist surrounding the cause of peanut allergy. To dissect genetic causes of susceptibility to allergy, we recently identified an improved animal model of peanut allergy in the Collaborative Cross strain CC027/GeniUnc (CC027). CC027 is genetically susceptible to peanut allergy and improves on earlier models in that it can be sensitized in the absence of adjuvants such as cholera toxin and is prone to reaction following oral challenge. Earlier models, such as C3H/HeJ (C3H), require sensitization with cholera toxin and react only to challenge via intraperitoneal injection. We performed a backcross between CC027 and C3H to identify the genetic architecture underlying this more human physiology-relevant susceptibility to peanut allergy. Here we report results from quantitative trait loci (QTL) mapping on peanut allergy response phenotypes, including three major QTL associated with anaphylaxis severity following peanut exposure. We were able to narrow one of these loci down to a mutation private to CC027 in the T-cell developmental gene Themis. Follow up experiments confirmed that CC027 exhibits several deficiencies in their T-cell compartment relative to other strains with the same haplotype at Themis. Altogether, our results point to defects in early T-cell development and maturation as a driver of food allergy development.
P. LEMEN H. CHEN, J. HUANG
Adolescent Social Isolation Increases Vulnerability to Voluntary Opioid Consumption in Adulthood in Rats
Social stress during adolescence can cause behavioral changes lasting into adulthood and is a risk-factor for substance use disorder, but the effect varies between individuals. This study characterizes how social isolation in adolescence affects opioid use and anxiety-like behavior in adulthood using the inbred strains WKY and DSS rats. We compare adulthood oxycodone intake in self-administration and behavior in an elevated plus maze (EPM) between rats either group housed (GH) or isolated for 6 weeks during adolescence. We also develop a method (PeerPub) for operant oral intake of two rats in the same chamber to better model human social condition. Our data shows rats isolated during adolescence (n = 12/group) have higher vulnerability to oxycodone consumption in adulthood (WKY females P=0.006, WKY males P=0.01, DSS females P=0.02, DSS males P=0.05). We also found differences in anxiety-like behavior between experimental phases (baseline, post-drug, and withdrawal). Overall, our data indicates that rats isolated during adolescence have less anxiety-like behavior before oxycodone exposure, a decreased sensitivity to the negative effects of oxycodone, however, they consume more drug. These data demonstrate a need for better understanding in the role social environments play in vulnerability to drug use. We plan to examine underlying molecular mechanisms associated with these phenotypes in future studies.
Liza Makowski1,2, Samson Eugin Simon1,2, Laura M. Sipe3, Boston W. Simmons1,2, Sydney C. Joseph1,2, Casey J Bohl4, Sandesh J. Marathe1,2, Jeremiah R. Holt2,4, Sidharth S. Mahajan1,2, D. Neil Hayes1,2,4, Lu Lu4, Robert W. Williams2,4, David G. Ashbrook2,4
1Division of Hematology and Oncology, Department of Medicine, College of Medicine, University of Tennessee Health Science Center (UTHSC); Memphis, TN 38163, USA 2UTHSC Center for Cancer Research, College of Medicine, UTHSC; Memphis, TN 38163, USA 3Department of Biology, University of Marry Washington; Fredericksburg, Virginia 22401, USA 4Department of Genetics, Genomics, and Informatics, UTHSC; Memphis, TN 38163, USA
Determining novel gene candidates in breast cancer using a unique pre-clinical model
Breast cancer (BC) is the most common cancer and the second cause of death in US women. Human studies have identified risk factors for developing BC with both environmental and genetics approaches, yet these studies often fall short due to the inability to control variables or sample enough individuals. We posit that genetic variants may be discovered that impact molecular mechanisms driving BC which could be targeted to advance therapeutic limitations, or identify biomarkers of risk or response to therapy. To best examine genetic variants on BC traits, we have pioneered a transformative approach with the creation of a novel murine model with robust, reliable, and reproducible phenotypic and genomic variation. The FVB C3(1)-Tantigen (“C3Tag”) genetically engineered mouse model (GEMM) develops spontaneous triple negative BC (TNBC), well established to resemble human basal-like TNBC, and aggressive subtype with few clinical approaches. TNBC tumors lack estrogen receptor, progesterone receptor, and HER2 which are the 3 receptors typically targeted in the clinic, which leads to poor survival in these patients. Thus, to model human heterogeneity in BC outcomes, we systematically crossed the C3Tag GEMM, into the BXD family - the largest and best characterized genetic reference population. The new model is termed “BXD-BC” and these hybrids have genomes that are reproducible. We hypothesized that BC phenotypic variation will be due to interaction of modifier and causal genes impact that drive different responses to tumor onset, progression, or exposures such as therapy or diet. Early results demonstrate that BXD-BC F1 isogenic hybrids have greatly differing severity of BC phenotypes which are significantly heritable, indicating genetic modifiers that impact disease. Thus, BXD-TNBC strains modulate cancer susceptibility and progression of the parent C3Tag GEMM. Using cutting edge systems genetics and molecular candidate validation, we have identified genetic modifiers of BC. Cross-species comparisons with publicly available human GWAS and genomic databases will identify conserved, biologically relevant, and targetable candidates to yield highly impactful and readily translatable findings. Ongoing work aims to test Gene X Environmental (GxE) interactions including response to therapy or obesogenic diets which will identify genetic modifiers of resistance or susceptibility to exposures. The lack of targeted therapies for TNBC presents a great unmet patient need. The deliverables of this novel BXD-BC will define susceptibility loci, candidate genes, and molecular networks that underlie variation of multiple BC phenotypes. Results generated will thus be transformative with high impact, leading to the identification of genes modifying heterogeneity and networks underlying individual differences in BC.
Camron D Bryant1, William B Lynch1,2, Sophia A Miracle1,2, Stanley I Goldstein1,3, Kelly K. Wingfield1,3, Ida Kazerani1, Gabriel A Saavedra1, Rhea Bhandari1, Ava Farnan1, Binh-Minh Nguyen1, Ethan T Gerhardt1, Britahny M. Baskin1, Jacob A Beierle1, Andrew Emili4
1Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences, Center for Drug Discovery, Northeastern University, Boston, MA, 02115, USA 2Graduate Program for Neuroscience, Boston University, Boston, MA, 02118, USA 3Biomolecular Pharmacology PhD Program, Department of Pharmacology, Physiology & Biophysics, Boston University Chobanian & Avedisian School of Medicine, Boston, MA 02118 USA 4Knight Cancer Institute, Oregon Health & Science University, Portland, OR 97201 USA
Oxycodone addiction model behaviors following brain versus liver viral Zhx2 overexpression and following Zhx2 knockout in BALB/c substrains
Opioid Use Disorder maintains epidemic proportions in the U.S., with current pharmacological treatments limited to opioid substitution therapy. Sensitivity to the subjective and physiological responses to opioids has a genetic component that could influence addiction liability. We previously mapped Zhx2 as a candidate gene underlying increased oxycodone (OXY) metabolite brain concentration in BALB/cJ (J) vs. BALB/cByJ (By) females, which we hypothesize could enhance state-dependent learning and recall of OXY-induced conditioned place preference (CPP) in J vs. By females. We tested this hypothesis via selective Zhx2 AAV overexpression in the liver and brain of Zhx2-deficient J mice. Following liver Zhx2 overexpression, J females showed a decrease in acute OXY locomotor activity and an increase in state-dependent OXY reward learning. In contrast, liver Zhx2 overexpression in males induced a decrease in OXY locomotion only following repeated administration, with no significant effect on state-dependent reward learning. To assess the role of Zhx2 in brain oxycodone metabolism and to pinpoint the location and cell types containing Zhx2, we next overexpressed Zhx2 in the brain via ICV AAV viral injections in J females. Brain Zhx2 overexpression in J females resulted in a significant decrease in state-dependent OXY-CPP and OXY locomotion. We are currently validating and localizing brain viral Zhx2 viral expression and assessing OXY metabolite levels. Preliminary results indicate viral spread to reward-relevant regions such as the striatum and colocalization of Zhx2 with oligodendrocytes. We also performed the reciprocal experiment and knocked out Zhx2 in wild-type By mice. While KOs of both sexes display heightened basal locomotion, repeated OXY exposure only exacerbated this locomotor difference in females. Furthermore, only KO females displayed heightened state-dependent locomotion while only KO males displayed heightened state-dependent environmental preference. Whole brain proteomic analysis in Zhx2 knockouts revealed differential expression of proteins associated with fatty acid metabolism, structural organization, and immune functions. Notably, there was a robust change in expression of two aldehyde dehydrogenase proteins (ALDH1A1 and ALDH1L1) known to contribute to metabolism of alcohol and other substances. Additionally, there was a robust upregulation of discoidin domain containing receptor 2 (DDR2), a collagen receptor known to influence the extracellular matrix that is associated with nicotine withdrawal and alcohol consumption. Overall, targeted and constitutive manipulation of Zhx2 expression and function modulated OXY addiction model traits. We are currently assessing the cell types and molecular mechanisms that link Zhx2 with OXY metabolism and behavior.
Sandesh J. Marathe1,2, Boston W. Simmons1,2, Laura M. Sipe3, Sydney C. Joseph1,2, Casey J Bohl4, Samson Eugin Simon1,2, David G. Ashbrook2,4, D. Neil Hayes1,2,4, Lu Lu4, Robert W. Williams2,4, Liza Makowski1,2,4
1Division of Hematology and Oncology, Department of Medicine, College of Medicine, University of Tennessee Health Science Center (UTHSC); Memphis, TN 38163, USA; 2UTHSC Center for Cancer Research, College of Medicine, UTHSC; Memphis, TN 38163, USA; 3Department of Biology, University of Marry Washington; Fredericksburg, Virginia 22401, USA 4Department of Genetics, Genomics, and Informatics, UTHSC; Memphis, TN 38163, USA,
High fat diet induced obesity and identification of genetic modifiers of breast cancer using novel recombinant inbred strains
The discovery of genetic and environmental modifiers using pre-clinical models can be leveraged to further understand breast cancer risk and outcomes to maximally improve patient care. Utilizing the BXD murine genetic reference population which is multiple recombinant inbred strains of C57BL/6 “B” crossed to DBA/2J “D”, we have begun to examine impacts on obesity and breast cancer risk, progression, and response to therapy. Therefore, we have created a novel model of breast cancer (BC) which has been generated by systematically crossing a spontaneous genetic model of BC into the BXD family which we term the BXD-BC. We hypothesize that we can build upon decades of BXD work to discover and validate genetic modifiers of tumor aggression and response to environmental exposures such as obesogenic diets. The BXD-BC hybrids demonstrate significant, heritable variation in tumor phenotypes on chow diet. Using systems genetics, we have begun to examine tumor phenotypes (latency, multiplicity, survival) in association with pre-existing reports on mediators that impact cancer outcomes. Using our previously published heritable variation in response to an obesogenic high fat diet resembling the Western or American diet compared to the Mediterranean diet, we aim to determine genetic mediators that impact cancer and obesity risk factors. Preliminary findings show that metabolites and microbes known to be impacted by obesity are associated with tumor phenotype and point to QTLs of interest. To our knowledge, this is the first study to explore modifier genes for BC phenotypes using a systems genetics approach in a genetically engineered mouse model (GEMM). In sum, the generation of this reliable, reproducible, and robust pre-clinical resource will enable the discovery of genetic and environmental modifiers which will be leveraged to further understand BC to maximally improve patient outcomes.
Samuel J. Widmayer*, Lydia K. Wooldridge, Michael Saul, Laura Reinholdt, Beth L. Dumont, Daniel M. Gatti
The Jackson Laboratory, Bar Harbor, ME, 04609, USA
*samuel.widmayer@jax.org
Haplotype reconstruction using low-pass whole-genome sequencing in genetically diverse mouse populations
The Diversity Outbred mouse population (DO) is a premier resource for powerful quantitative trait locus mapping and systems genetics. However, access and cost remain a barrier to entry for experiments using DO animals. Traditionally, DO mice are genotyped using the Giga Mouse Universal Genotyping Array (GigaMUGA), which contains roughly 140,000 markers that discriminate the founder strains of the DO. The cost to genotype a mouse using this array is roughly $100. Additionally, between DO generations 21 and 36, approximately 12.2 new crossovers were observed each generation, which is roughly half the expected rate. This divergence suggests that the traditional approach to genotyping DO animals lacks the required resolution to capture all recombination events in the DO, reducing its utility for genetic mapping. Other groups have used either low-coverage or reduced representation sequencing to genotype mouse populations for genetic mapping1,2. We describe an accurate, cost-effective workflow for genotyping DO animals using whole-genome sequencing approaches. We prepared 96 samples using two different sequencing library preparations: double‐digest restriction site–associated DNA sequencing (ddRADseq) and low-coverage whole genome sequencing (lcWGS). These samples were split between two populations: 48 generation 41 DO animals and 48 animals derived from an advanced intercross between four wild-derived strains (4WC) of Mus musculus musculus and M. m. castaneus. We constructed genetic reference panels for each population and leveraged an existing genotype imputation software, QUILT3, to impute SNPs for each sample. We then derived chromosome-specific genotype and allele probabilities from a filtered subset of these imputed SNPs using R/qtl2. We implemented this workflow in an open-source Nextflow pipeline, quilt-nf. Low-pass sequencing-based founder allele probabilities were over 90% concordant with GigaMUGA-derived allele probabilities on average in both populations and using both library preparation methods. We obtained comparable concordance even after downsampling the alignments derived from lcWGS experiments to 0.05X coverage, reducing the cost of genotyping by at least 75% compared to the GigaMUGA. The observed number of crossovers also closely approximates the expected frequency, suggesting that our workflow may improve the accuracy of haplotype reconstruction in complex mouse crosses. Hundreds of small loci (less than 0.1 Mb) remain discordant from GigaMUGA, and ongoing work seeks to leverage recently published structural variant datasets and regions of identity-by-descent to characterize sources of noise in haplotype reconstructions. We aim to lower barriers to conducting DO mouse experiments by reducing genotyping costs and the computational expertise associated with obtaining accurate haplotype reconstructions.
Tasfia Chowdhury1, David Ashbrook1, Kristin Hamre1, Daniel Goldowitz2
1University of Tennessee Health Science Center, Memphis, TN, 38163, USA 2University of British Columbia, Vancouver, BC V6T 1Z4, Canada
Determining Significant Polymorphisms in the Choline Metabolic Pathway in the Liver of BXD Mice for Greater Efficacy in the Treatment of Fetal Alcohol Spectrum Disorders
Choline has been well-documented as an effective treatment for at least some of the neurobehavioral deficits in Fetal Alcohol Spectrum Disorders (FASD) in both animal models and human populations. Previous work from our lab examined choline’s efficacy in ameliorating ethanol-induced cell death and demonstrated that different recombinant inbred mice strains (BXD) had variations in treatment efficacy particularly at the higher dose of 250mg/kg. The findings illustrated a need for understanding the genetics underlying choline metabolism variation to identify specific genes that contribute to this differential response. We used a bioinformatics approach to develop a list of candidate genes responsible for the differing choline metabolism among the strains. First, we did a literature review and identified 22 genes involved in the choline metabolic pathway in the liver, the primary site of choline metabolism. Next, we examined whether each of the BXD strains had inherited the B6 or D2 haplotype at each of the 22 identified genes and found variable haplotypic expression in 12 of the genes. This ruled out 10 genes that expressed the same haplotype across the strains, indicating that they were not the basis for the varied choline metabolism. Using GeneNetwork, we explored liver expression levels of the 12 remaining genes across the mouse strains and found that liver expression levels varied among 10 of the genes. Finally, we compared liver expression levels of the 10 remaining genes with the percent of cell death reduction in the forebrain and brainstem after choline supplementation and found a linear correlation in 4 of the genes: choline kinase alpha (Chka), choline/ethanolamine phosphotransferase 1 (Cept1), choline transporter gene solute carrier family 44 member 1 (Slc44a1), and choline transporter gene solute carrier family 44 member 3 (Slc44a3). All four genes were correlated with cell death in the forebrain and brainstem with rho values greater than 0.6. SNP analyses were performed in the human form of these genes using genome-wide association studies, and SNPs in all 4 genes were found to have statistically significant phenotypic correlations with alcoholic liver disease and mental and behavioral disorders attributable to alcohol abuse. These findings indicate that polymorphisms within these choline metabolic pathway genes need to be further explored as being the basis for the varying optimal treatment doses of choline for FASD across animal and human models.
Acknowledgements:
Grant Support: UTHSC Medical Student Research Fellowship Program, RO1AA023508
Mallory E. Udell1*,
Jun Huang1,
Hao Chen1
*mudell@uthsc.edu
1Department of Pharmacology, Addiction Science, & Toxicology, University of Tennessee Health Science Center, Memphis, TN 38103, USA
Enhanced alcohol self-administration in rat models of endogenous depression vulnerability versus resistance
Background Alcohol use disorder (AUD) and major depressive disorder (MDD) are both heritable and co-occur at rates that far surpass chance, suggesting that shared genetic factors may be driving these distinct pathologies. Together, these conditions are responsible for 50% of the global disease burden produced by all psychiatric and substance use disorders combined.
Methods To investigate the relationship between endogenous depression and alcohol consumption behaviors, we studied adolescent rats of two substrains derived from selective breeding within the Wistar Kyoto (WKY) rat model of depression. The WKY more-immobile (WMI) is genetically predisposed to depression-like behavior, whereas the WKY less-immobile (WLI) serves as a resistant control. For operant alcohol self-administration, we developed a novel device, HomeBrew, which allows rats to self-administer together in social pairs to model the context in which humans tend to drink. We hypothesized that alcohol self-administration behaviors would be enhanced in depression-prone (WMI) versus -resistant (WLI) male and female rats.
Results Our findings revealed significant main effects of sex and strain. Specifically, intake of 20% alcohol (g/kg) was enhanced in WMI versus WLI rats (p < 0.001) and in females relative to males (p < 0.001). The highest intake was found in WMI females (2.54 g/kg ± 0.21), who self-administered about 95 and 115 percent more alcohol (g/kg) than their WLI female and WMI male counterparts, respectively. We did not observe a significant interaction between sex and strain.
Conclusions Given the minimal genetic differences between the WMI and WLI inbred strains, these findings suggest that shared genetic pathways between AUD- and MDD-like phenotypes may influence the high rate at which they co-occur. Understanding the mechanisms that give rise to AUD+MDD comorbidity and sex-specific interactions between these pathologies is foundational to the development of novel therapeutic approaches that are safe and effective for dual diagnosis patients.
Flavia Villani1, Andrea Guarracino1,2, Rachel Ward3, Tomomi Green3, Madeleine Emms4, Michal Pravenec5, Pjotr Prins1, Erik Garrison1, Robert W. Williams1, Hao Chen3, Vincenza Colonna1,4
1Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center 2Genomics Research Centre, Human Technopole,Italy 3Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center 4Institute of Genetics and Biophysics, National Research Council, Italy 5Institute of Physiology, Czech Academy of Sciences.
Unveiling genetic complexity in rats through pangenome graphs and Genome-Phenome analysis
The HXB/BXH (HXB) family of recombinant inbred strains of rats has an unrivaled phenome with quantitative molecular and organismal phenotypes that have been systematically accumulated for over 25 years. We generated a pangenome for the HXB family of rats and used this resource to evaluate the functional impact of both known and new genetic variants. We sequenced both parents and all 30 progeny strains using linked-read libraries at ~40x coverage. We used PGGB to construct the family pangenome graph and to compute the frequency of major classes of DNA variants. Validation of SNVs on Chr 12 by Sanger sequencing underscored pangenome's potential for uncovering novel genetic variants, especially those residing in complex regions (repetitive regions and p-arm regions). Functional annotation of these variants highlighted their potential roles in gene regulation and molecular mechanisms. We focused on 12 validated genetic markers exclusively identified via the pangenome graph. We conducted a Phenome-Wide Association Study (PheWAS) to explore the broader significance of these novel variants. Intriguingly, we identified associations between specific markers and phenotypic traits, including a link between a lncRNA-encoding region and insulin concentration. Another association was found between a marker at 18.8 Mb and the insulin/glucose ratio in line with previous research. Specifically, our results align with a previously reported QTL highlighting a genomic region significantly linked to insulin resistance at T 30 min at 19.6 Mb. The investigation of structural variants (SVs) in Chr12 further confirmed the pangenome's reliability in detecting variations. We identified 2,481 high-confidence SVs, underscoring their presence in the SHR/OlaIpcv sample, 27 of which are predicted to have high impact. We validated 13 novel SVs with long-read sequencing using Nanopore technology. Out of 13, two insertions and four deletions consistently observed in the SHR/OlaIpcv sample and spanning multiple individuals were successfully validated. One deletion, that is preceded by simple repeats, was consistently present in all samples and situated within the Lmtk2 gene. Mutations in the Lmtk2 gene have been linked to multiple neurological disorders, including Alzheimer's disease, Parkinson's disease, and schizophrenia. These results demonstrate that pangenomes constructed from linked-reads can provide valuable information about genetic variation, making it a useful tool for the study of complex traits.
Zifan Yu1, Gregory Farage1, Robert W. Williams1, Karl Broman3, Śaunak Sen1*
1Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN 38163, USA 2Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA 3Department of Biostatistics and Medical Informatics, University of Wisconsin - Madison, Madison, WI 53706, USA
*sen@uthsc.edu
Real-time Linear Mixed Model Implementation for Association Mapping on Large Numbers of Quantitative Traits
Linear mixed models (LMMs) are used widely in genome-wide association studies (GWAS) to account for population structure and genetic relatedness among the study individuals. The advancements of high-throughput genotyping technologies have enabled GWAS to be conducted on large number of traits and created a demand for more efficient and scalable implementations of LMMs. We developed a new software package for fast LMM association scans, BulkLMM, that is designed for GWAS of large numbers of quantitative traits with modest sample sizes which are common for analyzing animal model data. We applied BulkLMM on BXD Individual Liver Proteome data, where for genome scans of the scale of over 35k traits and 7k markers and obtained real-time (in a few seconds) performance on high-end desktop hardware. BulkLMM provides additional features valuable for performing GWAS, such as permutation testing and the ability to incorporate prior knowledge on the residual variance. Our open-source implementation in the Julia programming language has the combined benefits of high-efficiency and easy prototyping which enable seamless downstream analysis and manipulation.
Mikhail Tiumentsev1*, Stephen E. Alway2,3,4, Richard Cushing1, Malik Hulette2, Jesse Ingels1, John T. Killmar1, Melinda McCarty1, Hector G. Paez2,3,4, David Ashbrook1
1Department of Genetics, Genomics and Informatics, UTHSC, Memphis, TN, 38163, USA 2Department of Physiology, College of Medicine, UTHSC, Memphis, TN, 38163, USANT 3Integrated Biomedical Sciences Graduate Program, College of Graduate Health Sciences, UTHSC, Memphis, TN, 38163, USA 4Laboratory of Muscle Biology and Sarcopenia, Division of Regenerative and Rehabilitation Sciences, College of Health Professions, UTHSC, Memphis, TN, 38163, USA
Mitochondrial phenotypes in BXD models of aging and Alzheimer’s disease
The connection between mitochondrial dysfunction and aging is well-established, and mitochondrial dysfunction seems to contribute to the pathogenesis of Alzheimer’s disease (AD). However, the exact mechanistic nature of this connection is not fully understood. One of the main scientific gaps is that the interactions between mitochondrial function, multiple genetic variants, and health outcomes are not known. Many animal models use a single genome in a single environment to address this problem, but this approach does not accurately represent the highly genetically and environmentally diverse human population found in a clinical setting. These studies are further complicated by the many variables that define mitochondrial biology.
The primary aim of the current study is to identify a link between genotype, environment, and mitochondrial function, gaining an insight into their interaction. Here we present an experiment design and preliminary data for a project investigating mitochondria in the BXD and the AD-BXD isogenic strains. Our research design is intended to provide detailed, tissue-specific data on the activity of the mitochondrial respiratory chain, reactive oxygen species (ROS) production, and mitochondrial DNA (mtDNA) copy number in mice described by the combination of age, sex, genetic background, and 5XFAD transgene status. We will analyze the data in two ways: on the group level: to identify sex, age or transgene effects on mitochondrial function; and on the strain level: to look for associations with other phenotypes collected in the population, including behavior and neuroanatomy.
Our preliminary data show that mtDNA content is affected by BXD strain. This is a high throughput method for which data from many samples can be collected. High-resolution respirometry data collection, currently underway, is a low-throughput method, with the experiments usually restricted to a small number of experimental groups to process enough samples. In our experiment, we plan to analyze respirometry data from many experimental groups using an incomplete nested design and linear mixed models. Subsequent measurements of ROS production following the same design will be added to the respirometry data. This approach allows us to obtain important bioenergetic data and draw conclusions for each grouping factor, addressing a significant limitation within the field of bioenergetics, while capitalizing on the well-defined genetics of the BXD and AD-BXD strains.
In summary, our findings will elucidate the mechanisms behind the interactions of genetic makeup, mtDNA copy number, and mitochondrial function in aging and AD, thus paving the road to the development of translatable interventions.
Thiago Missfeldt Sanches1*, Oksana Opoleskaya1, Denghui Chen1, Ben Johnson1, Abraham A.Palmer1
1 Department of Psychiatry, University of California San Diego, La Jolla, CA, 92093, USA.
Email address of the corresponding author: tsanches@health.ucsd.edu
After 100 generations of Husbandry, the HS-rats lineage maintains most of the genetic diversity of the founder strains while also having a linkage disequilibrium equivalent to many mice strains.
In 1984, the Heterogeneous Stock (HS) rat was developed by interbreeding 8 inbred rat strains. The HS rat population is about to reach the 100th generation. Over this period, they have been housed in numerous labs, subjected to different breeding programs, with different effective population sizes, and have inevitably experienced genetic drift. Here we evaluated how much the current population has changed compared to the founders and how much it has changed over the past decade, which is a period when we have access to the most extensive genotype and breeding records. We found that almost 20% of the biallelic Single Nucleotide Polymorphisms (SNPs) that characterized the original 8 founders have been lost in the modern population. However, HS rats have retained many qualities that make them an outstanding population for studying complex behaviors. They have retained a uniform distribution of the site frequency spectrum, a broad set of variants in coding regions of genes, including in genes with high pLI scores in humans. We observed a decrease in heterozygosity, analogous to a population with Ne of 78 between 2014 and 2016 and 190 after 2016, thus indicating that the current breeding scheme produces a Ne higher than the population size of 64. These factors suggest that the HS rats used for genetic studies over the past decade are broadly similar and can be analyzed together, provided that linear mixed models are used to control for relatedness and population structure. Given the current breeding program we expect less loss of heterozygosity over time, and continuing decreases in the size of linkage disequilibrium blocks, indicating that HS rats will continue to be a powerful tool for complex behavior studies for the next decades to come.
Felipe M S Dias1,2*, Amelie Baud1,2
1Centre for Genomic Regulation (CRG), Barcelona, Spain 2Universitat Pompeu Fara (UPF), Barcelona, Spain
*felipe.morillo@crg.eu
Assessment of different microbiome profiling approaches for host genetic effects analyses in outbred rats
Background: One of the main questions in microbiome studies is to what extend commensal microorganisms affect host phenotypes. Despite strong associations have been described, demonstrating causality is still difficult due to the presence of many confounders. To solve this problem, genetic epidemiology has become a promising approach as it uses host genetic variants as anchors to limit reverse causation and confounders in causal inference analyses. However, to this date few variants have been robustly associated with the microbiome composition. We hypothesize this happens because most studies have focused on taxonomy rather than function, even though we know that microbial functions can be redundant among taxa, and because most studies have relied on 16S data to characterize the microbiome, lacking species level resolution. Thus, we evaluated shallow shotgun sequencing as an alternative to define taxonomic and functional profiles of cecal microbiomes from 796 outbred laboratory rats. For taxonomic assignment the Genome Taxonomy Database (GTDB) was applied as a reference for both shallow shotgun and 16S, while EggNOG annotation was used for shallow shotgun based functional profiling. Then, we quantified aggregate host genetic effects (i.e., heritability), performed GWAS on the mapped taxonomic and functional traits with linear mixed models and compared the results with the ones obtained with 16S data for the same samples.
Results: Shallow shotgun was not only able to reveal more taxa (from phylum to species) with significant heritability (FDR<10% and Bonferroni<5%) than 16S but also assigned higher values for most of the common taxa mapped by both methods. Nevertheless, no significant heritable functions were detected. At the same time, while different significant loci were identified by the three approaches (Padj<0.05) the same locus on chromosome 10 was associated with Paraprevotella in both 16S and shallow based taxonomic profiles, with shallow shotgun being able to reveal the Paraprevotella species behind the association.
Conclusions: This study showed that shallow data has potential to reveal more and stronger host genetic effects than 16S on the taxonomic level. However, the poor results for functional profiling might reflect a limitation of shallow-shotgun to provide good coverage of individual bacterial genes, requiring function prediction based on reference genome catalogues. With the support of a rat gut microbiome catalogue, this can be achieved, complementing the discoveries obtained with taxonomic profiles for the discovery of variants to be used on future causal inference analyses.
Hao Chen1‡, Tristan V de Jong, 1† Yanchao Pan, 2† Pasi Rastas, 3 Daniel Munro, 4, 5 Monika Tutaj, 6,7 Huda Akil, 8 Chris Benner, 9 Denghui Chen, 10 Apurva S Chitre, 18 William Chow, 11 Vincenza Colonna, 12 Clifton L Dalgard, 13 Wendy M Demos, 6, 7 Peter A Doris, 14 Erik Garrison, 12 Aron M Geurts, 6,7 Hakan M Gunturkun, 1 Victor Guryev, 16 Thibaut Hourlier, 18 Kerstin Howe, 18 Jun Huang, 1 Ted Kalbfleisch, 19 Panjun Kim, 12 Ling Li, 20 Spencer Mahaffey, 21 Fergal J Martin, 17 Pejman Mohammadi, 22, 23 Ayse Bilge Ozel, 2 Oksana Polesskaya, 4 Michal Pravenec, 24 Pjotr Prins, 12 Jonathan Sebat, 4 Jennifer R Smith, 6, 7 Leah C Solberg Woods, 25 Boris Tabakoff, 18 Alan Tracey, 18 Marcela Uliano-Silva, 11 Flavia Villani, 12 Hongyang Wang, 27 Burt M Sharp, 12 Francesca Telese, 10 Zhihua Jiang, 27 Laura Saba, 21 Xusheng Wang, 12 Terence D Murphy, 28 Abraham A Palmer, 4, 29 Anne E Kwitek, 6,7 Melinda R Dwinell, 6,7 Robert W Williams, 12 Jun Z Li, 2‡
1 Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, 2 Department of Human Genetics, University of Michigan, 3 Institute of Biotechnology, University of Helsinki, 4 Department of Psychiatry, University of California San Diego, 5 Department of Integrative Structural and Computational Biology, Scripps Research, 6 Department of Physiology, Medical College of Wisconsin, 7 Rat Genome Database, Medical College of Wisconsin, 8 Michigan Neuroscience Institute, University of Michigan, 9 Department of Medicine, University of California San Diego, 10 Department of Psychiatry, University of California San Diego, 12 Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, 13 Department of Anatomy, Physiology & Genetics; The American Genome Center, Uniformed Services University of the Health Sciences, 14 The Brown Foundation Institute of Molecular Medicine, Center For Human Genetics, University of Texas Health Science Center, 16 Genome Structure and Ageing, University of Groningen, UMC Groningen, 17 European Molecular Biology Laboratory, European Bioinformatics Institute, 18 Tree of Life, Wellcome Sanger Institute, Cambridge, UK, 19 Gluck Equine Research Center, Department of Veterinary Science, University of Kentucky, 20 Department of Biology, University of North Dakota, 21 Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, 22 Center for Immunity and Immunotherapies, Seattle Children’s Research Institute, 23 Department of Pediatrics, University of Washington School of Medicine, 24 Institute of Physiology, Czech Academy of Sciences, 25 Department of Internal Medicine, Section on Molecular Medicine, Wake Forest University School of Medicine, 26 Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado, 27 Department of Animal Sciences, Washington State University, 28 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 29 Institute for Genomic Medicine, University of California San Diego, La Jolla, California, USA
A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats
The seventh iteration of the reference genome assembly for Rattus norvegicus— mRatBN7.2—corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared to its predecessor. Gene annotations are now more complete, significantly improving the mapping precision of genomic, transcriptomic, and proteomics data sets. We jointly analyzed 163 short-read whole genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined ~20.0 million sequence variations, of which 18.7 thousand are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.
Nick Rozinsky1, Andrea Guarracio 1, Joep de Ligt 2, Erik Garrison1 and Pjotr Prins 1
1Department of Genetics, Genomics and Bioinformatics, University of Tennessee Health Science Center, Memphis, TN, USA 2ESR, NZ
nick.rozinsky@gmail.com
GBAM: a new high performance simple and extensible format for fast column-based processing of aligned sequence reads for interactive tools and pangenome graphs
The GBAM file format is a new lossless file format that is random ac- cess optimized for accessing sequence alignment files in the family of BAM and CRAM file formats. GBAM leverages modern computer architecture and re- duced memory copy with memory map. The overall goal for GBAM was to come up with a simple and extensible format that reduces processing time of growing and long-read sequence data and address some of the weaknesses of lin- ear aligned read file formats that were designed over ten years ago. In addition we introduce a simple and straightforward format that can easily be adopted by bioinformatics tool writers. GBAM is an extensible format that enables fast processing of sequence data against graph-based pangenomes, such as recently produced by the human pangenome reference consortium. In our paper we present a reference implementation written in the Rust programming language that can easily be ported to other languages.
A pangenome graph is a connected graph where vertices are nucleotides. By adding numerous individual’s genetic data to this graph, we can create a graph in which variation information is preserved and can be used later on. Depending on variation level the complexity of the graph may rise quickly. GBAM can be used for linking the graph to existing sequence files, e.g. for RNA-seq. GBAM can potentially be used to reference nodes on disk. In any case a file format is necessary to be able to fetch additional data for the graph quickly.
The GBAM software can be easily extended to accommodate for various needs such as: adding index columns which pin alignment reads to pangenome graph coordinates, adding lightweight columns with filter predicate results, com- bining multiple such columns on the fly. The code is simple and robust, we performed testing on files up-to 1TB in size. It is also possible to implement efficient Bloom filter based on a columnar file format, which allows to quickly determine if a key is part of a set. Since the format is simple, it is possible to repurpose it for storing not alignment data. In general, any biology workflow which utilizes only the smaller part of the data in BAM file could benefit from GBAM in cases when sorted data is necessary.
GBAM is a free and open source software, available under a permissive BSD license. GBAM’s source code can be downloaded from https://github.com/pangenome/gbam
Alexandra L Purdy1, Mary Kolell1, Kaitlyn Andresen1, Cheyret Wood4, Tyler Buddell1,2, Amirala Bakhshiannik2,3, Melinda R Dwinell PhD3, Laura M Saba PhD4, Caitlin O’Meara PhD2,3, Michaela Patterson PhD1,2
1Department of Cell Biology, Neurobiology and Anatomy, Medical College of Wisconsin, Milwaukee, WI 2Cardiovascular Center, Medical College of Wisconsin, Milwaukee, WI 3Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 4Department of Pharmaceutical Sciences, University of Colorado, Anschutz
Genetic mapping of cardiomyocyte ploidy phenotypes that influence basal cardiac physiology and outcomes after myocardial infarction
Identifying genes that contribute to outcomes after cardiac injury and heart failure has proven to be challenging due to high genetic diversity of patient populations, inability to control for non-genetic factors, and cellular-level complexities that contribute to outcomes. Here, we propose utilizing the Hybrid Rat Diversity Panel (HRDP) to assess high throughput surrogate phenotypes, which can be predictive of outcomes after injury when measured in the basal state. Our surrogate phenotype of interest is cardiomyocyte (CM) ploidy, where high frequency of diploid CMs has been shown to be predictive of regenerative ability and functional recovery after myocardial injury. Conversely, an increased proportion of hyperpolyploid (≥8N) CMs is associated with adverse ventricular remodeling and dilated cardiomyopathy.
Having analyzed 72 of the 106 HRDP strains, I observed that frequency of diploid CMs varied from 1.2-21.8% across strains, while frequency of the hyperpolyploid CMs varied from 0.8-20.7%. To assess whether CM ploidy is associated with baseline cardiac function and/or myocardial infarction outcomes in adult rats, 12 strains were selected for further studies. Both CM ploidy phenotypes correlated with baseline left ventricular ejection fraction and left ventricular area, while frequency of hyperpolyploid CMs alone correlated with LV mass. These baseline correlations could indicate that increased CM ploidy results in dilation of the LV with reduced contractility. Interestingly, correlations with MI outcomes in the same strains (excluding BUF/Mna) suggest that increased hyperpolyploidy before injury protects against adverse ventricular remodeling, while CM proliferation after injury correlates with frequency of diploid CMs, as predicted.
To pinpoint regions of the genome that may be contributing to these physiological and pathological traits, preliminary genetic mapping with 64 strains was performed and resulted in multiple loci, most of which are uniquely associated with just one CM ploidy phenotype. These loci are currently being interrogated by RNAi screening in both primary neonatal rat CMs and Drosophila, along with analysis of RNA expression data from the BXH/HXB RI panel. Further, genetically engineered mice will validate candidates in vivo. This study will elucidate mechanisms controlling CM proliferation, polyploidization, and cardiac remodeling, which may be investigated for future use in precision medicine-based therapies.
Leandro M Velez1*, Casey Johnson1, Isoo Yoon1, Kaleb Aberra1, Alistair Senior2, Marin Nelson2, David James2, David Ashbroock3, Robert Williams3, Marcus M Seldin1
1Department of Biological Chemistry, Center for Epigenetics and Metabolism, University of California, Irvine, CA, USA. 2School of Medical Sciences, University of Sydney, Sydney, NSW, Australia. 3Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA.
*lmvelez@uci.edu (Leandro Velez)
Gene-by-PCOS interactions in cardio-metabolic traits on a panel of genetically diverse strains of mice
Polycystic ovary syndrome (PCOS) is the most common endocrinopathy in women, with a prevalence of ~4-20% in women of reproductive age. The diagnosis of the syndrome generally occurs when the patient consults for fertility issues, and is only based on a reproductive criterion, which includes (1) hyperandrogenism, (2) oligo-anovulation and (3) polycystic ovary morphology. But the overlap of PCOS with cardiometabolic diseases is significant. To put in numbers; up to 75% of PCOS women presents some degree of insulin insensitivity, 38-88% presents obesity or overweight, 20-50% develop type 2 diabetes by age 40, and PCOS women are at increased risk of cardiovascular disease. Despite these facts, shared reproductive/metabolic mechanisms are largely underexplored. Moreover, studies addressing the genetic architecture of PCOS are missing. Here, we induced a PCOS-like condition in 25 recombinant and classical inbred female strains and matched placebo controls over 6 weeks. Comprehensive in vivo and terminal reproductive/metabolic analyses were performed, as well as ovary and adipose RNA-Seq. These strains varied in PCOS response in a number of key metabolic and reproductive traits, including circulating hormone levels, glucose metabolism and cardiac function. We applied a linear mixed-effects model to estimate heritability, and genetic (h2), PCOS, and gene-by-PCOS interactions. High h2 was observed for lean and fat mass, glucose and AUC, whereas PCOS effect were high for the BW change, testosterone and AUC. Substantial gene-by-PCOS interactions were found for reproductive hormones. Undirected network construction and centrality estimates showed that that the reproductive hormones LH and LH/FSH ratio were the strongest central traits connecting metabolic phenotypes. We also showed select strains represent subtypes of human PCOS-metabolism interaction with varied susceptibilities to disease in a PCOS setting. Ovarian RNA-seq analysis and of PCOS DEGs showed strong enrichments with human disease settings such as hyperandrogenism, inflammation, and pregnancy hypertension. Similar analyses in GWAT RNA-seq showed enrichments in weight gain, liposarcoma, inflammation and reproductive diseases were in the top, with adipose genes connecting these diseases and potentially involved with the PCOS. In conclusion, we stablished a PCOS model to study relevant mechanisms intersecting reproduction with metabolism in the context of genetic variation.
Peter A Doris1, Isha Dhande1, Melissa L.Smith2, Kai Li3, Ted Kalbfleisch3, Yaming Zhu1
1 Brown Foundatoin Institute of Molecular Medicine, McGovern Medical School, University of 2Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY 3Gluck Equine Genetics Center, University of Kentucky, Lexington, KY
* peter.a.doris@uth.tmc.edu
Stroke susceptibility in SHRSP is determined by B cells genes
The spontaneously hypertensive rat (SHR) is a long-established inbred model of genetic hypertension produced by selective breeding on the trait of high blood pressure. After fixation of the trait, inbreeding continued in several isolated lines. At F23, in one such line, SHR-A3 (aka stroke-prone SHR), evidence of cerebrovascular disease emerged in some individuals and this trait was quickly fixed by breeding only progeny of animals that had disease. The existence of closely related SHR lines sharing genetic basis for hypertension, but diverging in stroke susceptibility confirms that end organ disease in the presence of hypertension can be due to non-hypertension alleles and provides an opportunity to identify the genetic basis of susceptibility. We used a panel of SNP markers (~10,000) to investigate genetic relatedness and the distribution of identity by descent in inbred members of the SHR-A3 and stroke-resistant SHR-B2 lines. The lines are 87% IBD with non-allelic regions of the genome distributed in ~150 discrete haplotype blocks, as expected from genealogy. Whole genome sequencing (Illumina) and recent long read genome assembly has been used to define genetic variation across the strains. The immunoglobulin heavy chain locus was the most divergent region of the genome. The region of the IGH locus containing the IG Variable genes is ~3Mb in SHR-A3 and over 6Mb in SHR-B2. There are heritable differences in serum immunoglobulin levels across the two strains that map in cis to the IGH locus. The conserved Constant region exons of IG are also highly divergent and are affected by segmental duplication in SHR-A3. A congenic strain was made in which the SHR-B2 IGH locus was transferred to the SHR-A3 genetic background. This was associated with reduced stroke. Analysis of SHR-A3 variation identified a premature stop codon in the lymphocyte calcium signaling gene Stim1. This gene transmits signals from the IG-encoded B cell receptor to the nucleus that are involved in proliferation, cytokine release and antibody affinity maturation. A role in stroke was also confirmed in a congenic line. De novo assembly indicated that the promoter of JunD, a regulator of lymphocyte proliferation, contains a SINE insertion in SHR-B2 that was absent in SHR-A3. This is associated in cis with a major effect on JunD expression. A JunD congenic was also found to have reduced cerebrovascular disease. B lymphocytes are the only cell type that expresses all three of these genes and these findings point to immunoglobulin as a mediator of stroke susceptibility. We have tested this hypothesis by the creation of an SHR-A3 line in which IGH was targeted. These animals lack B cells, lack immunoglobulin and have reduced stroke.
Peter A Doris8*, Melissa L.Smith1, Kai Li2, Jo Wood3, Kerstin Howe3, Kelli J. Kochan4, J. Chris Blazier4 Hao Chen5, Monika Tutaj6, Rebecca Schilling6, Anne Kwitek6 Melinda Dwinell6, Terence Murphy7, Françoise Thibaud-Nissen7, Shashikant Pujar7, Ted Kalbfleisch2
1Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY 2Gluck Equine Genetics Center, University of Kentucky, Lexington, KY 3Wellcome Sanger Institute, Hinxton, Cambridge, UK, 4Texas A&M University Institute for Genome Sciences and Society, College Station, TX 5University of Tennessee HSC, Memphis, TN 3Dept. of Physiology, Medical College of Wisconsin 4National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 5McGovern Medical School, University of Texas Houston
*peter.a.doris@uth.tmc.edu
A reference rat assembly from HiFi reads and long range scaffolding
The first application of long read methods to Rattus norvegicus resulted in the mRatBN7.2 assembly which combined PacBio continuous long reads (CLR) with long range scaffolding producing a significant improvement in assembly correctness. However, rapid advances in sequencing technology have overcome the high error rate of CLR. HiFi reads are now standard for complex genome assembly. These accurate long reads allow extended contigs to be assembled and when coupled with complementary methods for scaffolding that provide long range genomic data such as optical mapping and chromosomal conformation analysis it is possible to generate chromosome level assemblies. We have generated such an assembly for the reference strain, Brown Norway (BN/NHsdMcwi). An essential element of genome assembly is the analysis of the assembly to assess it for completeness, correctness, contiguity and accuracy and to allow comparison with previous related assemblies. Here we report the analysis of a new rat reference assembly (to be named GRCr8) using several bioinformatic tools and methods of evaluation. These tools include BUSCO and Compleasm for assembly completeness. We have aligned the assembly with the mRatBN7.2 assembly using Syri software to map synteny and recombination at a genome-wide scale and to generate a summary of single base and structural differences between the assemblies. We have performed k-mer analysis using Merqury software to assess quality score (QV), to determine completeness and accuracy of the assembly and to determine whether the presumed fully inbred state is reflected in the k-mer spectrum of the assembly. A preliminary automated annotation was performed at NCBI and revealed substantially fewer problem alignments with the annotation data set than are present in the mRatBN7.2 assembly. These problem alignments may represent sequence errors in the RefSeq data used for annotation. We are investigating these coding sequences problems by alignment of PacBio IsoSeq RNA data to the assembly. The mRatBN7.2 assembly contained variant bases the diverged in a large number of other inbred strains and were considered to be possible base-level errors in the assembly. The new assembly reduces these from nearly 140,000 variants to 550. Overall, the new assembly is 200Mb larger than mRatBN7.2. The additional assembled regions are not uniformly dispersed, but comprise regions that were not previously incorporated in mRatBN7.2. These regions include rDNA arrays, peri-centromeric regions, a much-extended Y chromosome and comprising segmentally duplicated autosomal regions which, in several notable examples, contain testis-expressed genes that are likely involved in sex chromosome competition.
This work was supported in part by the National Center for Biotechnology Information of the National Library of Medicine (NLM), National Institutes of Health and by grants from NHGRI (R01HG011252) and NIH-OD (R24OD024617).
Hao Chen1, Shuangying Leng, Caroline Jones, Robert W. Williams, Burt Sharp
University of Tennessee Health Science Center, Memphis, TN 38103
Oxycodone oral self-administration in inbred rats identifies different patterns of vulnerability
Although the misuse of most opioids requires injection or inhalation to produce rapid subjective effects, the exceptionally strong abuse liability of oxycodone is evident even when consumed orally. We designed an intermittent operant oral self-administration procedure in rats to model the pattern of oxycodone consumption in humans. This model starts with limited initial drug intake, followed by increasing drug concentrations during extended 4-h sessions on alternating days, a progressive ratio test, extended access in 16-h sessions, extinction, and cue-induced reinstatement. We studied 25 inbred strains using this protocol (149 females and 108 males). Mean oxycodone intake (mg/kg) during 4-h sessions (0.1 mg/ml oxycodone, 60μl per reward using a fixed-ratio 5 schedule) ranged from 0.09 ± 0.02 (HXB23) to 3.57 ± 0.45 (LE/Stm) in females and 0.07 ± 0.01 (M520) to 2.07 ± 0.32 (WMI) in males. While across strains, females consumed more oxycodone than males, intake in 4-h (r = 0.54, p = 0.02) and 16-h (r = 0.70, p = 0.003) sessions were strongly correlated between sexes. Different patterns of escalation emerged when rats switched from 4-h to 16-h sessions: many strains maintained similar intake despite increased drug availability, while others drastically escalated (e.g. 7.7-fold in females and 4.9-fold in males of the FXLE15 strain). Together, these data demonstrated strong genetic modulation of drug intake. While we are still increasing the number of phenotyped strains, we anticipate genetic mapping using whole genome sequence-based markers will identify loci and candidate genes contributing to heritable differences in patterns of oxycodone-driven behavior.
Richard Mott2, Leilei Cui1,2,3,4†, Bin Yang1†, Shijun Xiao1, Jun Gao1, Amelie Baud5, Delyth Graham6, Martin McBride6, Anna Dominiczak6, Sebastian Schafer7, Regina Lopez Aumatell8, Carme Mont9, Albert Fernandez Teruel10, Norbert Hübner11,12,13, Jonathan Flint14, Lusheng Huang1
1National Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China. 2UCL Genetics Institute, University College London, London, UK. 3Human Aging Research Institute and School of Life Science, Nanchang University, and Jiangxi Key Laboratory of Human Aging, Jiangxi, China.4School of Life Sciences, Nanchang University, Nanchang, China. 5Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain. 6BHF Glasgow Cardiovascular Research Centre, University of Glasgow, Glasgow G12 8TA UK. 7Cardiovascular and Metabolic Disorders Program, Duke-National University of Singapore Medical School, Singapore, 8Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain. 9Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford UK. 10 Departamento de Psiquiatría y Medicina Legal, Universitat Autonoma de Barcelona, Spain. 11Genetics and Genomics of Cardiovascular Diseases Research Group, Max Delbrück Center (MDC) for Molecular Medicine in the Helmholtz Association, Berlin, Germany. 12DZHK (German Center for Cardiovascular Research) partner site Berlin, Berlin, Germany. 13Charité Universitätsmedizin Berlin, Berlin, Germany. 14Department of Psychiatry and Behavioral Sciences, Brain Research Institute, University of California, Los Angeles, Los Angeles, CA, USA.
Presenter: Richard Mott: r.mott@ucl.ac.uk
UCL Genetics Institute, University College London, London, WC1E 6BT, UK
Dominance is common in mammals and is associated with trans-acting gene expression and alternative splicing
Background: Dominance and other non-additive genetic effects arise from the interaction between alleles, and historically these phenomena play a major role in quantitative genetics. However, most genome-wide association studies (GWAS) assume alleles act additively.
Results: We systematically investigate both dominance – here representing any non-additive within-locus interaction - and additivity across 574 physiological and gene expression traits in three mammalian stocks: F2 intercross pigs, rat heterogeneous stock and mice heterogeneous stock. Dominance accounts for about one quarter of heritable variance across all physiological traits in all species. Hematological and immunological traits exhibit the highest dominance variance, possibly reflecting balancing selection in response to pathogens. Although most quantitative trait loci (QTLs) are detectable as additive QTLs, we identify 154, 64 and 62 novel dominance QTLs in pigs, rats and mice respectively, that are undetectable as additive QTLs. Similarly, even though most cis-acting expression QTLs are additive, gene expression exhibits a large fraction of dominance variance, and trans-acting eQTLs are enriched for dominance. Genes causal for dominance physiological QTLs are less likely to be physically linked to their QTLs but instead act via trans-acting dominance eQTLs. In addition, thousands of eQTLs are associated with alternatively spliced isoforms with complex additive and dominant architectures in heterogeneous stock rats, suggesting a possible mechanism for dominance.
Conclusions: Although heritability is predominantly additive, many mammalian genetic effects are dominant and likely arise through distinct mechanisms. It is therefore advantageous to consider both additive and dominance effects in GWAS to improve power and uncover causality.
Britahny M Baskin1,2,3, Emma J Sandago1,2, Hong S Choi3, Carissa J Stots3, Megan L Quinn1,2, Reilly N Thompson1,2, Olivia F Barclay3, Alexandra G Panepinto3, Daniel Schmidlin3,4 Sophia A Miracle1,2, Kathleen M Kantak3, Camron D Bryant1,2
1Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences, Center for Drug Discovery, Northeastern University 2Department of Pharmacology, Physiology & Biophysics, Boston University Chobanian & Avedisian School of Medicine, Boston, MA 02118 3Department of Psychological and Brain Sciences, Boston University, Boston, MA, 02215 4Undergraduate Program in Neuroscience, Boston University, Boston, MA 02215
Corresponding author: c.bryant@northeastern.edu
Spontaneously Hypertensive Rat substrains and the offspring of reciprocal F2 crosses exhibit differences in model risk traits for addiction and cocaine sensitivity
Psychostimulant use disorders are heritable (40-50%) [1], yet their etiology is largely unknown. Quantitative trait locus (QTL) mapping in nearly identical rodent substrains can greatly facilitate identification of quantitative trait genes/variants underlying behavior, capitalizing on their near-isogenic nature. We previously observed differences in cocaine stimulant sensitivity and operant intravenous self-administration between spontaneously hypertensive rat (SHR) substrains from Harlan-Envigo Laboratories (SHR/NHsd) and Charles River Laboratories (SHR/NCrl) [2]. Following in-house breeding of parental substrains and reciprocal F2 crosses, female and male adult rats were assessed for locomotor activity following saline, and two doses of cocaine (20 and then 5 mg/kg; i.p) over two weeks. Rats were then tested on a sucrose preference task to assess sensitivity to an alternate “natural” reward, and rats from parental substrains were additionally tested on a Differential Reinforcement of Low-Rate Responding (DRL) operant task which required delayed lever responding for chocolate pellets to assess impulsivity. Following assessment of these three addiction risk model traits, parental rats underwent intravenous operant cocaine (0.25 mg/kg) self-administration sessions under an FR1 reinforcement schedule. SHR/NCrl rats exhibited greater locomotor activity when first injected with saline (novelty response) and greater conditioned hyperactivity on saline days compared to SHR/NHsd following repeated (3) injections of cocaine (20mg/kg, i.p). Females rats from both parental substrains exhibited a greater novelty response and greater cocaine-induced locomotion than males. SHR/NHsd exhibited a stronger sucrose preference than SHR/NCrl, suggesting parental substrain differences depend on the type of reward. Consistent with a higher pro-addiction phenotype however, on the DRL task, SHR/NCrl rats exhibited lower response efficiency and a sex dependent (females>males) increase in burst responding, indicating higher levels of impulsivity. Females from both parental substrains self-administered more cocaine compared to males under an FR1 schedule. We also assessed parent of origin effects in our reciprocally crossed F2 offspring (one-half had SHR/NCrl granddam, one-half had SHR/NHsd granddam). Regardless of maternal lineage of origin, females exhibited higher locomotor activity in response to cocaine than male F2 rats and also sensitized to repeated doses of cocaine whereas the males did not. However, for sucrose preference, the F2s with SHR/NCrl granddams exhibited higher sucrose preference scores than F2 rats with SHR/NHsd granddams, indicating a parent of origin effect. These results demonstrate the capability of comprehensive behavioral batteries to identify subtle differences in pro-addiction phenotypes amongst near-isogenic substrains and their reciprocal F2 offspring.
Denghui Chen1*, Khai-Minh H. Nguyen2, Daniel Munro2,3, James Guevara2, Katarina Cohen2, Tridi Jena2, Jonathan L. Sebat2,4,5, Abraham A. Palmer2,4
1Bioinformatics and System Biology Program, University of California San Diego 2Department of Psychiatry, University of California San Diego, La Jolla, CA, USA 3Department of Genome Sciences, University of Washington, Seattle, WA, USA 4Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA 5Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA, USA
* dec037@ucsd.edu
Structural variants calling in extended rat pedigree using PacBio HiFi sequencing
Genome-wide association studies (GWAS) in mammalian model organisms, such as rats, complement human GWAS efforts and offer a unique advantage by enabling the validation of associations through direct experimental manipulations, shedding light on new biological mechanisms. However, pinpointing the specific genes responsible within an associated region poses challenges for a variety of reasons. The causal alleles may not be single nucleotide polymorphisms (SNPs), and SNPs may not always effectively tag the underlying causal variations, such as structural variants (SVs) and short tandem repeats (STRs). To overcome this limitation, we are leveraging Pacific Biosciences (PacBio) high fidelity (HiFi) sequencing to identify SVs, which we then use to perform GWAS for gene expression and complex behavioral traits in rats. We are currently in the process of identifying SVs within the genomes of the eight inbred founders of the outbred heterogeneous stock (HS) population, which will later serve as the basis for SVs imputation for the outbred population. Through imputation, we will be able to uncover connections between SVs and a wide range of traits in over 10,000 outbred HS rats who have undergone extensive genotyping and phenotyping. Our ultimate objective is to unveil new genetic mechanisms underlying conditions such as drug abuse and other human diseases.
Samson Eugin Simon1,2, Laura M. Sipe3, Jeremiah R. Holt2,4, Boston W. Simmons1,2, Sydney C. Joseph1,2, Casey J Bohl4, Sandesh J. Marathe1,2, Arvind V. Ramesh1,2, Logan G. McGrath1,2 Zeid T. Mustafa1,2, Sidharth S. Mahajan1,2, D. Neil Hayes1,2,4, Lu Lu4, Robert W. Williams2,4, David G. Ashbrook2,4, Liza Makowski1,2, 4
1Division of Hematology and Oncology, Department of Medicine, College of Medicine, University of Tennessee Health Science Center (UTHSC); Memphis, TN 38163, USA 2UTHSC Center for Cancer Research, College of Medicine, UTHSC; Memphis, TN 38163, USA 3Department of Biology, University of Mary Washington; Fredericksburg, Virginia 22401, USA 4Department of Genetics, Genomics, and Informatics, UTHSC; Memphis, TN 38163, USA
Exploring the Impact of Methylation Genetics on Tumor Suppressors in BXD Preclinical Mouse Models
Breast cancer (BC) is a significant global health concern, involving a heterogeneous group of malignancies that vary in their molecular characteristics, clinical behavior, and treatment responses. Triple-negative breast cancer (TNBC) represents around 15-20% of all BCs and is associated with aggressive clinical behavior, such as early metastasis, and limited treatment options when compared to other BC subtypes. Understanding the contribution of epigenetic modifications, particularly DNA methylation, in driving TNBC aggression will aid in identifying novel molecular targets. Therefore, we have taken advantage of two well-established genetic models to test our hypothesis that methylation of DNA is modified by genetic variants that lead to aggressive BC. The BXD mouse family is created by crossing two strains, C57BL/6J ("B") and DBA/2J ("D") producing recombinant inbred lines (RILs) that have a consistent genetic background that can be reliably reproduced. BXD mice were crossed with a model of TNBC the C3(1)-SV40 T-large antigen genetically engineered mouse model (GEMM), resulting in BXD-BC F1 progeny in which females develop breast tumors. These F1 mice are isogenic hybrids, displaying significantly heritable variations in their presentation of TNBC characteristics such as tumor latency, multiplicity, and survival. Through an unbiased systematic quantification of breast cancer severity across BXD-BC hybrids, we identified several significant quantitative trait loci (QTL) and genetic variants of interest for certain tumor traits. Two tumor suppressor genes Rassf3 and Rassf6 were identified, which are commonly downregulated in various cancers through DNA methylation-induced gene silencing. Bulk RNA sequencing across BXD-BC strains revealed variability in methylation biomarkers and Rassf family mRNA expression levels, significantly correlating with tumor phenotypes. Future goals include determining epigenetic modifications at the genomic level using next-generation sequencing (NGS) to parse out gene expression dynamics. This novel murine model allows a unique approach to understanding how DNA methylation regulates expression of genetic modifier relating to TNBC and its treatment response. Findings from this study will have the potential to significantly improve treatment outcomes for individuals facing therapy-resistant breast cancer.
Pattee, J.1*, 2Purdy, A., 3Flinn, M., 2Patterson, M., 3O’Meara, C., & 4Saba, L.M.
1Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, USA 2Department of Cell Biology, Neurobiology and Anatomy, Medical College of Wisconsin, Milwaukee, WI, USA 3Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA 4Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
*jack.pattee@cuanschutz.edu
Cell-type specific co-expression network analysis in single nuclei RNA-Seq heart tissue data from inbred HRDP strains
RNA co-expression networks often provide insight into the etiology of biological processes. Using this technique in the context of single-cell and single nuclei RNA-Seq data could provide a detailed understanding of cell-type specific networks and ubiquitous networks active in all cell types by quantifying RNA expression within cell types. One of the challenges in using scRNA-Seq for network analysis is the sparsity of expression data (i.e., RNA-Seq reads) within any given cell and the transition from co-expression across samples (usually too few for robust network analysis) to co-expression across individual cells within a sample. Recently developed high dimensional weighted gene co-expression network analysis (hdWGCNA) addresses the first challenge by first forming “metacells” which aggregates reads across 2 to 10 cells with similar RNA expression. For this analysis, we used hdWGCNA to tease apart cell type-specific functionality from common pathways in single nuclei RNA-Seq data from heart tissue of 3 inbred strains for the Hybrid Rat Diversity Panel (HRDP). We use UMAP to cluster cells and assign cell types to clusters by examining the differentially expressed genes associated with each cluster. We then estimate cell-type specific networks using hdWGCNA and assess cluster preservation across cell types using network preservation statistics. Modules that are preserved across multiple cell types are weakly or not enriched for genes associated with that tissue type, whereas robust modules that are not preserved are strongly enriched for genes associated with that tissue type. Supported by NHLBI R21 HL156022, a pilot grant from the Medical College of Wisconsin’s Cardiovascular Center, and the ALSAM Foundation.
G. Allan Johnson, Ph.D.1, David Ashbrook2, Ph.D., Gary Cofer1, M.S., James C Cook1 B.S., Kathryn Hornburg1, Ph.D. Harrison Mansour1, M.S., Yi Qi1, M.D., Yuqi Tian1, B.S., Robert W. Williams2, Ph.D., Leonard E. White3, Ph.D.
1Duke Center for In Vivo Microscopy, Duke University, Durham, NC. 27710 2Department of Genetics, Genomics, and Informatics , University of Tennessee Health Science Center, Memphis TN 38103 3Department of Neurology, Duke University, Durham, NC, 27710
The connectome- a complex genetic trait
Diffusion tensor imaging (DTI) has provided an entirely new method for evaluating the brain with MRI. A series of high-resolution MRI are acquired with different directional magnetic field gradients encoding the diffusion of water in the brain. In tissues where the diffusion is constrained (e.g. axons) the magnitude and direction of diffusion provide unique insight into the cytoarchitecture, more specifically the strength of connectivity between regions of the brain. By parcellating the brain into functional regions and mapping the connection strength between these regions, one can generate a global assessment of the connectivity- the connectome- the wiring diagram of the brain. We have extended the spatial resolution of DTI methods used for clinical DTI by more than 2 million times making it possible to examine mouse models of disease with exquisite detail 1. In early work we demonstrated that DTI scalar traits and connectomes are highly heritable 2. More recently we have merged the method with light sheet microscopy to allow complementary measure of cell types and distributions 1. And in ongoing work we have demonstrated that DTI can follow strain dependent changes in connectivity with aging. This talk will present how we are using microscopic structural diffusion tensor imaging to study the relationships among complex traits, connectomes, genetics, and age in the BXD mouse models.
Support: Support NIH/NIA R01 AG070913
Yuqi Tian, David Ashbrook, Ph.D., James C. Cook B.S., Yi Qi, M.D., G Allan Johnson, Ph.D., Robert W. Williams, Ph.D., Leonard E. White, Ph.D.
A Rapid Workflow for Cell Quantification in Combined Light Sheet Microscopy and Magnetic Resonance Histology
Information on regional variation in cell numbers and densities in the CNS provides critical insight into structure, function, and the progression of CNS diseases. However, variability can be real or a consequence of methods that do not account for technical biases, including morphologic deformations, errors in the application of cell type labels and boundaries of regions, errors of counting rules and sampling sites. We address these issues in a mouse model by introducing a workflow that consists of the following steps: 1. Magnetic resonance histology (MRH) to establish the size, shape, and regional morphology of the mouse brain in situ. 2. Light-sheet microscopy (LSM) to selectively label neurons or other cells in the entire brain without sectioning artifacts. 3. Register LSM volumes to MRH volumes to correct for dissection errors and both global and regional deformations. 4. Implement stereological protocols for automated sampling and counting of cells in 3D LSM volumes. This workflow can analyze the cell densities of one brain region in less than 1 min and is highly replicable in cortical and subcortical gray matter regions and structures throughout the brain. This method demonstrates the advantage of not requiring an extensive amount of training data, achieving a F1 score of approximately 0.9 with just 20 training nuclei. We report deformation-corrected neuron (NeuN) counts and neuronal density in 13 representative regions in 5 C57BL/6J cases and 2 BXD strains. The data represent the variability among specimens for the same brain region and across regions within the specimen. Neuronal densities estimated with our workflow are within the range of values in previous classical stereological studies. We demonstrate the application of our workflow to a mouse model of aging. This workflow improves the accuracy of neuron counting and the assessment of neuronal density on a region-by-region basis, with broad applications for studies of how genetics, environment, and development across the lifespan impact cell numbers in the CNS.
J. Matthew Mahoney1*
1Computational Sciences, The Jackson Laboratory, Bar Harbor, ME, 04609, USA
*matt.mahoney@jax.org
Learning heritable endophenotypes from high-dimensional data using a causal mediation heuristic
High-dimensional data, including ‘omics, imaging, or cytological data, are now routinely gathered alongside clinical phenotypes in genetic mapping studies under the assumption that such intermediate data are more closely connected to the genetic mechanisms and, therefore, more heritable. In this this view, the genotype-phenotype map can be factored into two components: the influence of genetics on a set of endophenotypes, and the influence of the endophenotypes on the clinical phenotypes. In other words, the endophenotype mediates the causal influence of genotype on clinical phenotype. An ideal mediator has two desirable properties: 1) it is highly heritable, and 2) it explains a large fraction of phenotypic variance. However, in the high-dimensional setting, the optimal mediator endophenotype may be a complex function of the measured data, and itself a genetically complex trait. Thus, we need new tools to synthesize mediator endophenotypes from high-dimensional data using simultaneous genetic, mediator, and clinical phenotype data. In this work, we developed a novel heuristic approach to learning mediator traits from high-dimensional data as a form of high-dimensional mediation analysis (HDMA). Theoretically, we derive HDMA by considering the likelihood function for the perfect mediation graphical model, from which we obtain an objective function for reducing the dimensions of the genotype, mediator, and clinical phenotype data down to single scores that are concordant with the perfect-mediation model. Analysis of this objective function shows that, in the large-p-small-n setting typical of high-throughput biological data, there is a large set of trivial solutions, requiring careful regularization. We show that this objective function is closely related to a classic, but only recently solved, problem in multivariate analysis called generalized canonical correlation analysis (GCCA) [1]. Through this connection, we find that high-dimensional mediation analysis is a natural extension of multivariate genotype-phenotype mapping, proposed by Mitteroecker et al. [2], wherein the mediator trait is simultaneously optimized for heritability and correlation to the clinical phenotypes. Finally, we derive a kernelized version of HDMA that allows for learning polygenic endophenotypes with only knowledge of the kinship matrix among individuals, and we present a rapidly convergent iterative algorithm for kernelized HDMA.
Arvind V. Ramesh1,2, Laura M. Sipe3, Margaret S. Bohm1,2, Sydney C. Joseph1,2, Brianne Hibl4, Liza Makowski1,2
1Division of Hematology and Oncology, Department of Medicine, College of Medicine, University of Tennessee Health Science Center (UTHSC); Memphis, TN 38163, USA; 2UTHSC Center for Cancer Research, College of Medicine, UTHSC; Memphis, TN 38163, USA; 3Department of Biology, University of Mary Washington; Fredericksburg, Virginia 22401, USA 4Department of Department of Comparative Medicine, UTHSC; Memphis, TN 38163, USA
Striking Survival Outcomes after Vertical Sleeve Gastrectomy Bariatric Surgery in Different Strains of Mice
Obesity, defined as having a BMI at or above 30 kg/m2, is a global pandemic that has affected the lives of over 25% of adults in 94 different countries. Because of the multitude of complications that can develop as a result of overweight or obesity, such as cardiovascular diseases, type 2 diabetes, and cancer, the search for improved therapeutics targeting obesity’s disease progression is needed. Bariatric surgery is an increasingly prevalent approach and has shown to provide the greatest amount of weight loss compared to dietary or lifestyle interventions. Importantly, bariatric surgery also results in sustained weight loss for most patients. Vertical sleeve gastrectomy (VSG) is the most common type of bariatric surgery in patients. To begin to understand the underlying mechanisms of improvements in disease outcomes such as reduced Metabolic Syndrome and cancer risk, with improvements in cancer therapeutic outcomes, we aimed to develop a pre-clinical approach in mice. Using both C57BL/6J (“B6”) and FVB/NJ (“FVB”) strains, we established best practices and examined survival and weight loss efficacy of VSG compared to sham controls. We and others have shown that the B6 strain has the greatest susceptibility to diet-induced obesity (DIO), while the FVB/N mouse model has varied susceptibility, with overall less weight gain. To provide a reproducible VSG, we have developed a reliable VSG procedure including pre- and post-operational steps to reduce stress in DIO mice. Results demonstrated that while we could achieve 100% survival in B6 mice (N=19), survival was poor in FVB mice. In FVB mice, fifty percent of the mice died 24-48 hours after surgery with the other 50% dying in about 2 weeks. The FVB mice that lived past the acute recovery stage, developed large bezoars which appeared to inhibit digestion. Bezoars were rare in B6 mice. In conclusion, it appears that background strain is highly relevant to survival after bariatric surgery.
Harper Kolehmainen, Gregory Farage, Zifan Yu, Khyobeni Mozhui, Saunak Sen
Real-Time Interactive Application for Large-Scale Genetic Data Analysis
We present a real-time interactive platform for rapid association scans on BXD mouse genetic data. Built as a customizable interactive Pluto notebook written in Julia, our application offers a user-friendly interface for both QTL (quantitative trait loci) and eQTL (expression quantitative trait loci) analyses. Our application is intended for genome-wide association studies (GWAS) involving large quantities of quantitative traits and moderate sample sizes common for model systems. Our application uses the BulkLMM package as a computational backend for fast genome scans and permutation tests and BigRiverQTLPlots for plotting. Users with programming experience can easily modify the notebook interface to suit their needs, and even learn Julia. Thus, the notebook is simultaneously a data analysis tool, a teaching tool, and a protyping tool for reactive interfaces.
Vamsi K. Kodali, Francoise Thibaud-Nissen, Terence D. Murphy, and the NCBI Eukaryote Annotation team
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Unlocking Rat Genomics through the NIH Comparative Genomics Resource (CGR) at NCBI
The National Center for Biotechnology Information (NCBI) is developing the NIH Comparative Genomics Resource (CGR) to maximize the impact of eukaryotic research organisms and their genomic data to biomedical research. CGR facilitates reliable comparative genomics analyses for all eukaryotic organisms through community collaboration and an NCBI genomics toolkit to promote high data quality and easy data access and reuse. Recent developments include improvements to genome contaminant identification, better automated genome annotation, a clustered BLAST database, a graphical Comparative Genome Viewer (CGV), and new web and programmatic interfaces for streamlined access to genomic data. This talk will explore the new tools and datasets with a focus on Rattus norvegicus. NCBI RefSeq’s latest annotation for the mRatBN7.2 assembly includes curation of over 9000 protein- coding genes (40% of total) to maximize quality, with extensive integration with the Rat Genome Database (RGD) through NCBI’s Gene resource. The mRatBN7.2 assembly can be explored in NCBI’s Genome Data Viewer (GDV) relative to gene expression and variation data. CGV enables graphical browsing of whole genome alignments to assemblies from human, other rodents, and additional rat strains, with preliminary RefSeq annotations available for several additional strains. These resources and more are available for a surging number of organisms across the tree of life to enable the next wave of biomedical advances.
This work was supported by the National Center for Biotechnology Information of the National Library of Medicine (NLM), National Institutes of Health.
Gregory Farage1, Zifan Yu1, Karl Broman2, Śaunak Sen1*
1Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN 38163, USA 2Department of Biostatistics and Medical Informatics, University of Wisconsin - Madison, Madison, WI 53706, USA
*sen@uthsc.edu
Comparison of linear mixed model genome scans using individual level data vs strain mean data
When performing genome scans in recombinant inbred strain panels such as the BXD family of strains, a common question is: should I use the trait values measured on individuals, or can I use the strain means as the trait? The former has greater storage and computational requirements compared to the latter. The latter being a summary could be subject to information loss. We compared four distinct strategies for genetic mapping using Linear Mixed Models (LMMs): individual-level data, strain means, strain means weighted by sample size, and weighted strain means augmented by within strain variance. We used real and simulated data derived from 1,096 BXD mice and over 20,000 genetic markers downloaded from GeneNetwork. We employed the Julia package BulkLMM and a "Leave-One-Chromosome-Out" (LOCO) approach for the scans. We considered tradeoffs between computational resources and statistical efficiency. Our results indicate that while individual and summary data may give different results, we can approximate the individual-level analysis using summary data by adjusting our LMM.
Kauthar M. Omar1,2, Hélène Tonnelé 2, Felipe Morillo2, Felix Lisso1,2, Pjotr Prins1,3, Lenny Mwagandi1, Amelie Baud2
1Pwani University, Kilifi, Kenya. 2Centre for Genomic Regulation, Barcelona, Spain. 3University of Tennessee Health Science Center, Memphis, TN, USA,
QUANTIFYING ALLO-COPROPHAGY IN LABORATORY RATS FROM DEEP SHOTGUN SEQUENCING DATA OF THE GUT
The transmission of commensal bacteria can occur through physical contact and cohabitation, as well as through the consumption of water contaminated with fecal microorganisms. In both scenarios, the transfer of commensal microbes may have significant implications for health. An improved understanding of microbial transfers will provide valuable insights for the development of probiotics and the use of fecal microbiota transfers in disease treatment. The use of allo-coprophagy in rats is a good model for investigating the transmission dynamics of commensal microbes. Allo-coprophagy refers to a naturally occurring process observed in rodents, where an individual consumes the feces of another individual belonging to the same species. Consequently, this leads to the transfer of microbes from one animal to another. The objective of this study is to detect and quantify allo-coprophagy in outbred laboratory rats, in order to understand the transmission of microorganisms facilitated by allo-coprophagy. This work is based on the concept that the cells and DNA seen in a rat’s gut are derived from its own DNA and the DNA coming from its cage mate’s feces as a result of the shedding of the cage mate’s gut epithelial cells consumed through allo-coprophagy and identified via sequencing. In this study, we extracted DNA from the gut of 32 pair-housed rats belonging to the "heterogeneous stock" (HS) rats using deep shotgun sequencing and utilized only the reads that aligned to the rat reference genome. Pre-existing statistical methodologies[1] were employed to detect and quantify instances of DNA mixtures. An additional statistical analysis was conducted in order to ascertain if the observed mixture was a result of contamination and/or allo-coprophagy. One rat was identified to possess a substantial amount of cage mate DNA in its gut. However, when considering the entire dataset of 32 experimental rats, the data supporting allo-coprophagy was not statistically significant. This study shows the possibility of employing deep shotgun sequencing of the gut as a means to quantitatively assess a process that plays a crucial role in the transmission of microbes across individuals.
Śaunak Sen1*, Gregory Farage1
1Division of Biostatistics, Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN 38163, USA
Julia software for genetic analysis and omics data
Julia is an attractive programming language well-suited for large-scale biomedical data science due to its speed, user-friendly syntax, and rapidly growing package ecosystem. Our group chose Julia because of ease of development, fast runtimes without using a low-level language such as C/C++, and good support for parallelization. We present our software packages for genetic mapping and omics data analysis: FlxQTL, BulkLMM, and BigRiverQTLPlots for genetic mapping; MatrixLM, MatrixLMnet, and WolfRiverPlots for matrix-valued data analysis; Helium, GeneNetworkAPI, and MetabolomicsWorkbenchAPI for data utilities. If you are curious or enthusiastic about using Julia for your research, we invite you to engage with us for further discussion. More information about our packages can be found at: https://senresearch.github.io/.
Jeremiah R. Holt1,2,4, Sidharth S. Mahajan1,2, Samson Eugin Simon1,2, Laura M. Sipe3, Boston W. Simmons1,2, Sydney C. Joseph1,2, Casey J Bohl4, Sandesh J. Marathe1,2, D. Neil Hayes1,2,4, Lu Lu4, Robert W. Williams2,4, David G. Ashbrook2,4, Liza Makowski1,2, 4
1Division of Hematology and Oncology, Department of Medicine, College of Medicine, University of Tennessee Health Science Center (UTHSC); Memphis, TN 38163, USA 2UTHSC Center for Cancer Research, College of Medicine, UTHSC; Memphis, TN 38163, USA 3Department of Biology, University of Mary Washington; Fredericksburg, Virginia 22401, USA 4Department of Genetics, Genomics, and Informatics, UTHSC; Memphis, TN 38163, USA
Unsupervised clustering of tumors from preclinical models of triple-negative breast cancer reveals distinct patterns of gene expression and pathway enrichment that associate with BXD strain and immunity
Despite advances in early detection and treatment, breast cancer (BC) remains a significant global health concern, due to its nature as a heterogeneous group of malignancies that vary in their molecular characteristics, clinical presentation, and outcomes. Triple-negative breast Cancer (TNBC), which represents around 15-20% of all breast cancers and is associated with aggressive clinical characteristics, contributes disproportionately to BC mortality because of a general lack of targeted treatments and its increased prevalence in underrepresented patient populations. Understanding the specific molecular mechanisms, including patterns of gene expression and their underlying genetic modifiers, that are driving TNBC tumor formation and aggressiveness will improve our understanding and treatment of the disease. To decipher the molecular elements influencing TNBC tumorigenesis, as well as test our hypothesis that genetic variants influence the formation of aggressive BCs with distinct molecular and clinical characteristics, we have taken advantage of two well-established murine models. The BXD mouse family was created by crossing two strains, C57BL/6J ("B") and DBA/2J ("D") producing recombinant inbred lines (RILs) that have a consistent genetic background that can be reliably reproduced. BXD mice were then crossed with a model of TNBC the C3(1)-SV40 T-Large antigen genetically engineered mouse model (GEMM), resulting in BXD-BC F1 progeny in which females develop breast tumors lacking detectable levels of hormone receptors which resemble human TNBC. These F1 mice are isogenic hybrids, displaying significantly heritable variations in their presentation of TNBC characteristics such as tumor latency, multiplicity, and survival. Hierarchical clustering of gene expression data generated via bulk RNA sequencing of primary tumors from 95 of these mice (representing 27 individual strains) identified four statistically significant gene expression subtypes. Interestingly, these distinct classes of murine tumors exhibit patterns of gene regulation and subsequent enrichment of molecular pathways that mirror established TNBC subtypes in humans, including those pathways involving epithelial differentiation, growth factor signaling, epithelial-mesenchymal transition (EMT), proliferation, and immune invasion. Additionally, twelve of the BXD strains showed a strong correlation with two of these gene expression clusters, suggesting there are heritable genetic modifiers contributing to the formation of tumors with certain molecular features. Overall, by studying the complex relationship of genotype and phenotype in a murine model of TNBC, we may improve our ability to treat individuals suffering from this disease in the clinical setting.
Apurva S. Chitre1, Hannah Bimschleger2, Katarina Cohen2, Riyan Cheng2, Denghui Chen1, JianJun Gao2, Katie Holl3, Alesa Hughson4, Aidan Horvath4, Benjamin B. Johnson2, Thiago Missfeldt Sanches2, Celine L. St. Pierre5, Daniel Munro2,6, Khai-Minh Nguyen2, Tengfei Wang7, Nazzareno Cannella8, David Dietz9, Hao Chen7, Roberto Ciccocioppo8, Shelly B. Flagel4,11, Gary Hardiman17, Keita Ishiwari9,10, Thomas Jhou12, Peter W. Kalivas12, Brittany N. Kuhn12, Suzanne H. Mitchell13, Oksana Polesskaya2, Jerry B. Richards9,10, Terry E. Robinson14, Leah C. Solberg Woods15, Abraham A. Palmer2,16*
1 Bioinformatics and System Biology Program, University of California San Diego. 2 Department of Psychiatry, University of California San Diego. 3 Department of Physiology, Medical College of Wisconsin. 4 Department of Psychiatry, University of Michigan. 5 Department of Genetics, Washington University in St Louis. 6 Department of Integrative Structural and Computational Biology, Scripps Research. 7 Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center. 8 School of Pharmacy, University of Camerino. 9 Department of Pharmacology and Toxicology, State University of New York at Buffalo. 10 Clinical and Research Institute on Addictions, State University of New York at Buffalo. 11 Michigan Neuroscience Institute, University of Michigan. 12 Department of Neuroscience, Medical University of South Carolina. 13 Departments of Behavioral Neuroscience and Psychiatry, Oregon Institute of Occupational Health Sciences, Oregon Health & Science University. 14 Department of Psychology, University of Michigan. 15 Department of Internal Medicine, Wake Forest School of Medicine. 16 Institute for Genomic Medicine, University of California San Diego. 17 School of Biological Sciences, Queen's University Belfast.
*Corresponding author (aap@ucsd.edu)
Genetic analysis of multiple measures of locomotor activity in 7,895 outbred Heterogeneous Stock rats
Background Locomotor activity has been equated with a constellation of related personality dimensions including extraversion, externalizing behaviors and sensation seeking. Locomotor behavior is also of significant biological interest because it is correlated with measures of anxiety, substance abuse and other behaviors, and because it can sometimes confound other more complicated behavioral measures. Data collected from many different collaborating projects, as outlined at www.ratgenes.org, have assessed behavioral traits relevant to drug abuse in thousands of heterogeneous stock (HS) rats. Leveraging this dataset, we conducted a comprehensive genome-wide association study (GWAS) on locomotor behavior and examined genetic correlations across multiple cohorts.
Materials and methods The cohorts differed in age, and procedural details like size of arena and test length. All subjects were N/NIH HS rats, which were derived from an intercross among 8 inbred strains and have been maintained as an outbred population for almost 100 generations. Locomotor traits were measured in 9 phenotyping centers presented in Table 1. We estimated the SNP heritability (h2) using GCTA-GREML. We used the GCTA Bivariate GREML analysis to estimate the genetic correlation (rg). We performed GWAS using the linear mixed model approach using GCTA MLMA-LOCO. We also integrated eQTL data from different tissues to help identify putatively causal genes.
Results SNP heritability estimates for the different locomotor measures ranged from 0.14 to 0.40. We observed a high genetic correlation across these measures, with rg values ranging from 0.17 ± 0.20 to 0.77 ± 0.14. Given this observation, we opted for a meta-analysis, leveraging the large sample size. This approach led to the identification of 8 independent Quantitative Trait Loci (QTLs), with interval sizes ranging from 0.43 Mb to 4.29 Mb. Notably, for the meta-analyzed data, the QTL at chr10 26.09Mb had eQTLs and coding polymorphisms for several GABA receptor genes such as Gabra1, Gabrb2, and Gabra6. Gabrb2-knockout mice, as highlighted in prior studies, manifest an array of symptoms such as prepulse inhibition (PPI) deficits, heightened locomotor activity, and sociability impairments, among others [Yeung et al. 2018]. We observed another QTL at chr14 70.52Mb, which had eQTLs for Drd5 and Slc2a9. Past research on Drd5 highlighted its influence on specific dopamine-related behaviors, especially exploratory locomotion [Holmes et al., 2001].
Conclusion Our research, one of the largest rodent GWAS to date, begins pinpointing the specific genetic loci responsible for inter-individual variability in this essential behavioral metric.
Abraham A Palmer1, Khai-Minh Nguyen1, Clara Ortez1, Denghui Chen1, Apurva Chitre1, Leah Solberg Woods2, Thiago Sanches1, Riyan Cheng1, Oksana Polesskaya1
1Department of Psychiatry, University of California San Diego, La Jolla, CA 2Wake Forest School of Medicine, Winston-Salem, NC.
RATTACA: a new paradigm for examining genetic correlations in outbred rats
Genetic correlations between traits are frequently studied in humans and model systems as a first step towards identifying causal pathways and mechanisms. In model systems, genetic correlations have been studied using several approaches. One of the simplest but also most prone to misinterpretation are correlations observed in pairs of inbred strains. Using this approach, a pair of strains that is divergent for one trait might be examined to determine if a second putatively causal factor also differs between the two strains. Such an approach can be misleading because numerous traits may differ between a pair of strains without having any causal relationship. Another approach would be to examine two factors in an outbred population; however, any observed correlations could be due to genetic or environmental causes. Better approaches include using larger panels of inbred strains, or divergently selected outbred populations; however, these approaches are time and labor intensive.
Here we introduce a novel experimental paradigm that we are calling RATTACA, in which phenotypes are predicted in naïve rats using extant rat GWAS data. Prediction is based on standard polygenic methods (e.g. BLUP) that are already widely used in agricultural and human genetics. Performance improves with the heritability of the trait and with sample size of the GWAS training data. RATTACA allows us to produce cohorts of rats that are predicted to be divergent for a trait. These divergent cohorts can be examined for a second putatively correlated trait to see if the second trait is genetically correlated with the first trait that used for prediction. One critical advantage of this approach is that by using prediction rather than directly measuring the first trait, the second trait can be measured in naïve rats. For example, we have examined slice electrophysiological traits in rats that were predicted to be divergent for cocaine self-administration. Because none of the rats were exposed to cocaine, electrophysiological differences are not secondary to differential cocaine exposure. This paradigm is especially attractive for lower throughput traits since they cannot be easily measured in large cohorts. We can also produce rats that are predicted to be divergent for multiple traits. Another possible application is to examine rats that are predicted to be divergent for the expression of one or more genes. Finally, for genes with LoF mutations, cohorts could be produced that resemble wild type, heterozygous and knock out populations, but with the benefit of an outbred genetic background, thus improving robustness. We are currently providing cohorts for free or at deeply subsidized prices to qualified investigators.
Montana Kay Lara1, Trevor Hamilton Wolf1, Emily E. Dean1, Khalil Samir AbedRabbo1, Benjamin Wilson2, Anna L. Tyler2, Amanda E. Hernan1,3,4, Rod C. Scott3,4, J. Matthew Mahoney1,2*
1Department of Neurological Sciences, University of Vermont, Burlington, VT, United States 2The Jackson Laboratory, Bar Harbor, ME, United States 3Division of Neuroscience, Nemours Children’s Health, Wilmington, DE, United States 44Department of Psychological and Brain Sciences, University of Delaware, Newark, DE, United States
*matt.mahoney@jax.org
Genetic Modifiers Cause Chronic Epilepsy in a Haploinsufficiency Mouse Model of Tuberous Sclerosis Complex
Tuberous Sclerosis Complex (TSC) is an autosomal dominant, multi-system disorder caused by loss-of-function (LoF) mutations in the TSC1 or TSC2 genes, and is a leading genetic cause of epilepsy, which affects most patients. To date, no mouse model recapitulates epilepsy in adulthood due to Tsc1 or Tsc2 heterozygous LoF, despite multiple backgrounds having been tested for TSC-associated outcomes, in stark contrast to human patients whose causal mutation is always heterozygous LoF. We hypothesized that varying genetic background with a heterozygous Tsc1 LoF mutation using the neurologically diverse BXD strains could recapitulate TSC-associated epilepsy. In a proof-of-concept study, we generated a new congenic strain on a pure C57BL/6J (B6) background to create B6-Tsc1+/- mice, which we bred to make F1 hybrids with the seizure-susceptible DBA/2J (D2) strain (D2-Tsc1+/-) and the BXD87/RwwJ (BXD87) strain (BXD87-Tsc1+/-). Using five-day video-electroencephalography (video-EEG) monitoring, we screened all strains for seizures in adulthood. Neither the B6-Tsc1+/- strain or the D2-Tsc1+/- F1 hybrid had seizures in adulthood, but half of the female BXD87-Tsc1+/- mice did. The BXD87-Tsc1+/- model is the first construct-valid and face-valid model of TSC-associated epilepsy and demonstrates the importance of genetic modifiers to TSC-associated epilepsy in mouse models. Genetically, the transgressive segregation of epilepsy in BXD87-Tsc1+/- mice implies a complex genetic architecture, while the sex specificity implies a potential X-linked modifier. These data establish that genetic modifiers in the BXD population can cause chronic epilepsy in a mouse model and support future genetic mapping studies to detect these modifiers.
Khyobeni Mozhui
The University of Tennessee Health Science Center College of Medicine, Preventive Medicine, Memphis, TN USA
Genetics of epigenetics, entropy increase, and aging
DNA methylation (DNAm) is influenced by genetic and non-genetic factors. We used a “pan-mammalian” DNAm microarray to profiled methylation at highly conserved CpGs and performed quantitative trait locus (QTL) mapping in liver tissue from mouse strains belonging to the BXD Family. We find that epigenetic entropy increases with age, and the progress from order to disorder can be related to genotype-dependent life expectancy. A regulatory hotspot on chromosome 5 had trans-acting influence on numerous distant CpGs (trans-meQTL). We refer to this meQTL hotspot as meQTL.5a. The trans-modulated CpGs showed age-dependent changes, and were enriched in developmental genes, including several members of the MODY pathway (maturity onset diabetes of the young). The joint modulation by genotype and aging resulted in a more “aged methylome” for BXD strains that inherited the DBA/2J parental allele at meQTL.5a. Further, several gene expression traits, body weight, and lipid levels mapped to meQTL.5a, and there was a modest linkage with lifespan. DNA binding motif and protein-protein interaction enrichment analysis identified the hepatic nuclear factor, Hnf1a (MODY3 gene in humans), as a strong candidate. The pleiotropic effects of meQTL.5a could contribute to variation in body size and metabolic traits, and influence CpG methylation and epigenetic aging that could have an impact on lifespan.
Ted Kalbfleisch1, Kai Li, Elizabeth Hudson, Eric Kline, Melissa Laird-Smith, and Peter Doris
Quality and Completeness Assessments of Reference Genome Assemblies for Inbred Rat Strains Important as Models of Complex Disease
Here we describe reference quality genome assemblies for 6 rat strains that are important as models for complex disease. These strains include the stroke-prone spontaneously hypertensive SHRSP/BbbUtx, and stroke-resistant SHR/Utx , Wistar/Kyoto WKY/Bbb, Brown Norway BN/NHsdMcwi, Long Evans/Stm, and Fischer 344/Stm. In addition to the new technologies, and bioinformatic pipelines that have emerged to build the genomes, new methods are now available to assess the completeness of the genomes we are creating. There are many regions of genomes that remain intractable, even with the very accurate, very contiguous sequence data we are able to generate. Nevertheless, it is possible to assess how complete an assembly is, for regions we can characterize. Here, we present the results of tools such as BUSCO (minimum completeness 99.7%), Merqury, and others. Although these genomes are neither complete, nor correct in an absolute sense, the analyses run to date suggest that nearly all regions which are tractable with either short or long read sequence data are represented accurately, and comprehensively in these new assemblies.