Dr. Michael Boehnke (University of Michigan) – who pioneered large-scale studies identifying genetic risk in diabetes and bipolar disorder – shared with us some recent insights about recent advances in exome and genome sequencing and their applications to better understand disease biology and etiology of psychiatric disorders, the relevance of statistical data analysis to overcome the burden of multiple testing, and how the genome can influence the epigenome to modulate gene expression and risk of type 2 diabetes.
Dr. Boehnke is a true pioneer in precision medicine having conducted a number of the first large scale studies to identify genetic risk in diabetes and bipolar disorder. As a biostatistician, Dr. Boehnke’s research focuses on the genetic dissection of complex traits. In his 35-year career he has developed methods for analysis of human pedigrees, examined the history of breast cancer in genetically at risk individuals, and contributed important discoveries on the genetics of type 2 diabetes and related traits, such as obesity and blood lipid levels. He has served on the University of Michigan faculty since 1984, focusing on problems of study design and statistical analysis of human genetic data with a particular emphasis on development and application of statistical methods for human gene mapping. His current focus is on disease and trait association studies based on genome sequence and genotype-array data. Read his full bio.
PMWC 2018 Michigan takes place June 6-7, 2018.
Q&A with Michael Boehnke
Q: Little is known about the molecular basis of mood and psychotic disorders such as bipolar and schizophrenia. How do genetic studies using whole genome or exome analysis provide us an insight for the development of novel drugs, therapies, and preventive strategies?
A: Genome-wide association studies (GWAS) based on genotype arrays or sequencing identify genetic variants associated with any disease (or trait), including these psychiatric disorders. Availability of low-cost arrays assaying millions of sites in the genome together with clever statistical and computational tools now allow us to assay all but the rarest or most complex genetic variation in hundreds of thousands or even millions of individuals. Exome and genome sequencing allow near-complete assay even of very rare genetic variation, but sequencing costs need to fall even further to allow the sample sizes we need to identify disease-associated variants with high statistical confidence. Each disease-associated region we identify provides a potential entry point to understand disease biology and etiology, to suggest targets for new drugs, or to better target existing drugs to people for whom they will be helpful and not harmful. Taking the step from associated variant to causal mechanism to a drug is challenging, and a major focus for both academic and pharmaceutical researchers. The good news is that drug targets suggested by genetic studies have a substantially higher rate of progressing through the drug development pipeline than those without support from genetic studies.
Q: What are the challenges and some of the solutions you developed for analyzing genome or exome sequence data from 10,000s of individuals?
A: Analysis of sequence data challenges us computationally and statistically. A BAM file which includes the complete information for a single human genome requires 25 x 109 bytes (25 gigabytes) of computer storage, so that for our NHLBI-funded TOPMed project which has to date sequenced >120,000 genomes requires storage of 3 x 1015 bytes (3 petabytes) of data. It was only a few years ago I learned what the prefix peta meant! Dealing with that much data requires careful consideration of issues such as minimizing data transfers and avoiding multiple data copies. While the cost of sequence data generation has dropped by many orders of magnitude, the cost of computer storage has dropped more slowly, making careful data management critical.
Statistical analysis of such large data files also is challenging and has required us to develop analysis software that is computationally very efficient. To test for association with many millions of sites in the genome requires extreme levels of statistical significance to overcome the burden of multiple testing, so that many of our standard statistical tests are no longer well behaved and have to be modified. For example, we use modified disease association tests when the number of cases with disease is much smaller than the number of controls without disease. Carrying out so many tests on such large samples also requires very careful quality control to avoid even a low rate of false positives that would swamp true association signals with spurious ones. For example, we developed methods to identify and discard DNA samples that are contaminated by DNA from another person.
Q: What are the challenges we face and the opportunities that exist in resolving the complex processes underlying common diseases such as breast cancer and obesity?
A: We geneticists always need to keep in mind that genetics is just small part of the overall picture, and that environmental and behavioral factors also are critical to health and disease. Still, genetic information has the advantages that it is simple (a 4-letter alphabet), finite (3 billion base pairs is a lot, but is finite), and does not change (so we can measure it once and use it forever), whereas behaviors and the environment change all the time and measuring and summarizing them is truly challenging. GWAS identify genetic regions associated with disease, providing valuable entry points to understand human biology and disease. We seek to move from genomic regions to specific causal variants, genes, and pathways, which in turn can illuminate the complex causal processes underlying these and other diseases.
Q: You published last year a paper on genetic regulatory signatures that are associated with increased risk for type 2 diabetics. What is the significance of this discovery and could it help lead to more personalized treatments for diabetes?
A: My research group and our collaborators are working to understand the genetic basis of type 2 diabetes. Our work has identified hundreds of regions in the human genome that impact risk to type 2 diabetes and variability to diabetes-related traits like glucose and insulin levels. An important next step is to identify the specific genes and genetic variants involved, and their mechanisms of action. In Varshney et al., we presented an integrated analysis of human pancreatic islet molecular profiling data. We found that genetic variants associated with type 2 diabetes are more frequently present in regions of the genome where transcription Regulatory Factor X (RFX) is predicted to bind in an islet-specific manner, and that genetic variants that increase type 2 diabetes risk are predicted to disrupt RFX binding. Our findings provide a molecular mechanism by which the genome can influence the epigenome, and so modulate gene expression and risk of type 2 diabetes. It is our hope that these sorts of mechanistic insights, that result from combining molecular data on open chromatin and gene expression with other sources of genomic annotation can help pinpoint the functional mechanisms underlying type 2 diabetes and lead to better understanding of type 2 diabetes etiology and treatment.
The Precision Medicine World Conference (PMWC), in its 17th installment, will take place in the Santa Clara Convention Center (Silicon Valley) on January 21-24, 2020. The program will traverse innovative technologies, thriving initiatives, and clinical case studies that enable the translation of precision medicine into direct improvements in health care. Conference attendees will have an opportunity to learn first-hand about the latest developments and advancements in precision medicine and cutting-edge new strategies and solutions that are changing how patients are treated.
See 2019 Agenda highlights:
- Five tracks will showcase sessions on the latest advancements in precision medicine which include, but are not limited to:
- AI & Data Science Showcase
- Clinical & Research Tools Showcase
- Clinical Dx Showcase
- Creating Clinical Value with Liquid Biopsy ctDNA, etc.
- Digital Health/Health and Wellness
- Digital Phenotyping
- Diversity in Precision Medicine
- Drug Development (PPPs)
- Early Days of Life Sequencing
- Emerging Technologies in PM
- Emerging Therapeutic Showcase
- FDA Efforts to Accelerate PM
- Gene Editing
- Genomic Profiling Showcase
- Immunotherapy Sessions & Showcase
- Implementation into Health Care Delivery
- Large Scale Bio-data Resources to Support Drug Development (PPPs)
- Microbial Profiling Showcase
- Microbiome
- Neoantigens
- Next-Gen. Workforce of PM
- Non-Clinical Services Showcase
- Pharmacogenomics
- Point-of Care Dx Platform
- Precision Public Health
- Rare Disease Diagnosis
- Resilience
- Robust Clinical Decision Support Tools
- Wellness and Aging Showcase
See 2019 Agenda highlights:
- Five tracks will showcase sessions on the latest advancements in precision medicine which include, but are not limited to:
- AI & Data Science Showcase
- Clinical & Research Tools Showcase
- Clinical Dx Showcase
- Creating Clinical Value with Liquid Biopsy ctDNA, etc.
- Digital Health/Health and Wellness
- Digital Phenotyping
- Diversity in Precision Medicine
- Drug Development (PPPs)
- Early Days of Life Sequencing
- Emerging Technologies in PM
- Emerging Therapeutic Showcase
- FDA Efforts to Accelerate PM
- Gene Editing / CRISPR
- Genomic Profiling Showcase
- Immunotherapy Sessions & Showcase
- Implementation into Health Care Delivery
- Large Scale Bio-data Resources to Support Drug Development (PPPs)
- Microbial Profiling Showcase
- Microbiome
- Neoantigens
- Next-Gen. Workforce of PM
- Non-Clinical Services Showcase
- Pharmacogenomics
- Point-of Care Dx Platform
- Precision Public Health
- Rare Disease Diagnosis
- Resilience
- Robust Clinical Decision Support Tools
- Wellness and Aging Showcase
- A lineup of 450+ highly regarded speakers featuring pioneering researchers and authorities across the healthcare and biotechnology sectors
- Luminary and Pioneer Awards, honoring individuals who contributed, and continue to contribute, to the field of Precision Medicine
- 2000+ multidisciplinary attendees, from across the entire spectrum of healthcare, representing different types of companies, technologies, and medical centers with leadership roles in precision medicine