Introduction

In this final chapter, we consider the future of Mendelian randomization within the wider context of genetic epidemiology. We divide the chapter into two sections. First, we discuss methodological developments in instrumental variable techniques which enable more sophisticated Mendelian randomization analyses. Secondly, we discuss applied developments, such as advances in genotyping and other high-throughput cell biology techniques, which widen the scope for future Mendelian randomization analyses.

Conclusions

In conclusion, there are still areas of ongoing methodological research in Mendelian randomization, and work is needed to translate existing and future methodological developments into the context of Mendelian randomization for applied researchers. This is fueled to a large extent by increasing data availability: new exposure variables, increasing detail of genetic measurements, and publicly-available data resources. These are likely to provide further insights into causal mechanisms, and further scope for methodological and applied developments in the future.

Relevant papers to chapter:

Section 11.1.2 (Non-linear exposure–outcome relationships). S. Burgess, N.M. Davies, S.G. Thompson. Instrumental variable analysis with a nonlinear exposure–outcome relationship. Epidemiology 2014; 25(6):877-885.

Section 11.1.3 (Untangling the causal effects of related exposures). S. Burgess, S.G. Thompson. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am. J. Epidemiol. 2014.

Section 11.1.3 (Untangling the causal effects of related exposures). S. Burgess, D.F. Freitag, H. Khan, D.N. Gorman, S.G. Thompson. Using multivariable Mendelian randomization to disentangle the causal effects of lipid fractions. PLoS One 2014; 9(10):e108891.

Sections 11.1.4 (Elucidating the direction of causation) and 11.1.5 (Investigating indirect and direct effects). S. Burgess, R.M. Daniel, A.S. Butterworth, S.G. Thompson, EPIC-InterAct. Network Mendelian randomization: using genetic variants as instrumental variables to investigate mediation in causal pathways. Int. J. Epidemiol. 2014.

Section 11.2.4 (Published data and two-sample Mendelian randomization). S. Burgess, R.A. Scott, N.J. Timpson, G. Davey Smith, S.G. Thompson. Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors. Submitted manuscript.

Introduction

Much of this book has been motivated and illustrated by data collected by the CRP CHD Genetics Collaboration. In this chapter, we analyse the entirety of the CCGC data to estimate the causal effect of C-reactive protein (CRP) on coronary heart disease (CHD) risk as an illustration of the Mendelian randomization approach, as well as several of the methodological issues highlighted in this book.

Key points from chapter

  • The analyses presented in this chapter exemplify the assessment of the assumptions required to perform a Mendelian randomization analysis and the estimation of an overall causal effect.
  • The integrated analyses presented based on the totality of available data give a precise enough causal estimate to rule out even a moderately-sized causal effect of C-reactive protein on coronary heart disease risk.

Relevant papers to chapter:

CRP CHD Genetics Collaboration. Association between C reactive protein and coronary disease: Mendelian randomisation analysis based on data from individual participants. BMJ 2011; 342:d548.

S. Burgess, S.G. Thompson, CRP CHD Genetics Collaboration. Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes. Stat. Meth. Med. Res. 2012.

Introduction

In the next two chapters, we consider extensions to IV methods to efficiently analyse data typically available in Mendelian randomization investigations. The first extension is the inclusion of multiple instrumental variables in a single analysis model, and the statistical issues arising.We consider the impact on statistical power, and discuss the practical issue of missing data, which can limit power gains.

Key points from chapter

  • Use of multiple instrumental variables in Mendelian randomization leads to more precise estimates of causal effects.
  • Sporadically missing genetic data may offset this gain, but missing data methods can recover much of the loss.
  • Parsimonious models of genetic association, and in particular allele scores, can alleviate the problems of weak instruments which may arise when using large numbers of instrumental variables.
  • The procedure for constructing an allele score to be used in an analysis should be made clear, and in particular how variants and weights for the score are chosen, as this has a considerable impact on bias.

Relevant papers to chapter:

Section 8.2 (Allele scores). S. Burgess, S.G. Thompson. Use of allele scores as instrumental variables for Mendelian randomization. Int. J. Epidemiol. 2013; 42(4):1134-1144.

Section 8.2 (Allele scores). N.M. Davies, S.v.H.K. Scholder, H. Farbmacher, S. Burgess, F. Windmeijer, G. Davey Smith. The many weak instruments problem and Mendelian randomization. Statist. Med. 2014.

Section 8.3 (Power of IV estimates). S. Burgess. Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome. Int. J. Epidemiol. 2014; 43(3):922-929.

Section 8.4 (Multiple variants and missing data). S. Burgess, S. Seaman, D. Lawlor, J.P. Casas, S.G. Thompson. Missing data methods in Mendelian randomization studies with multiple instruments. Am. J. Epidemiol. 2011; 174(9):1069-1076.

Section 8.5.2 (Subsample Mendelian randomization). B.L. Pierce, S. Burgess. Efficient design for Mendelian randomization studies: subsample and two-sample instrumental variable estimators. Am. J. Epidemiol. 2013; 178(7):1177-1184.

Introduction

In this chapter, we consider extensions to a simple Mendelian randomization analysis to include data from multiple studies. We provide methods for combining the information provided by each study in an efficient way to produce a single causal estimate. Also, we consider how to combine summarized data on genetic associations from multiple variants in a single study.

Key points from chapter

  • A pooled causal effect estimate can be obtained by combining study-level, summary-level or individual-level data.
  • A single causal effect can be estimated from published data on genetic associations with the exposure and with the outcome, either taken from a single study or from separate sources.
  • If the same genetic variants have been measured in several studies, the parameters of genetic association can be pooled in a hierarchical model across studies.
  • Studies with common genetic variants can contribute to a pooled causal effect estimate even if data on one of the exposure or the outcome has not been measured.

Relevant papers to chapter:

S. Burgess, S.G. Thompson, CRP CHD Genetics Collaboration. Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes. Stat. Meth. Med. Res. 2012.

S. Burgess, S.G. Thompson, CRP CHD Genetics Collaboration. Bayesian methods for meta-analysis of causal relationships estimated using genetic instrumental variables. Statist. Med. 2010; 29(12):1298-1311.

Section 9.4 (Summary-level meta-analysis). S. Burgess, A. Butterworth, S.G. Thompson. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet. Epidemiol. 2013; 32(27):4726-4747.

Section 9.4 (Summary-level meta-analysis) and Section 9.8.2 (Two-sample Mendelian randomization). S. Burgess, R.A. Scott, N.J. Timpson, G. Davey Smith. Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors. Eur. J. Epidemiol. 2015.

Section 9.8.2 (Two-sample Mendelian randomization). B.L. Pierce, S. Burgess. Efficient design for Mendelian randomization studies: subsample and two-sample instrumental variable estimators. Am. J. Epidemiol. 2013; 178(7):1177-1184.

Introduction

In this chapter, we consider the effect of weak instruments on instrumental variable (IV) analyses. Weak instruments are those that do not explain a large proportion of the variation in the exposure, and so the statistical association between the IV and the exposure is not strong. This is of particular relevance in Mendelian randomization studies since the associations of genetic variants with exposures of interest are often weak. This chapter focuses on the impact of weak instruments on the bias and coverage of IV estimates.

Key points from chapter

  • Bias from weak instruments can result in seriously misleading estimates of causal effects. Studies with instruments having large expected F statistics are less biased on average. However, if a study by chance has a larger observed F statistic than expected, then the causal estimate will be more biased.
  • Coverage levels with weak instruments can be poorly estimated by methods which rely on assumptions of asymptotic normality.
  • Data-driven choice of instruments or analysis can exacerbate bias. In particular, any threshold guideline such as ensuring that an observed F statistic is greater than 10 is misleading. Methods, instruments, and data to be used should be specified prior to data analysis. Meta-analyses based on study-specific estimates of causal effect are susceptible to bias.
  • Bias can be alleviated by use of measured covariates and parsimonious modelling of the genetic association (such as a per allele additive SNP model rather than one coefficient per genotype). This should be accompanied by sensitivity analyses to assess potential bias, for example from model misspecification.
  • Bias can be reduced substantially by using LIML, Bayesian and allele score (see next chapter) methods rather than 2SLS, and bias in practice with a single IV should be minimal. Nominal coverage levels can be maintained by the use of Fieller’s theorem with a single IV, and confidence intervals from the Anderson–Rubin test statistic or Bayesian MCMC methods with multiple IVs.

Relevant papers to chapter:

S. Burgess, S.G. Thompson. Bias in causal estimates from Mendelian randomization studies with weak instruments. Statist. Med. 2011; 30(11):1312-1323.

S. Burgess, S.G. Thompson, CRP CHD Genetics Collaboration. Avoiding bias from weak instruments in Mendelian randomization studies. Int. J. Epidemiol. 2011; 40(3):755-764.

N.M. Davies, S.v.H.K. Scholder, H. Farbmacher, S. Burgess, F. Windmeijer, G. Davey Smith. The many weak instruments problem and Mendelian randomization. Statist. Med. 2014.