In statistics, a metaanalysis combines the results of several studies that address a set of related research hypotheses. This is normally done by identification of a common measure of effect size, which is modelled using a form of metaregression. Resulting overall averages when controlling for study characteristics can be considered metaeffect sizes, which are more powerful estimates of the true effect size than those derived in a single study under a given single set of assumptions and conditions.
Contents 
The first metaanalysis was performed by Karl Pearson in 1904, in an attempt to overcome the problem of reduced statistical power in studies with small sample sizes; analyzing the results from a group of studies can allow more accurate data analysis.^{[1]}^{[2]} However, the first metaanalysis of all conceptually identical experiments concerning a particular research issue, and conducted by independent researchers, has been identified as the 1940 booklength publication Extrasensory perception after sixty years, authored by Duke University psychologists J. G. Pratt, J. B. Rhine, and associates.^{[3]} This encompassed a review of 145 reports on ESP experiments published from 1882 to 1939, and included an estimate of the influence of unpublished papers on the overall effect (the filedrawer problem). Although metaanalysis is widely used in epidemiology and evidencebased medicine today, a metaanalysis of a medical treatment was not published until 1955. In the 1970s, more sophisticated analytical techniques were introduced in educational research, starting with the work of Gene V. Glass, Frank L. Schmidt and John E. Hunter. The online Oxford English Dictionary lists the first usage of the term in the statistical sense as 1976 by Glass.^{[4]} The statistical theory surrounding metaanalysis was greatly advanced by the work of Nambury S. Raju, Larry V. Hedges, Harris Cooper, Ingram Olkin, John E. Hunter, Jacob Cohen, Thomas C. Chalmers, and Frank L. Schmidt.
Advantages of metaanalysis (eg. over classical literature reviews, simple overall means of effect sized etc.) include:
1. Search of literature
2. Selection of studies (‘incorporation criteria’)
3. Decide which dependent variables or summary measures are allowed. For instance:
in which μ_{t} is the treatment mean, μ_{c} is the control mean, σ^{2} the pooled variance.
4. Model selection (see next paragraph)
For reporting guidelines, see QUOROM statement ^{[5]} ^{[6]}
Generally, three types of models can be distinguished in the literature on metaanalysis: simple regression, fixed effects metaregression and random effects metaregression.
The model can be specified as
Where y_{j} is the effect size in study j and β_{0}(intercept) the estimated overall effect size. are parameters specifying different study characteristics. specifies the between study variation. Note that this model does not allow specification of within study variation.
Fixedeffects metaregression assumes that the true effect size θ is normally distributed with where is the within study variance of the effect size. A fixed effects metaregression model thus allows for within study variability, but no between study variability because all studies have expected fixed effect size θ, i.e. .
Where is the variance of the effect size in study j. Fixed effects metaregression ignores between study variation. As a result, parameter estimates are biased if between study variation can not be ignored. Furthermore, generalizations to the population are not possible.
Random effect metaregression rests on the assumption that θ in is a random variable following a (hyper)distribution
Where again is the variance of the effect size in study j. Between study variance is estimated using common estimation procedures for random effects models (restricted maximum likelihood (REML) estimators).
Modern metaanalysis does more than just combine the effect sizes of a set of studies. It can test if the studies' outcomes show more variation than the variation that is expected because of sampling different research participants. If that is the case, study characteristics such as measurement instrument used, population sampled, or aspects of the studies' design are coded. These characteristics are then used as predictor variables to analyze the excess variation in the effect sizes. Some methodological weaknesses in studies can be corrected statistically. For example, it is possible to correct effect sizes or correlations for the downward bias due to measurement error or restriction on score ranges.
Metaanalysis leads to a shift of emphasis from single studies to multiple studies. It emphasizes the practical importance of the effect size instead of the statistical significance of individual studies. This shift in thinking has been termed Metaanalytic thinking. The results of a metaanalysis are often shown in a forest plot.
Results from studies are combined using different approaches. One approach frequently used in metaanalysis in health care research is termed 'inverse variance method'. The average effect size across all studies is computed as a weighted mean, whereby the weights are equal to the inverse variance of each studies' effect estimator. Larger studies and studies with less random variation are given greater weight than smaller studies. Other common approaches include the Mantel Haenszel method^{[7]} and the Peto method.
A recent approach to studying the influence that weighting schemes can have on results has been proposed through the construct of gravity, which is a special case of combinatorial metaanalysis.
Signed differential mapping is a statistical technique for metaanalyzing studies on differences in brain activity or structure which used neuroimaging techniques such as fMRI, VBM or PET.
A weakness of the method is that sources of bias are not controlled by the method. A good metaanalysis of badly designed studies will still result in bad statistics. Robert Slavin has argued that only methodologically sound studies should be included in a metaanalysis, a practice he calls 'best evidence metaanalysis'. Other metaanalysts would include weaker studies, and add a studylevel predictor variable that reflects the methodological quality of the studies to examine the effect of study quality on the effect size.
Another weakness of the method is the heavy reliance on published studies, which may increase the effect as it is very hard to publish studies that show no significant results. This publication bias or "filedrawer effect" (where nonsignificant studies end up in the desk drawer instead of in the public domain) should be seriously considered when interpreting the outcomes of a metaanalysis. Because of the risk of publication bias, many metaanalyses now include a "failsafe N" statistic that calculates the number of studies with null results that would need to be added to the metaanalysis in order for an effect to no longer be reliable.
Other weaknesses are Simpson's Paradox (two smaller studies may point in one direction, and the combination study in the opposite direction); the coding of an effect is subjective; the decision to include or reject a particular study is subjective; there are two different ways to measure effect: correlation or standardized mean difference; the interpretation of effect size is purely arbitrary; it has not been determined if the statistically most accurate method for combining results is the fixed effects model or the random effects model; and, for medicine, the underlying risk in each studied group is of significant importance, and there is no universally agreedupon way to weight the risk.
The example provided by the Rind et al. controversy illustrates an application of metaanalysis which has been the subject of subsequent criticisms of many of the components of the metaanalysis.
The file drawer problem describes the often observed fact that only results with significant parameters are published in academic journals. As a result the distribution of effect sizes are biased, skewed or completely cut off. This can be visualized with a funnel plot which is a scatter plot of sample size and effect sizes. There are several procedures available to correct for the file drawer problem, once identified, such as simulating the cut off part of the distribution of study effects.
A funnelplot expected without the file drawer problem 
A funnelplot expected with the file drawer problem 

Subject classification: this is a statistics resource . 
Completion status: this resource is ~25% complete. 

Metaanalysis is a systematic technique for reviewing, analysing, and summarising quantitative research studies on specific topics or questions. The purpose of this page is gather information and resources about how to conduct a metaanalysis. Thus, the target audience includes, for example, postgraduate students conducting a metaanalysis or beginning researchers interesting in conducting a metaanalysis. This page could also be useful for students involved in research method coursework which includes a section on understanding the use and application of metaanalysis.
Some dedicated metaanalysis software includes:
Name  URL  License  $Cost  Trial or Demo?  Version  Notes 

CMA  http://www.metaanalysis.com  Proprietary  ~1000  Yes  2  
RevMan  http://www.ccims.net/RevMan  ?  Free for noncommercial use  Yes  5  For organising reviews; for MA, see [1] 
Metawin  http://www.metawinsoft.com  Proprietary  150  
MIX  http://www.mixformetaanalysis.info  ?  0 
Nondedicated, generic statistics software which can be used for conducting metaanalysis include:
Lipsey, M. W., & Wilson, D. B. (2001). Practical metaanalysis. Sage. Thousand Oaks, CA.
