Are studies included in most meta-analyses based on a narrow range of participants?

Are studies included in most meta-analyses based on a narrow range of participants?

This question is mostly quoted from A New Psychology of Women (2017) by Lips, page 63:

Finally, and perhaps most importantly, the studies included in most meta-analyses are based on a narrow range of participants (usually young, North American students) and are not reflective of the populations at large--although this limitation tends to be obscured by the combination of many studies into a single analysis (Halpern, 1995).

I mean, this claim is more than 20 years old. Is it true today? Is there a study that examines the demographics of meta-studies at large or something related? If it is true, is this the case for most psychology research, male-female differences or all studies in the social sciences?

Hunter and Schmidt offer a a set of procedures to correct range restriction, these are included in many packages based on their meta-analytic methodology, here are some relevant references:



Meta-analysis is a tool that is applied across a wide range of disciplines, especially those involving human subjects. In general, meta-analyses aim to sample all relevant studies that have been conducted.

In some fields, the norm is to use undergraduate students (especially psychology students). Many psychology programs have a research participation pool; so for a lot of basic lab based research (e.g., cognitive psychology, social psychology), this represents a convenient sample. Note of course, that other disciplines will use other samples. Organisational psychology researchers often use samples of employees from organisations. Developmental psychologists study children. Social science researchers often rely on large scale panel surveys.

More recently, online samples are becoming particularly popular. This includes online convenience samples as well as paid samples. In particular, Mechanical Turk is particularly popular in the United States, but other platforms such as prolific exist.

In general, the choices that researchers make about sample are driven by many factors:

  • convenience/resources
  • discipline expectations
  • interests in generalisation (some researchers are mainly interested in generalising to their own culture)
  • where the researchers have traditionally come are from (e.g., most researchers in psychology come from the U.S., followed by UK/Europe/Australasia)
  • relevance of sample to research question (e.g., sometimes a convenience sample is fine; other times you really need a representative sample); in basic science, it's often fine, but if you're doing organisational or clinical research, you'll generally want a relevant org or clinical sample.

So in short, this phenomena is going to vary a lot across research questions. And in some cases it's probably not going to matter much; in other cases it will.

More generally, there is often a trade-off between convenience and rigour. For example, most researchers who measure job performance would love to have a rigorous objective measure of objective performance, but it's much easier to get self-report. So you'll often find that meta-analyses will be summarising studies that have opted for one point on the convenience/rigour trade-off.

Meta-analysts can and do perform moderator analyses if they think a particular factor might be relevant to the meta-analytic finding (e.g., whether the sample are undergraduate students or not).

It can be tempting to jump prematurely into a statistical analysis when undertaking a systematic review. The production of a diamond at the bottom of a plot is an exciting moment for many authors, but results of meta-analyses can be very misleading if suitable attention has not been given to formulating the review question specifying eligibility criteria identifying and selecting studies collecting appropriate data considering risk of bias planning intervention comparisons and deciding what data would be meaningful to analyse. Review authors should consult the chapters that precede this one before a meta-analysis is undertaken.

An important step in a systematic review is the thoughtful consideration of whether it is appropriate to combine the numerical results of all, or perhaps some, of the studies. Such a meta-analysis yields an overall statistic (together with its confidence interval) that summarizes the effectiveness of an experimental intervention compared with a comparator intervention. Potential advantages of meta-analyses include the following:

  1. To improve precision. Many studies are too small to provide convincing evidence about intervention effects in isolation. Estimation is usually improved when it is based on more information.
  2. To answer questions not posed by the individual studies. Primary studies often involve a specific type of participant and explicitly defined interventions. A selection of studies in which these characteristics differ can allow investigation of the consistency of effect across a wider range of populations and interventions. It may also, if relevant, allow reasons for differences in effect estimates to be investigated.
  3. To settle controversies arising from apparently conflicting studies or to generate new hypotheses. Statistical synthesis of findings allows the degree of conflict to be formally assessed, and reasons for different results to be explored and quantified.

Of course, the use of statistical synthesis methods does not guarantee that the results of a review are valid, any more than it does for a primary study. Moreover, like any tool, statistical methods can be misused.

This chapter describes the principles and methods used to carry out a meta-analysis for a comparison of two interventions for the main types of data encountered. The use of network meta-analysis to compare more than two interventions is addressed in Chapter 11. Formulae for most of the methods described are provided in a supplementary document ‘Statistical algorithms in Review Manager’ (available via the Handbook web pages), and a longer discussion of many of the issues is available (Deeks et al 2001).

10.2.1 Principles of meta-analysis

The commonly used methods for meta-analysis follow the following basic principles:

  1. Meta-analysis is typically a two-stage process. In the first stage, a summary statistic is calculated for each study, to describe the observed intervention effect in the same way for every study. For example, the summary statistic may be a risk ratio if the data are dichotomous, or a difference between means if the data are continuous (see Chapter 6).
  2. In the second stage, a summary (combined) intervention effect estimate is calculated as a weighted average of the intervention effects estimated in the individual studies. A weighted average is defined as

Meta-analyses are usually illustrated using a forest plot. An example appears in Figure 10.2.a. A forest plot displays effect estimates and confidence intervals for both individual studies and meta-analyses (Lewis and Clarke 2001). Each study is represented by a block at the point estimate of intervention effect with a horizontal line extending either side of the block. The area of the block indicates the weight assigned to that study in the meta-analysis while the horizontal line depicts the confidence interval (usually with a 95% level of confidence). The area of the block and the confidence interval convey similar information, but both make different contributions to the graphic. The confidence interval depicts the range of intervention effects compatible with the study’s result. The size of the block draws the eye towards the studies with larger weight (usually those with narrower confidence intervals), which dominate the calculation of the summary result, presented as a diamond at the bottom.

Figure 10.2.a Example of a forest plot from a review of interventions to promote ownership of smoke alarms (DiGuiseppi and Higgins 2001). Reproduced with permission of John Wiley & Sons

Massive Study Reveals Few Differences Between Men and Women’s Brains

Summary:A large-scale meta-analysis reveals men and women’s brains do have slight differences. However, the differences are due to brain size and not sex or gender. Researchers say brain differences between men and women are small and inconsistent once an individual’s head size is accounted for.

Source:Rosalind Franklin University

How different are men and women’s brains? The question has been explored for decades, but a new study led by Rosalind Franklin University neuroscientist Lise Eliot is the first to coalesce this wide-ranging research into a single mega-synthesis. And the answer is: hardly at all.

“Men and women’s brains do differ slightly, but the key finding is that these distinctions are due to brain size, not sex or gender,” Dr. Eliot said. “Sex differences in the brain are tiny and inconsistent, once individuals’ head size is accounted for.”

The unusually large study of studies, “Dump the ‘dimorphism’: Comprehensive synthesis of human brain studies reveals few male-female differences beyond size,” published in Neuroscience and Biobehavioral Reviews, finds that size is the only clear-cut difference between male and female brains.

Women’s brains are about 11% smaller than men’s, in proportion to their body size. Smaller brains allow certain features, such as a slightly higher ratio of gray matter to white matter, and a higher ratio of connections between, versus within, cerebral hemispheres.

“This means that the brain differences between large- and small-headed men are as great as the brain differences between the average man and woman,” Dr. Eliot said. “And importantly, none of these size-related differences can account for familiar behavioral differences between men and women, such as empathy or spatial skills.”

This is not the story typically publicized about sex differences in the human brain.

“Since the dawn of MRI, studies finding statistically significant sex differences have received outsized attention by scientists and the media,” said Dr. Eliot, whose books include “Pink Brain, Blue Brain: How Small Differences Grow Into Troublesome Gaps.”

“Researchers have been quietly accumulating massive amounts of data comparing male and female brains, but it’s only the differences that get hyped,” Dr. Eliot continued. “Unlike other areas of health research, women have been equally included in brain imaging from the outset.”

Dr. Eliot and her collaborators — fourth-year Chicago Medical School students Adnan Ahmed, Hiba Khan and Julie Patel — conducted a meta-synthesis of three decades of research, assimilating hundreds of the largest and most highly-cited brain imaging studies addressing 13 distinct measures of alleged sex difference.

For nearly every measure, they found almost no differences that were widely reproduced across studies, even those involving thousands of participants. For example, the volume or thickness of specific regions in the cerebral cortex is often reported to differ between men and women. However, the meta-synthesis shows that the regions identified differ enormously between studies.

Male-female brain differences are also poorly replicated between diverse populations, such as Chinese versus American, meaning there is no universal marker that distinguishes men and women’s brains across the human species.

“The handful of features that do differ most reliably are quite small in magnitude,” Dr. Eliot said. “The volume of the amygdala, an olive-sized part of the temporal lobe that is important for social-emotional behaviors, is a mere 1% larger in men across studies.”

The study also rebuts a longstanding view that men’s brains are more lateralized, meaning each hemisphere acts independently, whereas women’s two hemispheres are said to be better connected and to operate more in sync with each other. Such a difference could make males more vulnerable to disability following brain injury such as stroke.

Here again, the consensus of many studies shows that the difference is extremely small, accounting for even less than 1% of the range of left-right connectivity across the population.

This finding does agree with large datasets that have found no gender difference in aphasia, or the loss of language, following a stroke in the left hemisphere, contrary to long belief.

A last focus of the new study is on functional MRI. This method allows neuroscientists to see areas that “light up” during particular mental tasks and has been widely used to look for male-female differences during language, spatial and emotional tasks.

Male-female brain differences are also poorly replicated between diverse populations, such as Chinese versus American, meaning there is no universal marker that distinguishes men and women’s brains across the human species. Image is in the public domain

Across hundreds of such studies, Dr. Eliot’s team found extremely poor reliability in sex difference findings — nearly all specific brain areas that differed in activity between men and women were not repeated across studies. Such poor reproducibility agrees with recent research out of Stanford University demonstrating “false discovery,” or the over-publication of false-positive findings in the scientific literature on functional MRI sex difference.

“Sex comparisons are super easy for researchers to conduct after an experiment is already done. If they find something, it gets another publication. If not, it gets ignored,” Dr. Eliot said. Publication bias is common in sex-difference research, she added, because the topic garners high interest.

“Sex differences are sexy, but this false impression that there is such a thing as a ‘male brain’ and a ‘female brain’ has had wide impact on how we treat boys and girls, men and women,” Dr. Eliot said.

Systematic Review and Meta-Analysis of Prevalence Studies in Transsexualism

Over the last 50 years, several studies have provided estimates of the prevalence of transsexualism. The variation in reported prevalence is considerable and may be explained by factors such as the methodology and diagnostic classification used and the year and country in which the studies took place. Taking these into consideration, this study aimed to critically and systematically review the available literature measuring the prevalence of transsexualism as well as performing a meta-analysis using the available data.

Databases were systematically searched and 1473 possible studies were identified. After initial scrutiny of the article titles and removal of those not relevant, 250 studies were selected for further appraisal. Of these, 211 were excluded after reading the abstracts and a further 18 after reading the full article. This resulted in 21 studies on which to perform a systematic review, with only 12 having sufficient data for meta-analysis. The primary data of the epidemiological studies were extracted as raw numbers. An aggregate effect size, weighted by sample size, was computed to provide an overall effect size across the studies. Risk ratios and 95% confidence intervals (CIs) were calculated. The relative weighted contribution of each study was also assessed.

The overall meta-analytical prevalence for transsexualism was 4.6 in 100,000 individuals 6.8 for trans women and 2.6 for trans men. Time analysis found an increase in reported prevalence over the last 50 years.

The overall prevalence of transsexualism reported in the literature is increasing. However, it is still very low and is mainly based on individuals attending clinical services and so does not provide an overall picture of prevalence in the general population. However, this study should be considered as a starting point and the field would benefit from more rigorous epidemiological studies acknowledging current changes in the classification system and including different locations worldwide.

Are studies included in most meta-analyses based on a narrow range of participants? - Psychology

The authors concluded that behavioural couples therapy reported better outcomes than individual-based treatment for married or cohabiting individuals who sought help for alcohol dependence or drug dependence problems. The authors' conclusions appeared to reflect the evidence, but multiple potential biases in the review process made reliability of the conclusions unclear.

To evaluate the effectiveness of behavioural couples therapy (BCT) for alcohol and drug use disorders.

PsycINFO, MEDLINE and Cochrane Central Register of Controlled Trials (CENTRAL) were searched to May 2007 for English-language articles. Search terms were reported. Reference lists of reviews were searched.

Randomised controlled trials (RCTs) that evaluated behavioural couples therapy compared to an active or inactive control group for treatment of alcohol and substance use disorders were eligible for inclusion. Studies that included people other than just the intimate partner were excluded. Outcomes of interest were reduced frequency of use, reduced consequences of use and higher relationship satisfaction.

Most of the included studies compared behavioural couples therapy with cognitive-behavioural therapy (CBT). Other control groups included psychoeducation attention control treatment, individual-based treatments, alcohol-focused spouse intervention and treatment as usual. The number of sessions varied from six to 56. Most studies targeted alcohol use opiates and polysubstance use disorders were the targets of the other studies. It appeared that for most studies, behavioural couples therapy was an adjunct to treatment and for some studies supplemental medication was provided. Studies included couples where only one member had a current substance use disorder.

The authors did not report how many reviewers selected studies for inclusion.

The authors did not state that they assessed validity.

Data on frequency of substance use, consequences of substance use (such as job loss, hospitalisation, overdose, work missed) and relationship satisfaction were extracted. Data were used to calculate between-group effect sizes (ES), standard error (SE) and 95% confidence intervals (CI) for individual studies that used Hedge's g and Cohen's d statistics. Studies that reported multiple outcomes were categorised and combined within each domain using methods reported by Borenstein et al. Where data were missing, effect sizes were estimated using conversion equations for significance tests. Effect sizes were corrected for small sample sizes based on methods by Hedges and Olkin. Where necessary, authors of studies were contacted for further information.

The authors did not state how many reviewers conducted data extraction.

Data for each outcome were pooled using both fixed-effect and random-effects models to calculate mean effect size, standard error and 95% CIs for each outcome. Outcomes included overall effect size (all outcome variables and different time points), frequency and consequences of use and relationship satisfaction. Effect sizes were categorised as small (Cohen's d=0.20), medium (Cohen's d=0.50) and large (Cohen's d=0.80). Additional analyses were conducted for post-treatment and follow-up (where data were available) and for studies where the comparison group was CBT only. Meta-regression was used to explore effects of sample sizes, dose response and year of publication. Publication bias was assessed using methods by Rosenthal's fail-safe n.

Twelve RCTs (754 couples, range 16 to 138) were included in the review.

Behavioural couples therapy was superior to control groups for all outcomes overall (Cohen's d=0.54, 95% CI 0.37 to 0.71), overall frequency of use (Cohen's d=0.36, 95% CI 0.19 to 0.53) and overall consequences of use (Cohen's d=0.52, 95% CI 0.20 to 0.83). Behavioural couples therapy resulted in higher relationship satisfaction compared to control (Cohen's d=0.58, 95% CI 0.37 to 0.79). Results using Hedge's g statistic were reported.

At post-treatment, behavioural couples therapy was superior to control for relationship satisfaction (Hedge's g=0.63, SE 0.11, 95% CI 0.42 to 0.84), but not for consequences of use. Behavioural couples therapy was superior to control at follow-up for frequency of use (Hedge's g=0.44, SE 0.19, 95% CI 0.24 to 0.64), consequences of use (Hedge's g=0.49, SE 0.22, 95% CI 0.06 to 0.91) and relationship satisfaction (Hedge's g=0.52, SE 0.12, 95% CI 0.28 to 0.75). Results were significantly better for behavioural couples therapy compared to control at three months post-treatment and for at least the next year (p≤0.000 for both analyses).

Meta-regression showed no significant relationship between effect size and year of publication or number of treatment sessions. Higher effect sizes were associated with larger sample sizes. There was no evidence of significant publication bias.

Overall, behavioural couples therapy reported better outcomes than more typical individual-based treatment for married or cohabiting individuals who sought help for alcohol dependence or drug dependence problems.

The review question was clear with appropriate inclusion criteria inclusion criteria were broadly defined for participants. Several relevant sources were searched. The limitation to studies in English meant there was potential for language bias. Publication bias was formally assessed and no evidence was found. Methods used in the review process were not reported, so any steps taken to reduce reviewer error and bias in study assessment and data extraction were unknown. Study validity was not assessed and so results from this review and any synthesis may not have been reliable. Studies were combined in a meta-analysis and heterogeneity was assessed using meta-regression however, it was unclear how many studies contributed to each meta-analysis. Limited information was reported for individual studies, participant characteristics included, which made it difficult to assess the generalisability of the results. Sample sizes were small (most studies included fewer than 80 couples).

The authors' conclusions appeared to reflect the evidence, but potential for language bias, lack of both validity assessment and reporting of review methods and small sample sizes made the reliability of the conclusions unclear.

Practice: The authors did not state any implications for practice.

Research: The authors stated a need for further robust research to identify the mechanisms responsible for superior outcomes with behavioural couples therapy, to investigate the effectiveness of behavioural couples therapy in studies that included participants with low severity alcohol problems and to establish the extent of baseline relationship distress as a moderator of outcomes.


Rexrode, K. M. et al. Abdominal adiposity and coronary heart disease in women. J. Am. Med. Assoc. 280, 1843–1848. (1998).

McMillan, D. C., Sattar, N. & McArdle, C. S. ABC of obesity: obesity and cancer. Br. Med. J. (Clin. Res. Ed.) 333, 1109–1111. (2006).

Steppan, C. M. et al. The hormone resistin links obesity to diabetes. Nature 409, 307–312. (2001).

Collaboration, N. R. F. Trends in adult body-mass index in 200 countries from 1975 to 2014: a pooled analysis of 1698 population-based measurement studies with 19·2 million participants. The Lancet 387, 1377–1396. (2016).

Ionut, V., Burch, M., Youdim, A. & Bergman, R. N. Gastrointestinal hormones and bariatric surgery-induced weight loss. Obesity 21, 1093–1103. (2013).

Chambers, L., McCrickerd, K. & Yeomans, M. R. Optimising foods for satiety. Trends Food Sci. Technol. 41, 149–160. (2015).

Garrow, J. S. Energy Balance and Obesity in Man (North-Holland Publishing Company, Amsterdam, 1974).

Blundell, J. Making claims: functional foods for managing appetite and weight. Nat. Rev. Endocrinol. 6, 53–56. (2010).

Blundell, J. E., Rogers, P. J. & Hill, A. J. in Food Acceptance and Nutrition (eds Colms, J., Booth, D. A., Pangborn, R. M., & Raunhardt, O.) 205–219 (1987).

Blundell, J. E., De Graaf, K., Finlayson, G., Halford, J. C., Hetherington, M., King, N., & Stubbs, J. in Assessment Methods for Eating Behaviour and Weight-Related Problems: Measures, Theory and Research. 283–325 (Sage, Thousand Oaks, 2009).

Kojima, M. & Kangawa, K. Ghrelin: structure and function. Physiol. Rev. 85, 495–522. (2005).

Cummings, D. E. & Overduin, J. Gastrointestinal regulation of food intake. J. Clin. Investig. 117, 13–23. (2007).

Murphy, K. G. & Bloom, S. R. Gut hormones and the regulation of energy homeostasis. Nature 444, 854–859 (2006).

Kissileff, H. R., Carretta, J. C., Geliebter, A. & Pi-Sunyer, F. X. Cholecystokinin and stomach distension combine to reduce food intake in humans. Am. J. Physiol. Regul. Integr. Comp. Physiol. 285, R992-998. (2003).

Smith, G. P., Gibbs, J., & Kulkosky, P. J. Relationships between brain-gut peptides and neurons in the control of food intake. in The Neural Basis of Feeding and Reward 149–165 (1982).

Mattes, R. Soup and satiety. Physiol. Behav. 83, 739–747. (2005).

Tournier, A. & Louis-Sylvestre, J. Effect of the physical state of a food on subsequent intake in human subjects. Appetite 16, 17–24. (1991).

Santangelo, A., Peracchi, M., Conte, D., Fraquelli, M. & Porrini, M. Physical state of meal affects gastric emptying, cholecystokinin release and satiety. Br. J. Nutr. 80, 521–527. (1998).

Solah, V. A. et al. Differences in satiety effects of alginate- and whey protein-based foods. Appetite 54, 485–491. (2010).

Camps, G., Mars, M., de Graaf, C. & Smeets, P. A. Empty calories and phantom fullness: a randomized trial studying the relative effects of energy density and viscosity on gastric emptying determined by MRI and satiety. Am. J. Clin. Nutr. 104, 73–80 (2016).

Zhu, Y., Hsu, W. H. & Hollis, J. H. The impact of food viscosity on eating rate, subjective appetite, glycemic response and gastric emptying rate. PLoS ONE 8, e67482. (2013).

Mattes, R. D. & Rothacker, D. Beverage viscosity is inversely related to postprandial hunger in humans. Physiol. Behav. 74, 551–557 (2001).

Juvonen, K. R. et al. Structure modification of a milk protein-based model food affects postprandial intestinal peptide release and fullness in healthy young men. Br. J. Nutr. 106, 1890–1898 (2011).

Labouré, H., Van Wymelbeke, V., Fantino, M. & Nicolaidis, S. Behavioral, plasma, and calorimetric changes related to food texture modification in men. Am. J. Physiol. Regul. Integr. Comp. Physiol. 282, R1501–R1511. (2002).

Tang, J., Larsen, D. S., Ferguson, L. R. & James, B. J. The effect of textural complexity of solid foods on satiation. Physiol. Behav. 163, 17–24. (2016).

Larsen, D. S., Tang, J., Ferguson, L. R. & James, B. J. Increased textural complexity in food enhances satiation. Appetite 105, 189–194. (2016).

McCrickerd, K., Chambers, L. & Yeomans, M. R. Does modifying the thick texture and creamy flavour of a drink change portion size selection and intake?. Appetite 73, 114–120. (2014).

Krop, E. M., Hetherington, M. M., Miquel, S. & Sarkar, A. The influence of oral lubrication on food intake: a proof-of-concept study. Food Qual. Prefer. 74, 118–124. (2019).

Miquel-Kergoat, S., Azais-Braesco, V., Burton-Freeman, B. & Hetherington, M. M. Effects of chewing on appetite, food intake and gut hormones: a systematic review and meta-analysis. Physiol. Behav. 151, 88–96. (2015).

Robinson, E. et al. A systematic review and meta-analysis examining the effect of eating rate on energy intake and hunger. Am. J. Clin. Nutr. 100, 123–151. (2014).

Krop, E. M. et al. Influence of oral processing on appetite and food intake: a systematic review and meta-analysis. Appetite 125, 253–269. (2018).

Almiron-Roig, E. et al. Factors that determine energy compensation: a systematic review of preload studies. Nutr. Rev. 71, 458–473 (2013).

Dhillon, J., Running, C. A., Tucker, R. M. & Mattes, R. D. Effects of food form on appetite and energy balance. Food Qual. Prefer. 48, 368–375 (2016).

Campbell, C. L., Wagoner, T. B. & Foegeding, E. A. Designing foods for satiety: the roles of food structure and oral processing in satiation and satiety. Food Struct. 13, 1–12 (2017).

de Wijk, R., Zijlstra, N., Mars, M., De Graaf, C. & Prinz, J. The effects of food viscosity on bite size, bite effort and food intake. Physiol. Behav. 95, 527–532 (2008).

Juvonen, K. R. et al. Semisolid meal enriched in oat bran decreases plasma glucose and insulin levels, but does not change gastrointestinal peptide responses or short-term appetite in healthy subjects. Nutr. Metab. Cardiovasc. Dis. 21, 748–756 (2011).

Kehlet, U., Pagter, M., Aaslyng, M. D. & Raben, A. Meatballs with 3% and 6% dietary fibre from rye bran or pea fibre- effects on sensory quality and subjective appetite sensations. Meat Sci. 125, 66–75 (2017).

Gadah, N. S., Kyle, L. A., Smith, J. E., Brunstrom, J. M. & Rogers, P. J. No difference in compensation for sugar in a drink versus sugar in semi-solid and solid foods. Physiol. Behav. 156, 35–42 (2016).

Hogenkamp, P. S., Mars, M., Stafleu, A. & de Graaf, C. Intake during repeated exposure to low-and high-energy-dense yogurts by different means of consumption. Am. J. Clin. Nutr. 91, 841–847 (2010).

McCrickerd, K., Lensing, N. & Yeomans, M. R. The impact of food and beverage characteristics on expectations of satiation, satiety and thirst. Food Qual. Prefer. 44, 130–138 (2015).

Bolhuis, D. P. et al. Slow food: sustained impact of harder foods on the reduction in energy intake over the course of the day. PLoS ONE 9, e93370. (2014).

Lasschuijt, M. P. et al. Comparison of oro-sensory exposure duration and intensity manipulations on satiation. Physiol. Behav. 176, 76–83. (2017).

Pritchard, S. J., Davidson, I., Jones, J. & Bannerman, E. A randomised trial of the impact of energy density and texture of a meal on food and energy intake, satiation, satiety, appetite and palatability responses in healthy adults. Clin. Nutr. 33, 768–775. (2014).

Cassady, B. A., Considine, R. V. & Mattes, R. D. Beverage consumption, appetite, and energy intake: what did you expect?. Am. J. Clin. Nutr. 95, 587–593 (2012).

Hovard, P. et al. Sensory-enhanced beverages: effects on satiety following repeated consumption at home. Nutr. Bull. 40, 187–198 (2015).

Evans, C. E., Christian, M. S., Cleghorn, C. L., Greenwood, D. C. & Cade, J. E. Systematic review and meta-analysis of school-based interventions to improve daily fruit and vegetable intake in children aged 5 to 12 y. Am. J. Clin. Nutr. 96, 889–901. (2012).

Mattes, R. D. Nutrition for the Primary Care Provider, vol. 111, 19–23 (Karger Publishers, Berlin, 2015).

Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G. & The, P. G. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 6, e1000097. (2009).

Mourao, D. M., Bressan, J., Campbell, W. W. & Mattes, R. D. Effects of food form on appetite and energy intake in lean and obese young adults. Int. J. Obes. 31, 1688–1695. (2007).

Yeomans, M. R., McCrickerd, K., Brunstrom, J. M. & Chambers, L. Effects of repeated consumption on sensory-enhanced satiety. Br. J. Nutr. 111, 1137–1144. (2014).

Hogenkamp, P. S., Mars, M., Stafleu, A. & de Graaf, C. Repeated consumption of a large volume of liquid and semi-solid foods increases ad libitum intake, but does not change expected satiety. Appetite 59, 419–424 (2012).

Hogenkamp, P. S., Stafleu, A., Mars, M. & de Graaf, C. Learning about the energy density of liquid and semi-solid foods. Int. J. Obes. 36, 1229–1235 (2012).

Melnikov, S. M. et al. Sustained hunger suppression from stable liquid food foams. Obesity 22, 2131–2136 (2014).

Dong, H. et al. Orange pomace fibre increases a composite scoring of subjective ratings of hunger and fullness in healthy adults. Appetite 107, 478–485. (2016).

Marciani, L. et al. Preventing gastric sieving by blending a solid/water meal enhances satiation in healthy humans. J. Nutr. 142, 1253–1258. (2012).

Martens, M. J., Lemmens, S. G., Born, J. M. & Westerterp-Plantenga, M. S. A solid high-protein meal evokes stronger hunger suppression than a liquefied high-protein meal. Obesity (Silver Spring) 19, 522–527. (2011).

Martens, M. J., Lemmens, S. G., Born, J. M. & Westerterp-Plantenga, M. S. Satiating capacity and post-prandial relationships between appetite parameters and gut-peptide concentrations with solid and liquefied carbohydrate. PLoS ONE 7, e42110. (2012).

Wanders, A. J. et al. Pectin is not pectin: a randomized trial on the effect of different physicochemical properties of dietary fiber on appetite and energy intake. Physiol. Behav. 128, 212–219 (2014).

Yeomans, M. R., Re, R., Wickham, M., Lundholm, H. & Chambers, L. Beyond expectations: the physiological basis of sensory enhancement of satiety. Int. J. Obes. 40, 1693–1698 (2016).

Zhu, Y., Hsu, W. H. & Hollis, J. H. The effect of food form on satiety. Int. J. Food Sci. Nutr. 64, 385–391. (2013).

Zijlstra, N. et al. Effect of viscosity on appetite and gastro-intestinal hormones. Physiol. Behav. 97, 68–75 (2009).

Clegg, M. E., Ranawana, V., Shafat, A. & Henry, C. J. Soups increase satiety through delayed gastric emptying yet increased glycaemic response. Eur. J. Clin. Nutr. 67, 8–11. (2013).

Flood, J. E. & Rolls, B. J. Soup preloads in a variety of forms reduce meal energy intake. Appetite 49, 626–634. (2007).

Flood-Obbagy, J. E. & Rolls, B. J. The effect of fruit in different forms on energy intake and satiety at a meal. Appetite 52, 416–422. (2009).

Tsuchiya, A., Almiron-Roig, E., Lluch, A., Guyonnet, D. & Drewnowski, A. Higher satiety ratings following yogurt consumption relative to fruit drink or dairy fruit drink. J. Am. Diettetic Assoc. 106, 550–557. (2006).

Juvonen, K. R. et al. Viscosity of oat bran-enriched beverages influences gastrointestinal hormonal responses in healthy humans. J. Nutr. 139, 461–466. (2009).

Higgins, J. P. T. et al. The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. Br. Med. J. 343, d5928. (2011).

Athanassoulis, N. & Wilson, J. When is deception in research ethical?. Clin. Ethics 4, 44–49. (2009).

Blundell, J. et al. Appetite control: methodological aspects of the evaluation of foods. Obes. Rev. 11, 251–270. (2010).

Krop, E. M., Hetherington, M. M., Holmes, M., Miquel, S. & Sarkar, A. On relating rheology and oral tribology to sensory properties in hydrogels. Food Hydrocoll. 88, 101–113. (2019).

Johnson, J. & Vickers, Z. Effect of flavor and macronutrient composition of food servings on liking, hunger and subsequent intake. Appetite 21, 25–39. (1993).

Stubbs, R. J., Prentice, A. M. & James, W. P. T. Carbohydrates and energy balance. Ann. N. Y. Acad. Sci. 819, 44–69. (1997).

Rolls, B. J. et al. Time course of effects of preloads high in fat or carbohydrate on food intake and hunger ratings in humans. Am. Physiol. Soc. 260, R756–R763. (1991).

Hetherington, M. M., Foster, R., Newman, T., Anderson, A. S. & Norton, G. Understanding variety: tasting different foods delays satiation. Physiol. Behav. 87, 263–271. (2006).

Yusuf, S., Held, P., Teo, K. & Toretsky, E. R. Selection of patients for randomized controlled trials: implications of wide or narrow eligibility criteria. Stat. Med. 9, 73–86 (1990).



Whether marine omega‐3 supplementation is associated with reduction in risk of cardiovascular disease (CVD) remains controversial.

Methods and Results

This meta‐analysis included study‐level data from 13 trials. The outcomes of interest included myocardial infarction, coronary heart disease (CHD) death, total CHD, total stroke, CVD death, total CVD, and major vascular events. The unadjusted rate ratios were calculated using a fixed‐effect meta‐analysis. A meta‐regression was conducted to estimate the dose–response relationship between marine omega‐3 dosage and risk of each prespecified outcome. During a mean treatment duration of 5.0 years, 3838 myocardial infarctions, 3008 CHD deaths, 8435 total CHD events, 2683 strokes, 5017 CVD deaths, 15 759 total CVD events, and 16 478 major vascular events were documented. In the analysis excluding REDUCE‐IT (Reduction of Cardiovascular Events with Icosapent Ethyl‐Intervention Trial), marine omega‐3 supplementation was associated with significantly lower risk of myocardial infarction (rate ratio [RR] [95% CI]: 0.92 [0.86, 0.99] P=0.020), CHD death (RR [95% CI]: 0.92 [0.86, 0.98] P=0.014), total CHD (RR [95% CI]: 0.95 [0.91, 0.99] P=0.008), CVD death (RR [95% CI]: 0.93 [0.88, 0.99] P=0.013), and total CVD (RR [95% CI]: 0.97 [0.94, 0.99] P=0.015). Inverse associations for all outcomes were strengthened after including REDUCE‐IT while introducing statistically significant heterogeneity. Statistically significant linear dose–response relationships were found for total CVD and major vascular events in the analyses with and without including REDUCE‐IT.


Marine omega‐3 supplementation lowers risk for myocardial infarction, CHD death, total CHD, CVD death, and total CVD, even after exclusion of REDUCE‐IT. Risk reductions appeared to be linearly related to marine omega‐3 dose.

Clinical Perspective

What Is New?

We updated previous meta‐analyses by adding 3 recent large randomized controlled clinical trials, increasing sample size by 64%.

Marine omega‐3 supplementation significantly lowered risk for most cardiovascular end points, even after excluding a trial testing very high‐dose supplementation.

Risk reductions were linearly associated with dose of marine omega‐3 supplementation.

What Are the Clinical Implications?

Daily marine omega‐3 supplementation is effective in lowering risk for coronary and most other cardiovascular end points, including myocardial infarction, coronary heart disease death, total coronary heart disease, cardiovascular disease death, and total cardiovascular disease no benefits, however, were found for stroke.

Greater cardiovascular benefits may be achieved at higher doses of marine omega‐ 3 supplementation.


Whether marine or long‐chain omega‐3 fatty acid supplementation has significant benefits in reducing risk of cardiovascular disease (CVD) is the subject of intense debate. Despite consistent findings from observational studies showing inverse associations between higher fish consumption and lower risk of heart disease, 1 , 2 recent evidence from randomized controlled trials (RCTs) testing marine omega‐3 supplementation, usually a moderate‐dose combination of eicosapentaenoic acid (EPA) and docosahexaenoic acid compared with placebo, have had largely null results. 3 , 4 Although the American Heart Association continues to recommend marine omega‐3 supplementation for patients with prevalent coronary heart disease to reduce mortality, it found insufficient evidence for use in prevention among patients at high CVD risk but without CVD. 5 A recent meta‐analysis synthesizing study‐level data from 10 midsize‐to‐large RCTs with at least 1 year of follow‐up reported no significant favorable effects of marine omega‐3 fatty acid supplementation on fatal or nonfatal coronary heart disease (CHD) or any major vascular events. 6 Another expanded meta‐analysis including smaller trials and dietary intervention trials reached the same conclusions. 7 Inconsistent findings between observational studies and RCTs cast doubt on a causal relationship between fish oil supplements and CVD prevention. 8

Against this backdrop, results from 3 recently published large RCTs have further fueled the debate. In ASCEND (A Study of Cardiovascular Events in Diabetes), which included 15 480 diabetes mellitus patients without existing CVD at baseline, marine omega‐3 supplementation did not reduce the primary end point of serious vascular events. 9 The VITAL (Vitamin D and Omega‐3 Trial), which included 25 871 participants at “usual” risk of CVD from the general population, also did not find a statistically significant reduction in the primary end point of major CVD events. 10 However, both ASCEND and VITAL found reductions in at least 1 prespecified secondary end point (vascular deaths in ASCEND and myocardial infarction [MI] in VITAL). The REDUCE‐IT (Reduction of Cardiovascular Events with Icosapent Ethyl‐Intervention Trial), in contrast, observed significant protective effects of icosapent ethyl, a highly purified and stable EPA ethyl ester, against occurrence of all fatal or nonfatal cardiovascular events among patients with established CVD or risk factors. 11 Incorporating new data from these 3 recent large RCTs, with and without inclusion of REDUCE‐IT, is important to provide the most up‐to‐date evidence. Also, we explored dose–response relationships between marine omega‐3 supplementation and CVD risks, an important subject that has not been addressed by previous meta‐analyses.


The authors declare that all supporting data are available within the article and its online supplementary files.

Search Strategy

We performed an updated meta‐analysis of RCTs based on the published data of a previous study‐level meta‐analysis 6 by incorporating data from the ASCEND, VITAL, and REDUCE‐IT. All 3 additional studies met inclusion criteria of RCTs using marine omega‐3 fatty acids supplementation versus placebo or open label control, with a sample size of at least 500 participants and a follow‐up duration ≥1 year. Study‐level data from these 3 studies were extracted from published results.

The end points of interest included MI (fatal and/or nonfatal MI), death from CHD, total CHD (MI, death from CHD, or coronary revascularization), total stroke (fatal and/or nonfatal stroke), death from CVD, total CVD (nonfatal MI, nonfatal stroke, death from CVD, or hospitalization because of a cardiovascular cause), and major vascular events (nonfatal MI, nonfatal stroke, any revascularization, or death from CVD). All the participants included in the current analysis provided written informed consent.

Statistical Analyses

For calculating the pooled rate ratio (RR) and 95% CIs, we constructed 2×2 contingency tables for each trial. The pooled RR, 95% CI, and P value for heterogeneity were calculated using a fixed‐effect model using the Mantel‐Haenszel method. Because REDUCE‐IT used a significantly higher dose (4000 mg/d) of marine omega‐3 supplements than all other trials, which might introduce substantial heterogeneity, we performed separate analyses with and without this study. To assess trials having an adequate dose use, treatment duration, and sample size, we conducted a sensitivity analysis by restricting to studies using at least 840 mg/d total marine omega‐3 supplementation, that had at least 1000 participants, and that lasted at least 2 years. Four studies (DOIT, 12 SU.FOL.OM3, 13 Alpha.Omega, 14 and OMEGA 15 ) were excluded according to these stricter criteria in this subset analysis. We also conducted a sensitivity analysis that excluded 2 open‐label trials (GISSI‐P 16 and JELIS 17 ) to eliminate potential bias introduced in the unblinding design. Finally, to assess the joint impact of both open‐label and smaller trials, we excluded the aforementioned 6 studies.

In the exploratory dose–response analysis, we used the total marine omega‐3 dose from EPA and docosahexaenoic acid combined. A meta‐regression was used to assess linear dose–response relationships between marine omega‐3 supplements dose measured as mg/d and risk for each outcome of interest. The nonlinear relationship was not explored because of the limited number of included trials. We also conducted separate analyses with and without REDUCE‐IT to assess whether any significant dose–response relationship was driven by its extremely high dose. In a sensitivity analysis, we additionally adjusted for the median follow‐up duration. Statistical analyses were performed using STATA version 15.0 (StataCorp LP, College Station, TX) and R (version 3.3.2, R Foundation) package metareg was used for the dose–response analysis. 18


The 13 included trials 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 19 , 20 , 21 , 22 had a total number of 127 477 participants, of whom 59.7% were male (Table). On average, the participants were 64.3 years of age at baseline, had a body mass index of 28 kg/m 2 , and were treated for 5 years. The addition of ASCEND, VITAL, and REDUCE‐IT increased the sample size by 63.6% and extended the mean follow‐up duration by 0.6 year compared with the previous meta‐analysis. 6 Overall, 39.7% of participants had prevalent diabetes mellitus and 72.6% used cholesterol‐lowering medication at enrollment. The range of marine omega‐3 supplementation dose was 376 to 4000 mg/d, although the relative proportion of EPA and docosahexaenoic acid varied among different trials. The JELIS and REDUCE‐IT trials tested EPA alone.

Table 1. Baseline Characteristics of RCTs Investigating Effects of Marine Omega‐3 Supplementation and CVDs

BMI indicates body mass index CVDs, cardiovascular diseases NA, not applicable RCTs, randomized controlled trials.

b Thirty‐four participants with missing data.

The pooled associations between marine omega‐3 supplementation and risk of CHD subtypes are presented in Figure 1. In the analysis excluding REDUCE‐IT, the pooled RRs (95% CIs P values) were 0.92 (0.86, 0.99 P=0.020) for MI, 0.92 (0.86, 0.98 P=0.010) for CHD death, and 0.95 (0.91, 0.99 P=0.008) for total CHD, where no significant heterogeneity was found. A linear dose–response relationship was not found between marine omega‐3 supplementation dose and these CHD outcomes (Figure S1). Including the REDUCE‐IT substantially strengthened the inverse associations for MI and total CHD while introducing statistically significant heterogeneity for the pooled estimates. The pooled RRs (95% CIs I 2 , P for heterogeneity) became 0.88 (0.83, 0.94 I 2 =51.2%, P for heterogeneity 0.017) for MI (P<0.001) and 0.93 (0.89, 0.96 I 2 =54.7%, P for heterogeneity 0.009) for total CHD (P<0.001). By including the REDUCE‐IT, a significant dose–response relationship was also observed but without introducing significant heterogeneity. Every 1000 mg/d marine omega‐3 supplementation corresponded to 9% (95% CI: 2%, 15% P=0.012 P for heterogeneity 0.218) and 7% (95% CI: 0%, 13% P=0.041 P for heterogeneity 0.068) lower risk of MI and total CHD, respectively (Figure S1A, and S1C).

Figure 1. Pooled associations between marine omega‐3 supplementation and risk of subtypes of CHD. A, Marine omega‐3 supplementation and risk of MI, which includes fatal and/or nonfatal MI. B, Marine omega‐3 supplementation and risk of CHD death. C, Marine omega‐3 supplementation and risk of total CHD, which includes MI, death from CHD, or coronary revascularization. CHD indicates coronary heart disease MI, myocardial infarction RR, rate ratio.

The pooled RR (95% CI) between marine omega‐3 supplementation and risk of stroke and other CVD event subtypes are shown in Figures 2 and 3. In the analysis excluding REDUCE‐IT, no significant association was found for stroke (1.05 [0.98, 1.14] P=0.183) but significant inverse associations (RRs [95% CIs] P values) were found for CVD death (0.93 [0.88, 0.99] P=0.013) and total CVD (0.97 [0.94, 0.99] P=0.015). For major vascular events, the RR was 0.97 (0.94, 1.00 P=0.058). Each 1000 mg/d marine omega‐3 supplementation lowered risk of total CVD by 17% (95% CI: 4%, 29%) and risk of major vascular events by 17% (95% CI: 3%, 28%) without evidence of heterogeneity (Figure S2B, and S2C). Including REDUCE‐IT only slightly strengthened the pooled inverse associations (RR [95% CI] P value) for CVD death (0.92 [0.88, 0.97] P=0.003) and total CVD (0.95 [0.92, 0.98] P<0.001), but lowered the risk of major vascular events (0.95 [0.93, 0.98] P<0.001). The P values for heterogeneity became statistically significant for total CVD (P=0.002) and major vascular events (P=0.003) but not for total stroke (P=0.093) or CVD death (P=0.388). The linear dose–response relationships were statistically significant for total stroke (RR [95% CI] per 1000 mg/d increment: 0.89 [0.82, 0.98]) (Figure S3), total CVD (RR [95% CI] per 1000 mg/d increment: 0.91 [0.88, 0.95]) and major vascular events (RR [95% CI] per 1000 mg/d increment: 0.92 [0.89, 0.95]) without evidence of heterogeneity after including REDUCE‐IT (Figure S2B, and S2C). Additional adjustment for follow‐up duration did not materially change the regression slopes for the CVD end points.

Figure 2. Pooled associations between marine omega‐3 supplementation and risk of total stroke. Total stroke includes fatal and/or nonfatal stroke.

Figure 3. Pooled associations between marine omega‐3 supplementation and risks of other subtypes of CVD. A, Marine omega‐3 supplementation and risk of CVD death. B, Marine omega‐3 supplementation and risk of total CVD, which includes nonfatal MI, nonfatal stroke, death from CVD, or hospitalization because of a cardiovascular cause (except for JELIS and ALPHA OMEGA, which include revascularization). Removing JELIS and ALPHA OMEGA resulted in pooled RR 0.97 (0.94, 1.00), P=0.046 without including REDUCE‐IT, and 0.95 (0.93, 0.98), P=0.001 with REDUCE‐IT. C, Marine omega‐3 supplementation and risk of major vascular events, which include nonfatal MI, nonfatal stroke, death from CVD, or revascularization. CVD indicates cardiovascular disease MI, myocardial infarction RR, rate ratio.

In the sensitivity analysis that excluded DOIT, SU.FOL.OM3, Alpha.Omega, and OMEGA (because of considerably lower dose, duration, or size), inverse associations for most CVD end points were strengthened (Figure S4). In the analysis that excluded 2 open‐label trials, GISSI‐P and JELIS, the point estimates remained unchanged for most CVD end points except CHD death whose RR (95% CI) was attenuated to 0.94 (0.87, 1.01), and the 95% CIs for most end points became wider (Figure S5). Jointly excluding both open‐label and smaller trials also produced similar RRs with wider 95% CIs across CVD end points, with the largest attenuation for CHD death (Table S1).


In this updated meta‐analysis, we found that marine omega‐3 supplementation significantly lowered the risk of MI, total CHD, total CVD, and because of CHD or CVD, even after excluding REDUCE‐IT. Including REDUCE‐IT resulted in stronger inverse associations for these outcomes while introducing significant heterogeneity. Linear dose–response relationships were persistent only for total CVD and major vascular events in the analyses with and without including REDUCE‐IT.

The current updated meta‐analysis builds upon a previous one including 10 large RCTs and provides an up‐to‐date assessment regarding the effects of marine omega‐3 supplementation and risks of multiple subtypes of CVD end points. The inclusion of 3 additional studies, increasing samples size by 64% and contributing 11% to 45% of the total weight of the CVD end points in the current analysis, has a substantial influence on the available evidence. In contrast with recent meta‐analysis, our study suggests that MI, total CHD, CHD death, total CVD, and CVD death are reduced by marine omega‐3 supplementation (even after excluding REDUCE‐IT) and that higher doses of marine omega‐3 supplementation are significantly associated with reduced risk of total CVD and major vascular events. Despite the modest effect sizes for some of the CVD outcomes, the use of marine omega‐3 supplementation may still help prevent large absolute numbers of CVD events, given the high incidence rates of CVD worldwide. Finally, our results were generally consistent with previous findings that indicated that marine omega‐3 supplementation was not associated with risk of stroke.

The differential associations frequently observed between composite CVD end points and individual components of composite outcomes imply that the potential beneficial effects of marine omega‐3 may not be uniform across all types of CVD. Findings from the current study are in line with previous meta‐analyses suggesting that marine omega‐3 supplementation may be particularly effective in reducing CHD events and mortality because of CVD causes, but not in reducing stroke. 3 , 23 Both ASCEND and REDUCE‐IT observed a lower incidence of vascular death with marine omega‐3 supplementation than with placebo, and VITAL also found a lower risk of MI and fatal MI in the treatment group. In contrast, the effects of marine omega‐3 supplementation on risk of stroke were mostly null, which was confirmed in the current meta‐analysis. However, given the substantial risk reduction of total stroke in REDUCE‐IT, it remains unclear whether higher doses of omega‐3 supplementation are required to attain these benefits. In addition, because most marine omega‐3 trials recruited participants with existing CVD or prevalent chronic conditions, the frequent use of statins, beta‐blockers, aspirin, anticoagulants, and hypoglycemic medications may impair the ability to detect additional CVD benefits from the marine omega‐3 supplementation. However, the generally similar results among those using and not using these medications in VITAL, the only trial conducted in a usual‐risk population, and in previous meta‐analyses argues against this explanation.

In the current study, the linear dose–response relationship observed between marine omega‐3 supplementation and several CVD end points is both clinically and biologically plausible. Because most included trials comprise patients at high risk of CVD and with advanced atherosclerosis, a high dose of marine omega‐3 supplementation may be needed to achieve potential benefits in this setting. A dose–response analysis based on 58 placebo‐controlled trials estimated that each 1 g/d increase of marine omega‐3 reduced triglyceride levels by 5.9 mg/dL and such linear association did not plateau even at 7 g/d. 24 Nevertheless, our dose–response analysis was highly exploratory and should be interpreted cautiously. Because most included trials had a dose around 850 mg/d, the slope of the regression line was essentially determined by few distinctive doses within a narrow range, which may not be sufficient to delineate the underlying dose–response relationship. Although including REDUCE‐IT generated significant linear dose–response relationships between marine omega‐3 supplementation and most CVD outcomes, the substantially changed slopes suggested that the marine omega‐3 dose of 4000 mg/d was an influential outlier (most trials tested doses ≤1000 mg/d and the second largest dose was 1800 mg/d). Nevertheless, the general inverse trend in the dose–response analysis without including REDUCE‐IT suggested that the protective effects of marine omega‐3 may be evident even at moderate‐to‐high doses. Eventually, incorporating data from the ongoing trial STRENGTH (Statin Residual Risk Reduction With Epanova in High Cardiovascular Risk Patients with Hypertriglyceridemia), which is testing a high dose of marine omega‐3 supplementation, may help to further clarify the dose–response relationship between marine omega‐3 supplementation and CVD risk. Furthermore, results from our sensitivity analysis that excluded 4 smaller RCTs (Figure S4) suggested that adequate sample size, moderate‐to‐high marine omega‐3 dose, and longer treatment duration are required to ensure a rigorous and reliable assessment of the effect of marine omega‐3 supplementation on CVD end points. Finally, it is noteworthy that removing 2 open‐label trials attenuated the estimates for CHD death only (Table S1). This is largely because of the exclusion of GISSI‐P, the trial with the largest study weight (19.48%) among all included trials and showing a statistically significant risk reduction in this end point of nearly 20%. However, despite widening of CIs because of sample size reduction, the RR point estimates were virtually unchanged for MI, total CVD, and other vascular end points.

Our study has some limitations. First, we were unable to perform subgroup analysis by including 3 additional trials because the study‐level data were not available for these trials. However, because the associations did not differ across most subgroups such as age, sex, prior statin use, etc, in these 3 additional trials, it is unlikely that any significant effect modification would emerge, in view of the absence of interactions across these subgroups in previous meta‐analyses. 6 Although VITAL suggested that marine omega‐3 supplementation may particularly benefit blacks and those with low fish consumption, we could not investigate such effect modifications in the current meta‐analysis because previous trials had predominantly white participants, and few studies assessed baseline fish intake. Second, because of the lack of published study‐level data, we were unable to include some end points such as subtypes of stroke and revascularization. Third, potential nonlinear relationships between marine omega‐3 supplementation and CVD end points could not be determined because of an insufficient number of trials. Finally, our study did not include some small trials or trials using dietary advice as the intervention. However, a previous study including these additional trials 7 produced results identical to an earlier meta‐analysis involving 10 large trials only, 6 suggesting that the results were unlikely to be influenced by inclusion of those studies.


The current updated meta‐analysis incorporating data from 13 RCTs, including 3 recent large trials, suggests that marine omega‐3 supplementation is associated with lower risk of MI, total CHD, total CVD, and death from CHD or CVD causes. Such inverse associations may be particularly evident at higher doses of marine omega‐3 supplementation. Additional large trials testing high doses of marine omega‐3 supplementation are warranted to confirm and extend these findings.

Source of Funding

VITAL was supported by grants U01 CA138962 and R01 CA138962 from the National Institutes of Health. Pharmavite LLC of Northridge, California (vitamin D) and Pronova BioPharma of Norway and BASF (Omacor fish oil) donated the study agents and matching placebos.


Dr Frank B Hu reported being supported by grants HL60712, HL118264, and DK112940 from the National Institutes of Health and reported receiving research support from the California Walnut Commission and honoraria for lectures from Metagenics and Standard Process and honoraria from Diet Quality Photo Navigation, outside the submitted work. The remaining authors have no disclosures to report.


Wise J. European drug agency approves cannabis-based medicine for severe forms of epilepsy. BMJ. 2019366:l5708.

Jadoon KA, Ratcliffe SH, Barrett DA, Thomas EL, Stott C, Bell JD, et al. Efficacy and safety of cannabidiol and tetrahydrocannabivarin on glycemic and lipid parameters in patients with type 2 diabetes: a randomized, double-blind, placebo-controlled, parallel group pilot study. Diabetes Care. 201639:1777–86.

GW Research Ltd. Study to assess the effect of cannabidiol on liver fat levels in subjects with fatty liver disease. ClinicalTrialsGov 2014. Accessed 15 June 2019.

McGuire P, Robson P, Cubala WJ, Vasile D, Morrison PD, Barron R, et al. Cannabidiol (CBD) as an adjunctive therapy in schizophrenia: a multicenter randomized controlled trial. Am J Psychiatry. 2018175:225–31.

Freeman TP, Hindocha C, Green SF, Bloomfield MAP. Medicinal use of cannabis based products and cannabinoids. BMJ. 2019365:l1141.

Pertwee RG. The diverse CB1 and CB2 receptor pharmacology of three plant cannabinoids: Δ9‐tetrahydrocannabinol, cannabidiol and Δ9‐tetrahydrocannabivarin. Br J Pharmacol. 2008153:199–215.

Massi P, Valenti M, Vaccani A, Gasperi V, Perletti G, Marras E, et al. 5‐Lipoxygenase and anandamide hydrolase (FAAH) mediate the antitumor activity of cannabidiol, a non‐psychoactive cannabinoid. J Neurochem. 2008104:1091–100.

McPartland JM, Duncan M, Di Marzo V, Pertwee RG. Are cannabidiol and Δ9-tetrahydrocannabivarin negative modulators of the endocannabinoid system? A systematic review. Br J Pharmacol. 2015172:737–53.

Campos AC, Moreira FA, Gomes FV, Del Bel EA, Guimaraes FS. Multiple mechanisms involved in the large-spectrum therapeutic potential of cannabidiol in psychiatric disorders. Philos Trans R Soc B Biol Sci. 2012367:3364–78.

Laprairie RB, Bagher AM, Kelly MEM, Denovan-Wright EM. Cannabidiol is a negative allosteric modulator of the cannabinoid CB1 receptor. Br J Pharmacol. 2015172:4790–805.

Rohleder C, Müller JK, Lange B, Leweke FM. Cannabidiol as a potential new type of an antipsychotic. A critical review of the evidence. Front Pharmacol. 20167:422.

Iffland K, Grotenhermen F. An update on safety and side effects of cannabidiol: a review of clinical data and relevant animal studies. Cannabis Cannabinoid Res. 20172:139–54.

Huestis MA, Solimini R, Pichini S, Pacifici R, Carlier J, Busardò FP. Cannabidiol adverse effects and toxicity. Curr Neuropharmacol. 201917:974–89.

Whiting PF, Wolff RF, Deshpande S, Di Nisio M, Duffy S, Hernandez AV, et al. Cannabinoids for medical use a systematic review and meta-analysis. JAMA. 2015313:2456.

Lattanzi S, Brigo F, Trinka E, Zaccara G, Cagnetti C, Del Giovane C, et al. Efficacy and safety of cannabidiol in epilepsy: a systematic review and meta-analysis. Drugs. 201878:1791–804.

Higgins JPT, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, et al. The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. BMJ. 2011343:d5928.

Devinsky O, Patel AD, Cross JH, Villanueva V, Wirrell EC, Privitera M, et al. Effect of cannabidiol on drop seizures in the Lennox–Gastaut syndrome. N Engl J Med. 2018378:1888–97.

Devinsky O, Cross JH, Laux L, Marsh E, Miller I, Nabbout R, et al. Trial of cannabidiol for drug-resistant seizures in the Dravet syndrome. N. Engl J Med. 2017376:2011–20.

Boggs DL, Surti T, Gupta A, Gupta S, Niciu M, Pittman B, et al. The effects of cannabidiol (CBD) on cognition and symptoms in outpatients with chronic schizophrenia a randomized placebo controlled trial. Psychopharmacology. 2018235:1923–32.

GW Research Ltd. A randomized controlled trial to investigate possible drug-drug interactions between clobazam and cannabidiol. ClinicalTrialsGov 2018. Accessed 15 June 2019.

Devinsky O, Patel AD, Thiele EA, Wong MH, Appleton R, Harden CL, et al. Randomized, dose-ranging safety trial of cannabidiol in Dravet syndrome. Neurology. 201890:e1204–e1211.

Thiele EA, Marsh ED, French JA, Mazurkiewicz-Beldzinska M, Benbadis SR, Joshi C, et al. Cannabidiol in patients with seizures associated with Lennox-Gastaut syndrome (GWPCARE4): a randomised, double-blind, placebo-controlled phase 3 trial. Lancet. 2018391:1085–96.

Taylor L, Gidal B, Blakey G, Tayo B, Morrison G. A phase I, randomized, double-blind, placebo-controlled, single ascending dose, multiple dose, and food effect trial of the safety, tolerability and pharmacokinetics of highly purified cannabidiol in healthy subjects. CNS Drugs. 201832:1053–67.

Hill KP. Cannabidiol pharmacotherapy for adults with cannabis use disorder (NCT03102918). ClinicalTrialsGov 2019.

Naftali T, Mechulam R, Marii A, Gabay G, Stein A, Bronshtain M, et al. Low-dose cannabidiol is safe but not effective in the treatment for Crohn’s Disease, a randomized controlled trial. Dig Dis Sci. 201762:1615–20.

Consroe P, Laguna J, Allender J, Snider S, Stern L, Sandyk R, et al. Controlled clinical trial of cannabidiol in Huntington’s disease. Pharmacol Biochem Behav. 199140:701–8.

Thiele EA, Marsh ED, French JA, Mazurkiewicz MB, Benbadis SR, Joshi C, et al. Cannabidiol in patients with seizures associated with Lennox-Gastaut syndrome (GWPCARE4): a randomised, double-blind, placebo-controlled phase 3 trial. Lancet. 2018391:1085–96.

Groeneveld GJ, Martin JH. Parasitic pharmacology: a plausible mechanism of action for cannabidiol. Br J Clin Pharmacol. 202086:189–91.

Schoedel KA, Szeto I, Setnik B, Sellers EM, Levy-Cooperman N, Mills C, et al. Abuse potential assessment of cannabidiol (CBD) in recreational polydrug users: a randomized, double-blind, controlled trial. Epilepsy Behav. 201888:162–71.

Sun G, Zhang L, Zhang L, Wu Z, Hu D. Benzodiazepines or related drugs and risk of pneumonia: A systematic review and meta-analysis. Int J Geriatr Psychiatry. 201934:513–21.

FDA Centre for Drug Evaluation and Research. Drug approval package: epidiolex (Cannabidiol). Clin Rev. 2018:30.

Morrison G, Crockett J, Blakey G, Sommerville K. A Phase 1, open‐label, pharmacokinetic trial to investigate possible drug‐drug interactions between clobazam, stiripentol, or valproate and cannabidiol in healthy subjects. Clin Pharmacol Drug Dev. 20198:1009–31.

FDA Centre for Drug Evaluation and Research. Drug approval package: epidiolex (cannabidiol). Summ Rev. 2018:30.

Pertwee RG. Cannabinoids and the gastrointestinal tract. Gut. 200148:859–67.

Lau BK, Cota D, Cristino L, Borgland SL. Endocannabinoid modulation of homeostatic and non-homeostatic feeding circuits. Neuropharmacology. 2017124:38–51.

Morgan CJA, Freeman TP, Schafer GL, Curran HV. Cannabidiol attenuates the appetitive effects of Δ9- tetrahydrocannabinol in humans smoking their chosen cannabis. Neuropsychopharmacology. 201035:1879–85.

Curioni C, André C. Rimonabant for overweight or obesity. Cochrane Database Syst Rev. 20064:CD006162.

Badowski ME, Yanful PK. Dronabinol oral solution in the management of anorexia and weight loss in AIDS and cancer. Ther Clin Risk Manag. 201814:643–51.

Andries A, Frystyk J, Flyvbjerg A, Støving RK. Dronabinol in severe, enduring anorexia nervosa: a randomized controlled trial. Int J Eat Disord. 201447:18–23.

Bonn-Miller MO, Loflin MJE, Thomas BF, Marcu JP, Hyke T, Vandrey R. Labeling accuracy of cannabidiol extracts sold online. JAMA. 2017318:1708–9.

Poklis JL, Mulder HA, Peace MR. The unexpected identification of the cannabimimetic, 5F-ADB, and dextromethorphan in commercially available cannabidiol e-liquids. Forensic Sci Int. 2019294:e25–e27.

Psychology’s Meta-Analysis Problem

Psychology has a meta-analysis problem. And that&rsquos contributing to its reproducibility problem. Meta-analyses are wallpapering over many research weaknesses, instead of being used to systematically pinpoint them.

It&rsquos a bit of a case of a prophet not being recognized in its hometown. Meta-analysis was born in the late 1970s in educational psychology. It&rsquos the process of combining and analyzing data from more than one study at a time (more here).

Pretty soon, medical researchers realized it could be &ldquoa key element in improving individual research efforts and their reporting&rdquo. They forked off down the path of embedding meta-analysis in systematic reviewing &ndash so much so, that many people mistake &ldquometa-analysis&rdquo and &ldquosystematic review&rdquo as synonyms.

But they&rsquore not. Systematic reviewing doesn&rsquot just gather and crunch numbers from multiple studies. The process should include systematic critical assessment of the quality of study design and risk of bias in the data as well. Unreliable study results can fatally tilt results in a meta-analysis in misleading directions.

And psychology research can have a very high risk of bias. For example, many areas of psychology research are particularly prone to bias because of their populations: often volunteers rather than a sample selected to reduce bias, college students, or drawn from the internet or Mechanical Turk. They tend to be, as Joseph Henrich and colleagues pointed out [PDF], &ldquothe weirdest people in the world&rdquo: from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies. And their responses may not be reliable enough, either.

Here&rsquos an example of what this means for a body of studies. I looked at this meta-analysis on changing implicit bias when l was researching for a recent blog post. It sounds like a solid base from which to draw conclusions: 427 studies with 63,478 participants.

But 84% of the participants are college students, 66% female, and 55% from the USA. They&rsquore nearly all very short-term studies (95%), 85% had no measures of behavior in them, and 87% had no pre-test results.

James Coyne recently walked through a range of other critical biases and &ldquoquestionable research practices&rdquo (QRPs) affecting the literature in psychology, including selective reporting of analyses, studies that are too small and lack scientific rigor, over-reliance on statistical significance testing, and investigator allegiance to a therapy or idea. John Ioannidis and colleagues (2014) also document extensive over-reporting of positive results and other reporting biases across the cognitive sciences.

Coyne and colleagues found that these problems were bleeding through into a group of meta-analyses they studied, too [PDF]. Christopher Ferguson and Michael Brannick suspected a skew from unpublished studies in 20-40% of the psychology meta-analyses they studied [PDF]. Michiel van Elk and his colleagues found similar problems, pointing out

[I]f there is a true effect, then a meta-analysis will pick it up. But a meta-analysis will also indicate the existence of an effect if there is no true effect but only experimenter bias and QRPs.

Guy Cafri, Jeffrey Kromrey and Michael Brannick add to this list a range of problems of statistical rigor in meta-analyses published across a decade in a major psychology journal &ndash including multiple testing (an explainer here).

It&rsquos not that these problems don&rsquot exist in meta-analysis in other fields. But the problem seems far more severe in psychology, largely, it seems to me, because of the way meta-analysis has evolved in this field. In clinical research, systematic reviewing started to make these problems clear in the 1970s and 1980s &ndash and the community started chipping away at them.

Formal instruments for assessing the robustness and risk of bias of different study types were developed and evaluated, and those assessments were incorporated into systematic reviews. We called what drove this process methodology or methods research. John Ioannidis and his team at METRICS call it meta-research:

Meta-research is an evolving scientific discipline that aims to evaluate and improve research practices.

It&rsquos a long way from being a perfect science itself. It&rsquos not that easy to reliably measure the reliability of data! And it&rsquos not implemented as fully as it should be in clinical systematic reviews either. But it is the expected standard. (See for example PRISMA and AMSTAR for systematic reviews, and Julian Higgins and colleagues for meta-analyses [PDF].) Two of the key features of systematic reviews are appraising the quality of the evidence and minimizing bias within the review process itself.

Yet, in the equivalent for psychology &ndash Meta-Analysis Reporting Standards (MARS) &ndash assessment of data quality is optional. Gulp.

Fortunately, in psychological therapies, many adhere to clinical research standards for systematic reviews. To test my impressions of how non-therapeutic, non-imaging meta-analyses in psychology stack up, I assessed 10 published in the psychological literature in June on some key criteria. (Details of how I got to this 10 are at the foot of this post. And for comparison with biomedical systematic reviews, I used the recent study by Matthew Page and colleagues.)

It&rsquos a small group of psychology meta-analyses, and I&rsquom the only person who did the assessment, which are 2 big sources of bias. I tried to be very generous. It still isn&rsquot a pretty picture &ndash only 1 out of 10 meta-analyses came out with more thumbs up than down (&ldquoJ&rdquo). And this was only a subset of quality criteria for a systematic review.

Selected bias minimization & quality criteria: assessment of 10 meta-analyses in psychology

None reported a protocol or &lsquoa priori&rsquo design for the meta-analysis. That&rsquos explicitly reported in about 40% of systematic reviews in biomedical research. The rate of comprehensive search for studies, with detail reported, was 60% (compared with around 90% for biomedicine). And the key issues of quality assessment? Half as often for psychology: only 30 and 40%, compared with 60 and 80% in biomedicine.

The risk of bias in the studies meta-analyzed was mostly unquestioned. Perhaps more of the authors share the misassumption of authors of &ldquoA&rdquo:

48.9% of the studies in this dataset (i.e. 68 out of 139 published articles) come from the six leading SCM journals mentioned above, and 77.7% of the papers (i.e. 108 out of 139) are published in journals included in the master list of the Social Sciences Citation Index. Therefore, the quality of the studies that provide the correlations for this meta-analysis is considered high. A list of the studies is available on request.

Oh boy. Perhaps people would have less faith in the published literature in their discipline if they really looked under the hood.

When I was looking for research tools to evaluate risk of bias in psychology studies, I found an article with the promising title, &ldquoEvaluating psychological reports: dimensions, reliability, and correlates of quality judgments&rdquo. But this 1978 article turned out to be instructive for other reasons. It was based on what editors and consultants for 9 major psychology journals thought was worth publishing. It rather meticulously describes the foundation on which psychology&rsquos replication crisis was built:

Component VI seems primarily to reflect scientific advancement, while Component VII seems to merit the label &ldquodata grinders&rdquo or &ldquobrute empiricism,&rdquo with emphasis on description rather than explanation. The eighth component might be labeled &ldquoroutine&rdquo or &ldquoho-hum&rdquo research and brings to mind the remark one respondent penciled in the questionnaire margin: &ldquoHow about an item like, &lsquoOh God, there goes old so-and-so again!&rsquo ?&rdquo

The derided components VII and VIII include bedrocks of good science, like &ldquoThe author uses precisely the same procedures as everyone else&rdquo, and &ldquoThe results are not overwhelming in their implications, but they constitute a needed component of knowledge&rdquo. You can see what this publication culture led to in a great example Neuroskeptic explained: romantic priming. Or another case &ndash including problematic meta-analysis &ndash Daniel Engber wrote about: self-control and ego depletion.

Meta-research is well on its way in psychology and neuroscience, and pointing to lots of ways forward. Kate Button and colleagues show the importance of increasing sample sizes, Chris Chambers and more than 80 colleagues call for study pre-registration.

Changing the expectations of meta-analysis in psychology needs to be part of the new direction, too. Meta-analysis began in psychology. But long experience is no guarantee of being on the right track. Oscar Wilde summed that up in 1890 in The Picture of Dorian Gray:

He began to wonder whether we could ever make psychology so absolute a science that each little spring of life would be revealed to us. As it was, we always misunderstood ourselves and rarely understood others. Experience was of no ethical value. It was merely the name men gave to their mistakes&hellipAll that it really demonstrated was that our future would be the same as our past.

Disclosures: My ongoing PhD research is meta-research in systematic reviewing in health care, and assessing the quality of systematic reviews is part of my role at PubMed Health. I was one of the founding members of the Cochrane Collaboration and Germany&rsquos Institute for Quality and Efficiency in Health Care (IQWiG), participated in a standard-setting group for reporting abstracts of systematic reviews, and was a member of the GRADE Working Group from 2008 to 2011.

The cartoons (including the mock reproducibility meta-analysis at the top of the post) are my own (CC-NC-ND-SA license). (More cartoons at Statistically Funny and on Tumblr.)

These were the first 10 which were not systematic reviews, therapy, gene association, survey, or imaging meta-analyses, from a PsycINFO search on 10 June 2016: meta-analysis in title. I also had to have immediate access to the full text. The full list of 37 citations it took to yield these 10 is here.

Rough comparison of outcomes between this set and Page et al&rsquos large set of systematic reviews in biomedicine:


1.1 Social comparison

Festinger ( 1954 ) conceived SCT to explain how individuals make self-evaluations regarding opinions and abilities, seeking similar others as upward comparison targets (superior others), to maintain a stable self-view (Corcoran, Crusius, & Mussweiler, 2011 ). Earlier research also suggested that individuals engage in comparisons during stressful situations to assess how well they are coping (Schachter, 1959 ). Wills ( 1981 ) elaborated on Festinger's work by acknowledging that individuals also make downward comparisons (inferior others), often in reaction to a decrease in subjective wellbeing, with the goal of improving their affect. An example of comparison direction influencing affect could be a depressed patient comparing with a recovered patient (upward target) and experiencing an improvement or deterioration in affect due to the respective assimilation or contrast with the comparison target. The patient may also compare with a downward target (someone they see as more depressed), which again may lead to improvement or deterioration in affect dependent on whether they assimilate or contrast with the target. Thus, SCs are a cognitive process that can contribute to changes in affect depending on how individuals perceive their targets, as well as themselves.

Buunk and Brenninkmeijer ( 2001 ) report an example of how target similarity can influence the comparison process in context of depression. They found that individuals high in depression experienced a positive mood change when exposed to examples of recovered depressed patients. However, individuals low in depression experienced a negative mood change when exposed to the same targets, showing that perceived similarity with a target can determine the impact of comparison. In addition, these results were moderated by an interaction between the effort of coping of the target (low or high) and degree of social comparison orientation (SCO a concept regarding the extent to which comparison information is sought Gibbons & Buunk, 1999 ). Thus, mood change was more positive for depressed and less negative for nondepressed individuals when targets were, respectively, low- or high-effort copers and high in SCO. These findings suggest that comparison habits vary depending on mental health and have subsequent effects on psychological factors, such as mood, yet are a complex interaction of aspects of comparison. The two-way relationship between mental health and SC is highlighted here, where identifying the processes involved is necessary to gain better understanding of their relationship to depression and anxiety.

A definition suggested by Wood ( 1996 , p. 521) defined SC as a process that consists of three facets: (1) acquiring social information (seeking, encountering, or constructing) (2) evaluating the outcome of comparison in relation to the self (relevance and [dis]similarities) and (3) reacting to the evaluation of comparison information via cognitive, affective, or behavioural responses. In context of this definition, the B. P. Buunk and Brenninkmeijer experiment involves aspects of all three facets: the measure of SCO assesses a general level of engagement with SC, relating to processes in Part 1 the experimental manipulation focuses on Part 2, where the behaviour of targets and perceived similarity will affect the outcome of the comparison in relation to evaluations about the self and the experimental outcome variable relates to Part 3 where reaction is assessed as change in mood. In order to identify and separate aspects of SC in publications included in this review, we will focus on the three facets of acquiring, evaluating, and reaction, as per Wood's ( 1996 , p. 521) definition, when categorizing relevant variables.

1.2 SC and mental health

Depression and anxiety disorders are characterized by disturbances in emotional, cognitive, and behaviour processes (American Psychiatric Association, 2013 ). These disorders affect significant proportions of the general population, being the most common mental health disorders with an approximate 19.1% experiencing depression or dysthymia and 28.8% experiencing an anxiety disorder (including post-traumatic stress disorder [PTSD]) during the lifespan (Kessler et al., 2005 ). SC is likely to play a significant role in the development and maintenance of depressive symptoms, with evidence suggesting that reductions in engaging with general SC precede improvements in depression (Kelly, Roberts, & Bottonari, 2007 ). Literature further suggests that major depressive episodes are more common in individuals who negatively evaluate themselves compared with others (Sturman & Mongrain, 2008 ). Fewer publications focus on the role of SC in anxiety disorders however, diary reports suggest that social anxiety patients are more likely to engage in upward SC, compare on several dimensions at a time, and are more vulnerable to affective reactions than healthy controls (Antony, Rowa, Liss, Swallow, & Swinson, 2005 ). Therefore, a better understanding of the nature and impact of SC processes in dysfunctional cognitions and behaviours related to depression and anxiety could benefit diagnosis and treatment.

1.3 Present study

As described above, there are a number of approaches to assess SC, such as SCO (Gibbons & Buunk, 1999 ), the SCS (Allan & Gilbert, 1995 ), as well as numerous experimental reaction paradigms (Gerber et al., 2018 ). To better understand the process of SC, we use Wood's definition and code how SC is approached in terms of manipulation and outcome of the three facets: acquiring, evaluating, and reaction. In this context, SCO would be coded as acquiring, as it reflects a tendency of seeking and engaging with SC. The SCS would be coded as evaluating as it involves making judgements relative to others. Change in affect, cognitive reflections, or behavioural observations following SC would be coded as reaction. Our aim is therefore to assess how these facets of SC have been considered in depression and anxiety disorders. PTSD will also be considered in the literature review due to its classification as an anxiety disorder until the most recent DSM-5 publication (American Psychological Association, 2013 ). Our review involves the qualitative synthesis of a broad range of literature, covering observational and experimental paradigms. We also aimed to produce a quantitative synthesis of relevant data to summarize findings via a meta-analysis.