Tuesday, October 18, 2016

Critique of Meta-Analysis

Sir David Goldberg introduced me to the concept of meta-analysis. He was skeptical and expressed concern that it was difficult to build a sturdy wall as a conglomerate of miscellaneous pebbles, cobble, and boulders.

(I was reminded of a critique of Emmanuel Kant's system of philosophical thought as a sausage machine from which a homogeneous output flows from heterogeneous sources.)

Read what Senn (University of Glasgow Department of Statistics) says about meta-analysis, and try to avoid the errors:

Senn, 2009


p.s. You can learn a lot about a lot if you read Senn's paper carefully. Here is an illustration:

A meta-analysis by Hackshaw et al [16] in the BMJ considered passive smoking. The method involved weighting reported log-odds ratios using reported (or calculated from confidence intervals) standard errors. However, Peter Lee, in an extremely important but sadly neglected article [17] in Statistics in Medicine has pointed out that the fact that the standard error for a log-odds ratio is approximately equal to the square root of the sum of the reciprocals of the frequencies in the corresponding four-fold table provides various lower bounds on the standard error. Conversely, a given standard error implies a minimum sample size. In fact for a given total sample size N, the split of cases and non-cases in exposed and unexposed groups that gives rise to the minimum standard error is an equal split of N/4 subjects in each cell. It follows, for example, that for any reported variance, V [,] the total sample size, N [,] must satisfy the requirement that N ≥ 16/V. Similar inequalities exist for the total of any two cells and for the numbers in any given cell. 


Re-read this sentence fragment with care, and you will learn (or re-learn) a basic principle about 2x2 contingency table analysis that seems to be lost by many in the dense biostatistics curriculum [but here I would caution that Senn uses 'N' where an epidemiologist generally should be using 'n' so that big N denotes the population size and little n denotes the observed sample  size. [Statistics and epidemiology don't always have the same conventions here, but within epidemiology we can keep ourselves straight by paying attention to the population N versus the sample n.]

......the standard error for a log-odds ratio is approximately equal to the square root of the sum of the reciprocals of the frequencies in the corresponding four-fold table....


The rest of the same paragraph is worth reading and committing to memory. Doing so, you might get a sense of a rationale for senior epidemiologists always asking about the unweighted cell counts used to derive an X-Y estimate via odds ratios or log-odds-ratios. As  the cell count goes toward or to zero, the reciprocal of that frequency responds accordingly, and the size of the sum grows (as does its square root). What seems to be a large estimated association can be rendered essentially null by a small cell count in just one cell of the 2x2 table.

Of course, this postscript is not a critique of meta-analysis. It has to do with practical numeracy in epidemiology at a quite basic level that we could teach in primary school once simple division and square roots are grasped (even before logarithms are mastered). Epidemiologists worth their salt know and apply these principles of our field.

No comments:

Post a Comment

Comments to this blog are moderated. Urgent or other time-sensitive messages should not be sent via the blog.