Understanding Different Study Designs

.Q 1. Can you recommend a reference which could serve as a good introduction to study design for clinical research?

A. See Topics 12-16 of the book

Medical Statistics at a Glance.

If you are registered with the University of Edinburgh, you can consult the electronic version of this book via the University’s library discovery system, DiscoverEd.

Here are some reference details:

Title: Medical statistics at a glance
Author: Aviva Petrie
Caroline Sabin
Publisher: Hoboken : Wiley
Publication Date: 2013
Edition: Third edition

Q. 2. What exactly is a pilot study and when and why should it be conducted?

A. Check out the content at the following Students 4 Best Evidence page: What is a pilot study?

· Q 3. I have been advised by a clinician that I need to be aware of confounding when interpreting results. I have a vague idea what this is but would value gaining a better insight. Can you point me to any relevant resources.

A. It is helpful to be aware that confounding is a phenomenon which arises from the presence of an extraneous variable in a statistical model which is correlated with both an independent (predictor) variable and the dependent (outcome) variable. The phenomenon itself consists in the true association between the predictor and effect variable being masked (false negative) or to the false detection of an association between the predictor and response variable (false positive).

Please take time out to refer to the online version of Medical Statistics at a Glance (2nd edition, Topic 34) to see these ideas in context by means of some clinical examples. This should help you form analogies with your own work. If you are registered with the University of Edinburgh (UoE) library, you can go to the UoE library catalogue and access the details from there!

Note. Statisticians adjust for confounding through the use of multivariate models. These allow for testing of associations between predictor and response variables independently of other predictor variables. This approach is referred to as adjusting for confounding.

· Q 4. I am experiencing some difficulties in identifying the differences between cohort, cross-sectional and case-control studies and when to use them. Could you recommend a lucid reference?

A. The 2003 paper Observation research methods. Research design II: cohort, cross-sectional and case-control studies by C J Mann is perfect for your needs. Towards the end of this reference, you will also find a useful bullet point summary of key characteristics relating to each study design together with a glossary of relevant terms. This reference is usefully complemented by

Non-randomised controlled study (NRS) designs, which provides a list of design types for you to choose from to help you be more precise when defining your own study study if this study is not defined as a randomized controlled trial.

· Q 5. I would like to have a firm handle on what is meant by randomization. Can you recommend a useful read?

A. There are some varieties of treatment allocations which are really just fake forms of randomization and you would do well to identify a range of genuine ones, including their respective virtues. To this end, please consider having a read through the resource Lesson 8: Treatment Allocation and Randomization, where you can read about randomization procedures of varied complexity, ranging from simple random sampling to, for example, permuted block randomization with random block sizes. If you would like to read more about the latter randomization design, you should find the Statistics How to Guide Permuted Block Randomization very accessible.

· Q 6. What exactly is meant by a randomized controlled trial?

A. The randomized controlled trial (RCT) is presented in a very helpful way at the Wikipedia site

Randomized controlled trial.

· Q 7. Are there different types of RCT?

A. Yes, there are indeed. For example, a factorial randomized controlled trial (RCT) is used to test multiple interventions in addition to a control in the same experiment. For more details of this and other types of RCT, please refer to the National Institure for Health Research website Clinical Trials.

· Q 8. Is it always ethical to use randomized controlled trials in medical research?

No. Have a look, for example, at the BMJ paper

Understanding controlled trials: Why are randomised controlled trials important?

· Q 9. Why is it important to randomize the treatment allocation when assessing the effect of a new intervention by comparison with a control? Why can’t I simply consider the treatment allocation assigned by doctors or selected by patients?

A. The BMJ article Treatment allocation in controlled trials: why randomise? answers these questions for you and provides a reminder of a topical clinical trial relating to vitamin supplements that could not be relied upon due to the absence of randomisation.

· Q 10. Does randomization always guarantee a fair comparison across groups with different treatment allocations?

A. For small sample sizes, not automatically. This is due to the law of large numbers. This law should alert you to the possibility of covariate imbalance between different groups. Whilst under simple randomization the probability of a particular covariate property such as family history of disease ought to be the same across two treatment groups and for a continuous covariate, such as BMI, the average ought to be the same across the two groups, in practice this ideal becomes more fully realized as the sample size approaches infinity. Therefore actual proportions or means may be quite different for small trials. Within this context, a statistician would wish to adjust for covariate balance in order that a more accurate comparison can be made concerning the true difference associated with the different treatment allocations. To learn more about these issues together with findings concerning how large a sample is adequate to facilitate covariate balance in RCTs, please refer to

Testing baseline imbalance in a randomized study.

· Q 11. In a clinical trial or other research study, what exactly is meant by blinding, why is it carried out and what is the difference between single and double blinding?

A. These questions are all addressed in the BMJ article Blinding in clinical trials and other studies.

· Q 12. Where can I learn how assign random allocations of treatments to patients?

A. Methods of generating random samples vary in complexity. It is recommended that you start with simple techniques for random sampling as provided in the BMJ article How to randomise. A link for randomisation software is also provided in this article. As this link is now redundant, please refer to the list of randomisation software within the directory provided here instead.

· Q 13. Can you recommend a good resource on the use of placebos in clinial trials which also provides an explanation on the placebo effect?

A. Yes; the Wikipedia site Placebo provides very comprehensive details on these topics together with some relevant references from the published literature.

· Q 14. What is meant by a longitudinal study?

A. There are a variety of types of longitudinal study. Have a look at the BMJ reference Longitudinal Studies for details.

· Q 15. Where can I find a handy summary of the main types of bias to look out for in study design?

A. Have a look at Topic 34 of the 4th edition of the book Medical Statistics at a Glance by Petrie and Sabin, which will also help you in getting the names right for the types of bias which are relevant to your particular study. This book is available as an electronic copy via the University of the Edinburgh library catalogue, so if you are registered with this library, just sign in and have a look! You can print off the relevant topic (if you like) without having to purchase the entire book. This topic also refers you to further topics of the same book if you need to learn more about particular types of bias.

Additionally, you ought to be aware of the danger of introducing bias into statistical analysis by including repeated measures (e.g. interferon-gamma testing multiple times for some but not all patients) among a sample of individuals (the patients) for whom you had previously expected to assume independence (e.g. across interferon-gamma readings). The issue can crop up in very many types of analyses, including the chi-square test of association. Here, for example, you may wish to compare proportions of individuals across sites, where some of the individuals involve repeated measures. To understand the problem more fully but in simple terms, please refer to the resource Assumption of Independence.

Another form of bias to consider is volunteer bias. You may be concerned about such bias when, for example, you are planning a longitudinal study involving relationships lifestyle habits and future dementia diagnosis and you are aware that it is possible that those who have volunteered for the study are more health and better educated than the general population of interest. To learn more about volunteer bias, you may wish to constult the resource Volunteer Bias in Psychology: Definition and Importance. For the dementia example referrred to above, you may also find it helpful to consider the advice on calculation of standardised ratios provided on StatsforMedics (see EPIDEMIOLOGY FOR THE RUSTY AND COMPARING A STUDY COHORT WITH A GENERAL POPULATION).

. Q. 16. I am carrying out a research study to compare distress levels for autistic children with those for a control group. I have been advised to match the autistic and control patients for several variables, including age and gender. So as to gain a better understanding of my study design, can you please offer some advice on the pros and cons of matching.

A. Please observe that by means of the matching process, you are cleaning the data used from the confounding influence of the matched variables (provided that influence truly exists). Imagine
you are carrying out a univariable analysis (which involves comparing the control and case groups for a single outcome at any one time without incorporating other predictive factors). Then
matching can to some extent help alleviate the concern that you have over-estimated (positive bias) or under-estimated (negative bias) the effect of autism on your study outcome. So whilst the matching process will prevent you comparing autism as a predictive factor for distress against age and gender, it will allow you to feel more confident about the validity of your findings following hypothesis testing at the univariable level.

That said, there is are costs from matching which you need to be aware of. Firstly, note that if one of the variables you matched for (e.g. age or gender) is a predictor not only for your study outcome (distress level), but also for the factor you wish to focus on for your study (autism), it is possible that your study design will cause any true difference between autistic and control patients to be dampened. In other words, in these circumstances, the statistical power of your study may be attenuated. To help you reflect more on the above problem, please consider having a read through
the BMJ article Removal of radiation dose response effects: an example of over-matching. * Secondly, be aware that one major drawback about matching is that once you have matched for particular variables (age and sex, say), it is no longer possible to quantify:

a) the independent influence of these factors on the study outcome

b) test for an interactive effect between one of your matched variables (e.g. age) and autism when predicting for distress level.

From this perspective, you are not in a position to provide a multivariable predictive model for distress which takes into consideration other factors in addition to autism status. As a non-specialist in statistics seeking to juggle your research with other mandatory learning within the undergraduate medical curriculum, the development of such a model is well beyond what you should hope to accomplish within the time-frame of a short research project. Such work is best left to a trained statistician.

*N.B. The term 'over-matching' as used in the above article refers to imprudent choice of the factors by which one matches in particular circumstances.

.Q. 17. I would like to gain a clear understanding of the distinction between the terms ‘bias’, ‘confounding’ and ‘effect modification’.

A. Please refer to Bias, Confounding and Effect Modification, where in addition to conceptual distinctions, you will gain access to some useful statistical examples illustrating the source and identification of such phenomena.

. Q. 18. I am trying to estimate what magnitude of effect would be meaningful in assessing whether there is a noteworthy difference in the groups I am to compare. I understand that the notion of clinical significance is important here. However, I am not clear what this notion refers to or how clinical significance relates to statistical significance. Can you point me to a suitable reference.

A. You can learn a lot from consulting the paper Clinical versus statistical significance as they relate to the efficacy of periodontal therapy This paper highlights the importance of deciding on what size of effect you consider important from a clinical perspective prior to your study and the usefulness of statistical signifcance (based on statistical hypothesis testing) as a means of assessing whether this outcome is likely to have occurred by chance.

. Q 19. What is pseudoreplication?

A. Pseudoreplication is a phenomenon which arises when replicate samples (e.g. samples of glomeruli) are taken from the same large experimental unit (mouse say) so as to increase sample size when the original and more appropriate intention as defined by the study hypothesis was to analyse data according to the larger experimental unit. These ideas are reflected in the example described in the solution to Q. 19, where a link to a relevant publication is also provided. Even from a common sense point of view, it should be clear that where one wishes to consider the effect of a treatment in a randomized controlled trial, if the treatment is randomly allocated to mice rather than glomeruli, the treatment effect (e.g. mean difference across the two groups) should be analysed as though the experimental units are mice; otherwise, there is the danger of confounding due to data arising from the same mice. For the latter reason, pseduoreplication is problematic, as it can lead to spurious results, thus undermining the authenticity of study outcomes.

. Q. 20. I am analysing data from a drug intervention study comparing parameters measured in 16 mice given placebo and 16 mice given a drug. Most of the measurements are continuous variables. To assess injury in the kidneys we have randomly selected 100 glomeruli per animal and assessed these glomeruli for injury. To do this, we have opted to score the glomeruli in the following way: 0 refers to cases of no injury, while 1 and 2 refer to injury in increasing degrees of severity.

I’m not sure how best to analyse these data. I analysed the count data for the 1600 glomeruli using a chi-square test of association, where my test hypothesis was that there is an association between whether the drug was given and the level of injury. I am suspicious that the strong result reflected by the very high chi-square Statistic (34.9) and very low corresponding p-value (3.4 x 10^-9) are a product of pseudoreplication.

An alternative approach to comparing the two groups would be to calculate for each animal an overall measure of kidney injury (most simply the proportion of glomeruli with injury) and analyse the resultant data using an appropriate method for comparing groups of continuous data. When I apply such a test to each of my groups of size 16, the p-value is considerably higher (0.008). Does this reflect the fact that the second statistical approach is more appropriate?

A. You are correct to be concerned about the initial chi-square test on counts of glomeruli. As you know, the test knows nothing of the background to the data collection and is therefore treating glomeruli as individuals. The main problem here is that the chi-square test of association requires observations to be independent and of course, they are not, given that multiple glomeruli originate from the same source.

One of the simplest approaches to addressing pseudoreplication is taking averages. If you had taken a measurement (diameter say) for each glomerulus, this would have amounted to obtaining an average glomerulus diameter per mouse and then performing a hypothesis test on the resultant continuous data. With your experiment, the situation is a little different because you are dealing with counts and not measurements. In obtaining a percentage in your later test (after creating two rather than 3 injury categories) you have, however, created a type of average, and I see this as a possible solution to the pseudoreplication problem. Bear in mind, however, that using the later approach, when comparing groups of such small sizes (16 mice per group), it is wise to use the appropriate tests of Normality to decide on whether you should be applying a test for Normal data or non-parametric data.

The paper The problem of pseudoreplication in neuroscientific studies: is it affecting your analysis? is, in my opinion, very suitable for non-statisticians as a means of understanding pesudoreplication in a clinical context. Please kindly note the reservations expressed under ‘Averaging dependent observations’, recognizing (see later in the paper) that a mixed model approach is considered a superior (although a much more advanced) approach to that of averaging. Should you be interested in the latter approach, it is advisable to have a professionally trained statistician carry out the analyses as a member of your research team.

Understanding Different Study Designs by Margaret MacDougall is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.