Bayesian analysis of real‐world data as evidence for drug approval: Remembering Sir Michael Rawlins

British Journal of Clinical Pharmacology – July 17, 2023

Source: OpenAlex

Summary

A compelling 95% probability of success was observed for medical cannabis in treating childhood epilepsy, with all 20 patients improving. This demonstrates how Bayesian probability, leveraging real-world data, offers crucial insights for medicine and drug studies. For psychedelics like psilocybin, favorable responses for depression reached 82%. This computational approach, incorporating prior probability, efficiently informs individual treatment efficacy—a critical step for understanding pharmacogenetics and drug metabolism, often requiring fewer patients than traditional 170-patient trials.

Abstract

The two pillars of modern medical research are where in most randomized controlled trials (RCTs), the active treatment is compared with placebo. A recent expert consensus survey endorsed the statement that 'Results from placebo-controlled trials are more reliable than results from any other study design',1 reflecting that placebo controlled RCTs are considered to be the gold standard. However, in his 2008 Harvein Oration, the prestigious annual lecture of the Royal College of Physicians of London, Sir Michael Rawlins, the ex-head of the National Institute for Health and Care Excellence (NICE) and the Medicines and Healthcare products Regulatory Agency (MHRA), pointed out RCTs are not the apex of evidence, but rather a piece of a larger evidence puzzle. He wrote, 'RCTs are often called the 'gold standard' for demonstrating (or refuting) the benefits of a particular intervention. Yet the technique has important limitations of which four are particularly troublesome: the null hypothesis, probability, generalisability and resource implications'.2 Here, we follow the footsteps of Sir Michael Rawlins and highlight how the combination of real-world evidence (RWE) and Bayesian probability analyses could complement the traditional approach of RCTs and null hypothesis significance testing (NHST). What does Bayesian analysis reveal about the medical cannabis for childhood epilepsy? In the dataset, all 20 children experienced a reduction in seizure numbers, so after updating the flat prior (all success rates between 0% and 100% are possible), the probability that the next patient will improve is 95% with, a 95% credible interval of 87%–100%. Even if we choose a sceptical prior, a gentle distribution excluding 0% and 100%, with a mean at 25%, the probability that the next patient will improve is still 88% with a 95% credible interval of 75%–98%. Similar results are obtained for treating treatment resistant depression with psilocybin. In this case, results depend on which exact depression measure is used. The worst-case scenario is using QIDS-16, where the probability of a favourable response is 62% with a 95% credible interval of 42%–82%, and the best-case scenario is using the MADRS measure, where the probability of a favourable response is 82% with a 95% credible interval of 66%–96%. See Figure 1 for the posterior distribution of the treatment success for both datasets; all data and code associated can be found at https://github.com/szb37/Bayesian-RWE. Using Bayesian analysis of real-world data, we showed with data from just 20 patients that the probability of success treating childhood epilepsy with cannabis is 95%. In comparison, a traditional placebo-controlled RCT of a very similar product, Epidiolex that contains purified cannabidiol, in a similar clinical population used a cohort of 170 patients to reveal a statistically significant between-treatment difference using NHST.7 It is worth contrasting the differences between these approaches investigating cannabis's efficacy. The most obvious one is the difference in the sample size. Placebo-controlled RCTs are 'information inefficient' because half of the patients are not receiving the treatment under investigation. This contributes to the high cost of drug development that is eventually passed onto the consumers. Moreover, placebo-controlled RCTs raise ethical concerns when an ineffective treatment as in the placebo arm could have severe consequences, for example, suicide in depression or status epilepticus in epilepsy. Furthermore, traditional RCTs generally require a larger sample, because their primary outcome is the 'between-treatment difference', that is, how much better is treatment relative to placebo, which requires that about half of the patients are randomized to the placebo group. However, this 'between-treatment difference' is not relevant to either patient or doctors when choosing a treatment. Patients and doctors experience/observe, and care about, the 'change over time', that is, how much improvement is to be expected from the treatment. Relatedly, traditional RCTs report the between-treatment P-value, which is the 'probability of obtaining the observed or more extreme data assuming that the treatment is no better than placebo'. Note this probability is not related to treatment success. In contrast, the Bayesian analysis yields the probability of treatment success when treating the next patient, which is what doctors and patients care about. The main reason why RCTs are held in such high regard is because after blinding non-specific treatment effects should equally distributed between treatment arms8; hence, the between-treatment difference should correspond to the true treatment effect, free of subjective biases. This seems a compelling reason to favour RCTs over alternatives; however, in practice, only a small minority of trials measure blinding integrity and thus empirically demonstrate that patients were genuinely unaware of their treatment allocation.9, 10 In many trials, participants unblind due to side effects,11, 12 undermining the purpose of blinding and hence the objectivity of placebo-controlled RCTs.13 In particular, psychedelics elicit conspicuous subjective effects that make them easy to distinguish from placebo; therefore, running truly blinded trials is near impossible.14 In a recent trial of psilocybin for the treatment of alcohol use disorder, 94% of participants correctly guessed their treatment allocation (50% would be expected in a truly blind trial) with a mean confidence of 89%.15 We emphasize that the unblinding is not due to incompetence, but rather the nature of the treatment. Similarly, most exercise/meditation/diet-based therapies are tested in an unblinded manner.16 One major criticism of RWE is the lack of a control condition, raising the question whether the effects could be driven by placebo response. While this concern cannot be entirely eliminated, there is more at play here. In both of our case series, patients were treatment resistant: so, in the depression study, all patients had failed on at least two previous antidepressant drugs, some on more than 10, and all had failed on psychotherapy. It has been shown that for depression, previous failed medication is associated with decreased chances of success for the next treatment17, 18 and the observed response rate is much higher than what is generally observed in the placebo arms of antidepressant trials (~30%),19 arguing that these results are difficult to explain by the placebo response alone. Moreover control conditions can be incorporated to RWE. For example, we previously ran a 'self-blinding' trial on psychedelic microdosing, where citizen scientists implemented their own placebo control without clinical supervision.20 Another example of an RWE control condition is provided by a single case where medical cannabis treatment was interrupted due to medicine access problems and seizures rapidly reappeared.21 This case can be viewed as an n = 1 ABA(B) trial, that is, a within-subject crossover trial.22 Therefore, when ethical considerations allow, such designs could help to establish the causal effect of the treatment even in an RWE context.23, 24 Our arguments here should not be read as a call for the abolition of traditional RCTs, but rather to consider complementary forms of evidence. For example, Bayesian analysis of RWE can be implemented as a hypothesis-generating step prior to conducting the more resource-intensive traditional RCTs. We believe Sir Michael Rawlins would support this agenda as he argues that 'randomised controlled trials […] should be replaced by a diversity of approaches that involve analysing the totality of the evidence base'.2 David Nutt was PI on the two studies. All three authors conceived the analytic design and wrote the text; Balázs Szigeti conducted the Bayesian analysis. We would like to acknowledge Rayyan Zafar who provided useful feedback. We thank the MRC for funding the psilocybin for treatment-resistant depression trial. B.Sz. and L.P. declares no conflicts. D.N. is an advisor to Neural Therapeutics and Algernon Pharmaceuticals. He has received consulting fees from H. Lundbeck and Beckley Psytech plus lecture fees from Takeda and Otsuka and Janssen plus owns stock in Alcarelle, Awaknlifesciences and Psyched Wellness. Data availability from https://github.com/szb37/Bayesian-RWE.

Bayesian analysis of real‐world data as evidence for drug approval: Remembering Sir Michael Rawlins

Summary

Abstract

Tags

Authors

Comments