Why That Observational Study of Hydroxychloroquine in The Lancet Seems Fishy

02/06/2020

The drug hydroxychloroquine, pushed by US President Donald Trump and others in recent months as a possible treatment to people infected with COVID-19, is displayed by a pharmacist at the Rock Canyon Pharmacy in Provo, Utah, May 27, 2020. Photo: Reuters/George Frey.

If you’re following at all the search for COVID-19 treatments, and possibly even if not, you will have seen the flurry of media coverage for the observational study in The Lancet, ‘Hydroxychloroquine or chloroquine with or without a macrolide for treatment of COVID-19: a multinational registry analysis’. It made the news not least because hydroxychloroquine is the drug President Donald Trump says he is taking in the belief that it will reduce his chance of catching COVID-19.

This view is not backed up evidence until some randomised trials come in. Getting in before the trials, The Lancet study used propensity-score matching to try to control for the non-random treatment. It found that taking hydroxychloroquine and chloroquine were associated with an increased risk of heart problems.

I am highly skeptical of the powers of hydroxychloroquine with relation to COVID-19 (‘skeptical’ in the sense that I have suspended judgement for now – there simply isn’t evidence either way). But I want the test of its properties to be done properly, with random controlled trials. And if we are to use observational studies (which I do not object to, they just aren’t as useful as an experiment where you can manipulate the treatment), they have to use real data.

The data in that study, and in at least one preprint on a second treatment, were provided by an Illinois firm called Surgisphere. Allegedly the data represents the treatment and health outcomes of 96,032 patients from 671 hospitals in six continents. However, there is simply no plausible way I can think of that the data are real.

I’ll say that again: I believe with very high probability the data behind that high profile, high consequence Lancet study are completely fabricated.

If Surgisphere can name the 671 participating hospitals or otherwise prove that the data is real, I will retract that statement, delete this post or write whatever humbling apology they desire. But I think there’s nearly zero chance of that happening.

Could Surgisphere really have patient data from 671 hospitals?

I’m far from the first to ask for more information on this amazing new database no-one had heard of, and they’ve had a week to explain. This what they came up with:

The Surgisphere registry is an aggregation of the de-identified electronic health records of customers of QuartzClinical, Surgisphere’s machine learning program and data analytics platform. Surgisphere directly integrates with the EHRs of our hospital customers to provide them actionable data insights to improve efficiency and effectiveness. As part of these QuartzClinical customer agreements, Surgisphere, as a global healthcare data collaborative, has permission to include these hospitals’ EHR data in its queryable registry/database of real-world, real-time patient encounters.

…

While our data use agreements with these institutions prevents us from sharing patient level data or customer names, we are able to complete appropriate analyses and share aggregate findings to the wider scientific community.

(“EHR” is ‘Electronic Health Record’, i.e. patients’ personal data). Frankly, this doesn’t pass the laugh test.

I can imagine why any hospital customers would not want to be named, because if it came out that they are allowing their data to go to Illinois to be analysed at will – The Lancet article says it was “deemed” that ethical approval was not needed – there would surely be an outcry. This would be a much bigger scandal than Facebook giving data to Cambridge Analytica. After all, what we post on Facebook was seen by many people as quasi-public. Imagine having your electronic health records – patient demographics, medical history, medications, allergies, lab results, radiology results – given to Cambridge Analytica.

In Australia, we recently had a major public controversy about sharing health records between health providers. I can’t imagine the reaction if it was found they were being shared with overseas researchers without permission or knowledge. And the fact that hospitals aren’t named by Surgisphere means that no patient of any hospital in the world knows whether or not their data is used in this study.

But hang on, you might say, this data (which remember, I think doesn’t exist but let’s pretend it does for the sake of argument) isn’t going to a shady outfit like Cambridge Analytica, it’s going to the “global healthcare data collaborative” Surgisphere.

Right, let’s look at Surgisphere. Surgisphere has five employees with LinkedIn accounts. Other than the CE and co-author of The Lancet paper, these are a VP of business development and strategy, a vice-president of sales and marketing, and two science editors (actually, one science editor and one ‘Scoence Editor’, which does not inspire confidence in their attention to detail while editing). LinkedIn also records one employee of QuartzClinical – a director of sales marketing.

Here are some of the people you might expect to work for a genuine global health care data collaborative that had sold software to 671 hospitals and integrated with their electronic health record (EHR) systems, and that coordinates an ongoing international health research collaboration:

– Global network manager and coordinators
– Hospital/customer liaison team
– Support staff / help desk
– Trainers, and developers of training material
– Researchers
– Legal team to deal with privacy and contract issues in dealing with 670+ hospitals. Issues coming from the EU’s GDPR alone would keep a substantial legal team busy I’m sure.
– Software or database developers. Like, maybe a humble extract-transform-load developer or two to get those billions of rows of transactions data into a database.
– Database administrators and data engineers
– EHR integration solutions specialists
– Data governance lead
– If any of the above are outsourced, a procurement team to handle all the sub-contracting

Surgisphere does not have any of these people, except for Sapan Desai who doubles up as chief executive and medical researcher (a good indication of the size of the firm: most CEs are not also active publishing researchers). Judging from its LinkedIn profile, his team is three sales executives and two science editors.

Nor does Surgisphere or any of its staff have a presence on GitHub. Nor an explanation anywhere of the impressive data engineering that would be required to wrangle all that data. Nor journal articles, conference papers or even blog posts describing its network, the APIs that connect it, how proud they are of their Hadoop cluster on AWS[footnote]The sort of infrastructure required to process large amounts of data[/footnote], which database platform they use, etc. – all the things that real firms that have made impressive innovations (like the first ever world-wide database of individual level hospital data would be, if it were real).

Yet Surgisphere claims to have sold software to 671 hospitals. What would it cost to deploy machine-learning data analytics software to a hospital and integrate it with the EHR? This isn’t some light and easy integration like installing a stats package on a PC and giving it an ODBC[footnote]Open Database Connectivity[/footnote] connection to a database. The integration to the EHR systems and the way we know they use the data means, at a minimum, sending all the data to the cloud. That means you need to deal with network and security architects, have extremely robust testing, bullet-proof security (remember, some of the closest guarded sensitive data in the world), go through who knows what red tape at each hospital in terms of convincing their data governance people of what you are doing.

I don’t know, but $1 million for each deployment can’t be far off the mark. Certainly not less than $300k a pop. So Surgisphere should be a billion dollar company if it’s done this 670 times, but it clearly is not. In fact, Dun and Bradstreet estimate its revenue at $45,245. You couldn’t even do the discovery stage of an EHR integration project at a single hospital for that, never mind deploy anything.

Of course, EHR integration is a real thing, and it’s done usually to move patient information securely around. For example, a quick Google found this useful presentation about EMR integration (EMR and EHR are basically interchangeable terms) in the Great Lakes region. I notice Surgisphere is conspicuously absent from the list of presenters on slide 10. This makes it kind of surprising (but not really) that they claim in The Lancet article to have data on most COVID-19 hospital cases in North America diagnosed before April 14, 2020 – 63,315 such cases in the study according to Table S1, which would have been a clear majority of all hospital cases.

What about QuartzClinical software?

What about this QuartzClinical software that is claimed to have been sold to 671 hospitals and is sending the data back to Chicago? It has its own website. It claims to use “machine learning and advanced statistical analysis” to help decision-making. Remarkably, it “successfully integrates your electronic health record, financial system, supply chain, and quality programs into one platform”. Let’s revise my $1 million estimate to $10 million over three years, minimum, if that means you’re replacing those things with one platform. But probably it just means a data warehouse that pulls from your various data sources, and has an analytical layer and recommendation engine on top. Straightforward business intelligence stuff, but still a big project for a hospital.

I can’t say more than that because the QuartzClinical site is very light on details. It doesn’t have any customer testimonials. It doesn’t talk about what’s under the hood. It doesn’t have any information on versions or history or the forward roadmap. It does claim to have won some awards though. Let’s see:

– “Grand Prize in Quality, International Hospital Federation 39th World Hospital Congress 2015”. Nope, that actually went to Texas Children’s Hospital for “Advanced Population Health – the critical role of care delivery systems”.

– “Second place Dr Kwang Tae Kim Grand Award, International Hospital Federation 41st World Hospital Congress, 2017”. Nope, the two honourable mentions were “Achieving high reliability through care coordination for patients who require emergency surgery” by Northwest Community Hospital, USA, and “the application of improving clinical alert system to reduce the unexpected cardiac arrest event in Taiwan (Yuan’s General Hospital, Taiwan). Neither of these sound like something QuartzClinical would have been part of.

– “Institute for Healthcare Improvement – Four of the Best from the IHI Scientific Symposium (2017).” I couldn’t find this ‘award’ so it’s possible they really did get listed in some such “four of the best” list. The only mention of Quartz Clinical on the ihi.org website is as an exhibitor at the 2018 symposium. It’s possible they also exhibited a year earlier and got some kind of recognition.

– American Hospital Association McKesson Quest for Quality Prize for 2017. This went to the Memorial Medical Center of Springfield Illinois. From their description of how they won I don’t see anything that seems linked to software like QuartzClinical. Instead, they did things like changing the process for dealing with hip fractures, and placed handrails in hospital rooms. However, according to his LinkedIn profile, Surgisphere CEO Sapan Desai worked for the Memorial Medical Center from mid-2014 to mid-2016 as director of “Quality Alliance and Predictive Analysis”, so its plausible he had some role in the program that led them winning the award, even if QuartzClinical was not involved.

– Frost and Sullivan Healthcare Innovation Technology Award 2019. Yes, this one Surgisphere do seem to have genuinely won. However, the Wikipedia page for Frost and Sullivan says that these awards are “based on research using a proprietary methodology, which is sometimes based on a single article produced by the receiver of the award”, describing them as a vanity award that the recipient pays a fee to communicate. I can’t judge that.

In addition to these five claimed awards, there is this media release saying that Sapan Desai “received an honourable mention for his outstanding achievements in quality and patient safety, corporate social responsibility, innovations in service delivery at affordable cost, healthcare leadership, and management practices” at the IHF’s Dr Kwang Tae Kim Grand Award ceremony in Taipei, Taiwan in 2018. This appears false. The Dr Kwang Tae Kim Awards are for hospitals and health care organisations, not individuals. All five mentions of Sapan Desai on the IHF website relate to him giving conference talks; there is no mention of him getting an award. The fact that his own press release announcing his ‘honourable mention’ does not link to any authoritative source for that is suspicious in itself.

So one correct claim (Frost and Sullivan), one exaggerated (the Memorial Medical Center award, which was not for QuartzClinical but at least was an award, with a plausible connection to Desai), three apparently false (relating to the International Hospital Federation) and one uncertain (the Institute for Healthcare Improvement).

I was particularly puzzled by the 2015 IHF Grand Prize in Quality. It seems such a specific and easily disprovable claim, and as well as being on the QuartzClinical site it is made repeatedly by Sapan Desai as individual, for example in his bio for this event in 2018: “He is the recipient of the international grand prize in healthcare quality by the International Hospital Federation in 2015.” Was he perhaps working at the Texas Children’s Hospital? (No, he wasn’t.)

Then I came across this piece claiming a “top quality award” at that 2015 IHF 39th congress. Despite the headline, the text actually reports Desai was given “first prize for the best presentation”, for his “Improving the Success of Strategic Management Using Big Data”. There’s no record of this award on the IHF site, although he definitely did give that presentation. It is plausible he got an award for best presentation. I now think that at some point in subsequent CV-garnishing, this evolved into the claimed “Grand Prize in Healthcare Quality”.

My best guess is that the other apparently false claims of awards, if they have any basis, are exaggerations of conference awards or honourable mentions for talks that have been exaggerated into significant awards for software.

How else might we know those awards before 2019 weren’t for QuartzClinical? Well, it was only launched in January 2019 as seen by this ‘review blog’ which transparently just repeats media releases verbatim.

What we’re left with, with QuartzClinical is a description of software that seems to combine data warehousing from multiple sources with an analytical layer that then provides decision-supporting algorithms. The analysis is apparently done off-premise of the customer (because we know Surgisphere claim they retain all the data for future use). The data sources include both the finance and electronic health records and at a minimum would need some moderately complex data engineering and pipelines for deployment. The firm that owns it has no capability for ICT project management, software development, deployment or support.

There is very little references to this software on the web other than its own promotional material. It has an entry on venddy.com, a site that allows vendors and purchasers of health systems to review each other, but zero user reviews. The promotional material appears on the web from early 2019 onwards so we know it is around a year old. The owner has a record of exaggerating his CV well beyond the point of being misleading (e.g. an honourable mention for giving a paper evolves over time to the Grand Prize in Quality).

What is the probability that a new cloud-based data analytics tool, which integrates with the most sensitive data systems hospitals have (finance and electronic health records) and transfers that data across international boundaries, goes from zero to deployed in 671 hospitals on six continents in 12 months, yet has no user reviews and no discussion on the web from excited IT managers involved in its deployment? Zero, that is the probability; or as close to zero as counts.

‘Surgical outcomes’

Next, a few words about Surgical Outcomes, the international collaborative network of QuartzClinical customers (hospitals and health care centres) that are so trustingly giving their data to Surgisphere. Here is the Surgical Outcomes website. It is an odd combination of hype about machine-learning and six-sigma process improvement. You can join the collaborative for $295 per year and access online education resources for continuing medical education/maintenance of certification. Or pay $2,495 to access other services such as participation in the “research collaborative”.

There are many screenshots of a business intelligence tool, presumably QuartzClinical (which is promoted heavily) allowing the user to drill down (for example) into surgical procedures and understand cost drivers, accompanied with goofy videos on the power of data and importance of performance metrics.

There is a frankly weird blog with about 100 posts, starting in September 2019. These combine basic statistical instruction on topics such as propensity score matching with quality control and project management advice. Some of the statistics is simply wrong; one example chosen at random being this screenshot which incorrectly names the limits of a confidence interval “parameters”.

The oldest blog post on the Surgical Outcomes site is from September 2019 and titled “How do I sign up“. I think we can safely say this is the beginning of the Surgical Outcomes “international collaborative network”. Here is a screenshot from that blog:

You and I know, dear reader, that this is not how hospitals agree to share patients’ personal data. In particular, it is not how hospitals in other countries decide to share their data with a firm in the US. We also know that “a quick technology assessment” is not what is needed before deploying an analytical platform. Not one that draws data from the hospital’s finance and EHR systems, stores it in the cloud, conducts machine-learning on it and returns decision recommendations integrated with the hospital’s own processes.

The article itself

I haven’t even mentioned the data issues we can glean directly from a reading of the article, other than in passing about the surprisingly high proportion of North American hospital cases that were in-sample. Several of the more obvious errors relate to Australia and have been reported on in the media. For example, many more cases in Australia than existed at the time of the study, as reported in The Guardian. Surgisphere responded that a recently joined hospital (could there be any other kind!) “self-designated as belonging to the Australasia continental designation … This hospital should have more appropriately been assigned to the Asian continental designation.” Hmm, so the secret database has dreadful data quality but sure, mistakes happen.

But as Thomas Lumley points out, the misclassified hospital had to have 546 hospitalised COVID-19 cases by April 14 and self-describe itself as being in Australasia. Indonesia had enough hospital cases by then but it seems unlikely there was a concentration of this size in one hospital. And would an Indonesian hospital self-describe as Australasian? (No, it would not.) And could this data be shared legally with a firm that doesn’t even know which country’s laws it needs to abide by? (No, it could not.)

Then there’s the smoking rate being three-times in North America what it is in South; the small range in average BMI[footnote]Body mass index[/footnote]; the implausibly detailed data for Africa; the ethnicity data that is illegal to collect in some countries; and on and on.

I don’t want to write any more, it makes me upset and angry just thinking about this. It’s all said better in the links at the bottom of the page anyway.

Previously, I had more or less gone along but thought there was exaggeration when people said “peer review is broken”, but now I really believe it. In the future my motto is really going to be “publish the data and code or it didn’t happen” – not just as “this is good practice” but as in “if you don’t, I need to think you might be making this up”. With sensitive data, we’ll need to find ways to provide synthesised or other disclosure-controlled versions.

Here’s a good quote from Andrew Gelman (link included later)

The good news about this episode is that it’s kinda shut up those people who were criticising that Stanford antibody study because it was an un-peer-reviewed preprint. The problem with the Stanford antibody study is not that it was an un-peer-reviewed preprint; it’s that it had bad statistical analyses and the authors supplied no data or code.

I hope I’m wrong about this whole thing. Maybe Desai’s ETL developers, support staff and EHR integration specialists just aren’t on LinkedIn while his sales people are. Maybe hospitals really are knowingly and happily sharing our data with an American firm, and the data is stored in European servers to comply with the GDPR and there’s even some patient permission given somewhere that hasn’t been mentioned. Perhaps QuartzClinical is wrapped in some other firm’s software so it has been deployed to 671 hospitals without any reviews or discussion because its branding is hidden.

I would feel bad about writing such a long aggressive post as this in that case. But it seems very unlikely. It is dreadful to think that the most likely explanation of what we’re seeing is simply that the data are fabricated, in what is possibly a criminal conspiracy, and the science publication process is so broken that it gets through. It just seems to me very likely that this explanation is the correct one.

Some other criticisms

– An excellent open letter with multiple signatures by various researchers led by James Watson (not the DNA guy). Very measured and asks excellent questions.

– James Todaro’s critique of the article, much more of a focus on Surgisphere’s credibility (like my post above) than is in the above letter.

– The Guardian Australia coverage of some aspects of the controversy

– The latest of quite a number of posts on Andrew Gelman’s Statistical Modelling blog

– #LancetGate on Twitter (mostly in French)

Peter Ellis is an Australasian professional statistician and data scientist with a background running analytics/stats/data science teams up to 20 people in size, currently available for consultancy work in my role as Chief Data Scientist at Nous Group.

This article was originally published by Peter Ellis on his blog. It has been republished here under a Creative Commons Attribution-ShareAlike 4.0 International License.

It was lightly edited to abide by the parts of our style guide pertaining to capitalisation, to change instances of American spelling to the British, and to change the headline.