Fake news has been in the public eye due to the spread of the phenomenon during the recent US election. The impact of fake news can be significant. However, have you ever considered the impact of fake clinical data being published in medical journals? This can be as alarming as people’s lives can be put at risk due to it.
Last year we were scoping out an AI driven oncology project, to do the following:
- review all relevant, authoritative medical literature in oncology and collate it,
- review all clinical trial data in the space and collate it with results, biomarkers and more
- add in local country and hospital treatment protocols,
- link this to the electronic patient records (longitudinal data) to examine scan images, all blood and diagnostic test results (including genetic testing) along with treatment and patient outcomes,
- collate all that constantly updated data to allow the Artificial Intelligence to identify what treatment will have the best possible outcome for a specific unique combination of factors for a specific patient
We were doing it in order to be able to examine an individual patient’s electronic health record, and all data for that specific patient, and analyze it with the power of knowledge of all peer reviewed journal articles and clinical trials and hospital patient outcomes to make a clinical suggestion for the oncologist for that specific patient (mainly for complex cases with co-morbidities) and link to evidence supporting the recommendation.
As you can imagine, we would need to stand by our recommendation as the patient’s survival outcome is at stake.
As we tested various image analysis algorithms we worked on, we discovered something very disturbing – faked data. In several prestigious journals we were able to detect faked image data in the article using AI. Once the AI identifies it, an expert human can clearly see the fake when directed where to look but it is done very skillfully and the journals are not picking up on this. The ones we identified were in prestigious journals - and I am happy to say that they were unrelated to any pharma companies. These were done by academic physicians and not on specific drugs, so theoretically they had nothing to gain. However, it made us think on many levels.
Firstly, our AI has to be able to detect all faked data as it I critical that what we base our recommendations on, is real data.
Secondly, the scale of the issue and the underlying causes. We realized that in academia there is a saying ‘publish or perish’. There is a lot of pressure to publish articles or risk losing their job. This must be why they are risking their reputations with such appalling behavior.
Thirdly, although journals are peer reviewed, the fakes are good. Peer review does not equal fact in all cases as these fakes as well done.
Discussing this with a pharma exec I understand that in that company, and probably most pharma, any published results are not relied on until the company itself replicates it. I applaud this approach. But, regular physicians who do not have the ability to replicate studies may be unknowingly putting their patients at risk if they rely on erroneous data.
Big data is often defined by volume, variety, velocity and veracity. When using AI to analyze big data from medical journals, veracity, or truthfulness, takes on a very important role. This is a big data veracity challenge.
Artificial Intelligence to the rescue?
AI has the ability to identify some fake data – certainly faked images as we discovered. But there are still challenges for humanity and AI in this domain. The next challenge for AI will be taking this up a level to be able to identify even more components that are faked.