› Forums › Housekeeping › Ask for Feedback › I am stuck with data. What data should I use?
- This topic has 4 replies, 3 voices, and was last updated 3 years, 11 months ago by Dr Andree Bates.
-
AuthorPosts
-
-
October 30, 2020 at 10:07 am #5147
I am thinking AI would be useful to solve a challenge but I am confused about what data could be used. It is a rare disease so data is limited. There are only 30 patients on the drug, and while we have data on them, how do I get more data to use to get more insights so I can understand why it has so few patients on it, when a competitor drug which is not vastly different has more.
Help.
-
October 30, 2020 at 10:27 am #5148
Good question. Rare diseases are definitely more challenging for big data but it is not necessarily impossible.
So, there are a few options. It depends on the disease area and country and what you are trying to figure out. A few avenues to consider are:
1. Patient record data. Could you look at data on the patients on your drug versus the others. And maybe you could also get data on previous patients – if the condition is terminal and patients have been on the drug but are no longer due to being dead, you could still look at their data and combine it with the alive patient data. It still unlikely to be too many records though. It won’t really be for AI as it would be small data but you may be able to see something that is a possible reason the docs are only putting a small number on the product. Are there more patients in a different country? Could you pool all the global patients data? But this has other dangers – different clinical guidelines, different approaches. Another member and I were discussing a similar issue and he is in Japan so I suggested the MDV data as they have longitudinal patient record data that is very good.
2. Patient forums and associations. Are there groups in this space to look at what they are discussing as there may be some clues in that data.I have found groups in a few rare disease areas.
3. Your call center data – both the physician one and the patient one. Take those recordings and put them through a language analysis as there could be clues in there also.
4. Social media data. You can filter this around the condition.
5. Are there any nurse educators that see the patients? If you take transcripts of their conversations and analyze that also you may find something.
6. What about CRM data? Is it decent in terms of accuracy and completeness? If so, that may be useful also.
7. And of course, not AI big data, but market research data is useful here. And, if you have a lot of it, a tool like the relative Insight one may provide new insights.
8. GAN? The member in Japan that I was discussing the MDV data with – well he and I were also brainstorming about GAN (Generative Adversarial Networks). You will have seen examples of these in the trainings and they are used a lot to make fake videos of people and I showed the Obama video in the training as it shows how realistic they are. This is because they take elements from the real data and create fake data that is so real, it is almost impossible to distinguish from the real data. So, I was thinking…and I have no idea how good this would be but…if you took a rare disease patient data and applied GAN to it, then you would generate a lot more data but the new fake patient data would all have only elements of real patients data. So, theoretically, it may be an interesting thing to look at as a way of generating more data from rare patient disease data. I will ask one of our data scientists about this to see if I am fantasizing or if this actually could work.
-
October 30, 2020 at 10:30 am #5149
Look what I found. So GAN has been used to detect rare disease.
https://arxiv.org/abs/1907.01022
and a longer paper on it is here
Food for thought.
-
December 10, 2020 at 2:01 am #5697Anonymous0 Points
very interesting. thanks for the link to the paper.
-
January 20, 2021 at 6:26 pm #6015
You are most welcome. 🙂
-
-
AuthorPosts
- You must be logged in to reply to this topic.