A chatbot using clinical intelligence A total of 149 simulated consultations were conducted to evaluate Google Amie.
In terms of diagnostic accuracy and empathy, the language model had performance that was comparable to or even superior than that of actual medical professionals.
According to a paper published in the scientific journal Nature, an artificial intelligence system that was taught to do average consultations was able to match and even outperform the performance of human physicians while conversing with simulated patients. Taking into account the clinical history that was provided, the chatbot was even able to provide a list of potential diagnoses.
With the use of an LLM (long language model) model that was built by Google, the Articulate Medical Intelligence Explorer (Amie) chatbot was able to diagnose respiratory and cardiovascular diseases, in addition to other pathologies, with more precision than licensed primary healthcare providers.
In the experiment that included six different therapeutic specializations, the quality of the discussion was assessed based on a total of twenty-six criteria. These criteria included the capacity to describe the condition and the therapy for it, as well as empathy, civility, sincerity, and the demonstration of interest and dedication. At the same time as it was able to extract a volume of information from keyboard discussions with patients that was equivalent to that which was obtained by human physicians, the artificial intelligence was successful in 24 of these characteristics.
Alan Karthikesalingam, a clinical research expert at Google Health in London, cautions that “this does not mean, at all, that a linguistic model is better than doctors when it comes to preparing a clinical history.” Karthikesalingam is one of the co-authors of the study. “The participating primary care physicians were likely not accustomed to interacting with patients via text-based chat, and this may have affected their performance.”
What if the democratization of healthcare were to continue?
Amie, which is developed by Google, is still in its experimental phase. It has not been tested with patients who are experiencing actual health issues; rather, it has been tested with people who have been taught to play this role.
“We want the results to be interpreted with caution and humility,” Karthikesalingam advised. “We want to achieve this.” To the best of our knowledge, this is the very first time that a conversational artificial intelligence system has been customized for diagnostic discourse and the collection of clinical histories.
Adam Rodman, an internist at Harvard Medical School in Boston, United States, believes that although if Amie is a helpful tool, it should not be used in lieu of face-to-face meetings with medical practitioners.
In an interview with Nature, he made the observation that “medicine is about human relationships; it is about much more than just compiling information.”
According to the ArXiv archive of scientific manuscripts that have not yet been subjected to peer review, the research was published on January 11th. Its writers make a veiled reference to the possibility that, in the long run, the chatbot may contribute to the “democratization of healthcare.”
According to Vivek Natarajan, an artificial intelligence researcher at Google Health in Mountain View, California, one of the issues they encountered was the lack of recordings of actual consultations that might be utilized as data by the program.
In order to make up for this shortcoming, they devised an algorithm that would allow the chatbot to train itself with its own “conversations.” This algorithm would make use of electronic medical records and recorded consultations, as well as interpretations of the role that a particular patient would play in the presence of a “empathetic” doctor or even a critic. assessing interactions of this kind…
Discretion and personal space
In order to conduct online consultations via text with Amie and twenty licensed physicians, the researchers recruited twenty “fake patients” who had previously been trained to carry out such consultations. However, they did not disclose to the patients whether they were human or a bot. The participants were asked to rate their experience after they had simulated 149 different clinical settings.
It is therefore the intention of the Google team to do further, more in-depth research with the purpose of identifying any potential biases and ensuring that the system is unbiased with regard to various demographic groups. In addition to this, it is studying the ethical conditions that must be met in order to test it with individuals who have actual clinical issues.
Daniel Ting, a clinical scientist who specializes in artificial intelligence and works at the Duke-NUS School of Medicine in Singapore, expressed his enthusiasm for the idea of checking the system for bias. He believes that this would ensure that the algorithm does not punish ethnic groups who have been underrepresented in prior datasets.
According to Ting, another essential factor that must be taken into consideration is the privacy of the users.
For a significant number of these major commercial language modeling systems, we are still uncertain about the location of the data storage and the method by which it is evaluated.