Recently, two researchers – Anish Dhir and Ciarán Lee from Babylon Health, a digital health service and remote consultation platform headquartered in London had come up with a new method to mine causal relations between sets of medical data. Using quantum cryptography, the duo regard data sets as conversations and their presuming causal relationships as eavesdroppers to distinguish whether a relationship exists or not.
Dhir and Lee chose not to use machine learning because of the complexity of joining multiple layers of data with the same variables to uncover as many causal relationships as possible. The model was tested on datasets with known causal relationships, such as the size and texture of breast tumors and if they are malignant or benign. Related findings were peer-reviewed and presented at the Association for Advancement of Artificial Intelligence (AAAI) conference in New York last week.
Researchers believe tapping onto causal relations will improve the performances of the company’s artificial intelligence (AI) driven diagnostic tool and medical chatbot, which were under scrutiny. Apart from that, some researchers also believe causal model is a way to make AI less biased.
Determining fairness of an algorithm
According to Matt J. Kusner, Associate Professor, Department of Computer Science, University College London and Fellow at The Alan Turing Institute and Joshua R. Loftus, Assistant Professor, Department of Technology, Operations and Statistics, New York University, there are at least four ways to ensure an algorithm is fair.
The more significant ones include “fairness through unawareness” and “demographic parity”. For the former, it’s to get rid of data that are likely to lead to bias at a glance. For example, if an algorithm is to assist human judges to make parole decisions, perhaps factors like previous offences should be considered while ethnic origins should be eradicated. Yet, data are not so well-defined in general. Historically, racial bias does have influence on whether a person would be plead guilty. Thus, developers may find themselves in a dilemma as to whether they should keep or discard ethnic origin as training data.
For the latter, an algorithm should yield the same prediction regardless of individuals’ demographics. For example, if a model is designed to predict the risks of deterioration of patients within intensive care unit (ICU), it should show the same accountability for white and non-white patients. If too few non-white patients are included in the training data, the model may lack the sensitivity and miss patients that are truly at risk. However, this is sometimes criticized because it’s nonsensical in certain setting. Female does show a higher risk of suffering from breast cancer as compared to male.
The use of causal models
Indeed, judging from the above, artificial intelligence (AI) is far from bias-free. As such, Kusner and Loftus suggested to use causal models to access fairness of algorithms. As they wrote in Nature recently, a causal model allows researchers to establish a cause and effect relations across a wealth of data. This is crucial especially in medicine where data is often underpinned by correlations.
For example, heart disease and obesity are correlated. On the other hand, low vitamin D is also found to be correlated with obesity, so, is low vitamin D and heart disease somewhat related too? If so, how are they related? There may be a need of more independent clinical trials to illustrate it. Therefore, Kusner and Loftus believe a causal model is able to reason how sensitive an algorithm is to two independent factors and from there, determine whether any of the observed data could be influenced by any previously unobserved factor.
Besides, a causal model also allows us to question if a particular outcome will differ shall the causing factor changes. Typically, the outcome of a non-biased algorithm will not alter and researchers can use this to highlight factors that counter such fairness. Most importantly, a causal model can avoid any unexpected repercussion in the future as researchers are able to intervene when appropriate, particularly as the algorithm grows with more infused data.