Unlocking Trust in AI: New Research Improves Fairness, Safety, and Reliability of AI Models
A groundbreaking research initiative at Deakin Applied Artificial Intelligence Initiative has delivered new tools and insights to help make artificial intelligence (AI) systems safer, fairer, and more reliable – especially in high-stakes, real-world applications.

Funded by the Australian Research Council Discovery Projects program, the project focused on uncovering and correcting hidden vulnerabilities in AI and machine learning (ML) models/algorithms.
The team tackled key issues like algorithmic bias, security threats, fairness in decision-making, and the ability to explain and assure AI behaviour.
“We wanted to understand when AI systems fail – and more importantly, how to fix them,” said the project chief investigator Sunil Gupta. “Our work shows it is possible to systematically test and improve these AI algorithms, even when they are ‘black boxes’.”
Highlights of the Project:
- Fairer AI: New techniques were developed to adjust biased model predictions – boosting fairness without sacrificing accuracy.
- Detecting Compromise: Researchers created tools to detect if models have been tampered with, a major concern when AI is deployed in the outside world.
- Trojan Attack Defence: Cutting-edge defences can now detect and neutralise hidden “Trojan” triggers planted in neural networks.
- Model Improvement via Distillation: Innovative methods were developed to turn the models into their improved versions via novel knowledge distillation methods.
- Human-AI Collaboration: The project explored new ways for AI systems to work with human experts, ensuring smarter and more accountable decision-making.
As AI becomes more deeply embedded in our lives – from healthcare to justice systems – it is essential that we can trust these systems to be fair, secure, and robust. When these systems do not perform as expected, the fallout can be swift and damaging.
The rollout of IBM’s Watson for use in oncology care in the USA failed to meet expectations because the AI, trained in controlled conditions, could not handle the messy, unstructured data and complex workflows of real‐world healthcare, resulting in unreliable recommendations and poor clinician adoption. This high-profile example highlighted the need for reliable, evidence-based AI in clinical settings.
Concerns about fairness and transparency in AI-driven judicial decisions in the USA were raised when the COMPAS risk assessment tool that analysed reoffending rates was found to exhibit racial bias.
Deakin’s research lays the foundation for AI that not only performs well but behaves responsibly and transparently.
The Project Chief Investigator, Deakin Distinguished Prof. Svetha Venkatesh says, “Our goal was to push the boundaries of AI not just in terms of performance, but in responsibility. By advancing fairness, security, and collaboration, we are building AI systems that people can trust – and that work with us, not just for us.”
The research team has released several state-of-the-art tools and methods, along with open-source code, to help developers, policymakers, and companies apply these advances in practice.