More and more of the world’s computer systems incorporate neural networks – artificial-intelligence driven systems that can “learn” how to do something without anyone – including their creators – understanding exactly how they’re doing it.
This has caused some concern, especially in fields where safety is critical. These “black box” systems hide their inner workings, so we don’t actually know when errors are happening – only when they manifest in the real world. And that can be in very rare situations that don’t get caught during a normal testing process.
Even when we do catch them, the inscrutable inner workings of deep learning systems mean that these errors can be hard to fix because we don’t know exactly what caused them. All we can do is give the system negative feedback and keep an eye on the problem.
But that state of affairs may be changing, because a research team from Lehigh University in Pennsylvania and Columbia University in NYC has created a test for deep learning systems. The software, called DeepXplore, examines the decision logic and behaviours of a neural network to find out what it’s doing. They describe it as a “white box“.
To test the system, they threw a bunch of datasets at it to see what happens, including self-driving car data, Android and PDF malware data, and image data. They also fed it a selection of production-quality neural networks trained on those datasets – including some that have ranked highly in self-driving car challenges.
The results showed thousands of incorrect behaviours, like self-driving cars crashing into guard rails under certain circumstances. That’s the bad news. But the good news is that the systems can then use that information to automatically improve themselves, fixing the errors.
“DeepXplore is able to generate numerous inputs that lead to deep neural network misclassifications automatically and efficiently,” said Junfeng Yang, an associate professor of computer science at Columbia University, who worked on the project. “These inputs can be fed back to the training process to improve accuracy.”
Yinzhi Cao of Lehigh University added: “Our ultimate goal is to be able to test a system, like self-driving cars, and tell the creators whether it is truly safe and under what conditions.”
The full details of the testing approach have been published in a peer-reviewed paper that will be presented at SOSP 2017, and the software itself has been released on Github.