Security researchers had devised a way to attack a Proofpoint product that uses machine learning to identify spam emails. The system produced email headers that included a “score” of how likely a message was to be spam. But analyzing these scores, along with the contents of messages, made it possible to build a clone of the machine-learning model and craft spam messages that evaded detection.
The vulnerability notice may be the first of many. As AI is used more widely, new opportunities for exploiting weak spots in the technology also are emerging. That’s given rise to companies that probe AI systems for vulnerabilities, with the goal of catching malicious input before it can wreak havoc.
Startup Robust Intelligence is one such company. Over Zoom, Yaron Singer, its cofounder and CEO, demonstrates a program that uses AI to outwit the AI that reads checks, an early application for modern machine learning.
Singer’s program automatically tweaks the intensity of a few pixels that make up the numbers and letters written on the check. This alters what a widely used commercial check-scanning algorithm perceives. A scammer equipped with such a tool could empty a target’s bank account by modifying a legitimate check to add several zeros before depositing it.
“In a lot of applications, very, very small changes can lead to dramatically different results,” says Singer, a professor at Harvard who is running his company while on sabbatical in San Francisco. “But the problem runs deeper; it’s just the very nature of how we perform machine learning.”
Robust Intelligence’s tech is being used by companies including PayPal and NTT Data, as well as a large ride-share company; Singer says he can’t describe how exactly it is being used, for fear of tipping off would-be adversaries.
The company sells two tools: one that can be used to probe an AI algorithm for weaknesses and another that automatically intercepts potentially problematic inputs—a kind of AI firewall. The probing tool can run an algorithm many times, examining the inputs and outputs and seeking ways to trick it.
Such threats are not just theoretical. Researchers have shown how adversarial algorithms can trick real-world AI systems, including autonomous driving systems, text-mining programs, and computer vision code. In one oft-mentioned case, a group of MIT students 3D-printed a turtle that Google software recognized as a rifle, thanks to subtle markings on its surface.
“If you’re developing machine-learning models right now, then you really have no way to do some kind of red teaming, or penetration testing, for your machine-learning models,” Singer says.
Singer’s research focuses on perturbing the input of a machine-learning system to make it misbehave and designing systems to be safe in the first place. Tricking AI systems relies on the fact that they learn from examples and pick up subtle changes in ways that humans do not. By trying multiple carefully chosen inputs—for example, showing altered faces to a face-recognition system—and seeing how the system responds, an “adversarial” algorithm can infer what tweaks to make in order to produce an error or a particular result.
Along with the check-fooling system, Singer demonstrates a way of outwitting an online fraud-detection system as part of probing for weaknesses. This fraud system looks for signs that someone making a transaction is actually a bot, based on a wide range of characteristics, including the browser, the operating system, the IP address, and the time.
Singer also shows how his company’s tech can deceive commercial image-recognition and face-recognition systems with subtle tweaks to a photo. The face-recognition system concludes that a subtly doctored photo of Benjamin Netanyahu actually shows basketball player Julius Barnes. Singer gives the same pitch to prospective customers worried about how their newfangled AI systems could be subverted, and what that might do to their reputation.
Some big companies that use AI are starting to develop their own AI defenses. Facebook, for instance, has a “red team” that tries to hack its AI systems to identify weak spots.