Video Injection Attacks: What Are They and Are We Ignoring the Simple Solution?
In March of 2022, a paper was presented at the 13th International Multi-Conference on Complexity, Informatics, and Cybernetics that described “Video Injection Attacks on Remote Digital Identity Verification Solution Using Face Recognition.” The paper outlined attacks on biometric systems that were completed with relative ease, leveraging rich data sources culled from video conference streams and social media. Understandably, the paper caused a stir in the digital identity and biometric industry. Multiple companies responded in the press, all proposing and employing technical capabilities aimed at resolving the threat. This is yet another example of the escalation and maturity of systematic attacks on platforms that are designed to give proof of one’s identity when accessing a digital service with the right level of authentication. Like all facets of cybersecurity, it would appear that digital identity systems - specifically video data acquisition for biometric matching and document verification, are subject to an ever-escalating volley of attacks. This forces “the good guys” - system implementers and integrators - to be reactive to the ever-changing landscape of threats. In many instances, the remedies involve increasing friction and technological requirements, both placing an undue burden on the consumer/user.
Further, it’s only when researchers uncover potential attack vectors that the “good guys” have an opportunity to be proactive about technology innovation to help stave off the barrage of attacks. The stakes have never been higher: the more we transact online, the monetary value of successful attacks increases. During the pandemic, hundreds of billions of dollars were fraudulently obtained using stolen or fake identities.
“Based on the findings of the report, my conclusion was that this idea was not a practical deterrent for reasons which at this moment must be all too obvious.”- Dr. Strangelove
But what if there was a better way?
What if you could be assured that you were interacting with the right person when you attempt to remotely prove identity without having to rely on friction-filled user experiences and technology advancements that elevate the cost of end-user devices and service implementations, ultimately imposing a barrier to entry into systems that are critical for the most underserved and the least capable to afford expensive phones and computers? I believe there is a better way, and that the current arms race is misguided because it focuses only on pieces of the puzzle and not on the entirety of the customer experience.
Video Killed the Identity Star: What exactly is a video injection attack?
What is a video injection attack, anyway? In the context of an identity verification system and for the purposes of this article, a video injection attack attempts to insert fraudulent data streams between the capture device, referred to as the sensor, and the biometric feature extractor during identity verification in an attempt to establish a fraudulent identity. Given the nature of most identity verification systems, the biometric being collected is one or more frames of video of a face image that is compared against the face on an identity document. Any biometric capture system worth its salt employs presentation attack detection and prevention methods. Those range from challenge-response (blink your eyes, turn your head, track movement across the screen) to machine learning-based deep fake video detection.
“If you can’t explain it to a six-year-old, you don’t understand it yourself.” - Albert Einstein
This translates to a fraudster hacking an app on a phone and submitting their video to the identity verification website. Systems deploy techniques that attempt to make the injection of those videos difficult or impossible. As the complexity of the attacks increases, so do the prevention techniques.
There’s Identity Gold in Them There Hills, erm, Internet Sites
Sounds complicated, right? As the authors of the video injection attack paper point out, this turns out to be a relatively simple affair. We live in a world where our online media footprint has permanence. We use video conferencing systems, we post to TikTok, and we have photos littering the social media sphere. It turns out that our media littering the internet is a fraudster’s dream. From samples culled from a variety of sources, attackers can create quality fakes - and what’s especially troubling - easily employ face morphing techniques to seamlessly “overlay” the victim’s face onto their own.
“It's like looking in a mirror. Only... not.” - Castor Troy (Nicolas Cage), Face/Off
This isn’t a pre-canned video - which is difficult to employ in real-time to respond to “random” prompts. This is the attacker “skinned” with the victim's features, with the attacker responding to the prompts in real-time. And to make matters worse, the quality of the video increases with the increase in sample data size. Your video on TikTok? Great! Your video on Snapchat? Awesome. Your public photos on Instagram and Facebook? Even better.
Give Me a Lever Long Enough and a Fulcrum on Which to Place It, and I Shall Hack an App
Ok, that’s not exactly what Archimedes said. But the gist is there. The single largest challenge that is faced by these systems is that the user’s access point - their mobile device, their laptop, their home computer - is insecure. Given resources and time - the level and the fulcrum - an attacker will hack an application running on a local device. With a hacked app in hand, the attack can proceed by manipulating the mobile application and injecting the video directly into the app, convincing the back-end biometric liveness detection and feature extraction that it is coming from a trusted device. It’s surprisingly easy, and I encourage you to review the work that was performed by the researchers.
The Hour Has Arrived to Abandon Theories and Go Directly to What is Practical - Samael Aun Weor
But I’ll be honest, though. To me, this really should simply be academic. If digital identity systems would consider the entirety of the user journey and how to address fraud holistically, this paper would only be demonstrating attacks on a subset of that system. And those attacks would be mitigated by alternative methods that have nothing to do with the subject of the attack (video injections).
What is this heresy of which you speak? We can’t let attackers submit video samples trying to spoof biometric systems, can we? What’s next…
“Dogs and cats living together??!?... MASS HYSTERIA!” - Dr. Peter Venkman (Bill Murray), Ghostbusters
The core issue here is about trust - specifically the trust in the device. If I let an attacker have unchecked access to my app on their device, they will successfully break in and start submitting fraudulent claims of victims’ identities from their device. The authors of the paper suggest meeting this escalating challenge with force - an ever-widening spiral of technology deployed to combat these threats. The unfortunate part is that this typically means increased costs and friction for users. New devices and smarter capture equipment all sound expensive. Smart cards, flashing strobe lights in-app, increased complexity of user participation - well that’s just added friction for the user. These remedies are missing the boat. Typically, if a user is engaging in a digital identity verification journey, they will be doing so from their device. It would be highly unusual for a user to engage from someone else’s device. So really, the digital identity solutions should be asking the questions: “If I am trying to establish Tim’s identity, shouldn’t Tim be in possession of the device, shouldn’t that device be trustworthy, and finally, shouldn’t Tim own that device?” If they could affirmatively answer these questions, then they would render these video injection attacks obsolete and the methods meant to prevent them would become an unnecessary expense.
It’s Not the Destination, It’s the Journey
Don’t get me wrong: I’m not diminishing the value of a biometric system and liveness detection techniques. If you have a platform that relies on a biometric sample for access control or some other use case beyond identity verification, then employing liveness detection and advanced video analysis makes sense. But you need to consider the journey. Ultimately you are onboarding an identity. As part of that, there are some assumptions you can make about capture methods. Additionally, you are gathering identity information. And it turns out that these little bits of information can go a long way in ensuring that you are interacting with the correct person and not a fraudulent actor. In fact, so much so, that it virtually eliminates fraud from the acquisition process.
How does this work in practice?
The key is establishing the trust in the device. The authors of the video injection attack document discuss the necessity to ensure the security of the digital channel. In fact, they suggest:
“Since the smartphone or computer is under the attacker’s control, and because any type of security (code obfuscation, anti-rooting detection, etc.) can be bypassed by the attacker, it is absolutely necessary that the countermeasures, faced with the new vulnerability of image injection, must be placed on the server side in order to ensure greater robustness. Additionally, the client-side (i.e. the smartphone) could be complemented by the implementation of so-called mutual authentication and secure channel protocol (as it exists today in smartcard-based security solutions) between the camera and the Android operating system. In our next work, we will focus in particular our researches [sic] for countermeasures to overcome this new breach facing face recognition.”
My challenge is that this approach is expensive and complex, relying on improvements to hardware and software to attain. As we’ve demonstrated at Prove, there is an easier way. With our Pinnacle platform, we observe that:
- Through authentication, we can establish that the correct individual is in possession of their device. Prove employs passive and active authentication mechanisms, reducing friction when compared to other authentication methods such as document authentication and face recognition. Crucially, passive methods provide less of an attack surface for fraud attempts. This authentication establishes a cryptographic certainty of possession of the device.
- Coupling authentication with a Prove Trust Score®, allows Pinnacle to determine whether the right person is using the device. Trust Score leverages many different real-time data signals to calculate this trustworthiness. Pinnacle looks for real-time SIM changes, malware, disconnects, account changes, and other anomalous behavior that could make the device less trustworthy.
- With this information in hand, Pinnacle evaluates the strength of the association of the phone number to the individual being onboarded. Using a variety of different sources of data as well as a wealth of anonymized transactional data, Pinnacle correctly attributes ownership of the phone number to the individual and assigns an assurance level to that phone number indicating the confidence of ownership.
- Finally, if the user is a participant of a larger ecosystem, it’s probable that they have created a reusable identity and are leveraging technology like FIDO2/passkey credentials as a cryptographic analog for the mobile phone. This allows additional passive certificate verifications to occur, further ensuring the trustworthiness of the device.
When taken as a whole, Pinnacle uses cryptographic certainty (SIM) and machine learning to establish that the device is trustworthy. It’s not necessary to deploy costly countermeasures that are meant to address only a small part of the user journey. In fact, one could argue that:
- If I know the device belongs to Tim and he is indeed using this device…
- And I’m required to capture an image of Tim’s face…
- It’s not necessary to deploy complex face liveness detection.
I can argue this because we have established that a trusted and expected device is interacting with the platform as part of my identity verification journey, and therefore I can relax my requirements for face capture.
This isn’t just my opinion: our customers are seeing higher completion and approval rates, coupled with minimal fraud. We are seeing 92% approval ratings with an extraordinarily low fraud rate (in basis points) of 3.3bps. That’s amazing.
Never Confuse a Single Defeat with a Final Defeat - F. Scott Fitzgerald
“A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools.” - Douglas Adams, Hitchhiker’s Guide to the Galaxy
I’m not naive enough to claim final victory. As sure as the sun rises and sets, fraudsters will forever be escalating the war on our infrastructure, searching for ways to exploit systems when the value is high enough. However, when onboarding a person during identity verification, deploying Pinnacle ensures that you are interacting with the correct device for the person being onboarded and that the device is being used by the right person, you have effectively eliminated the video attack vector by providing a trusted capture pathway. What’s wild about this is that you could equally apply this to document verification techniques as well. Why do I need to invest heavily in document authentication if we know that the device is associated with the document being captured? Well, I’ll save that for another day and another blog post.
Thanks for reading the random musings of an identity wonk. I look forward to hearing your thoughts.
Sources:
Kévin Carta, Claude Barral, Nadia El Mrabet, and Stéfane Mouille, “Video Injection Attacks on Remote Digital Identity Verification Solution Using Face Recognition,” The 13th International Multi-Conference on Complexity, Informatics and Cybernetics: IMCIC 2022, vol. 2 (March 2022): 92-97. https://www.iiis.org/DOI2022/ZA639OX/
Julia Ainsley and Sarah Fitzpatrick, “Secret Service recovers $286 million in stolen Covid relief funds”, NBC News, August 26, 2022. https://www.nbcnews.com/news/crime-courts/secret-service-recovers-286-million-stolen-covid-relief-funds-rcna44886
Bonin, et al, “PinnacleTM: Prove's Machine Learning Approach to Enable Cryptographic Authentication”, Prove Identity, 2022. https://www.prove.com/pages/pinnacle-proves-machine-learning-approach-to-cryptographic-authentication
Carta et al (n 1) 97
Keep reading
This blog post outlines best practices for integrating identity verification APIs to enhance security, compliance, and user experience in digital interactions.
Identity verification is crucial for developers to prioritize in their applications to ensure a secure and trustworthy online environment for all parties involved.
In an age where our smartphones have become almost like extensions of ourselves, the identity assurance achieved through smartphone possession and data is a natural evolution.