Why GPT-4 is vulnerable to image injection attacks
 
                                                    
                                                    The new OpenAI GPT-4V release supports image loading, which creates an entirely new attack vector, making large language models (LLMs) vulnerable to multimodal image injection attacks. Attackers can inject commands, malicious scripts, and code into images, and the model will execute them.
Multimodal attacks using injected images can leak data, redirect requests, create misinformation, and execute more complex scripts to change the interpretation of data in the LLM. They can redirect the LLM to ignore previous defenses and execute commands that can compromise the organization, ranging from fraud to operational sabotage.
.
While all companies that use LLM in their workflows are at risk, those for whom LLM is a core element of image analysis and classification are most at risk. Attackers using different techniques can quickly change how images are interpreted and classified, leading to more chaotic results due to misinformation.
An attacker using different techniques can quickly change how images are interpreted and classified, leading to more chaotic results due to misinformation.
After redefining LLM hints, there is an increased likelihood that it will become even more «blind» to malicious commands and execution scenarios. By embedding commands in a series of images uploaded to the LLM, attackers can orchestrate fraud and operational sabotage, facilitating social engineering attacks.
Attackers can use the LLM’s hints to create fraud and operational sabotage, facilitating social engineering attacks.
 
                    
                                        
                                            
                






