The Open Worldwide Application Security Project (OWASP) recently launched their ‘Top 10 for Large Language Model (LLM) Applications’. OWASP is a non-profit organization that offers free and open resources, guidance and best practice insights. Its mission is to assist developers and organizations in safeguarding their web applications, mobile applications and large language models against cyber attacks.
What are LLMS?
Large language models (LLMs) are very large deep learning models that are pre-trained on vast amounts of data and can perform a variety of natural language processing (NLP) tasks. Based on user inputs, these models can generate specific and relevant outputs to address queries They are built on a particular type of machine learning (ML) known as a transformer model. While the commonly known ChatGPT conversational AI / chatbot by OpenAI is powered by the GPT-4 LLM model. Many alternative LLMs exist, including the PaLM 2 model for BARD.
OWASP’s Top Ten for LLMs
The OWASP Top 10 for LLMs project aims to increase awareness among developers, designers, architects, managers, and entire organizations regarding the potential security risks associated with deploying and managing LLMs.
This particular project provides a list of the ten most critical vulnerabilities, emphasizing their potential impact, ease of exploitation, and prevalence in real-world applications. OWASP’s key goals are to raise awareness of these vulnerabilities, prevent them where possible, propose remediation strategies, and ultimately improve the security of all LLM applications.
Vulnerabilities for LLMs
We’ll now briefly examine each of OWASP’s 10 critical vulnerabilities:
- Prompt Injection
This vulnerability involves manipulating LLMs by providing deceptive inputs which cause unintended actions. These vulnerabilities can generally be divided into two forms. Firstly, direct injections which overwrite system prompts and secondly, indirect injections which manipulate inputs from external sources.
- Insecure Output Handling
Backend systems can be exposed when LLM output is accepted without undergoing proper validation. This oversight can lead to serious consequences such as cross-site scripting (XSS), cross-site request forgery (CSRF), server-side request forgery (SSRF), privilege escalation and the remote execution of code.
- Training Data Positioning
When training data used for LLMs is tampered with, it can introduce biases and vulnerabilities, compromising security, ethical behavior or even the LLM’s effectiveness.
- Model Denial of Service
Model denial of service attacks occur when resource-intensive operations are imposed on an LLM. This can lead to a degradation in service capabilities, as well as higher operational and rectification costs.
- Supply Chain Vulnerabilities
Vulnerable components or services within LLMs can result in additional exposure to security attacks. These vulnerabilities may arise from utilizing pre-trained models, plugins, or third-party datasets.
- Sensitive Information Disclosure
LLMs possess the potential to inadvertently reveal confidential data in their responses, leading to privacy violations, security breaches, and unauthorized access to data.
- Insecure Plugin Design
A plugin for a LLM may have insecure inputs or inadequate access controls. This means that they become much easier for attackers to exploit, potentially resulting in serious issues such as the remote execution of code.
- Excessive Agency
Due to excessive functionality, permissions issues or an abundance of autonomy, LLM systems may conduct actions which lead to unexpected consequences.
- Over Reliance
Over reliance on LLMs by individuals or systems can lead to increased problems such as miscommunications, misinformation, legal complications, and security vulnerabilities due to insufficient oversight and testing. This reliance also heightens the risk of generating and spreading inaccurate or inappropriate content.
- Model Theft
Model theft involves the unauthorized access, copying and or removal of proprietary LLM models. Such thefts can expose sensitive or confidential information, lead to economic losses, and reduce competitive advantage.
OWASP’s Prevention Strategies
To mitigate each of the above vulnerabilities, OWASP recommends organizations and LLM developers and designers take the following steps:
- Prompt Injection
To prevent prompt injections, it is crucial to implement privilege controls, restricting access to specific actions based on user roles. User consent for privileged actions should also be a part of this process. Segregating user prompts and implementing processes to visually highlight any unreliable responses is recommended.
- Insecure Output Handling
Adopting a zero-trust approach toward LLM output is essential. This involves rigorous validation and sanitization of the output. Encoding LLM outputs to prevent code execution via Markdown or JavaScript is crucial.
- Training Data Positioning
To prevent manipulation of training data, all external data sources should always be verified. Continuous monitoring of data legitimacy throughout the training stages is recommended. If feasible, utilizing separate training models for different use cases can help minimize potential issues or data manipulation.
- Model Denial of Service
Input validation and content filtration should be used to help prevent denial of service attacks. Setting limits on application programming interface (API) requests per user and managing resource usage per request are crucial preventive measures. Employing resource monitoring and queue management can also significantly reduce the risk of such attacks.
- Supply Chain Vulnerabilities
Mitigating vulnerabilities in the supply chain involves conducting thorough evaluations of all providers and suppliers. Utilizing only trusted plugins and implementing robust security measures such as monitoring, anomaly detection, and inventory management are all essential components of effective LLM management.
- Sensitive Information Disclosure
As well as limiting access to information, organizations should use validation to filter malicious inputs, preventing the poisoning of their LLMs. Data sanitization methods can be used to effectively clean user data for training purposes. Extra caution and procedures should be implemented if sensitive data is required to fine-tune the LLM model.
- Insecure Plugin Design
To mitigate insecure plugin design vulnerabilities in LLMs, it is essential to implement parameter controls such as validation layers and type checks for all plugins. Incorporating least-privilege principles, authorization identities, and user confirmation into processes is also crucial. In addition, conducting comprehensive testing using static application security testing (SAST), dynamic application security testing (DAST), and interactive application security testing (IAST) is recommended.
- Excessive Agency
Plugin functions and scope should be limited to prevent excessive agency, while also avoiding open-ended functions. Additionally, restricting permissions and implementing user authentication is essential. Downstream systems should always incorporate authorization measures.
- Over Reliance
Ensuring consistency in LLM outputs involves continuous monitoring and validation. Verification using trusted sources is recommended, along with fine-tuning where feasible. To mitigate risks, complex tasks need to be segmented and clear communication of LLM limitations should be made to all users. For instance, highlighting these risks prominently on the user interface. All users should also clearly understand best practices for LLMs and have clear usage guidelines that they adhere stringently to.
- Model Theft
Implementing stringent access and authentication controls is imperative to prevent model theft. Regular monitoring and auditing of access logs are essential to detect any unauthorized or unintended access. Leveraging ML operations (MLOps) automation for secure deployment and approval workflows is recommended.
Automated Testing to Prevent Risks
As conversational AI based on LLMs continues to become more commonplace in everyday life, the challenges and risks posed are likely to evolve in tandem. As such, it is crucial for organizations to prioritize investments in testing and monitoring their LLMs.
Cyara Botium offers automated testing solutions designed to facilitate continuous, scheduled, and automatic testing and monitoring of LLMs. Botium offers several guardrails to prevent the most prominent risks which may be introduced by LLM generated data. These include:
- Inaccurate data or hallucinations (data which is made up by the LLM)
- Inappropriate data generated by the LLM
- Security, privacy, legal and copyright breaches
- Biased data.
Through automated testing organizations can better streamline testing and monitoring processes for their LLMs and ensure the highest quality service and user experience while mitigating the aforementioned risks.