OpenAI unveils new AI models to improve internet safety

OpenAI has unveiled two new artificial intelligence models designed to improve internet security. These models, gpt-oss-safeguard-120b and gpt-oss-safeguard-20b, are designed for logical reasoning and the classification of various threats on online platforms.

Nov 1, 2025 0 10

OpenAI unveils new AI models to improve internet safety

These models are refined versions of earlier OpenAI developments. The company positions them as open-weights models, meaning their internal parameters, which drive their performance, are publicly available. These models are transparent and provide control, but differ from open-source models, whose full source code is available to users for customization and modification.

According to OpenAI, companies will be able to customize these models for their specific needs. As logical reasoning models, they demonstrate their reasoning process, allowing developers to better understand how a particular conclusion was reached. For example, a product review website could use these models to identify fake reviews.

The development was carried out in partnership with companies such as Discord, SafetyKit, and ROOST, an organization focused on building AI security infrastructure. The models are initially available in a preview version, and OpenAI plans to gather feedback from users and security experts.

This move may be seen as a response to criticism of the company's rapid commercialization and growth at the expense of ethics and safety. The day before, OpenAI also announced the completion of its recapitalization, solidifying its structure as a non-profit organization with a controlling stake in its commercial business.