Meta says in a new policy document that it may stop risky AI systems.
Meta has presented a Frontier AI Framework, a guideline for developing and releasing advanced AI models with a focus on risk and security. The document describes how Meta conducts risk assessments, analyzes threat scenarios and applies decision making to manage the impact of AI technology.
AI management and risk assessment
The Frontier AI Framework is part of Meta’s broader AI governance program. It focuses specifically on the most advanced AI models that have the potential for catastrophic risk. Meta takes an “outcomes-led” approach, assessing risk based on potential consequences rather than just technical capabilities.
read also
Meta seeks $65 billion investment in AI infrastructure despite question marks over training process
A key aspect of the framework is threat modeling. Meta identifies scenarios in which AI can contribute to large-scale cybersecurity incidents or the development of chemical and biological weapons. AI models are subjected to evaluations and tests, such as threat modeling and red teaming, involving external experts.
Thresholds and measures
The framework introduces a three-step model to categorize AI risks: critical, high and moderate.
- Critique: The model can directly enable a catastrophic threat scenario. Development is halted until effective mitigations are found.
- High: The model increases the probability of a threat scenario, but cannot fully execute it. It is not released externally.
- Moderate: There is no significantly increased risk. The model can be released, with appropriate security measures.
Meta emphasizes that the AI ecosystem is constantly evolving and the framework will be updated in the future based on new technological developments and threat insights.