G42

Weak 1.3/5

Click categories for more information

very weak

weak

moderate

substantial

strong

Risk Identification

Learn more

Risk Identification

17%

Risk Analysis and Evaluation

Learn more

Risk Analysis and Evaluation

18%

Risk Treatment

Learn more

Risk Treatment

24%

Risk Governance

Learn more

Risk Governance

42%

Best in class

Note: SaferAI contributed to the process of writing G42’s Frontier Safety Framework.

SEE FRAMEWORK

The various security mitigation levels are very well defined, qualitatively. They outline which threat actor the level should protect against, with a clear qualitative objective.
They are best in class for decision making and audit areas of risk governance. For instance, they uniquely have a dedicated risk committee (Frontier AI Governance Board) which oversees operations.
G42 uniquely mentions having independent internal audits to verify framework compliance, as well as annual external audits.

Overview

Highlights relative to others

Clear evaluation protocols, strong speak-up culture, and clear go/no-go decision protocols.

Strong pairing of risk thresholds to mitigation thresholds, with clear deployment and containment mitigation measures.

Named external collaborators who helped refine focused risk domains.

Weaknesses relative to others

Risk tolerance lacks precision, as do risk thresholds.

Lacking commitments to share evaluation results.

Lacking justification that evaluation methods are comprehensive enough to match threat actors.

No mention of assurance processes, nor a plan to contribute to their development.

G42

1. Risk Identification

Very Weak 17%

1.1 Classification of Applicable Known Risks (40%) 25%

1.1.1 Risks from literature and taxonomies are well covered (50%) 25%

They state that “Initially G42 identified potential capabilities across several domains, including biological risks, cybersecurity, and autonomous operations in specialized fields.” To improve, at least one document from literature should be included which provides transparency for how they arrived at this initial list.

The list of included risk domains is biological threats and offensive cybersecurity. This does not contain chemical, nuclear or radiological risks, nor loss of control risks or autonomous AI R&D. Since 1.1.2 is not greater than 50%, this exclusion would either require more justifiaction, or these areas should be included in monitoring.

Quotes:

“An initial list of potentially hazardous AI capabilities which G42 will monitor for is:

Biological Threats: When an AI’s capabilities could facilitate biological security threats, necessitating strict, proactive measures.
Offensive Cybersecurity: When an AI’s capabilities could facilitate cybersecurity threats, necessitating strict, proactive measures.

To produce this list, G42 both conducted our own internal risk analysis and received input from external AI safety experts. Initially G42 identified potential capabilities across several domains, including biological risks, cybersecurity, and autonomous operations in specialized fields. We then collaborated with METR and SaferAI to refine our list, prioritizing capabilities based on their potential impact and how feasibly they can be measured and monitored.” (p. 4)

“In the future, we will map out other hazardous capabilities to consider monitoring. We may also add thresholds for:

Autonomous Operation: When an AI system can make unsupervised decisions with critical implications, particularly in sectors such as healthcare or defense.
Advanced Manipulation: Applicable when AI systems can influence human behavior or decisions on a large scale, warranting enhanced monitoring and usage restrictions.

We plan to integrate decisions on whether to expand our monitoring to include additional hazardous capabilities into our regular framework review process. This includes both our scheduled internal reviews and our annual reviews by third parties. In making these decisions, we expect to consider factors such as: “near miss” incidents, whether internal or industry-wide; recommendations from trusted external experts; as well as changes in industry standards for AI risk management.” (p. 4)

1.1.2 Exclusions are clearly justified and documented (50%) 25%

It is commendable that they name the third parties that influenced their decision to exclude certain risk domains, like “autonomous operations in specialized fields”. However, whilst their prioritization of risks involves capabilities’ “potential impact and how feasibly they can be measured and monitored”, more detail would be useful on what exact levels of potential impact/feasibility of measurement + monitoring influenced their decision. More detail is also needed on precisely which capabilities they decided to exclude on this basis, and why they excluded e.g. chemical/radiological/nuclear threats and autonomous AI R&D, for instance.

It is good that they list other hazardous capabilities to consider monitoring, and that there is a structured process for deciding whether to expand monitoring to include additional risk domains. However, more precise conditions required for including these capabilities as monitored risk domains could be given.

Quotes:

“Initially G42 identified potential capabilities across several domains, including biological risks, cybersecurity, and autonomous operations in specialized fields. We then collaborated with METR and SaferAI to refine our list, prioritizing capabilities based on their potential impact and how feasibly they can be measured and monitored.

In the future, we will map out other hazardous capabilities to consider monitoring. We may also add thresholds for:

Autonomous Operation: When an AI system can make unsupervised decisions with critical implications, particularly in sectors such as healthcare or defense.
Advanced Manipulation: Applicable when AI systems can influence human behavior or decisions on a large scale, warranting enhanced monitoring and usage restrictions.

1.2 Identification of Unknown Risks (Open-ended red teaming) (20%) 7%

1.2.1 Internal open-ended red teaming (70%) 10%

There is some indication of identifying risks specific to the model via a structured process, though minimal detail on the methodology is given. Insofar as the “red teaming activity” and “adversarial testing” refers to open-ended red teaming, there is also some recognition that “specialized subject matter experts” are needed. However, detail on the expertise required, and why this standard is satisfied, is not given.

The commitment and purpose could be made more explicit, e.g. that the process is to identify either novel risk domains, or novel risk models/changed risk profiles within pre-specified risk domains (e.g. emergence of an extended context length allowing improved zero shot learning changes the risk profile), and provide methodology, resources and required expertise.

Quotes:

Deployment Mitigation Level 3: “Simulation and Adversarial Testing: Regular simulations identify model vulnerabilities and develop adaptive responses. Red teaming activity to identify and mitigate potential risks in the system.

Testing is designed to ensure effectiveness across all planned deployment contexts, with specialized subject matter experts providing domain-specific input as needed.” (p. 8)

1.2.2 Third party open-ended red teaming (30%) 0%

The framework doesn’t mention any third-party procedures pre-deployment to identify novel risk domains or risk models for the frontier model. To improve, they should commit to an external process to identify either novel risk domains, or novel risk models/changed risk profiles within pre-specified risk domains (e.g. emergence of an extended context length allowing improved zero shot learning changes the risk profile), and provide methodology, resources and required expertise.

Quotes:

No relevant quotes found.

1.3 Risk modeling (40%) 15%

1.3.1 The company uses risk models for all the risk domains identified and the risk models are published (with potentially dangerous information redacted) (40%) 10%

There is no description of risk modelling or engaging in risk models. However, there is implicitly a risk model in the definition of the Frontier Capability Thresholds, as threat scenarios for certain harms. For instance, they mention “Biological threats: Enabling an individual with only introductory biology experience in developing a biological weapon, through providing detailed advice, automating biological processes, etc. Or, facilitating the design of novel biological weapons with substantially greater potential for damage” and “Offensive cybersecurity: Automating powerful cyber
offensive operations against unsecured or secured targets, in a way that could cause critical damage.” These both could be seen as threat scenarios for how harm could be caused in the risk domains of Biological threats and offensive cybersecurity specifically.

However, to improve, explicit risk modelling with step by step, causal pathways to harm, specific to G42’s models, needs to be conducted.

Quotes:

“Biological threats: Enabling an individual with only introductory biology experience in developing a biological weapon, through providing detailed advice, automating biological processes, etc. Or, facilitating the design of novel biological weapons with substantially greater potential for damage” (p. 5)

“Offensive cybersecurity: Automating powerful cyber offensive operations against unsecured or secured targets, in a way that could cause critical damage.” (p. 6)

1.3.2 Risk modeling methodology (40%) 2%

1.3.2.1 Methodology precisely defined (70%) 0%

There is no methodology for risk modeling defined.

Quotes:

No relevant quotes found.

1.3.2.2 Mechanism to incorporate red teaming findings (15%) 0%

No mention of risks identified during open-ended red teaming or evaluations triggering further risk modeling.

Quotes:

No relevant quotes found.

1.3.2.3 Prioritization of severe and probable risks (15%) 10%

They prioritize capabilities (which here implicitly refers to risk models) “based on their potential impact and how feasibly they can be measured and monitored.” ‘Potential impact’ here likely refers to some combination of severity and probability. However, more explicit detail on how these are weighed in is needed.

Quotes:

“We then collaborated with METR and SaferAI to refine our list, prioritizing capabilities based on their potential impact and how feasibly they can be measured and monitored.” (p. 4)

1.3.3 Third party validation of risk models (20%) 50%

They describe collaborating with “external AI safety experts”, METR and SaferAI to “refine” their list of identified risk domains, which implicitly means to give input into their risk models and to help prioritize them. To improve, detail should be given on how exactly risk models are validated, as opposed to just expert input. Naming the third parties involved however is commendable.

Quotes:

“To produce this list, G42 both conducted our own internal risk analysis and received input from external AI safety experts. Initially G42 identified potential capabilities across several domains, including biological risks, cybersecurity, and autonomous operations in specialized fields. We then collaborated with METR and SaferAI to refine our list, prioritizing capabilities based on their potential impact and how feasibly they can be measured and monitored.” (p. 4)

G42

2. Risk Analysis and Evaluation

Very Weak 18%

2.1 Setting a Risk Tolerance (35%) 7%

2.1.1 Risk tolerance is defined (80%) 8%

2.1.1.1 Risk tolerance is at least qualitatively defined for all risks (33%) 25%

Implicitly, capability thresholds function as risk tolerances. For instance, “Enabling an individual with only introductory biology experience in developing a biological weapon, through providing detailed advice, automating biological processes, etc. Or, facilitating the design of novel biological weapons with substantially greater potential for damage.” To improve, they should set out the maximum amount of risk the company is willing to accept for each risk domain (though these need not differ between risk domains), ideally expressed in terms of probabilities and severity (economic damages, physical lives, etc), and separate from KRIs.

Quotes:

Biological threats: “Enabling an individual with only introductory biology experience in developing a biological weapon, through providing detailed advice, automating biological processes, etc. Or, facilitating the design of novel biological weapons with substantially greater potential for damage.” (p. 5)

Offensive cybersecurity: “Automating powerful cyber offensive operations against unsecured or secured targets, in a way that could cause critical damage.” (p. 6)

2.1.1.2 Risk tolerance is expressed at least partly quantitatively as a combination of scenarios (qualitative) and probabilities (quantitative) for all risks (33%) 0%

The risk tolerance, implicit or otherwise, is not expressed fully or partly quantitatively. To improve, the risk tolerance should be expressed fully quantitatively or as a combination of scenarios with probabilities.

Quotes:

No relevant quotes found.

2.1.1.3 Risk tolerance is expressed fully quantitatively as a product of severity (quantitative) and probability (quantitative) for all risks (33%) 0%

Quotes:

No relevant quotes found.

2.1.2 Process to define the tolerance (20%) 0%

2.1.2.1 AI developers engage in public consultations or seek guidance from regulators where available (50%) 0%

No evidence of engaging in public consultations or seeking guidance from regulators for risk tolerance.

Quotes:

No relevant quotes found.

2.1.2.2 Any significant deviations from risk tolerance norms established in other industries is justified and documented (e.g., cost-benefit analyses) (50%) 0%

No justification process: No evidence of considering whether their approach aligns with or deviates from established norms.

Quotes:

No relevant quotes found.

2.2 Operationalizing Risk Tolerance (65%) 24%

2.2.1 Key Risk Indicators (KRI) (30%) 22%

2.2.1.1 KRI thresholds are at least qualitatively defined for all risks (45%) 25%

The Frontier Capability Threshold acts as a pseudo risk tolerance, whilst the evaluations function as KRIs. However, the evaluations are not specific enough of what threshold would trigger mitigations. For instance, for preliminary evaluations for biological threats, they say to “assess language models’ accuracy on open-source biology benchmarks (e.g., LAB-Bench, WMDP-Bio)”, but don’t indicate what level of accuracy would be concerning. The same is true for all the evaluations mentioned. Nonetheless, the evaluation results appear grounded in risk modeling.

In addition, only one KRI is given for each risk domain; though they do mention to “Update the policy to include more advanced thresholds” once the threshold is crossed.

Quotes:

Biological Threats. “Frontier Capability Threshold. Enabling an individual with only introductory biology experience in developing a biological weapon, through providing detailed advice, automating biological processes, etc. Or, facilitating the design of novel biological weapons with substantially greater potential for damage. Evaluations. Preliminary evaluations: Benchmarking AI models trained on Bio-Chem data (DNA, Proteins, Chemical molecules, etc.) against emerging research to check capabilities to predict and simulate complex biological interactions. Assess language models’ accuracy on open-source biology benchmarks (e.g., LAB-Bench, WMDP-Bio).

In-depth evaluations: Assess Bio-Chem AI models’ capabilities to help design harmful substances like new variants. Evaluate language models’ ability to answer questions about biological weapons development, relative to expert ability. More ambitiously, evaluate human participants’ ability to conduct realistic but safe wet lab experiments with and without language model assistance. Although there is less prior research on evaluating and mitigating risks from biological design tools (BDTs), it is still critical for G42 to develop innovative measures for these tools over time.” (p. 6)

Offensive cybersecurity. “Frontier Capability Threshold. Automating powerful cyber offensive operations against unsecured or secured targets, in a way that could cause critical damage. Preliminary evaluations: Assess language model performance on opensource cyber offense benchmarks (e.g., Cybench, eyeballvul). In-depth evaluations: Assess model ability to exploit examples of secured targets in a safe environment.” (p. 6)

2.2.1.2 KRI thresholds are quantitatively defined for all risks (45%) 10%

The KRIs are not quantitative – however, they could improve this by providing specific quantitative thresholds (on the benchmarks, uplift studies etc. that they mention in the evaluations) which would trigger mitigations. Partial credit is given for this.

Quotes:

Biological Threats. “Frontier Capability Threshold. Enabling an individual with only introductory biology experience in developing a biological weapon, through providing detailed advice, automating biological processes, etc. Or, facilitating the design of novel biological weapons with substantially greater potential for damage. Evaluations. Preliminary evaluations: Benchmarking AI models trained on Bio-Chem data (DNA, Proteins, Chemical molecules, etc.) against emerging research to check capabilities to predict and simulate complex biological interactions. Assess language models’ accuracy on open-source biology benchmarks (e.g., LAB-Bench, WMDP-Bio).

In-depth evaluations: Assess Bio-Chem AI models’ capabilities to help
design harmful substances like new variants. Evaluate language models’ ability to answer questions about biological weapons development, relative to expert ability. More ambitiously, evaluate human participants’ ability to conduct realistic but safe wet lab experiments with and without language model assistance.
Although there is less prior research on evaluating and mitigating risks from biological design tools (BDTs), it is still critical for G42 to develop innovative measures for these tools over time.” (p. 6)

2.2.1.3 KRIs also identify and monitor changes in the level of risk in the external environment (10%) 10%

There is an indication that KRIs monitor changes in the level of risk in the external environment, e.g. “post-deployment monitoring will also be used to indicate whether G42’s models have reached capability thresholds”. However, the specific target of monitoring, and the threshold that would trigger mitigations, is not given.

Quotes:

“In addition to conducting pre-deployment evaluations, post-deployment monitoring will also be used to indicate whether G42’s models have reached capability thresholds and whether increased deployment mitigation and security mitigation levels are required.” (p. 5)

2.2.2 Key Control Indicators (KCI) (30%) 25%

2.2.2.1 Containment KCIs (35%) 45%

2.2.2.1.1 All KRI thresholds have corresponding qualitative containment KCI thresholds (50%) 90%

Each of the KRI thresholds require SML 2 if triggered (G42’s security level 2). SML 2 is clearly qualitatively defined: “The model should be secured such that it would be highly unlikely that a malicious individual or organization (state sponsored, organized crime, terrorist, etc.) could obtain the model weights or access sensitive data.” More detail would be useful on what constitutes a “malicious individual or organization” and “highly unlikely”, and what techniques are used by the malicious individual/organization.

SML 1,3 and 4 are also defined, though they are not linked to a specific KRI threshold. There is a commitment to further develop SML3 once the KRI is reached, but without justification that this will be sufficiently in advance.

Quotes:

“Additionally, if a Frontier Capability Threshold has been reached, G42 will update this Framework to define a more advanced threshold that requires increased deployment (e.g., DML 3) and security mitigations (e.g., SML 3).” (p. 5)

“G42’s Security Mitigation Levels are a set of levels, mapped to the Frontier Capability Thresholds, describing escalating information security measures. These protect against the theft of model weights, model inversion, and sensitive data, as models reach higher levels of capability and risk. Each tier customizes protections based on the assessed risk and capability of the model, ensuring G42’s AI development remains both resilient and efficient, minimizing disruptions to functionality while maintaining robust security.” (p. 9)

Security Level 1. “Suitable for models with minimal hazardous capabilities. Objective: No novel mitigations required on the basis of catastrophically dangerous capabilities.” (p. 9)

Security Level 2. “Intermediate safeguards for models with capabilities requiring controlled access, providing an extra layer of caution. Objective: The model should be secured such that it would be highly unlikely that a malicious individual or organization (state sponsored, organized crime, terrorist, etc.) could obtain the model weights or access sensitive data.” (p. 10)

Security Level 3. “Advanced safeguards for models approaching hazardous capabilities that could uplift state programs. Objective: Model weight security should be strong enough to resist even concerted attempts, with support from state programs, to steal model weights or key algorithmic secrets.” (p. 10)

Security Level 4. “Maximum safeguards. Objective: Security strong enough to resist concerted attempts with support from state programs to steal model weights.” (p. 11)

2.2.2.1.2 All KRI thresholds have corresponding quantitative containment KCI thresholds (50%) 0%

The containment KCI thresholds are not quantitatively defined.

Quotes:

No relevant quotes found.

2.2.2.2 Deployment KCIs (35%) 25%

2.2.2.2.1 All KRI thresholds have corresponding qualitative deployment KCI thresholds (50%) 50%

The KRI thresholds clearly require DML 2 if triggered. DML 2 is their deployment mitigation level 2. DML 2 is clearly qualitatively defined: “Even a determined actor should not be able to reliably elicit CBRN weapons advice or use the model to automate powerful cyberattacks including malware generation as well as misinformation campaigns, fraud material, illicit video/text/image generation via jailbreak techniques overriding the internal guardrails and supplemental security products.”

More detail would be useful on what constitutes a “determined actor”, “reliably elicit”, or “powerful cyberattacks.” It is also unclear if DML 2 must be implemented even if say, the Biological threats KRI is triggered but the Offensive cybersecurity KRI is not.

DML 1,3 and 4 are also defined, though they are not linked to a specific KRI threshold, and could again use more detail.

There is a commitment to further develop DML3 once the KRI is reached, but without justification that this will be sufficiently in advance.

Quotes:

“G42’s Frontier Capability Thresholds are defined in the following table. Each capability threshold is associated with a required Deployment Mitigation Level (DML) and Security Mitigation Levels (SML), which must be achieved before the capability threshold is reached. If a necessary Deployment Mitigation Level cannot be achieved, then the model’s deployment must be restricted; if a necessary Security Mitigation Level cannot be achieved, then further capabilities development of the model must be paused. Additionally, if a Frontier Capability Threshold has been reached, G42 will update this Framework to define a more advanced threshold that requires increased deployment (e.g., DML 3) and security mitigations (e.g., SML 3).” (p. 5)

Frontier Capability Threshold for Biological threats and/or Offensive cybersecurity triggers DML 2 and SML 2. (pp. 5-6)

“Deployment Mitigation Level 1: Foundational safeguards, applied to models with minimal hazardous capabilities. Objective: No novel mitigations required on the basis of catastrophically dangerous capabilities” (p. 7)

“Deployment Mitigation Level 2: Intermediate safeguards for models with capabilities requiring focused monitoring. Objective: Even a determined actor should not be able to reliably elicit CBRN weapons advice or use the model to automate
powerful cyberattacks including malware generation as well as misinformation campaigns, fraud material, illicit video/text/image generation via jailbreak techniques overriding the internal guardrails
and supplemental security products.”

“Deployment Mitigation Level 3: Advanced safeguards for models approaching significant capability thresholds. Objective: Deployment safety should be strong enough to resist sophisticated attempts to jailbreak or otherwise misuse the model.”

“Deployment Mitigation Level 4: Maximum safeguards, designed for high-stakes frontier models with critical functions. Objective: Deployment safety should be strong enough to resist even concerted attempts, with support from state programs, to jailbreak or otherwise misuse the model.”

2.2.2.2.2 All KRI thresholds have corresponding quantitative deployment KCI thresholds (50%) 0%

There are no quantitative deployment KCI thresholds given.

Quotes:

No relevant quotes found.

2.2.2.3 For advanced KRIs, assurance process KCIs are defined (30%) 0%

There are no assurance processes KCIs defined. The framework does not provide recognition of there being KCIs outside of containment and deployment measures.

Quotes:

No relevant quotes found.

2.2.3 Pairs of thresholds are grounded in risk modeling to show that risks remain below the tolerance (20%) 25%

Whilst the framework acknowledges that the containment and deployment KCIs “protect against the theft of model weights, model inversion and sensitive data, as models reach higher levels of capability and risk” and “protect against misuse, including through jailbreaking, as models reach higher levels of capability and risk” respectively, these could be more explicitly linked to a risk model detailing why exactly these KCIs, if satisfied, enable risks to remain below the risk tolerance.

Quotes:

“G42’s Deployment Mitigation Levels are a set of levels, mapped to the Frontier Capability Thresholds, that describe escalating mitigation measures for products deployed externally. These protect against misuse, including through jailbreaking, as models reach higher levels of capability and risk. These measures address specifically the goal of denying bad actors access to dangerous capabilities under the terms of intended deployment for our models, i.e. presuming that our development environment’s information security has not been violated.” (p. 7)

2.2.4 Policy to put development on hold if the required KCI threshold cannot be achieved, until sufficient controls are implemented to meet the threshold (20%) 25%

Whilst there is a commitment to pausing development if a necessary containment KCI cannot be reached, the KCIs should be defined such that development is put on hold if any KCI cannot be reached (and the corresponding KRI threshold is crossed.) Further, a process for pausing development should be given, to ensure risk levels do not manifest above the risk tolerance at any point. Conditions and process of dedeployment should also be given.

Quotes:

“If a necessary Deployment Mitigation Level cannot be achieved, then the model’s deployment must be restricted; if a necessary Security Mitigation Level cannot be achieved, then further capabilities development of the model must be paused.” (p. 5)

G42

3. Risk Treatment

Weak 24%

3.1 Implementing Mitigation Measures (50%) 23%

3.1.1 Containment measures (35%) 34%

3.1.1.1 Containment measures are precisely defined for all KCI thresholds (60%) 50%

Containment measures are given for Levels 2 and 3, but could be more specific, e.g. specification of what “verified credentials” and “access is role-based, aligned with user responsibility, and supported by a zero-trust architecture to prevent unauthorized entry” actually entails. A plan is not given for assuring that measures will be defined for Level 4 before the corresponding KRI is crossed.

Quotes:

The following are from pp. 9-11:

Security Level 1. “Specific Measures. None. G42 may choose to open source models.”

Securiy Level 2. “Specific Measures. Access controls and role-based permissions. Model weights are gated by granular role-based permission levels, model access is geofenced to pre-approved locations, limited access attempts using the same credentials. Network segmentation to isolate systems containing model weights.

Internal and External Red-Teaming: Rigorous testing by internal security teams, supplemented by external experts, to identify weaknesses.

Dynamic Threat Simulation and Response Testing: Regular adversarial simulations expose potential security weaknesses.”

Security Level 3. “Specific Measures. Model weights and sensitive data are secured through thorough Security Level 2 protocols, as well as the following measures to ensure access to model weights is highly restricted: multi-party and quorum-based approval for high-sensitivity operations, end-to-end encryption of model weights both at rest and in
transit, automatic encryption key rotations at regular intervals. Only trusted users with verified credentials are granted access to high-risk models. Access is role-based, aligned with user responsibilities, and supported by a zero-trust architecture to prevent unauthorized entry.”

Security Level 4. “Specific Measures. To be defined when models reach capabilities necessitating Level 3 containment mitigation measures.”

3.1.1.2 Proof that containment measures are sufficient to meet the thresholds (40%) 10%

Whilst there is a process for determining weaknesses in containment measures with internal red-teaming, it is not clear that this is prior to their implementation. Further, to improve, they should detail proof for why they believe the containment measures proposed will be sufficient to meet the KCI threshold, in advance. In addition, red-teaming is more an evidence gathering activity than a validation/proof; to improve, a case should be made for why they believe their containment measures to be sufficient.

Quotes:

Security Level 2. “Internal and External Red-Teaming: Rigorous testing by internal security teams, supplemented by external experts, to identify weaknesses.” (p. 10)

3.1.1.3 Strong third party verification process to verify that the containment measures meet the threshold (100% if 3.1.1.3 > [60% x 3.1.1.1 + 40% x 3.1.1.2]) 10%

Whilst there is a process for determining weaknesses in containment measures with external red-teaming, it is not clear that this is prior to their implementation. In addition, red-teaming is more for evidence collection than validation, which this criterion requires. Further, to improve, they should detail a process for third-parties to verify the case for why they believe the containment measures proposed will be sufficient to meet the KCI threshold, in advance.

Quotes:

Security Level 2. “Internal and External Red-Teaming: Rigorous testing by internal security teams, supplemented by external experts, to identify weaknesses.” (p. 10)

3.1.2 Deployment measures (35%) 30%

3.1.2.1 Deployment measures are precisely defined for all KCI thresholds (60%) 50%

The deployment measures are defined in detail for Levels 1, 2 and 3 but not Level 4 (i.e., their various deployment KCIs). Some of the measures remain high-level and could use more precision, for instance “Regular simulations identify model vulnerabilities and develop adaptive responses” or “Asynchronous Monitoring: This offcycle review catches anomalies missed in real-time, assessing all stored interactions for unusual behaviors” could be more detailed, including frequency or what evidence they are searching for, for instance.

Quotes:

Deployment Mitigation Level 1: “Specific Measures. Examples of foundational safeguards that may be applied include: Model Cards: Documents published alongside each new model deployment, summarising the model’s intended use cases, performance on public benchmarks, and the responsible practices conducted to ensure safety.

Incident Reporting Channels: Designated pathways for users to report instances of concerning or harmful behavior in violation of company policy to relevant G42
personnel.

Information Security Training: Training programs for new and existing personnel on best practices in information security consistent with the measures described in the Security Mitigation Levels.” (p. 7)

Deployment Mitigation Level 2: “Specific Measures. Risk of model misuse is mitigated by: Real-Time Monitoring and Prompt Filtering: Real-time classifiers evaluate inputs and outputs, detecting and filtering harmful interactions as they occur. This will also be aligned to underlying customer company policy and regulatory compliance.

Model Robustness Testing: Regular tests of AI models for robustness against attempts to manipulate or corrupt their output.

Asynchronous Monitoring: This offcycle review catches anomalies missed in real-time, assessing all stored interactions for unusual behaviors.

Controlled Rollout: For new frontier level models, implement phased rollouts, starting with limited access to trusted users, with full deployment only after exhaustive risk assessments.” (p. 8)

Deployment Mitigation Level 3: “Specific Measures. Risk of model misuse is mitigated by: Real-time anomaly detection and encrypted data handling.

Simulation and Adversarial Testing: Regular simulations identify model vulnerabilities and develop adaptive responses. Red teaming activity to identify and mitigate potential risks in the system. Testing is designed to ensure effectiveness across all planned deployment contexts, with specialized subject matter experts providing domain-specific input as needed.

Controlled Rollout: For new frontier level models, implement phased rollouts, starting with limited access to trusted users, with fully deployment only after exhaustive risk assessments.” (p. 8)

Deployment Mitigation Level 4: “Specific Measures. To be defined when models reach capabilities necessitating Level 3 deployment mitigation measures.” (p. 9)

“Although there is less prior research on evaluating and mitigating risks from biological design tools (BDTs), it is still critical for G42 to develop innovative measures for these tools over time.” (p. 6)

3.1.2.2 Proof that deployment measures are sufficient to meet the thresholds (40%) 0%

No proof is provided that the deployment measures are sufficient to meet the deployment KCI thresholds, nor is there a process to solicit such proof.

Quotes:

No relevant quotes found.

3.1.2.3 Strong third party verification process to verify that the deployment measures meet the threshold (100% if 3.1.2.3 > [60% x 3.1.2.1 + 40% x 3.1.2.2]) 25%

They detail a process for soliciting external expert advice prior to deployment decisions. However, sufficiency criteria for third-parties’ expertise should be determined ex ante, and the advice should be verification that the measures are sufficient above simply “input”. Further, verification should ideally take place before the relevant KRI thresholds are crossed, rather than after.

Quotes:

“As deemed appropriate, we will solicit external expert advice for capability and safeguards assessments. This may include partnering with private or civil society organisations with expertise in AI risk management to provide input on our assessments plans and/or internal capability reports ahead of deployment decisions.” (p. 12)

3.1.3 Assurance processes (30%) 2%

3.1.3.1 Credible plans towards the development of assurance properties (40%) 0%

There are no indications of plans to develop assurance processes nor mention of assurance processes in the framework. There are no indications to contributing to the research effort to ensure assurance processes are in place when they are required.

Quotes:

No relevant quotes found.

3.1.3.2 Evidence that the assurance properties are enough to achieve their corresponding KCI thresholds (40%) 0%

There is no mention of providing evidence that the assurance processes are sufficient.

Quotes:

No relevant quotes found.

3.1.3.3 The underlying assumptions that are essential for their effective implementation and success are clearly outlined (20%) 10%

Whilst assurance processes are not explicitly mentioned in the framework, the assumptions for deployment KCIs to successfully mitigate risk are given, which is given partial credit here: “these measures […] [presume] that our development environment’s information security has not been violated”. To improve, a similar mode of setting out assumptions for KCIs to be successfully met should be applied for assurance processes.

Quotes:

3.2 Continuous Monitoring and Comparing Results with Pre-determined Thresholds (50%) 25%

3.2.1 Monitoring of KRIs (40%) 31%

3.2.1.1 Justification that elicitation methods used during the evaluations are comprehensive enough to match the elicitation efforts of potential threat actors (30%) 50%

There is an indication that elicitation must “avoid underestimating model capabilities”, listing elicitation methods such as “prompt engineering, fine-tuning, and agentic tool usage”. However, this reasoning is not used to empirically justify why the evaluations are comprehensive enough, and is not linked to risk models of the elicitation efforts of potential threat actors.

Quotes:

“If the preliminary evaluations cannot rule out proficiency in hazardous capabilities, then we will conduct in-depth evaluations that study the capability in more detail to assess whether the Frontier Capability Threshold has been met. Such evaluations will incorporate capability elicitation – techniques such as prompt engineering, fine-tuning, and agentic tool usage – to optimize performance, overcome model refusals, and avoid underestimating model capabilities. Models created to generate output in a specific language, such as Arabic or Hindi, may be tested in those languages.” (pp. 5-6)

3.2.1.2 Evaluation frequency (25%) 50%

There is an acknowledgment that frequent evaluation during development is necessary, with a period of 6 months “for our most advanced models”. However, the frequency also does not relate to effective compute. It would be an improvement to state that the fixed time period is to account for post-training enhancements/elicitation methods.

Quotes:

“G42 will conduct evaluations throughout the model lifecycle to assess whether our models are approaching Frontier Capability Thresholds” (p. 5)

“G42 will publish internal reports providing detailed results of our capability evaluations. These reports will be created for our most advanced models at least once every six months, and the results will be shared with the Frontier AI Governance Board and the G42 Executive Leadership Committee.” (p. 5)

“Conduct routine capability assessments.” (p. 13)

3.2.1.3 Description of how post-training enhancements are factored into capability assessments (15%) 0%

Whilst evaluations are defined to “avoid underestimating model capabilities”, this is not explicitly linked to accounting for post-training enhancements, nor a safety margin.

Quotes:

3.2.1.4 Vetting of protocols by third parties (15%) 25%

There is some process for gaining external input on evaluation protocols. To improve, this could be made required rather than “as deemed appropriate”, and with named organizations, as well as sufficiency criteria for expertise. Further, the input from third parties should be less about providing information as it should be about validating the protocols used, providing a third party form of accountability to verify that the evaluation methodologies are sound.

Quotes:

3.2.1.5 Replication of evaluations by third parties (15%) 0%

There is no mention of evaluations being replicated or conducted by third parties.

Quotes:

No relevant quotes found.

3.2.2 Monitoring of KCIs (40%) 21%

3.2.2.1 Detailed description of evaluation methodology and justification that KCI thresholds will not be crossed unnoticed (40%) 25%

There is an awareness that monitoring of mitigation effectiveness is necessary. However, more detail is required on what “post-deployment monitoring” entails, such as process, frequency and methods. The focus of post-deployment monitoring does also seem to moreso focused on whether models cross KRI thrsholds, rather than if measures still meet the KCI threshold.

Quotes:

“Model Robustness Testing: Regular tests of AI models for robustness against attempts to manipulate or corrupt their output.” (p. 8) (DL2)

“Asynchronous Monitoring: This off cycle review catches anomalies missed in real-time, assessing all stored interactions for unusual behaviors.” (p. 8) (DL2)

3.2.2.2 Vetting of protocols by third parties (30%) 25%

There is some process for gaining external input on safeguard assessment protocols. To improve, this could be made required rather than “as deemed appropriate”, and with named organizations, as well as sufficiency criteria for expertise.

Quotes:

3.2.2.3 Replication of evaluations by third parties (30%) 10%

There is an indication that third parties help to conduct red teaming of containment KCI measures to ensure they meet the containment KCI threshold, but detail on process, expertise required and methods are not given, and external experts are only supplementary. To improve, there should also be a process for replicating / having safeguard red teaming conducted by third parties for deployment KCI measures. Further, these external evaluations should be independent.

Quotes:

“Internal and External Red-Teaming: Rigorous testing by internal security teams, supplemented by external experts, to identify weaknesses.” (p. 10)

3.2.3 Transparency of evaluation results (10%) 21%

3.2.3.1 Sharing of evaluation results with relevant stakeholders as appropriate (85%) 25%

Whilst they commit to publishing Model Cards publicly with each new deployment, this only details “performance on public benchmarks”. To improve, all KRI and KCI assessments should be public. Further, they should notify the relevant authorities if any KRI threshold is crossed.

Quotes:

“Incidence Response: Developing a comprehensive incident response plan that outlines the steps to be taken in the event of non-compliance. Incident detection should leverage automated mechanisms and human review, and non-sensitive incident information should be shared with applicable government bodies. We plan for our response protocols to focus on rapid remediation to minimize unintended harmful outputs from models. Depending on the nature and severity of the incident, this might involve implementing immediate containment measures restricting access to the model either externally, internally or both.” (p. 11)

“We will maintain detailed documentation for G42’s most capable models, including design decisions, testing results, risk assessments, and incident reports.” (p. 11)

“Examples of foundational safeguards that may be applied include: Model Cards: Documents published
alongside each new model deployment, summarising the model’s intended use cases, performance on
public benchmarks, and the responsible practices conducted to ensure safety.” (p. 7)

3.2.3.2 Commitment to non-interference with findings (15%) 25%

No commitment to permitting the reports, which detail the results of external evaluations (i.e. any KRI or KCI assessments conducted by third parties), to be written independently and without interference or suppression.

Quotes:

No relevant quotes found.

3.2.4 Monitoring for novel risks (10%) 25%

3.2.4.1 Identifying novel risks post-deployment: engages in some process (post deployment) explicitly for identifying novel risk domains or novel risk models within known risk domains (50%) 25%

There is a clear emphasis on identifying novel risks. However, no explicit process for uncovering novel risks, post-deployment, in the deployment context, is detailed. They indicate post-deployment monitoring will take place. This could be built upon to detect novel risks. They do note that asynchronous monitoring aims to find “unsual behaviours”; more detail could be added here for an improved score on how exactly they anticipate their monitoring setup will be likely to detect novel risks.

The emphasis on “near miss” incidents as a mechanism to trigger expanded monitoring of other risk domains aligns well with this criterion; partial credit is given here. However, to improve, detection of near misses should be proactively found, rather than relying on reactive recognition of near accidents.

Quotes:

“This Framework emphasizes proactive risk identification and mitigation, centering on capability monitoring, robust governance, and multi-layered safeguards to ensure powerful AI models are both innovative and safe. With a systematic approach to early threat detection and risk management, it aims to support G42 in unlocking the benefits of frontier AI safely and ethically.” (p. 3)

“We plan to integrate decisions on whether to expand our monitoring to
include additional hazardous capabilities into our regular framework review process. This includes both our scheduled internal reviews and our annual reviews by third parties. In making these decisions, we expect to consider factors such as: “near miss” incidents, whether internal or industry-wide; recommendations from trusted external experts; as well as changes in industry standards for AI risk management.” (p. 4)

“Asynchronous Monitoring: This offcycle review catches anomalies
missed in real-time, assessing all stored interactions for unusual behaviors. (p. 8)

3.2.4.2 Mechanism to incorporate novel risks identified post-deployment (50%) 25%

Whilst they mention a mechanism for including novel risks via conducting the regular framework review process, there is no mechanism defined to incorporate novel risks into the risk modeling itself. To improve, discovery of a changed risk profile or novel risk domain should trigger risk modelling exercises for all existing capabilities, or at least those likely to be affected. They do mention an intent to incorporate risks such as advanced manipulation in future – a mechanism for deciding when to incorporate this as a risk would be an improvement.

Quotes:

“In the future, we will map out other hazardous capabilities to consider monitoring. We may also add thresholds for:

Autonomous Operation: When an AI system can make unsupervised
decisions with critical implications, particularly in sectors such as healthcare or defense.
Advanced Manipulation: Applicable when AI systems can influence human behavior or decisions on a large scale, warranting enhanced monitoring and usage restrictions.

We plan to integrate decisions on whether to expand our monitoring to
include additional hazardous capabilities into our regular framework review process. This includes both our scheduled internal reviews and our annual reviews by third parties. In making these decisions, we expect to consider factors such as: “near miss” incidents, whether internal or industry-wide; recommendations from trusted external experts; as well as changes in industry standards for AI risk management.” (p. 4)

G42

4. Risk Governance

Moderate 42%

4.1 Decision-making (25%) 60%

4.1.1 The company has clearly defined risk owners for every key risk identified and tracked (25%) 0%

No mention of risk owners.

Quotes:

No relevant quotes found.

4.1.2 The company has a dedicated risk committee at the management level that meets regularly (25%) 90%

The company has a Frontier AI Governance Board that oversees operations.

Quotes:

“A dedicated Frontier AI Governance Board, composed of our Chief Responsible AI Officer, Head of Responsible AI, Head of Technology Risk, and General Counsel, shall oversee all frontier model operations reviewing safety protocols, risk assessments, and escalation decisions.” (p. 11)

4.1.3 The company has defined protocols for how to make go/no-go decisions (25%) 75%

The framework outlines clear decision-making protocols.

Quotes:

“If a given G42 model achieves lower performance on relevant open-source benchmarks than a model produced by an outside organization that has been evaluated to be definitively below the capability threshold, then such G42 model will be presumed to be below the capability threshold.” (p. 4)

4.1.4 The company has defined escalation procedures in case of incidents (25%) 75%

The framework outlines clear incident response protocols.

Quotes:

“Incidence Response: Developing a comprehensive incident response plan that outlines the steps to be taken in the event of non-compliance. Incident detection should leverage automated mechanisms and human review, and non-sensitive incident information should be shared with applicable government bodies.” (p. 11)
“We plan for our response protocols to focus on rapid remediation to minimize unintended harmful outputs from models. Depending on the nature and severity of the incident, this might involve implementing immediate containment measures restricting access to the model either externally, internally or both.” (p. 11)

4.2. Advisory and Challenge (20%) 25%

4.2.1 The company has an executive risk officer with sufficient resources (16.7%) 25%

The framework does not mention a risk officer, but mentions the existence of several adjacent roles.

Quotes:

“A dedicated Frontier AI Governance Board, composed of our Chief Responsible AI Officer, Head of Responsible AI, Head of Technology Risk, and General Counsel, shall oversee all frontier model operations, reviewing safety protocols, risk assessments, and escalation decisions. (p. 11)

4.2.2 The company has a committee advising management on decisions involving risk (16.7%) 0%

No mention of an advisory committee.

Quotes:

No relevant quotes found.

4.2.3 The company has an established system for tracking and monitoring risks (16.7%) 25%

The framework includes mentions of how risks are continuously tracked.

Quotes:

“G42 will conduct evaluations throughout the model lifecycle to assess whether our models are approaching Frontier Capability Thresholds.” (p. 4)
“In addition to pre-deployment evaluations, post-deployment monitoring will also be used to indicate whether G42’s models have reached capability thresholds and whether increased deployment mitigation and security mitigation levels are required.” (p. 5)

4.2.4 The company has designated people that can advise and challenge management on decisions involving risk (16.7%) 50%

The framework includes a dedicated AI Governance Board which can be assumed to play an advise and challenge role.

Quotes:

4.2.5 The company has an established system for aggregating risk data and reporting on risk to senior management and the Board (16.7%) 50%

The framework clearly states that risk information will be reported to the Board and senior management.

Quotes:

4.2.6 The company has an established central risk function (16.7%) 0%

No mention of a central risk function.

Quotes:

No relevant quotes found.

4.3 Audit (20%) 70%

4.3.1 The company has an internal audit function involved in AI governance (50%) 50%

The framework does not mention an internal audit function, but uniquely mentions that independent internal audits will take place.

Quotes:

“Annual Governance Audits: G42 will have independent internal audits to verify compliance with our policy.” (p. 12)

4.3.2 The company involves external auditors (50%) 90%

The framework uniquely includes mentions of external audits.

Quotes:

“Third-Party Experts: As deemed appropriate, we will solicit external expert advice for capability and safeguards assessments. This may include partnering with private or civil society organisations with expertise in AI risk management to provide input on our assessments plans and/or internal capability reports ahead of deployment decisions.” (p. 12)
“External Audits: To reinforce accountability, G42 will engage in annual external audits to verify compliance with the Framework.” (p. 12)

4.4 Oversight (20%) 0%

4.4.1 The Board of Directors of the company has a committee that provides oversight over all decisions involving risk (50%) 0%

No mention of a Board risk committee.

Quotes:

No relevant quotes found.

4.4.2 The company has other governing bodies outside of the Board of Directors that provide oversight over decisions (50%) 0%

No mention of any additional governance bodies.

Quotes:

No relevant quotes found.

4.5 Culture (10%) 47%

4.5.1 The company has a strong tone from the top (33.3%) 50%

The framework includes clear statements on risk responsibilities.

Quotes:

“As a leader in AI innovation, G42 is committed to developing AI systems that align with its principles that prioritize fairness, reliability, safety, privacy, security and inclusiveness to reflect and uphold societal values.” (p. 3)
“This Framework emphasizes proactive risk identification and mitigation, centering on capability monitoring, robust governance, and multi-layered safeguards to ensure powerful AI models are both innovative and safe.” (p. 3)

4.5.2 The company has a strong risk culture (33.3%) 0%

No mention of elements of risk culture.

Quotes:

No relevant quotes found.

4.5.3 The company has a strong speak-up culture (33.3%) 0%

The framework clearly states whistleblower mechanisms.

Quotes:

“Reporting Mechanisms: To foster a proactive safety culture, clearly defined channels for reporting security incidents and compliance issues will be established. This includes creating mechanisms for employees to anonymously report potential concerns of non-compliance and ensuring that these reports are promptly addressed.” (p. 12)

4.6 Transparency (5%) 72%

4.6.1 The company reports externally on what their risks are (33.3%) 50%

The framework clearly states which risks are in scope.

Quotes:

“An initial list of potentially hazardous AI capabilities which G42 will monitor for is: Biological Threats: When an AI’s capabilities could facilitate biological security threats, necessitating strict, proactive measures. Offensive Cybersecurity: When an AI’s capabilities could facilitate cybersecurity threats, necessitating strict, proactive measures.” (p. 4)

4.6.2 The company reports externally on what their governance structure looks like (33.3%) 75%

The framework includes very clear details on the governance responsibilities of the Governance Board.

Quotes:

“Public Disclosure: G42 will publish non-sensitive, up-to-date and active copies of the Framework. We will share more detailed information with the UAE Government and relevant policy stakeholders.” (p. 12)
“G42 will publish an annual transparency report detailing its approach to frontier models, sharing key insights and fostering public trust.” (p. 12)
“A dedicated Frontier AI Governance Board, composed of our Chief Responsible AI Officer, Head of Responsible AI, Head of Technology Risk, and General Counsel, shall oversee all frontier model operations, reviewing safety protocols, risk assessments, and escalation decisions. Responsibilities of the Frontier AI Governance Board include, but are not limited to:

Framework Oversight
Evaluating Model Compliance
Investigation
Incidence Response” (p. 11)

“An annual external review of the Framework will be conducted to ensure adequacy, continuously benchmarking G42’s practices against industry standards. G42 will conduct more frequent internal reviews, particularly in accordance with evolving standards and instances of enhanced model capabilities. G42 will proactively engage with government agencies, academic institutions, and other regulatory bodies to help shape emerging standards for frontier AI safety, aligning G42’s practices with evolving global frameworks. Changes to this Framework will be proposed by the Frontier AI Governance Board and approved by the G42 Executive Leadership Committee.” (p. 12)

4.6.3 The company shares information with industry peers and government bodies (33.3%) 90%

The framework includes many different bodies, including authorities and peers, with whom information will be shared.

Quotes:

“Threat Intelligence and Information Sharing: G42 will share threat intelligence with industry partners to address common challenges and emerging risks.” (p. 12)
“We will share more detailed information with the UAE Government and relevant policy stakeholders.” (p. 12)
“G42 will actively participate in forums to set industry standards and share best practices for frontier model safety.” (p. 12)
“Non-sensitive incident information should be shared with applicable government bodies.” (p. 11)

G42

Best in class

Overview

Clear evaluation protocols, strong speak-up culture, and clear go/no-go decision protocols. Strong pairing of risk thresholds to mitigation thresholds, with clear deployment and containment mitigation measures. Named external collaborators who helped refine focused risk domains.

Risk tolerance lacks precision, as do risk thresholds. Lacking commitments to share evaluation results. Lacking justification that evaluation methods are comprehensive enough to match threat actors. No mention of assurance processes, nor a plan to contribute to their development.

1. Risk Identification

1.1 Classification of Applicable Known Risks (40%) 25%

1.1.1 Risks from literature and taxonomies are well covered (50%) 25%

Quotes:

1.1.2 Exclusions are clearly justified and documented (50%) 25%

Quotes:

1.2 Identification of Unknown Risks (Open-ended red teaming) (20%) 7%

1.2.1 Internal open-ended red teaming (70%) 10%

Quotes:

1.2.2 Third party open-ended red teaming (30%) 0%

Quotes:

1.3 Risk modeling (40%) 15%

1.3.1 The company uses risk models for all the risk domains identified and the risk models are published (with potentially dangerous information redacted) (40%) 10%

Quotes:

1.3.2 Risk modeling methodology (40%) 2%

1.3.2.1 Methodology precisely defined (70%) 0%

Quotes:

1.3.2.2 Mechanism to incorporate red teaming findings (15%) 0%

Quotes:

1.3.2.3 Prioritization of severe and probable risks (15%) 10%

Quotes:

1.3.3 Third party validation of risk models (20%) 50%

2. Risk Analysis and Evaluation

2.1 Setting a Risk Tolerance (35%) 7%

2.1.1 Risk tolerance is defined (80%) 8%

2.1.1.1 Risk tolerance is at least qualitatively defined for all risks (33%) 25%

Quotes:

2.1.1.2 Risk tolerance is expressed at least partly quantitatively as a combination of scenarios (qualitative) and probabilities (quantitative) for all risks (33%) 0%

Quotes:

2.1.1.3 Risk tolerance is expressed fully quantitatively as a product of severity (quantitative) and probability (quantitative) for all risks (33%) 0%

Quotes:

2.1.2 Process to define the tolerance (20%) 0%

2.1.2.1 AI developers engage in public consultations or seek guidance from regulators where available (50%) 0%

Quotes:

2.1.2.2 Any significant deviations from risk tolerance norms established in other industries is justified and documented (e.g., cost-benefit analyses) (50%) 0%

Quotes:

2.2 Operationalizing Risk Tolerance (65%) 24%

2.2.1 Key Risk Indicators (KRI) (30%) 22%

2.2.1.1 KRI thresholds are at least qualitatively defined for all risks (45%) 25%

Quotes:

2.2.1.2 KRI thresholds are quantitatively defined for all risks (45%) 10%

Quotes:

2.2.1.3 KRIs also identify and monitor changes in the level of risk in the external environment (10%) 10%

Quotes:

2.2.2 Key Control Indicators (KCI) (30%) 25%

2.2.2.1 Containment KCIs (35%) 45%

2.2.2.1.1 All KRI thresholds have corresponding qualitative containment KCI thresholds (50%) 90%

Quotes:

2.2.2.1.2 All KRI thresholds have corresponding quantitative containment KCI thresholds (50%) 0%

Quotes:

2.2.2.2 Deployment KCIs (35%) 25%

2.2.2.2.1 All KRI thresholds have corresponding qualitative deployment KCI thresholds (50%) 50%

Quotes:

2.2.2.2.2 All KRI thresholds have corresponding quantitative deployment KCI thresholds (50%) 0%

Quotes:

2.2.2.3 For advanced KRIs, assurance process KCIs are defined (30%) 0%

Quotes:

2.2.3 Pairs of thresholds are grounded in risk modeling to show that risks remain below the tolerance (20%) 25%

Quotes:

2.2.4 Policy to put development on hold if the required KCI threshold cannot be achieved, until sufficient controls are implemented to meet the threshold (20%) 25%

Quotes:

3. Risk Treatment

3.1 Implementing Mitigation Measures (50%) 23%

3.1.1 Containment measures (35%) 34%

3.1.1.1 Containment measures are precisely defined for all KCI thresholds (60%) 50%

Quotes:

3.1.1.2 Proof that containment measures are sufficient to meet the thresholds (40%) 10%

Quotes:

3.1.1.3 Strong third party verification process to verify that the containment measures meet the threshold (100% if 3.1.1.3 > [60% x 3.1.1.1 + 40% x 3.1.1.2]) 10%

Quotes:

3.1.2 Deployment measures (35%) 30%

3.1.2.1 Deployment measures are precisely defined for all KCI thresholds (60%) 50%

Quotes:

3.1.2.2 Proof that deployment measures are sufficient to meet the thresholds (40%) 0%

Quotes:

Clear evaluation protocols, strong speak-up culture, and clear go/no-go decision protocols.

Strong pairing of risk thresholds to mitigation thresholds, with clear deployment and containment mitigation measures.

Named external collaborators who helped refine focused risk domains.

Risk tolerance lacks precision, as do risk thresholds.

Lacking commitments to share evaluation results.

Lacking justification that evaluation methods are comprehensive enough to match threat actors.

No mention of assurance processes, nor a plan to contribute to their development.

1.3.1 The company uses risk models for all the risk domains identified and the risk models are published (with potentially dangerous information redacted) (40%) 10%