China's AI Safety Framework: Balancing Innovation, Control, and Global Influence

Source: carnegieendowment.org

Published on October 17, 2025 at 02:18 PM

China is trying to balance the immense potential of artificial intelligence with the need to control its risks. A new framework reveals how China plans to manage this balancing act, potentially shaping the future of technology worldwide.

Technical Standards as Guardrails

China's AI policy community believes technical standards and model evaluations are crucial. These mechanisms, developed with leading technologists, aim to create guardrails without heavy regulation.

China is investing heavily in these areas, though its evaluation system lags behind the United States currently. This framework offers guidance that could inform technical standards or regulations to address risks.

AI+ Plan and Underlying Fears

The Chinese Communist Party is pushing its AI+ Plan to upgrade the economy, society, and government. The framework also reflects concerns about controlling information, socioeconomic impacts, and AI escaping human control.

The term "safety" in Chinese can also mean "security," blurring the lines between extreme risks and broader security issues. The framework addresses catastrophic risks, including the potential misuse of AI for weapons development.

Risks of Open-Source Models

This version focuses on governing open-source models, acknowledging risks from models with open weights. These risks include the propagation of model defects and the potential for malicious use.

One proposed countermeasure is to exclude sensitive training data in high-risk fields like nuclear and biological weapons. The framework emphasizes human control over advanced systems, suggesting mechanisms like circuit breakers.

Expert Insights and Cross-Organizational Effort

Experts see the discussion of extreme risks as a critical update, bringing existential threats into policy considerations. The framework was a collaborative effort involving leading experts, organizations, and companies like Alibaba and Huawei.

It bridges regulatory, academic, and commercial stakeholders, suggesting it may serve as a technical reference and policy foundation. The AI Safety Governance Framework 2.0 builds on an earlier version, addressing advances like open-source models.

Domestic Focus and Global Implications

The framework seems primarily focused domestically, providing a roadmap for new technical standards. It also serves as a "Chinese approach" to addressing risks, potentially influencing global AI governance.

It offers insight into the CCP’s technology policy process, understanding emerging technology before charting a path. The Cyberspace Administration of China guided the project, with contributions from TC260 and CNCERT-CC.

Comprehensive Governance Measures

The framework pairs specific risks with technical countermeasures and calls for an AI safety assessment system. Creating an effective evaluation ecosystem is critical for advancing safety without heavy regulation.

China currently weighs development opportunities more heavily than risk, but is giving more weight to safety. The framework articulates how experts in China understand and mitigate risks.

Labor Market Impacts and Societal Risks

Framework 2.0 discusses labor market impacts and societal risks, noting the decline in demand for traditional labor. It also raises concerns about the erosion of independent learning and research capacity.

It states that version 1.0 aimed to implement the Global AI Governance Initiative and promote coordinated AI safety governance. Both versions released official English translations, suggesting an international focus.

Domestic or International Driver?

The key question is whether the framework is primarily for international or domestic audiences, impacting its interpretation. A domestic focus suggests it reflects Chinese thinking on risks and future regulatory actions.

TC260 released a draft "AI Safety Standards System (V1.0)" to implement the original framework. Some of those standards, covering the labeling of AI-generated content, have been finalized and put into effect.

Updates and Sectoral Regulations

Framework 2.0 offers more detailed discussion of risks and potential mitigations compared to the initial version. It deepens discussion around risks from open-source models, employment, and catastrophic risks.

Many aspects suggest it is directed at domestic actors, evidenced by its release during China Cybersecurity Week. It also aims to shape sectoral AI regulations, introducing a risk categorization and grading system.