UK AI Safety Institute: a project for the future of AI?


The UK Frontier AI Taskforce, a government-funded initiative launched in April 2023 as the Foundation Model Taskforce, is evolving into the UK AI Safety Institute.

British Prime Minister Rishi Sunak announced the establishment of the Institute during his closing speech at the AI ​​Safety Summit, held in Bletchley Park, England, on November 2, 2023.

He said the UK government’s ambition for the new entity is to make it a global platform responsible for testing the safety of emerging types of AI.

The Institute will carefully test new types of cutting-edge AI before and after their release to assess the potentially dangerous capabilities of AI models, including exploring all risks, from social harms such as bias and misinformation, to “to the most unlikely but extreme risks, such as humanity completely losing control of AI,” the British government said in a statement. a public statement.

To further this mission, the UK’s AI Safety Institute will partner with national organizations such as the Alan Turing Institute, Imperial College London, TechUK and the Startup Coalition. Everyone welcomed the launch of the Institute.

It will also collaborate with private AI companies in the UK and overseas. Some of them, like Google DeepMind and OpenAI, have already publicly supported this initiative.

Partnerships confirmed with the United States and Singapore

Sunak added that the Institute will be at the forefront of the UK government’s AI strategy and will be tasked with cementing the country’s position as a world leader in AI security.

In taking on this role, the UK’s AI Safety Institute will partner with similar institutions in other countries.

The Prime Minister has already announced two confirmed partnerships to collaborate on AI safety testing with the recently announced US AI Safety Institute and the Singapore government.

Read more: AI Security Summit: Biden-Harris Administration Launches US AI Security Institute

Ian Hogarth, president of Frontier AI Working Group, will continue to chair the Institute. The task force’s external advisory board, made up of industry heavyweights ranging from national security to IT, will now advise the new global hub.

Eight AI companies agree to test their models before deployment

In addition, Sunak announced that several countries, including Australia, Canada, France, Germany, Italy, Japan, Korea, Singapore, the United States, the United Kingdom and the delegation of the EU, had signed an agreement to test the leading companies’ AI models.

To contribute to this mission, eight companies involved in AI development – ​​Amazon Web Services (AWS), Anthropic, Google, Google DeepMind, Inflection AI, Meta, Microsoft, Mistral AI and OpenAI – have agreed to “dig deeper” access to their future. AI models before they are made public.


However, he added that it is dangerous to rely only on pre-deployment testing.

The reasons given are:

  • Models may be disclosed (e.g. Meta’s LLaMA model).
  • It is difficult to test dangerous abilities. “We don’t know how to test (safely) whether an AI can self-replicate, for example. Or how to test if it fools humans,” Pause AI said.
  • Bad Actors Can Still Create Dangerous AI – and pre-deployment testing cannot prevent this from happening.
  • Some abilities are even dangerous in AI labs. “A self-replicating AI, for example, could escape the laboratory before deployment,” Pause AI wrote.
  • Abilities can be added or discovered after trainingincluding fine-tuning, jailbreaking, and runtime improvements.

Read more: 28 countries sign the Bletchley Declaration on Responsible AI Development

Leave a comment