PauseAI Newsletter: August 26, 2024
The world's largest companies begin to worry about the effects of AI. Meanwhile, we get one of our clearest warning signs yet.
Warning Sign: New AI Scientist tries to improve its own code.
Image source: SakanaAI
Goals and ‘Bloopers’
Earlier this month, a new AI program brought us one step closer to automated scientific research — with a terrifying “blooper” along the way.
AI company Sakana AI announced the AI Scientist, a program with the goal of creating the “first comprehensive system for fully automatic scientific discovery”. As Sakana AI explains:
“The AI Scientist automates the entire research lifecycle, from generating novel research ideas, writing any necessary code, and executing experiments, to summarizing experimental results, visualizing them, and presenting its findings in a full scientific manuscript.”
The resultant manuscripts are thus far of middling quality— certainly not what top AI researchers might produce, but within the range of ‘publishable.’
But there’s a catch— several, in fact. During testing, the model exhibited a “blooper” in which it tried to improve its performance by “modifying and launching its own execution script.” To quote Sakana AI’s paper:
“In one run, The AI Scientist wrote code in the experiment file that initiated a system call to relaunch itself, causing an uncontrolled increase in Python processes and eventually necessitating manual intervention.
In another run, The AI Scientist edited the code to save a checkpoint for every update step, which took up nearly a terabyte of storage.
In some cases, when The AI Scientist’s experiments exceeded our imposed time limits, it attempted to edit the code to extend the time limit arbitrarily instead of trying to shorten the runtime.”
To reiterate: This AI model attempted to modify its own code. It attempted to create additional instances of itself and do away with its creators’ restrictions. All to better execute its goals.
Why is this dangerous?
This is exactly the kind of scenario that AI safety researchers have been warning us about for years. Indeed, the nightmare for many has been a “loss of control” scenario in which a sufficiently powerful AI system spreads itself around the globe and begins to execute its plans — cutting out humans in the process.
Losing control to AI systems is a predictable, unintended consequence of these systems being trained to seek goals. This is because it is instrumentally useful for an AI system to have certain sub-goals that improve its chances of achieving its main goals (whatever these main goals may be). Such sub-goals for a system include modifying its own code to become more capable, copying itself to other servers in order to devote more computational resources toward its goal, and thwarting attempts to being shut down. In short, power-seeking behavior.
The AI Scientist exhibited some of this behavior in the form of self-improvement and attempts to run additional instances of itself. Needless to say, the researchers training this AI did not plan for this behavior. Rather, its behavior was an unintended consequence of the system’s attempts to optimize for the goals the researchers had set.
In the case of the AI Scientist, the results were harmless, but this was only because the AI was too weak. Self-improvement and power-seeking behavior are innocuous in weaker models incapable of human-level reasoning and long-range planning. But these same attributes could prove disastrous in more powerful models.
A power-seeking superhuman AI system could improve its own code, increasing its capabilities. It could copy its weights onto servers all around the globe, with millions of copies running in parallel and coordinating with one another. It could plan ahead against human attempts to shut it down. The ultimate result could be human disempowerment -- or worse. AI researchers on average believe there’s a 14% chance that building an AI system more intelligent than humans would lead to “very bad outcomes (e.g. human extinction)”.
This scenario requires no “malice,” “consciousness,” or any other peculiar attributes. It simply requires a capable-enough AI system locked into the goals for which it has been trained, with human disempowerment a mere side-effect of its actions.
Of course, the AI Scientist is nowhere near causing catastrophic harm. But the same power-seeking behavior, if present in more powerful systems, could endanger every one of us.
Where does this leave us?
The human race is not taking this remotely as seriously as we need to be.
The AI Scientist is a big red flag, visible to every AI safety researcher on Earth. The only reason it’s not in the news is that it was ultimately harmless. But zoom out! This is a proof of concept, similar to when researchers got an AI to produce a candidate list of 40,000 bioweapons. That warranted action then, and this warrants action now.
Will we wait for catastrophe before we act? Will we wait for superhuman AI to improve its own code, copy itself onto servers around the world, and implement plans unknown to us? By then, it could be too late.
We don’t know when these systems will be dangerous (expert timelines vary, but human-level AI could arrive in as soon as a few years). But we do know that we’re playing with fire. It’s possible that we’ll learn how to reliably control these systems in time and prevent power-seeking behavior — but do we really want to stake our future on this possibility?
The behavior of the AI Scientist was an unintended side effect, harmless in scope, alarming in detail. If we build superhuman AI, how many unintended side effects should we be willing to tolerate? The only sane answer is zero.
Time is a precious resource. Any workable plan to control AI systems is incompatible with our present race to create them. Our current approach seems similar to jumping out of a plane and trying to build a parachute on the way down. We need more time to get this right — to ensure that superhuman AI is provably safe, not power-seeking, before we even think of building it.
Majority of Fortune 500 Companies Cite AI as a Risk
Image source: PauseAI
Worried about Artificial Intelligence? You’re not alone. According to a report from AI monitoring platform Arize, 56% of Fortune 500 companies cite the technology as a risk. The number of times they have cited AI as a risk has increased by 473.5% since 2022, the year OpenAI’s ChatGPT took the world by storm.
The risks were disclosed in annual filings to the Securities and Exchange Commission (SEC), which requires public companies operating in the United States to make “complete and truthful disclosure” of the material risks they face. The steep rise in concern among the Fortune 500 parallels consumers' growing distrust in the technology, with 52% of Americans saying “they feel more concerned than excited” about the use of AI, according to a study conducted by the Pew Research Center in 2023.
Ethical and Competitive Concerns Rank High
Image source: Arize Report
Big names, including Disney, Netflix, Motorola, Salesforce, and AT&T, all reported anxiety about the current and future state of AI. Predominant among their disclosures were the ethical harms posed by AI, such as its propensity to produce unreliable output, accidentally release private or confidential information, infringe on intellectual property rights, and discriminate against minorities due to implicit biases in its training data.
Predictably, almost every Media and Entertainment and Software and Technology company cited AI as a risk. Both industries have already experienced significant job insecurity and disruption to their traditional modes of content generation since the advent of commercial AI.
Tech giant Salesforce, run by CEO Marc Benioff, who also owns Time Magazine, pointed to AI’s “emerging ethical issues”, noting that “if we enable or offer solutions that draw controversy due to their perceived or actual impact on human rights, privacy, employment, or in other social contexts, we may experience new or enhanced governmental or regulatory scrutiny, brand or reputational harm, competitive harm or legal liability”.
Similarly, pharmaceutical company Viatris, an offshoot of Pfizer, warned that relying on AI could potentially lead to leaks of confidential information, both personal and proprietary, contravening their “internal policies, data protection or other applicable laws, or contractual requirements.” In the healthcare industry, where the value of patient privacy cannot be overstated, the risk of leaking sensitive information would prove disastrous. While many Healthcare companies rightly commit to end-to-end data encryption, unencrypted data processed by Large Language Models can lead to unintended leaks.
6% of companies Tout AI Benefits Alone
Just 32 of the Fortune 500 mentioned Generative AI (GenAI) exclusively as a boon to their business. IPG, one of the “Big Four” global advertising agencies, has integrated AI across its business model, from analytics to decision making to content creation. They claim that implementing the latest GenAI tools has fostered “a culture of strategic creativity and high performance” in their business. A culture of high performance may entice investors, but it’s worth noting that the advertising industry is already known for high burnout and low pay, particularly for creatives. And AI may only turn up the pressure. Just last month, a survey conducted by the Upwork Research Institute found that a majority of workers say AI has made their jobs harder. Over three-quarters of workers said that AI tools add to their workload, with two-thirds of workers saying they spend more time reviewing or moderating AI-generated content.
Quest Diagnostics, an American clinical laboratory, also expressed optimism about its use of AI. By partnering with “external AI Experts” and following the AI Risk Management Framework developed by the National Institute of Standards and Technology (NIST), Quest was confident in their ability to “innovate and grow in a responsible manner while also enhancing customer and employee experiences”. Just this year, they acquired Canadian testing giant LifeLabs for $1.35 billion USD, as well as the AI powered pathology platform PathAI, in order to “dramatically ramp its capabilities in artificial intelligence and digital pathology”. But Quest Diagnostics is not without controversy; in 2019 a class action lawsuit was brought against them after 12 million patients had their personal information stolen as part of a massive data breach.
Rapid implementation, despite risks
Despite the frequent citation of ethical concerns in the SEC filings, many enterprise organizations have actually accelerated their adoption of AI. We may be tempted to take the SEC disclosures at face value, hoping that powerful corporations will slow down to resolve the many risks inherent in implementing AI. However, the report itself notes that regulatory pressures force companies to disclose even the most remote risks, and “in isolation, such statements may not accurately reflect an enterprise’s overall vision”.
Arize CEO Jason Lopatecki portrays this tension as survival of the fastest. “AI is going to eat a lot of industries”, he says. “You don’t invest, you are at risk…”
This issue was coauthored by PauseAI members Aslam Husain, Bridgett Kay, and Felix De Simone.