Companies Are Being Reckless With AI

Today I want to talk about a problem with the incentive structure inside which powerful AIs are being developed.

Companies are incentivized to be the first to develop new technology since they’ll benefit from the first-mover advantage. The problem is that it’s cheaper and easier to develop powerful unsafe AI than powerful safe(r) AI. So companies are economically incentivized to neglect AI safety. I have been thinking about this issue for a while and now we have a very concrete example of it.

Microsoft’s Bing Chat was released very quickly, almost certainly to prevent competitors from releasing something similar first. Microsoft didn’t share its training methodology for Bing Chat, so all we can do is speculate. But there are strong indicators that Bing Chat wasn’t trained using Reinforcement Learning from Human Feedback (RLHF) despite RLHF seeming to yield safer AI than other methods at the moment.

Microsoft had clear monetary incentives not to use RLHF since it takes more time and money to implement over other techniques. So naturally, Bing Chat was less aligned and apparently less safe than OpenAI’s ChatGPT, which is itself already misaligned and thus unsafe. I think Microsoft has since improved Bing Chat’s alignment, but their past actions still set a very dangerous precedent.

Companies continuing to develop increasingly powerful AI disregarding safety poses an existential threat to humanity. They can’t be allowed to continue.

In my opinion, the core problem we’re coming up against is that we as a species have no way to limit or even slow down technological development, even when we need to. When new technology comes out, such as the internal-combustion engine, personal computers, smartphones, social media, artificial intelligence, nuclear weapons, etc, it tends to get adopted first and reeled back later. We are more reactive than proactive when it comes to technology.

But there is no reacting to artificial general intelligence. It’s going to be smarter and faster than us at making decisions. We have to be proactive. We either have to make safe AGI before we figure out how to make AGI or we have to get into a position where the only entities capable of creating AGI won’t activate it unless it’s provably safe. To pull this off, we need global cooperation.

I’m aware that neither of those options is easy to pull off, but it’s hard to see an alternative.

Note: The same logic could apply to nations if there is indeed an AI arms race.