ChatGPT eats cannibals
ChatGPT hype is starting to wane, with Google searches for “ChatGPT” down 40% from its peak in April, while web traffic to OpenAI’s ChatGPT website has been down almost 10% in the past month.
This is only to be expected — however GPT-4 users are also reporting the model seems considerably dumber (but faster) than it was previously.
One theory is that OpenAI has broken it up into multiple smaller models trained in specific areas that can act in tandem, but not quite at the same level.
But a more intriguing possibility may also be playing a role: AI cannibalism.
The web is now swamped with AI-generated text and images, and this synthetic data gets scraped up as data to train AIs, causing a negative feedback loop. The more AI data a model ingests, the worse the output gets for coherence and quality. It’s a bit like what happens when you make a photocopy of a photocopy, and the image gets progressively worse.
While GPT-4’s official training data ends in September 2021, it clearly knows a lot more than that, and OpenAI recently shuttered its web browsing plugin.
A new paper from scientists at Rice and Stanford University came up with a cute acronym for the issue: Model Autophagy Disorder or MAD.
“Our primary conclusion across all scenarios is that without enough fresh real data in each generation of an autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease,” they said.
Essentially the models start to lose the more unique but less well-represented data, and harden up their outputs on less varied data, in an ongoing process. The good news is this means the AIs now have a reason to keep humans in the loop if we can work out a way to identify and prioritize human content for the models. That’s one of OpenAI boss Sam Altman’s plans with his eyeball-scanning blockchain project, Worldcoin.