<img src="https://certify.alexametrics.com/atrk.gif?account=43vOv1Y1Mn20Io" style="display:none" height="1" width="1" alt="">

OpenAI releases GPT-4: more power, same problems

2 minute read
Pic: Shutterstock

The relentless pace of AI development continues with OpenAI’s release of GPT-4, the Large Language Model that powers ChatGPT, Bing, and an increasingly large number of other apps.

The rapid homogenisation of the internet continues with OpenAI’s announcement of GPT-4, which is billed as, well, an iterative step ahead of GPT-3.5, the model that most people will have interacted with up to date. 

Headline new capability is an ability to respond to images. It’s going to be able to generate recipe suggestions out of a mosaic picture of ingredients for instance, as well as write captions and picture descriptions from uploaded graphics, which might be a boon for accessibility on the internet. 

It can also digest far more text than 3.5 could, up to 25,000 words, allowing it to spit out summaries of that long report you really don’t want to bother yourself reading.

It is not going to be as freely available as 3.5 was, however. GPT-4 will initially be available only to either $20 a month ChatGPT Plus subscribers or those using Microsoft's Bing search engine platform. Thank you for helping train our AI, O huddled masses, and goodbye.

All in all though it highlights the sheer speed of progress in the field. The research paper describing GPT was only published in 2018 and the company has iterated 4 full versions of it already, the last ones in public as it sought to widen not only the pool of trainers but also decided to ignore its initial concerns about malicious usage that had kept it behind closed doors and get monetisation going. 

Users should not expect to be blown away by the new version, however. While the company is happy to point out its improvement in exhibiting human-level performance on some professional and academic benchmarks (it passes a simulated bar exam with a score around the top 10% of test takers whereas GPT-3.5’s score was around the bottom 10%), it also says that in a casual conversation the distinction between GPT-3.5 and GPT-4: “can be subtle. The difference comes out when the complexity of the task reaches a sufficient threshold—GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5.”

The company is also very keen to address safety concerns. 

“We spent 6 months making GPT-4 safer and more aligned,” it writes. “GPT-4 is 82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5 on our internal evaluations.”

Which is all fine until you realise the sheer number of requests that are going to be passing over its servers every minute of every day when a service has 100 million active users a month. So, here’s the rather important disclaimer that also comes with the new release:

“GPT-4 still has many known limitations that we are working to address, such as social biases, hallucinations, and adversarial prompts.”

And this is still a huge potential problem. Yesterday, Microsoft confirmed that Bing has been running on GPT-4 for the past five weeks, and there are any number of stories about how people have been able to break its guardrails to produce entertaining and even rather alarming results. 

That’s not stopping its increasing adoption by other apps, however. Online language learning specialist Duolingo, payment processor Stripe, visual impairment aide Be My Eyes, and investment bankers Morgan Stanley are all integrating it into different workflows in different manners. More are to come soon, as GPT-4 is going to be accessible as an API for developers to build on with a waitlist starting to admit applicants from now. 

Tags: Technology