GPT-4o

From Wikipedia, the free encyclopedia
Generative Pre-trained Transformer 4 Omni (GPT-4o)
Developer(s)OpenAI
Initial releaseMay 13, 2024; 8 days ago (2024-05-13)
PredecessorGPT-4 Turbo
Type
LicenseProprietary
Websiteopenai.com/index/hello-gpt-4o

GPT-4o (GPT-4 Omni) is a multilingual, multimodal generative pre-trained transformer designed by OpenAI. It was announced by OpenAI's CTO Mira Murati during a live-streamed demo on 13 May 2024 and released the same day.[1] GPT-4o is free, but with a usage limit that is 5 times higher for ChatGPT Plus subscribers.[2] It can process and generate text, images and audio.[3] Its API is twice as fast and half the price of its predecessor, GPT-4 Turbo.[1]

Background[edit]

GPT-4o was originally shadow launched on the Large Model Systems Organization (LMSYS) as 3 different models. These 3 models were called gpt2-chatbot, im-a-good-gpt2-chatbot, and im-also-a-good-gpt2-chatbot.[4] On 7 May 2024, Sam Altman tweeted "im-a-good-gpt2-chatbot", which was commonly interpreted as a confirmation that these were new OpenAI models being A/B tested.[5][6]

Capabilities[edit]

GPT-4o achieved state-of-the-art results in voice, multilingual, and vision benchmarks, setting new records in audio speech recognition and translation.[7][8] GPT-4o scored 88.7 on the Massive Multitask Language Understanding (MMLU) benchmark compared to 86.5 by GPT-4.[9] Unlike GPT-3.5 and GPT-4, which rely on other models to process sound, GPT-4o natively supports voice-to-voice, making the response nearly instant and seamless.[9] Sam Altman noted on 15 May 2024 that GPT-4o's voice-to-voice capabilities were not yet integrated into ChatGPT, and that the old version was still being used.[10]

The model supports over 50 languages,[1] which OpenAI claims cover over 97% of speakers.[11] Mira Murati demonstrated the model's multilingual capability by speaking Italian to the model and having it translate between English and Italian during the live-streamed OpenAI demo event on 13 May 2024. In addition, the new tokenizer uses fewer tokens for certain languages, especially languages that are not based on the Latin alphabet, making it cheaper for those languages.[9]

GPT-4o has knowledge up to October 2023[12][13] and has a context length of 128k tokens[12] with output token limit capped to 2048.[13]

As of May 2024, it is the leading model in the Large Model Systems Organization (LMSYS) Elo Arena Benchmarks by the University of California, Berkeley.[14]

Scarlett Johansson controversy[edit]

As released, GPT-4o offered five voices: Breeze, Cove, Ember, Juniper and Sky. A similarity between the voice of Scarlett Johansson and Sky was quickly noticed. On May 14, Entertainment Weekly asked themselves whether this likeness was on purpose.[15] On May 18, Johansson's husband, Colin Jost, joked about the similarity in a segment on SNL.[16] On May 20, 2024, OpenAI disabled the Sky voice, issuing a statement saying "We've heard questions about how we chose the voices in ChatGPT, especially Sky. We are working to pause the use of Sky while we address them."[17]

American actress Scarlett Johansson starred in Spike Jonze's sci-fi movie Her in 2013, playing the role of Samantha, an artificially intelligent virtual assistant personified through a female voice. As part of the promotion leading up to the release of GPT-4o, Sam Altman on May 13 tweeted a single word: "her".[18][19]

OpenAI claimed that each voice was based on the voice work of a hired actor. Specifically, OpenAI claimed "Sky’s voice is not an imitation of Scarlett Johansson but belongs to a different professional actress using her own natural speaking voice."[17] CTO Mira Murati stated "I don't know about the voice. I actually had to go and listen to Scarlett Johansson's voice." OpenAI further claimed the voice talent was recruited before reaching out to Johansson.[19]

On May 21, Johansson issued a statement explaining that OpenAI had repeatedly offered to make her a deal to gain permission to use her voice as early as nine months prior to release, a deal she rejected. She said she was "shocked, angered and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine that my closest friends and news outlets could not tell the difference." In the statement, Johansson also used the incident to draw attention to the lack of legal safeguards around the use of creative work to power leading AI tools, as her legal counsel demanded OpenAI detail the specifics of how the Sky voice was created.[19][20]

Observers noted similarities to how Johansson had previously sued and settled with The Walt Disney Company for breach of contract over the direct-to-streaming rollout of her Marvel film Black Widow,[21] a settlement widely speculated to have netted her around $40M.[22]

See also[edit]

References[edit]

  1. ^ a b c Wiggers, Kyle (2024-05-13). "OpenAI debuts GPT-4o 'omni' model now powering ChatGPT". TechCrunch. Retrieved 2024-05-13.
  2. ^ Field, Hayden (2024-05-13). "OpenAI launches new AI model GPT-4o and desktop version of ChatGPT". CNBC. Retrieved 2024-05-14.
  3. ^ Claburn, Thomas. "OpenAI unveils GPT-4o, a fresh multimodal AI flagship model". The Register. Retrieved 2024-05-18.
  4. ^ Edwards, Benj (2024-05-13). "Before launching, GPT-4o broke records on chatbot leaderboard under a secret name". Ars Technica. Retrieved 2024-05-17.
  5. ^ Sam, Altman (2024-05-07). "https://twitter.com/sama/status/1787222050589028528" Twitter, X. Retrieved 14 May 2024.
  6. ^ Zeff, Maxwell (2024-05-07). "Powerful New Chatbot Mysteriously Returns in the Middle of the Night". Gizmodo. Retrieved 2024-05-17.
  7. ^ van Rijmenam, Mark (13 May 2024). "OpenAI Launched GPT-4o: The Future of AI Interactions Is Here". The Digital Speaker. Retrieved 17 May 2024.
  8. ^ Daws, Ryan (2024-05-14). "GPT-4o delivers human-like AI interaction with text, audio, and vision integration". AI News. Retrieved 2024-05-18.
  9. ^ a b c "Hello GPT-4o". OpenAI.
  10. ^ "OpenAI GPT-4o: How to access GPT-4o voice mode; insights from Sam Altman". The Times of India. 2024-05-16. ISSN 0971-8257. Retrieved 2024-05-18.
  11. ^ Edwards, Benj (2024-05-13). "Major ChatGPT-4o update allows audio-video talks with an "emotional" AI chatbot". Ars Technica. Retrieved 2024-05-17.
  12. ^ a b "Models - OpenAI API". OpenAI. Retrieved 17 May 2024.
  13. ^ a b Conway, Adam (2024-05-13). "What is GPT-4o? Everything you need to know about the new OpenAI model that everyone can use for free". XDA Developers. Retrieved 2024-05-17.
  14. ^ Franzen, Carl (2024-05-13). "OpenAI announces new free model GPT-4o and ChatGPT for desktop". VentureBeat. Retrieved 2024-05-18.
  15. ^ Stenzel, Wesley (May 14, 2024). "ChatGPT launching talking AI that sounds exactly like Scarlett Johansson in 'Her' — on purpose?". Entertainment Weekly. Retrieved 2024-05-21.
  16. ^ Caruso, Nick (2024-05-20). "Scarlett Johansson Says She Was 'Shocked, Angered and in Disbelief' After Hearing ChatGPT Voice That Sounds Like Her — Read Statement". TVLine. Retrieved 2024-05-21.
  17. ^ a b "How the voices for ChatGPT were chosen". OpenAI. May 19, 2024.
  18. ^ "her". X (formerly Twitter). May 13, 2024. Retrieved 2024-05-21.
  19. ^ a b c Allyn, Bobby (May 20, 2024). "Scarlett Johansson says she is 'shocked, angered' over new ChatGPT voice". NPR.
  20. ^ Mickle, Tripp (2024-05-20). "Scarlett Johansson Said No, but OpenAI's Virtual Assistant Sounds Just Like Her". The New York Times. ISSN 0362-4331. Retrieved 2024-05-21.
  21. ^ "Scarlett Johansson took on Disney. Now she's battling OpenAI over a ChatGPT voice that sounds like hers". Yahoo Finance. 2024-05-21. Retrieved 2024-05-21.
  22. ^ Pulver, Andrew (2021-10-01). "Scarlett Johansson settles Black Widow lawsuit with Disney". The Guardian. ISSN 0261-3077. Retrieved 2024-05-21.