GPT-3

Generative Pre-trained Transformer 3 (GPT-3)
Original author(s)	OpenAI
Initial release	May 28, 2020 (publication); June 11, 2020 (OA API beta)
Repository	github.com/openai/gpt-3 ;
Predecessor	GPT-2
Successor	GPT-3.5; GPT-4
Type	Large language model; Generative pre-trained transformer; Foundation model;
License	proprietary
Website	openai.com/blog/openai-api

Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020.

Like its predecessor, GPT-2, it is a decoder-only^[2] transformer model of deep neural network, which supersedes recurrence and convolution-based architectures with a technique known as "attention".^[3] This attention mechanism allows the model to focus selectively on segments of input text it predicts to be most relevant.^[4] GPT-3 has 175 billion parameters, each with 16-bit precision, requiring 350GB of storage since each parameter occupies 2 bytes. It has a context window size of 2048 tokens, and has demonstrated strong "zero-shot" and "few-shot" learning abilities on many tasks.^[2]

On September 22, 2020, Microsoft announced that it had licensed GPT-3 exclusively. Others can still receive output from its public API, but only Microsoft has access to the underlying model.^[5]

^ Cite error: The named reference preprint was invoked but never defined (see the help page).
^ ^a ^b Radford, Alec; Narasimhan, Karthik; Salimans, Tim; Sutskever, Ilya (June 11, 2018). "Improving Language Understanding by Generative Pre-Training" (PDF). p. 12. Archived (PDF) from the original on January 26, 2021. Retrieved July 31, 2020.
^ Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N; Kaiser, Łukasz; Polosukhin, Illia (2017). "Attention is All you Need" (PDF). Advances in Neural Information Processing Systems. 30. Curran Associates, Inc.
^ Bahdanau, Dzmitry; Cho, Kyunghyun; Bengio, Yoshua (September 1, 2014). "Neural Machine Translation by Jointly Learning to Align and Translate". arXiv:1409.0473 [cs.CL].
^ Hao, Karen (September 23, 2020). "OpenAI is giving Microsoft exclusive access to its GPT-3 language model". MIT Technology Review. Archived from the original on February 5, 2021. Retrieved September 25, 2020. The companies say OpenAI will continue to offer its public-facing API, which allows chosen users to send text to GPT-3 or OpenAI's other models and receive its output. Only Microsoft, however, will have access to GPT-3's underlying code, allowing it to embed, repurpose, and modify the model as it pleases.

[preprint-1] Cite error: The named reference preprint was invoked but never defined (see the help page).

[OpenAI_Radford_20200611-2] Radford, Alec; Narasimhan, Karthik; Salimans, Tim; Sutskever, Ilya (June 11, 2018). "Improving Language Understanding by Generative Pre-Training" (PDF). p. 12. Archived (PDF) from the original on January 26, 2021. Retrieved July 31, 2020.

[2018_Attention_Paper-3] Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N; Kaiser, Łukasz; Polosukhin, Illia (2017). "Attention is All you Need" (PDF). Advances in Neural Information Processing Systems. 30. Curran Associates, Inc.

[jointly-4] Bahdanau, Dzmitry; Cho, Kyunghyun; Bengio, Yoshua (September 1, 2014). "Neural Machine Translation by Jointly Learning to Align and Translate". arXiv:1409.0473 [cs.CL].

[MSgotcode-5] Hao, Karen (September 23, 2020). "OpenAI is giving Microsoft exclusive access to its GPT-3 language model". MIT Technology Review. Archived from the original on February 5, 2021. Retrieved September 25, 2020. The companies say OpenAI will continue to offer its public-facing API, which allows chosen users to send text to GPT-3 or OpenAI's other models and receive its output. Only Microsoft, however, will have access to GPT-3's underlying code, allowing it to embed, repurpose, and modify the model as it pleases.

[1]

[2]

[3]

[4]

[5]

Original author(s)	OpenAI^[1]
Initial release	May 28, 2020 (publication); June 11, 2020 (OA API beta)
Repository	github.com/openai/gpt-3
Predecessor	GPT-2
Successor	GPT-3.5 GPT-4
Type	Large language model Generative pre-trained transformer Foundation model
License	proprietary
Website	openai.com/blog/openai-api