Hallucination (artificial intelligence)

A Sora-generated video of the Glenfinnan Viaduct, incorrectly showing a second track where the real viaduct has only one, and second chimney on its apparent interpretation of the train The Jacobite, which also has distortion in the fourth carriage's windows.

In the field of artificial intelligence (AI), a hallucination or artificial hallucination (also called bullshitting,^[1]^[2] confabulation^[3] or delusion^[4]) is a response generated by AI that contains false or misleading information presented as fact.^[5]^[6]^[7] This term draws a loose analogy with human psychology, where hallucination typically involves false percepts. However, there is a key difference: AI hallucination is associated with erroneous responses rather than perceptual experiences.^[7]

For example, a chatbot powered by large language models (LLMs), like ChatGPT, may embed plausible-sounding random falsehoods within its generated content. Researchers have recognized this issue, and by 2023, analysts estimated that chatbots hallucinate as much as 27% of the time,^[8] with factual errors present in 46% of generated texts.^[9] Detecting and mitigating these hallucinations pose significant challenges for practical deployment and reliability of LLMs in real-world scenarios.^[10]^[8]^[9] Some researchers believe the specific term "AI hallucination" unreasonably anthropomorphizes computers.^[3]

^ Dolan, Eric W. (9 June 2024). "Scholars: AI isn't "hallucinating" -- it's bullshitting". PsyPost - Psychology News. Retrieved 11 June 2024.
^ Hicks, Michael Townsen; Humphries, James; Slater, Joe (8 June 2024). "ChatGPT is bullshit". Ethics and Information Technology. 26 (2): 38. doi:10.1007/s10676-024-09775-5. ISSN 1572-8439.
^ ^a ^b Edwards, Benj (6 April 2023). "Why ChatGPT and Bing Chat are so good at making things up". Ars Technica. Retrieved 11 June 2023.
^ "Shaking the foundations: delusions in sequence models for interaction and control". www.deepmind.com. 22 December 2023.
^ "Definition of HALLUCINATION". www.merriam-webster.com. 21 October 2023. Retrieved 29 October 2023.
^ Joshua Maynez; Shashi Narayan; Bernd Bohnet; Ryan McDonald (2020). "On Faithfulness and Factuality in Abstractive Summarization". Proceedings of The 58th Annual Meeting of the Association for Computational Linguistics (ACL) (2020). arXiv:2005.00661. Retrieved 26 September 2023.
^ ^a ^b Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang; Madotto, Andrea; Fung, Pascale (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. 55 (12). Association for Computing Machinery: 1–38. arXiv:2202.03629. doi:10.1145/3571730. S2CID 246652372. Retrieved 15 January 2023.
^ ^a ^b Metz, Cade (6 November 2023). "Chatbots May 'Hallucinate' More Often Than Many Realize". The New York Times.
^ ^a ^b de Wynter, Adrian; Wang, Xun; Sokolov, Alex; Gu, Qilong; Chen, Si-Qing (13 July 2023). "An evaluation on large language model outputs: Discourse and memorization". Natural Language Processing Journal. 4. arXiv:2304.08637. doi:10.1016/j.nlp.2023.100024. ISSN 2949-7191.
^ Cite error: The named reference cnbc several errors was invoked but never defined (see the help page).

[1] Dolan, Eric W. (9 June 2024). "Scholars: AI isn't "hallucinating" -- it's bullshitting". PsyPost - Psychology News. Retrieved 11 June 2024.

[2] Hicks, Michael Townsen; Humphries, James; Slater, Joe (8 June 2024). "ChatGPT is bullshit". Ethics and Information Technology. 26 (2): 38. doi:10.1007/s10676-024-09775-5. ISSN 1572-8439.

[ars_making_things_up-3] Edwards, Benj (6 April 2023). "Why ChatGPT and Bing Chat are so good at making things up". Ars Technica. Retrieved 11 June 2023.

[4] "Shaking the foundations: delusions in sequence models for interaction and control". www.deepmind.com. 22 December 2023.

[Merriam-Webster2023-5] "Definition of HALLUCINATION". www.merriam-webster.com. 21 October 2023. Retrieved 29 October 2023.

[6] Joshua Maynez; Shashi Narayan; Bernd Bohnet; Ryan McDonald (2020). "On Faithfulness and Factuality in Abstractive Summarization". Proceedings of The 58th Annual Meeting of the Association for Computational Linguistics (ACL) (2020). arXiv:2005.00661. Retrieved 26 September 2023.

[axiv-7] Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang; Madotto, Andrea; Fung, Pascale (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. 55 (12). Association for Computing Machinery: 1–38. arXiv:2202.03629. doi:10.1145/3571730. S2CID 246652372. Retrieved 15 January 2023.

[nyt-8] Metz, Cade (6 November 2023). "Chatbots May 'Hallucinate' More Often Than Many Realize". The New York Times.

[de_Wynter-2023-9] Wynter, Adrian; Wang, Xun; Sokolov, Alex; Gu, Qilong; Chen, Si-Qing (13 July 2023). "An evaluation on large language model outputs: Discourse and memorization". Natural Language Processing Journal. 4. arXiv:2304.08637. doi:10.1016/j.nlp.2023.100024. ISSN 2949-7191.

[cnbc_several_errors-10] Cite error: The named reference cnbc several errors was invoked but never defined (see the help page).

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]