Hallucination (artificial intelligence)

A Sora-generated video of the Glenfinnan Viaduct, incorrectly showing a second track whereas the real viaduct has only one, a second chimney on its interpretation of the train The Jacobite, and some carriages much longer than others.

In the field of artificial intelligence (AI), a hallucination or artificial hallucination (also called bullshitting,^[1]^[2] confabulation^[3] or delusion^[4]) is a response generated by AI that contains false or misleading information presented as fact.^[5]^[6]^[7] This term draws a loose analogy with human psychology, where hallucination typically involves false percepts. However, there is a key difference: AI hallucination is associated with erroneous responses rather than perceptual experiences.^[7]

For example, a chatbot powered by large language models (LLMs), like ChatGPT, may embed plausible-sounding random falsehoods within its generated content. Researchers have recognized this issue, and by 2023, analysts estimated that chatbots hallucinate as much as 27% of the time,^[8] with factual errors present in 46% of generated texts.^[9] Detecting and mitigating these hallucinations pose significant challenges for practical deployment and reliability of LLMs in real-world scenarios.^[10]^[8]^[9] Some researchers believe the specific term "AI hallucination" unreasonably anthropomorphizes computers.^[3]

^ Dolan, Eric W. (9 June 2024). "Scholars: AI isn't "hallucinating" -- it's bullshitting". PsyPost - Psychology News. Retrieved 11 June 2024.
^ Hicks, Michael Townsen; Humphries, James; Slater, Joe (8 June 2024). "ChatGPT is bullshit". Ethics and Information Technology. 26 (2): 38. doi:10.1007/s10676-024-09775-5. ISSN 1572-8439.
^ ^a ^b Edwards, Benj (6 April 2023). "Why ChatGPT and Bing Chat are so good at making things up". Ars Technica. Retrieved 11 June 2023.
^ "Shaking the foundations: delusions in sequence models for interaction and control". www.deepmind.com. 22 December 2023.
^ "Definition of HALLUCINATION". www.merriam-webster.com. 21 October 2023. Retrieved 29 October 2023.
^ Joshua Maynez; Shashi Narayan; Bernd Bohnet; Ryan McDonald (2020). "On Faithfulness and Factuality in Abstractive Summarization". Proceedings of The 58th Annual Meeting of the Association for Computational Linguistics (ACL) (2020). arXiv:2005.00661. Retrieved 26 September 2023.
^ ^a ^b Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang; Madotto, Andrea; Fung, Pascale (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. 55 (12). Association for Computing Machinery: 1–38. arXiv:2202.03629. doi:10.1145/3571730. S2CID 246652372. Retrieved 15 January 2023.
^ ^a ^b Metz, Cade (6 November 2023). "Chatbots May 'Hallucinate' More Often Than Many Realize". The New York Times.
^ ^a ^b de Wynter, Adrian; Wang, Xun; Sokolov, Alex; Gu, Qilong; Chen, Si-Qing (13 July 2023). "An evaluation on large language model outputs: Discourse and memorization". Natural Language Processing Journal. 4. arXiv:2304.08637. doi:10.1016/j.nlp.2023.100024. ISSN 2949-7191.
^ Cite error: The named reference cnbc several errors was invoked but never defined (see the help page).

[1] Dolan, Eric W. (9 June 2024). "Scholars: AI isn't "hallucinating" -- it's bullshitting". PsyPost - Psychology News. Retrieved 11 June 2024.

[2] Hicks, Michael Townsen; Humphries, James; Slater, Joe (8 June 2024). "ChatGPT is bullshit". Ethics and Information Technology. 26 (2): 38. doi:10.1007/s10676-024-09775-5. ISSN 1572-8439.

[ars_making_things_up-3] Edwards, Benj (6 April 2023). "Why ChatGPT and Bing Chat are so good at making things up". Ars Technica. Retrieved 11 June 2023.

[4] "Shaking the foundations: delusions in sequence models for interaction and control". www.deepmind.com. 22 December 2023.

[Merriam-Webster2023-5] "Definition of HALLUCINATION". www.merriam-webster.com. 21 October 2023. Retrieved 29 October 2023.

[6] Joshua Maynez; Shashi Narayan; Bernd Bohnet; Ryan McDonald (2020). "On Faithfulness and Factuality in Abstractive Summarization". Proceedings of The 58th Annual Meeting of the Association for Computational Linguistics (ACL) (2020). arXiv:2005.00661. Retrieved 26 September 2023.

[axiv-7] Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang; Madotto, Andrea; Fung, Pascale (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. 55 (12). Association for Computing Machinery: 1–38. arXiv:2202.03629. doi:10.1145/3571730. S2CID 246652372. Retrieved 15 January 2023.

[nyt-8] Metz, Cade (6 November 2023). "Chatbots May 'Hallucinate' More Often Than Many Realize". The New York Times.

[de_Wynter-2023-9] Wynter, Adrian; Wang, Xun; Sokolov, Alex; Gu, Qilong; Chen, Si-Qing (13 July 2023). "An evaluation on large language model outputs: Discourse and memorization". Natural Language Processing Journal. 4. arXiv:2304.08637. doi:10.1016/j.nlp.2023.100024. ISSN 2949-7191.

[cnbc_several_errors-10] Cite error: The named reference cnbc several errors was invoked but never defined (see the help page).

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]