A field of study within computer science, focused on the development of computer systems that can accomplish tasks typically associated with human intelligence. These tasks include speech recognition, route mapping, decision making, etc.
The training data of an AI model can skew the output, leading it to generate inaccurate or offensive material.
A program designed to communicate with humans in a natural manner, sometimes to facilitate providing information or completing tasks.
A chatbot developed by OpenAI. ChatGPT is a transformer type of AI that is designed to mimic conversations using natural language processing, through which users can write prompts to generate text-based responses.
A model of artificial intelligence that can generate new content such as text, images, video, etc., through pattern recognition, by examining large amounts of training data and creating material that contains similar characteristics to identified patterns in the dataset. Examples include ChatGPT, Claude, Midjourney or DALL-E.
Instances where a generative AI model generates output that contains inaccurate or irrelevant information, especially when it may look correct. For example, when asking ChatGPT (or any text-based generative AI) to generate a list of citations for a topic, the citations it provides may look accurate but the source material associated with the citation may not actually exist when searching for it.
An AI model that receives large amounts of training data that establishes the capacity for it to respond to conversational queries. AI such as ChatGPT, Bard, or Claude use LLM.
The programmed capacity to understand conversations and respond in kind.
A structured text-based query that asks a generative AI to generate new content in the form of text, image, video, etc.
The process of refining prompts to elicit more desirable results from generative AI.
The development of a generative AI model involves the input of specific types of data, often in large amounts. This process is referred to as “training” and it determines the content output of the specific model. For example, if developing an AI that reviews artwork specifically, the AI model will be trained only on data containing artwork.
ChatGPT was trained on a body of text which allows it to generate text in response to a prompt. Some partial lists of the training dataset exist, and ChatGPT will also provide a partial list when queried. However, the entire body of text that has trained ChatGPT is unknown.
When ChatGPT provides an answer to a question, it will not immediately provide a reference for where the information came from. This is because it is pulling predictive language from a wide variety of places, so the information usually doesn't come from a single source. Because of this, you typically cannot trace the response back to a single parent source or know where the information came from.
Yes, when prompted, ChatGPT can provide references. However, these references may not be where the information actually came from and - more importantly - may not be for real sources. Despite sounding plausible, ChatGPT can easily hallucinate citations. This can be an issue with other generative AI tools, like Gemini, as well.
For example, ChatGPT provided this citation and summary in response to a prompt about needing an academic journal article on the topic of amphetamine use and social determinants of health:
"Here's an example of a peer-reviewed journal article on amphetamine abuse and its intersections with social determinants of health:
Stoové, M. A., Dietze, P. M., Jolley, D., Fry, C. L., Brewer, N., & Hellard, M. E. (2008). Increasing access to sterile injecting equipment and syringe exchange in a rural area with a high prevalence of bloodborne viruses. Australian Journal of Rural Health, 16(2), 99-106.
This article describes a study conducted in a rural area of Australia with a high prevalence of bloodborne viruses, including those associated with amphetamine use..." OpenAI. (2023, February 17). [ChatGPT response to a prompt about locating an academic journal article]. https://chat.openai.com/
Although the summary sounds plausible and the citation looks realistic, this article does not exist. The journal exists, as does the lead author. However, Stoové has not published in this journal.
As of 1 March 2023, the cutoff date for the data ChatGPT was trained on is September 2021. This means the tool has limited access to events and information more recent than that. ChatGPT is being updated regularly, so this may change. However, it is important to realize that the currency of the information provided by ChatGPT is lagging. This can impact information credibility, especially when dealing with a topic where the age of your information matters.
Here is ChatGPT's response to this topic on 1 March 2023:
"My knowledge cutoff date for training documents is 2021-09. This means that I was trained on a large corpus of text up to that point in time, and any information or events that occurred after that date may not be included in my knowledge base. However, I am constantly being updated and fine-tuned by my creators at OpenAI to improve my accuracy and relevance." OpenAI. (2023, March 1). [ChatGPT response to a prompt about its cutoff date for training documents]. https://chat.openai.com/
Update
As of 24 March 2023, OpenAI has began implementing plugins for ChatGPT which will "help [it] access up-to-date information, run computations, or use third-party services." The access to current information is not yet a part of the ChatGPT research preview commonly used.
Gemini and currency
Gemini does not have a cutoff date for the information it was trained on. However, this does not mean Gemini will still be wholly accurate.
Content on this page was adapted from AI, ChatGPT, and the Library Libguide by Amy Scheelke for Salt Lake Community College, licensed CC BY-NC 4.0, except where otherwise noted.
509-542-4887 library@columbiabasin.edu 2600 N 20th Ave, Pasco, WA. 99301