OpenAI to clean false information from ChatGPT.

On May 31st, OpenAI announced that they are working on improving ChatGPT’s ability to solve mathematical problems in an effort to reduce instances of artificial intelligence (AI) hallucinations. The ultimate goal is to develop AI that is aligned with human values, and reducing hallucinations is a crucial step toward achieving that.

In March, the latest version of ChatGPT, ChatGPT-4, was introduced and gained mainstream attention. However, generative AI chatbots have long struggled with factual accuracy, sometimes generating false information, which is known as “hallucinations.” OpenAI announced their efforts to reduce these hallucinations through a post on their website.

AI hallucinations occur when artificial intelligence systems generate factually incorrect outputs that are misleading or unsupported by real-world data. These hallucinations can take on various forms, such as generating false information, creating non-existent events or people, or providing inaccurate details about a certain topic.

OpenAI conducted research to examine the effectiveness of two types of feedback: “outcome supervision” and “process supervision.” Outcome supervision involves feedback based on the final result, while process supervision provides input for each step in a chain of thought. OpenAI evaluated these models using math problems, generating multiple solutions and selecting the highest-ranked solution according to each feedback model.

After thorough analysis, the research team found that process supervision yielded superior performance because it encouraged the model to follow a human-approved process. In contrast, outcome supervision proved more difficult to scrutinize consistently.

OpenAI recognizes that the implications of process supervision extend beyond mathematics and further investigation is necessary to understand its effects in different domains. The company expressed the possibility that if the observed outcomes hold in broader contexts, process supervision could be a favorable combination of performance and alignment compared with outcome supervision. To facilitate research, the company publicly released the complete dataset of process supervision, inviting exploration and study in this area.

Related: AI demand briefly catapults Nvidia into $1T club

Although OpenAI did not provide explicit instances that prompted its investigation into hallucinations, two recent occurrences exemplified the problem in real-life scenarios.

In a recent incident, lawyer Steven Schwartz in the Mata vs. Avianca Airlines case acknowledged relying on the chatbot as a research resource. However, the information provided by ChatGPT turned out to be entirely fabricated, highlighting the issue at hand.

OpenAI’s ChatGPT is not the only example of artificial intelligence systems encountering hallucinations. During a demonstration of its chatbot technology in March, Microsoft’s Bing AI chatbot examined earnings reports and generated inaccurate figures for companies like Gap and Lululemon.

Magazine: 25K traders bet on ChatGPT’s stock picks, AI sucks at dice throws, and more