NIXsolutions: OpenAI's New CriticGPT is Trained to "Criticize" GPT-4 Results

OpenAI has introduced CriticGPT, a new artificial intelligence model designed to identify errors in code generated directly by ChatGPT. CriticGPT will be used as an algorithmic assistant for testers who review the program code produced by ChatGPT. According to a new study, “LLM Critics Help Catch LLM Bugs,” published by OpenAI, the new CriticGPT model is designed as an AI assistant for expert testers who check the program code generated by ChatGPT. CriticGPT, based on the GPT-4 Large Language Model (LLM) family, analyzes code and flags potential errors, making it easier for coders to spot flaws that might otherwise go undetected due to human error. The researchers trained CriticGPT on a data set of code samples containing intentionally introduced errors, teaching it to recognize and flag various errors.

NIXsolutions

Impact on Coding and Testing

The scientists found that in 63% of cases involving naturally occurring LLM errors, annotators preferred CriticGPT’s human critique. Additionally, teams using CriticGPT wrote more comprehensive reviews than people not using the AI assistant, while reducing the rate of confabulations (false facts and hallucinations). The development of an automated “critic” involved training a model on a large number of input data with intentional errors introduced. Experts were asked to modify the code written by ChatGPT, introducing errors, and then provide a result with supposedly discovered bugs. This process allowed the model to learn to identify and criticize different types of errors in the code.

In experiments, CriticGPT demonstrated the ability to catch both introduced bugs and naturally occurring errors in ChatGPT response results. The researchers also created a new method called “Force Sampling Beam Search” (FSBS) that helps CriticGPT write more detailed code reviews, allowing them to adjust the thoroughness of their problem searches while controlling the false positive rate. Interestingly, CriticGPT’s capabilities go beyond simple code review. In the experiments, the model was applied to a set of ChatGPT training data that had previously been rated as flawless by humans. Surprisingly, CriticGPT identified errors in 24% of cases, which were subsequently confirmed by experts. OpenAI believes this demonstrates the model’s potential not only for working with technical problems but also highlights its ability to catch subtle errors that can elude even careful human inspection.

Future Challenges and Limitations

Despite its promising results, CriticGPT, like all AI models, has limitations, notes NIXsolutions. The model was trained on relatively short ChatGPT responses, which may not fully prepare it to evaluate the longer, more complex tasks that future AI systems may face. The research team recognizes that the model is most effective at detecting bugs that can be identified in one specific, bottleneck area of the code. However, real errors in AI output can often be scattered across multiple parts of the answer, presenting a challenge for future iterations of the model. Additionally, although CriticGPT reduces confabulation, it does not eliminate it completely, and human experts can still make mistakes based on this false data. We’ll keep you updated on future developments and improvements.