The automation on this paper had this specific bit:critiques = [["Review your previous answer and find problems with your answer. And because GPT is generative, it may not be able to compare its own answer with the correct answer accurately. While I am not an OpenAI engineer, I want to talk about some of the ways they probably worked to make GPT-4 more of a reasoning model. But how does GPT do this? It doesn’t know the context for what it says or does, nor does it know what a word is.