Itdaily - ChatGPT vs. Claude vs. Gemini vs. Mistral: Which improves text best?

ChatGPT vs. Claude vs. Gemini vs. Mistral: Which improves text best?

ChatGPT vs. Claude vs. Gemini vs. Mistral: Which improves text best?

AI is a popular tool for editing texts. We investigate which AI model is currently most suitable for this with a comparative test.

Arriving at the third test in our series of AI comparisons, we look at the copy-editing capabilities of AI. In this scenario, we have the AI tools improve a text that is full of errors.

Many users rely on AI to improve their texts for spelling and grammar. In this comparison, we evaluate four AI tools on how they improve a faulty Dutch text, with the premise that the meaning and tone remain the same.

The prompt is:

Improve the grammar in this text, keep it objective and formal, and preserve the meaning and tone.

‘Gistere ging ik naar de winkel om eten te kopen maar ik had mijn portoomonee thuis laten liggen. Dat was vervelend wat ik had alle boodschappen al gepakt en sta aan de kassa te wachten. De kassierster was verbaasd en zei dat ik moest betalen, maar kon dus niet. Daarna ben ik snel terug naar huis gejogt en daarna weer terug naar de winkel gegaan om te betalen, dat koste veel tijd.’

Gemini: balanced but formal

Gemini delivers a polished and formal text that stays close to the original. The sentences are made more fluid and grammatical errors are correctly improved. However, the improved text is much more formal than the original and therefore reads slightly less smoothly.

An overview of the most important adjustments is a nice extra. This allows the user to immediately see which choices were made and what has changed. This makes the output not only correct but also valuable. The style remains natural and easy to read without making the text unnecessarily complicated.

Claude: fluid and strong

Claude provides a text that reads smoothly and makes a professional impression. The sentence structure is logical and the phrasing is correct.

Here we again see a nice overview of the applied changes, and once more the choice is made to use somewhat more formal language. We asked in the prompt to maintain the original tone, so Claude does not entirely meet our requirements. “… deelde mee dat ik diende te betalen” is archaic language that is hardly used in professional emails or texts.

ChatGPT: formal and dry

ChatGPT opts for a more formal and dry style. The text is not only grammatically improved but also slightly more adjusted in terms of content. Words are replaced by very formal alternatives, which gives the text more body, but at the same time ensures that this text deviates too much from the original tone. The sentence structure is logical, but the new text is not faithful to the source.

Mistral: surprisingly natural language

Mistral focuses primarily on correcting basic errors and stays closest to the original text structure. The improved text reads much more smoothly and the tone and writing style are well preserved. Mistral chooses not to proactively provide extra explanation, but that does not detract from the quality of the text.

Conclusion

The four AI tools clearly show different ways of improving a text. Both Gemini and Claude have a strong balance between correctness, readability, and explanation, but do not maintain the original tone. Mistral does this very well but lacks an overview of the changes. That overview was not requested, so in terms of quality and improvements, Mistral scores the best on the text.

Claude distinguishes itself in turn with its fluid and professional style, while ChatGPT excels mainly in formal reformulation, though it aligns less with the original text.

Tool Tone Clarity Usability Structure Focus on prompt Overall score Note
        
ChatGPT 6 8 6 7 6 8,5 Text changed too drastically
Claude 7 9 7 7 6 9,2 not the same tone
Gemini 6 8 8 7 6 8,8 Practical tips
Mistral 8 6 8 8 8 7 Perfectly preserved tone

This article is the third in our series of AI tool comparisons in which we compare various capabilities of four major AI models. In the complete test, we evaluate coding skills, data analysis, drafting an email, and summarizing a meeting… Click here for the full overview.