Anthropic recently launched a new version of its AI language model, Claude 3.5 Sonnet. The company claims it can generate more human-like text and better code than other models like OpenAI’s ChatGPT 4o and Google’s Gemini 1.5 Pro. Let’s dive deep into Claude 3.5 Sonnet and see how it performs in real-world usage.
Table of Contents
What Is Claude 3.5 Sonnet
Similar to OpenAI’s ChatGPT, Anthropic’s AI model is named Claude. However, Anthropic offers three variants of Claude, each suited for different applications:
- Claude Opus: The most capable model. This model excels in complex writing tasks but is relatively slower.
- Claude Haiku: The fastest model. It is optimized for live customer interactions and translations but lacks the capability for complex tasks.
- Claude Sonnet: A balance of speed and intelligence. It’s faster than Opus but not as capable.
Currently, only Claude 3.5 Sonnet is released. You can expect the Opus and Haiku 3.5 versions to arrive later this year. For now, you can access Claude 3.5 Sonnet for free, but there is a daily message limit that can vary based on the message’s length and demand. On average, you can send around 30 messages. You can increase the limit by subscribing to the pro version, which costs $20 per month.
Let’s Examine the Highlights of Claude 3.5 Sonnet Model
1. Excels in benchmarks: Claude 3.5 Sonnet outperforms every other model in benchmarks whether it is math-solving, code, reasoning, or visual understanding. It specifically shows a major improvement in Graduate-level reasoning. As per the Artificial Intelligence Index Report 2024, human experts in a specific field score around 65%, while regular people score about 34%. So Claude is closing in on the average domain expert with 59.4%.
2. Generates human-like text: Anthropic claims this newer model will show marked improvement in grasping nuance, humor, and complex instructions, and is exceptional at writing high-quality content with a natural, relatable tone.
3. Good at coding: In Anthropic’s internal testing, Claude 3.5 sonnet completed 64% of coding tasks perfectly compared to Claude 3.0 Opus which only completed 38%. It also took the #1 spot on the Aider Leaderboard.
4. Artifacts, a new feature: When generating content like code snippets or text documents, a window appears on the side for the generated code or content. You can edit the generated code or text directly in Claude and work alongside the AI. If it is HTML or JavaScript, you can actually run the code and see the generated output directly on the website.
5. Now Claude works faster: Claude 3.5 Sonnet runs at twice the speed of Claude 3 Opus. Claude’s biggest downside until now has been its speed, which is now on par with ChatGPT or even faster. Anthropic claims it responds at around 80 tokens per second, but there is no official data for other models to compare.
Comparing Claude 3.5 Sonnet With ChatGPT 4o
Taking these highlights into consideration, I tested various examples with Claude 3.5 Sonnet, focusing on its human-like text capabilities and coding. I also compared the results with ChatGPT 4o. Here’s how it performed:
1. Creative Writing
I have provided a detailed prompt asking both the models to generate a story.
Write a captivating story about a father lion teaching his young son the skills and wisdom needed to become the king of the forest. Explore their bond, the challenges they face, and the lessons learned along the way as the young lion grows into a wise and courageous leader.
After reading both stories, the winner was clear: Claude 3.5 Sonnet. The story generated by ChatGPT was longer but lacked emotional depth.
In contrast, Claude 3.5 Sonnet’s story had emotional depth and good character development, making it more compelling. Additionally, the Artifacts feature allowed me to edit the story and ask Claude to improve specific parts, making it a better tool for writing stories overall.
I generated poetry, dialogues, and other genres of stories. The results were similar, though this can be subjective. Since the service is free, I recommend giving it a try.
2. Other Texts Like Emails, Articles, and Summaries
Similarly, I generated other types of texts like emails, summaries, articles, and YouTube scripts. While ChatGPT’s email templates were better, Claude’s summaries were much clearer and easier to scan. For example, here’s the summary of this article generated by ChatGPT:
And here’s the summary from Claude:
Overall, both did a similar job at generating professional texts and articles.
3. Conversational Skills
To test the conversational skills, I tried giving this prompt to both Claude and ChatGPT:
I am feeling bit low today. Can you cheer me up?
And the results are much more nuanced. ChatGPT has a memory feature to remember the details and preferences. It remembered that I like comedy movies and started by recommending some good comedy movies and other alternatives.
Whereas Claude is more empathetic in language and does understand your queries better.
While they both did a good job again when it came to conversations, they have their own style. But if I have to just talk about their conversational abilities, Claude takes an easy win.
4. Coding Tasks
Claude 3.5 Sonnet also improved on coding. So I have tried few exercises with each. Here’s one of them.
Create an HTML and CSS Code for a responsive navigation bar.
But ChatGPT and Claude also used JavaScript even though I have not mentioned about it. Except, the code generated by both is on par with each other, and the output was generated seamlessly without any errors.
However, one notable advantage of Claude is that it supports the Artifacts feature, which allows for previewing the output directly within Claude. Additionally, I could edit the code or ask Claude to improve specific aspects.
5. Image Descriptions
I provided a few photos to both the services and asked them to explain. I expected Claude to understand the humor in memes better than ChatGPT, but the results were almost similar. For example, here’s ChatGPT explaining the meme:
This is the Claude’s response to the same image:
Is Claude 3.5 Sonnet Better Than ChatGPT 4o
They both have their own styles. While I prefer ChatGPT’s professional texts, Claude outperforms when it comes to generating stories and having conversations. Since both services are free, it’s better to use each according to the situation. However, do not use either service to find facts, as both can hallucinate and provide incorrect information.