Why Can't Google's LLM craft a rhyme?

Author

Pratik

We recently ran some tests that led to an interesting discovery. Google's PALM2, despite its reputation, struggles at writing poems—except when the subject is Kanye West. OpenAI GPT and Meta LLAMA models, on the other hand, perform exceptionally well. What gives?

We issued a straightforward challenge to multiple AI models: write a short poem about Steve Jobs and Tim Cook. Google's PALM2 LLM fell remarkably short in generating the poems. In stark contrast, OpenAI's and Facebook's models penned compelling verses for both prompts. The difference is striking and begs the question: Why? Compare the output below or try it out at this link

‍

Try it out: https://app.contentable.ai/playground/650d41ba1097a0c8745a6d8e

‍

Next we decided to try the same prompt on celebrities like Kanye West and Justin Bieber with the hope that it could be a cultural bias. We observed that PALM2 output did rhyme to some extent but it was still terrible compared to GPT and LLAMA

‍

Try it out : https://app.contentable.ai/playground/650d42d01097a0c8745a6d98

‍

Was this a cultural bias or was it a algorithmic preference? We tried other tech celebrities like Elon Musk and Sundar Pichai. Google PALM struggled with Sundar Pichai but did an average job with Elon Musk. This points fingers at training data as there is a lot of data on Elon Musk. We cannot confirm for sure as Steve Jobs would have a similar amount of training data to Elon Musk. Why did PALM get elon right and not steve jobs?

‍

Possible Explanations

Training Data: PALM2 may lack poetic examples for specific public figures, except singers.
Algorithmic Preferences: LLM might be optimized for different types of tasks and therefore struggles with poetry.
Cultural Bias: Kanye, being a musical artist, might align better with LLM's training data, explaining the model's relative success.
‍

‍

Implications

Specialization Matters: General-purpose models like Google's PALM2 may not be the best choice for all creative tasks.
Training Data: The models' capabilities are only as good as the data they’re trained on.
Fine-Tuning: OpenAI and LLAMA models may have undergone specific tuning to excel at this task.

‍
‍

Google's LLM's inconsistency in poetic tasks compared to OpenAI and LLMAS models underscores a crucial takeaway: not all AI models excel at the same tasks. This disparity is not just academic; it has practical implications for businesses looking to adopt AI solutions. The key is comparison. Platforms like contentable.ai allow you to evaluate multiple AI models head-to-head, ensuring you adopt the one that best aligns with your specific needs. Failure to compare could mean missing out on the most effective tools for your business.

‍

Try it out and let us know what you find when you compared different models at contentable.ai.

Your Cart

Why Can't Google's LLM craft a rhyme?

Possible Explanations

Implications

Read similar articles

99.9% teams love Collab. Not convinced you’re one?We love a challenge.

Why Can't Google's LLM craft a rhyme?

Possible Explanations

Implications

Subscribe our newsletter to get latest news & updates

Read similar articles

99.9% teams love Collab. Not convinced you’re one?We love a challenge.