🌻 E40 - Is ChatGPT Changing the Way We Speak?

Have you noticed how often words like ‘delve’ and ‘realm’ are being used lately?

In partnership with

Every day, ChatGPT handles more than 1.5 million requests. Think about that for a second.

Just this morning, I used a tool called Cursor seven times in the span of two hours while building an application. You’re likely using AI tools too—maybe even without realizing it. Whether we’re typing out prompts or letting these AI systems assist us, one thing is clear: they’re deeply integrated into our lives. And now, it’s starting to show in an unexpected way—our language.

That raises a fascinating question: Could ChatGPT actually be changing the way we speak?

To answer this, researchers from the Center for Humans and Machines and the Center for Adaptive Rationality at the Max-Planck Institute for Human Development in Germany dove into the data. They analyzed a massive set of 280,000 English-language videos—presentations, talks, and speeches—coming from over 20,000 YouTube channels, all tied to academic institutions.

What they found was significant. Certain words, patterns, and styles of communication unique to AI, especially ChatGPT, are starting to seep into how we speak. It’s subtle, but it’s there. And it may just be the beginning of a much larger shift in human culture.

The Daily Newsletter for Intellectually Curious Readers

If you're frustrated by one-sided reporting, our 5-minute newsletter is the missing piece. We sift through 100+ sources to bring you comprehensive, unbiased news—free from political agendas. Stay informed with factual coverage on the topics that matter.

🌸Choice Cuts

🌼 Bigger Isn’t Always Better: Rethinking Synthetic Data Generation

I’ve always had this intuition: a bigger model can generate better synthetic data.

It makes sense, right? When we’re training a smaller, task-specific model—especially for something like math or programming—we rely heavily on synthetic data. This data helps cover all the permutations and combinations of real-world scenarios.

We even apply different strategies like Chain of Thought (CoT) or ReACT as template to the synthetic datasets to optimize training. So naturally, I thought a larger model would be able to generate richer variations of data.

But I overlooked one critical issue: bigger models can also hallucinate more.

That’s where smaller models come into play. When fine-tuned for a specific task, they can actually generate more reliable synthetic data without the risk of hallucinations.

Google even experimented with this approach, and the results were eye-opening. Smaller, task-specific models can offer a more precise, cost-effective solution for synthetic data generation.

It’s a fascinating shift in thinking—bigger isn’t always better, especially when it comes to training models for specific tasks.

Our findings reveal that models finetuned on WC-generated data consistently outperform those trained on SE-generated data across multiple benchmarks and multiple choices of WC and SE models. These results challenge the prevailing practice of relying on SE models for synthetic data generation, suggesting that WC may be the compute-optimal approach for training advanced LM reasoners.

I used to be a huge fan of LLM Lingua. Whenever I retrieved data from multiple sources or agents, I’d run it through LLM Lingua as a prompt compressor.

The goal? To cut down on the cost of using generative language models without sacrificing quality.

For the most part, it worked well. But when it hit production, the feedback was mixed—some developers loved it, others weren’t so sure. One of my brilliant colleagues even wrote a blog about it, sharing their thoughts and experiences.

Then came LanguaShrink—a tool that took prompt compression to a whole new level.

Inspired by insights that LLM performance is linked to the density and position of key information in prompts, LanguaShrink applies psycholinguistic principles and even taps into the Ebbinghaus memory curve. It’s designed to be task-agnostic, so it works across different kinds of prompts, and early results look promising.

The best part? They haven’t released the code just yet, but it’s definitely one to watch. Could this be the future of efficient prompt compression? Time will tell.

Transform the way you run your business using AI (Extended Labour day Sale)💰

Imagine a future where your business runs like a well-oiled machine, effortlessly growing and thriving while you focus on what truly matters.
This isn't a dream—it's the power of AI, and it's within your reach.

Join this AI Business Growth & Strategy Masterclass and discover how to revolutionize your approach to business.
In just 4 hours, you’ll gain the tools, insights, and strategies to not just survive, but dominate your market.

What You’ll Experience: 
🌟 Discover AI techniques that give you a competitive edge
💡 Learn how to pivot your business model for unstoppable growth
💼 Develop AI-driven strategies that turn challenges into opportunities
⏰ Free up your time and energy by automating the mundane, focusing on what you love

🗓️ Tomorrow | ⏱️ 10 AM EST

This is more than just a workshop—it's a turning point.
The first 100 to register get in for FREE. Don’t miss the chance to change your business trajectory forever.

🌸 Podcasts

There’s a lot more I could write about but I figure very few people will read this far anyways. If you did, you’re amazing and I appreciate you!

Love MusingsOnAI? Tell your friends!

If your company is interested in reaching an audience of AI professionals and decision-makers, reach us.

If you have any comments or feedback, just respond to this email!

Thanks for reading, Let’s explore the world together!

Raahul

Reply

or to participate.