šŸŒ» E40 - Is ChatGPT Changing the Way We Speak?

Have you noticed how often words like ā€˜delveā€™ and ā€˜realmā€™ are being used lately?

In partnership with

Every day, ChatGPT handles more than 1.5 million requests. Think about that for a second.

Just this morning, I used a tool called Cursor seven times in the span of two hours while building an application. Youā€™re likely using AI tools tooā€”maybe even without realizing it. Whether weā€™re typing out prompts or letting these AI systems assist us, one thing is clear: theyā€™re deeply integrated into our lives. And now, itā€™s starting to show in an unexpected wayā€”our language.

That raises a fascinating question: Could ChatGPT actually be changing the way we speak?

To answer this, researchers from the Center for Humans and Machines and the Center for Adaptive Rationality at the Max-Planck Institute for Human Development in Germany dove into the data. They analyzed a massive set of 280,000 English-language videosā€”presentations, talks, and speechesā€”coming from over 20,000 YouTube channels, all tied to academic institutions.

What they found was significant. Certain words, patterns, and styles of communication unique to AI, especially ChatGPT, are starting to seep into how we speak. Itā€™s subtle, but itā€™s there. And it may just be the beginning of a much larger shift in human culture.

For Those Who Seek Unbiased News.

Be informed with 1440! Join 3.5 million readers who enjoy our daily, factual news updates. We compile insights from over 100 sources, offering a comprehensive look at politics, global events, business, and culture in just 5 minutes. Free from bias and political spin, get your news straight.

šŸŒøChoice Cuts

šŸŒ¼ Bigger Isnā€™t Always Better: Rethinking Synthetic Data Generation

Iā€™ve always had this intuition: a bigger model can generate better synthetic data.

It makes sense, right? When weā€™re training a smaller, task-specific modelā€”especially for something like math or programmingā€”we rely heavily on synthetic data. This data helps cover all the permutations and combinations of real-world scenarios.

We even apply different strategies like Chain of Thought (CoT) or ReACT as template to the synthetic datasets to optimize training. So naturally, I thought a larger model would be able to generate richer variations of data.

But I overlooked one critical issue: bigger models can also hallucinate more.

Thatā€™s where smaller models come into play. When fine-tuned for a specific task, they can actually generate more reliable synthetic data without the risk of hallucinations.

Google even experimented with this approach, and the results were eye-opening. Smaller, task-specific models can offer a more precise, cost-effective solution for synthetic data generation.

Itā€™s a fascinating shift in thinkingā€”bigger isnā€™t always better, especially when it comes to training models for specific tasks.

Our findings reveal that models finetuned on WC-generated data consistently outperform those trained on SE-generated data across multiple benchmarks and multiple choices of WC and SE models. These results challenge the prevailing practice of relying on SE models for synthetic data generation, suggesting that WC may be the compute-optimal approach for training advanced LM reasoners.

I used to be a huge fan of LLM Lingua. Whenever I retrieved data from multiple sources or agents, Iā€™d run it through LLM Lingua as a prompt compressor.

The goal? To cut down on the cost of using generative language models without sacrificing quality.

For the most part, it worked well. But when it hit production, the feedback was mixedā€”some developers loved it, others werenā€™t so sure. One of my brilliant colleagues even wrote a blog about it, sharing their thoughts and experiences.

Then came LanguaShrinkā€”a tool that took prompt compression to a whole new level.

Inspired by insights that LLM performance is linked to the density and position of key information in prompts, LanguaShrink applies psycholinguistic principles and even taps into the Ebbinghaus memory curve. Itā€™s designed to be task-agnostic, so it works across different kinds of prompts, and early results look promising.

The best part? They havenā€™t released the code just yet, but itā€™s definitely one to watch. Could this be the future of efficient prompt compression? Time will tell.

Transform the way you run your business using AI (Extended Labour day Sale)šŸ’°

Imagine a future where your business runs like a well-oiled machine, effortlessly growing and thriving while you focus on what truly matters.
This isn't a dreamā€”it's the power of AI, and it's within your reach.

Join this AI Business Growth & Strategy Masterclass and discover how to revolutionize your approach to business.
In just 4 hours, youā€™ll gain the tools, insights, and strategies to not just survive, but dominate your market.

What Youā€™ll Experience: 
šŸŒŸ Discover AI techniques that give you a competitive edge
šŸ’” Learn how to pivot your business model for unstoppable growth
šŸ’¼ Develop AI-driven strategies that turn challenges into opportunities
ā° Free up your time and energy by automating the mundane, focusing on what you love

šŸ—“ļø Tomorrow | ā±ļø 10 AM EST

This is more than just a workshopā€”it's a turning point.
The first 100 to register get in for FREE. Donā€™t miss the chance to change your business trajectory forever.

šŸŒø Podcasts

Thereā€™s a lot more I could write about but I figure very few people will read this far anyways. If you did, youā€™re amazing and I appreciate you!

Love MusingsOnAI? Tell your friends!

If your company is interested in reaching an audience of AI professionals and decision-makers, reach us.

If you have any comments or feedback, just respond to this email!

Thanks for reading, Letā€™s explore the world together!

Raahul

Reply

or to participate.