🌻 E38: Jevons Paradox, Agent Updates & HelixFold3

🌸 Good Morning - Jevon’s Paradox of LLMs

Over the past years, we've witnessed a remarkable 79% reduction in costs, with the cost per million tokens in large language models decreasing even more rapidly than the growth in computing power predicted by Moore’s Law.

Let me take you back to the 19th century when a British economist named William Stanley Jevons (1835-1882) introduced a concept that would later be known as the Jevons Paradox. When James Watt unveiled his efficient steam engine, which used significantly less coal than previous models, many believed that coal consumption would decrease. However, to everyone's surprise, the opposite occurred—coal consumption in the UK soared.

Similarly, consider Apple’s iPod and iTunes, which many thought would simply make music more accessible. Instead, they revolutionized the music industry, leading to an unprecedented surge in music consumption.

That's why we're seeing the widespread adoption of large language models—everyone's buzzing about AI agents.

However, in real-world production, it's not just about deploying a single agent; we need a whole flock of them—100 or more—working together seamlessly. From my experience over the past year, I've noticed that while agent integration succeeds 90-95% of the time, the pipeline often breaks down because full automation is still lacking. This is the real challenge. The company that can crack this 'last mile' will be the next unicorn.

Prompt: A man sleeps and does not dream!!!

🌸 From The Agent Community

🌼 A Text2SQL Debugger Agent

Database mismatches, such as conditions and constraints, often cause errors in real-life Text-to-SQL frameworks.

To tackle this, a tool-assisted agent framework for SQL inspection and refinement has been proposed, equipping LLMs with a retriever and a detector to diagnose and correct these mismatches.

Spider-Mismatch, a dataset that reflects real-world condition mismatches, outperforms baselines on its dataset and achieves top performance in few-shot settings on the Spider and Spider-Realistic datasets.

🌼 Webpilot - Autonomous Multi-Agent System for Web Task Execution

LLM-based autonomous agents often struggle with complex web tasks due to the unpredictable nature of these environments. Traditional agents rely on rigid, expert-designed policies, limiting their adaptability to new tasks. Unlike humans, who adapt through exploration, these agents lack flexibility.

WebPilot, a multi-agent system that enhances Monte Carlo Tree Search (MCTS) with a dual optimization strategy. The Global Optimization phase breaks tasks into subtasks, refining plans based on new observations.

The Local Optimization phase uses tailored MCTS to handle uncertainties and refine decisions. A 93% success rate increase on WebArena marks a significant advancement in autonomous agent capabilities. It looks like MindSearch and The code are still not available but want to explore more.

🌸 Choice Cuts

🌼 HelixFold3, a model based on PaddlePaddle that replicates AlphaFold3, has been released as open-source.Baidu's PaddleHelix team has successfully matched AlphaFold3's performance with HelixFold3 and made it available to the public.

🌼 The Mamba in LLAMA

🌼 Gpt4 is costly for Scraping.

100 M Context window - A 100M context window means it can probably store everything you’ve ever told it for years.

🌸 Podcasts

Love MusingsOnAI? Tell your friends!

If your company is interested in reaching an audience of AI professionals and decision-makers, reach us.

If you have any comments or feedback, just respond to this email!

Thanks for reading, Let’s explore the world together!

Raahul

Reply

or to participate.