Biden Eases Restrictions on Ukraine’s Use of US Arms for Strikes Within Russia

Joe Biden lifts ban on Ukraine using US weapons to strike deep inside Russian territory WASHINGTON, Nov 17 (Reuters) - President Joe Biden's administration has lifted restrictions that had blocked Ukraine from using U.S.-provided weapons to strike deep into Russian territory, said three sources familiar with the matter, in a significant change to U.S. policy in the Ukraine-Russia conflict. Ukraine plans to conduct its
HomeTechnologyEnhancing Language Models: The Role of Language Agents in Cost-Effective Thinking

Enhancing Language Models: The Role of Language Agents in Cost-Effective Thinking

Researchers have created an agent designed to enhance the cognitive abilities of large language models (LLMs).
The LLMs that are becoming increasingly prevalent in the technology sector are not inexpensive by any means. For example, constructing prominent LLMs like GPT-4 has cost around $100 million, accounting for legal fees related to training data, computing power for billions or trillions of parameters, energy and water for processing, plus the efforts of numerous developers creating the training algorithms needed for iterative learning.

However, if a researcher requires support for a specialized task that could be handled more effectively by a machine and doesn’t have the resources offered by a large institution like Washington University in St. Louis, what alternatives exist? For instance, if a parent aims to prepare their child for a challenging exam and needs to provide numerous examples for solving complex math problems.

Creating their own LLM would be a daunting and costly undertaking given the earlier mentioned expenses. Furthermore, directly utilizing larger models like GPT-4 and Llama 3.1 may not be the best fit for the intricate reasoning necessary for tasks involving logic and mathematics.

A more economical version of an LLM thinker, akin to a generic brand in generative AI, would be advantageous.

To address this issue, researchers at WashU developed an autonomous agent that instructs the reasoning capabilities of large language models. This agent produces a unique set of instructions for each task, which has proven to be very effective in enhancing the reasoning processes of various LLMs across different scenarios, according to research led by Chenguang Wang, an assistant professor in computer science and engineering, along with Dawn Song, a professor at the University of California, Berkeley.

Contributors to this research include WashU PhD students Nicholas Crispino, Kyle Montgomery, and research analyst Fankun Zeng, who recently presented their findings at a machine learning conference.

This “agent” operates as a large LLM that assists in formulating step-by-step instructions based on information gathered from the internet, according to Crispino. By providing basic task details such as the name of the dataset and a few input examples, the agent generates high-quality instructions for various tasks.

These instructions help guide smaller LLMs in completing specific tasks. This approach offers a more cost-effective solution for generative AI because the large LLM needs to be used only once per dataset, after which the instructions can be handed over to a smaller LLM that takes over the task.

“We utilize the expensive model a single time to create effective instructions that direct the reasoning or thinking process of a less costly model,” explained Crispino.

Montgomery added, “Our method significantly enhances the performance of leading large language models.”

The team tested their budget-friendly method, referred to as Zero-Shot AgentInstruct, on various language processing tasks and compared its effectiveness to zero-shot prompting using LLMs Vicuna-13b, Llama-2-70b-chat, and GPT-3.5 Turbo.

In comparison to “zero-shot chain of thought” prompting—using the phrase “let’s think step by step”—the Zero-Shot AgentInstruct demonstrated superior performance across numerous tasks evaluated on 29 datasets (inclusive of 53 subsets).

“Our advancement in thinking and reasoning is impressive, especially in math and logic,” remarked Wang.

In essence, they capitalize on the strengths of powerful LLMs to break down tasks into step-by-step reasoning methods for the other model, akin to an experienced educator sharing insights with students.

“We’re exploring the limits of enhancing the reasoning abilities of smaller models using larger models without necessitating additional training,” Crispino commented.