Blogroll

How to try GPT-5.2, the new ChatGPT model series from OpenAI

Mashable - Thu, 12/11/2025 - 20:54

OpenAI has announced its launching GPT-5.2 effective immediately, with access rolling out in phases starting Dec. 11.

This is the second major update to GPT-5, the long-awaited OpenAI model that launched on Aug. 7.

In a blog post announcing the new series of ChatGPT models, the company called GPT-5.2 the "most capable model series yet for professional knowledge work."

This Tweet is currently unavailable. It might be loading or has been removed.

GPT-5.2 is actually a series of models; the series itself is part of the larger GPT-5 series. Here's what we're looking at:

  • GPT-5.2 Instant

  • GPT-5.2 Thinking

  • GPT-5.2 Pro

When can you start using GPT-5.2?

With a Dec. 11 release date, GPT-5.2 is available now — though not all users will see it listed as an available model in ChatGPT.

As with previous model rollouts, OpenAI will first deploy the model to paid users on Plus, Pro, Go, Business, and Enterprise plans.

Even for paying customers, OpenAI tends to make its new models available in phases — so if it's not available immediately via the OpenAI API or ChatGPT, keep checking.

"We deploy GPT‑5.2 gradually to keep ChatGPT as smooth and reliable as we can," the company wrote. "If you don’t see it at first, please try again later. In ChatGPT, GPT‑5.1 will still be available to paid users for three months under legacy models, after which we will sunset GPT‑5.1."

If you have a paid ChatGPT account, go to ChatGPT.com to see if the model is available. Eventually, all ChatGPT users should have free access to GPT-5.2, though that access will be limited on free membership plans.

GPT-5.2 benchmarks and performance

OpenAI published a new system card that shows GPT-5.2 makes iterative improvements on key benchmarks compared to GPT-5 and 5.1. In addition, OpenAI said that GPT-5.2 hallucinates less than previous models.

  • GDPval: GPT-5.2 scored 70.9 percent, while GPT-5 scored 38.8 percent

  • SWE-Bench Pro: GPT-5.2 scored 55.6 percent, while GPT-5 scored 50.8 percent

  • AIME 2025 math: GPT-5.2 scored 100 percent, while GPT-5 scored 94 percent

  • FrontierMath (Tier 1–3): GPT-5.2 scored 40.3 percent, while GPT-5 scored 31.0 percent

Disclosure: Ziff Davis, Mashable’s parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.

Categories: IT General, Technology

OpenAI launches GPT-5.2, claiming it hallucinates less and responds better to mental illness

Mashable - Thu, 12/11/2025 - 20:23

OpenAI announced today that it's launching GPT-5.2, the newest model in its GPT-5 series. The new model will start rolling out immediately, with paid ChatGPT customers getting access first.

In a blog post announcing the new model — which is actually a series of models, comprised of GPT‑5.2 Instant, GPT-5.2 Thinking, and GPT-5.2 Pro — OpenAI said that GPT-5.2 makes noticeable improvements in math and science, imaging, coding, handling agentic tasks, and overall accuracy. The company called GPT-5.2 its "most capable model series yet for professional knowledge work."

The new model comes at a difficult time for OpenAI, which is rumored to be in a "code red" state over stronger competition from rivals like Google Gemini and spreading fears of an AI bubble.

Ever since it launched ChatGPT in 2022, OpenAI has been securely on top of the AI industry. However, the company is in an increasingly precarious position. Google has an almost unfathomable amount of training data at its disposal, and Google AI products like Gemini 3, Veo 3, and Nano Banana have outperformed GPT-5, the new model OpenAI launched earlier this year, in many respects.

Still, ChatGPT is by far the most popular AI chatbot in the world, with an estimated 700 million weekly active users.

How to try GPT-5.2

The new GPT-5.2 models will start rolling out immediately, though access may not be available right away to all users. As per usual, OpenAI will launch the models to paid users on the Plus, Pro, Go, Business, and Enterprise accounts.

As of this writing, GPT-5.2 was not yet available for this reporter, and the rollout will likely happen in phases.

"We deploy GPT‑5.2 gradually to keep ChatGPT as smooth and reliable as we can; if you don’t see it at first, please try again later," OpenAI wrote in a blog post. "In ChatGPT, GPT‑5.1 will still be available to paid users for three months under legacy models, after which we will sunset GPT‑5.1."

OpenAI says GPT-5.2 makes key improvements in safety, accuracy, and performance benchmarks

The AI industry relies on standardized benchmark tests to demonstrate how well models perform, and companies like OpenAI also have their own internal tests. In addition, AI leaderboards like LMArena let users compare and rank various AI models. While GPT-5.2 has already appeared near the top of LMArena's AI coding leaderboard, it will take more time to see how users rate the new series of models against the competition. However, OpenAI released a new model card for GPT-5.2 on Dec. 11, which shows that the model makes across-the-board improvements in a variety of areas, which isn't surprising.

Most notably, OpenAI says that GPT-5.2 is more accurate and will produce fewer hallucinations compared to GPT-5.1. OpenAI's documentation states that GPT-5.2 Thinking has an average hallucination rate of 10.9 percent, compared to 16.8 percent and 12.7 percent for GPT-5 Thinking and GPT-5.1 Thinking, respectively. When GPT-5.2 is given access to the web via a browser, its hallucination rate drops to 5.8 percent.

In its blog post, OpenAI also states that GPT-5.2 scores more highly on benchmark tests for coding, science and math, performing economically valuable tasks, computer vision, and agentic work involving third-party tools. OpenAI also highlighted GPT-5.2's improved abilities with spreadsheets, in particular.

OpenAI says GPT-5.2 is safer for users with mental health problems

Lately, OpenAI has been accused of endangering ChatGPT users with mental health issues. Due to well-documented sycophancy problems, ChatGPT reportedly encouraged delusions and conspiratorial thinking on some users, who later died by suicide. OpenAI is now facing wrongful death suits, including a new suit that was just revealed for the first time today by the Wall Street Journal, in which a ChatGPT user killed himself shortly after killing his own mother.

OpenAI says that according to its internal tests, GPT-5.2 has a better response to users with mental health problems.

"With this release, we continued our work to strengthen our models’ responses in sensitive conversations⁠, with meaningful improvements in how they respond to prompts indicating signs of suicide or self harm, mental health distress, or emotional reliance on the model. These targeted interventions have resulted in fewer undesirable responses in both GPT‑5.2 Instant and GPT‑5.2 Thinking as compared to GPT‑5.1 and GPT‑5 Instant and Thinking models."

Mashable has not been able to independently verify these results, and the GPT-5.2 system card has scant details on how safety performance was measured in this context.

For more information, check out the OpenAI blog post announcing GPT-5.2 or read the new GPT-5.2 system card.

If you're feeling suicidal or experiencing a mental health crisis, please talk to somebody. You can call or text the 988 Suicide & Crisis Lifeline at 988, or chat at 988lifeline.org. You can reach the Trans Lifeline by calling 877-565-8860 or the Trevor Project at 866-488-7386. Text "START" to Crisis Text Line at 741-741. Contact the NAMI HelpLine at 1-800-950-NAMI, Monday through Friday from 10:00 a.m. – 10:00 p.m. ET, or email info@nami.org. If you don't like the phone, consider using the 988 Suicide and Crisis Lifeline Chat. Here is a list of international resources.

Disclosure: Ziff Davis, Mashable’s parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.

Categories: IT General, Technology

Why your Home Assistant automations feel slow (and how to speed them up)

How-To Geek - Thu, 12/11/2025 - 20:00

Sluggish automations can make your smart home feel unresponsive, which can ruin the user experience. Diagnosing the issue can be difficult, since there are so many variables you have to consider. Here are some things you should start with when trying to find the cause of your issues.

Categories: IT General, Technology

These 5 Lexus SUVs are ultra reliable and less than 5 years old

How-To Geek - Thu, 12/11/2025 - 19:30

Lexus has built its reputation on long-term dependability, and the brand’s recent SUVs are some of the strongest examples of that legacy. In a market where new-car prices routinely exceed $50,000, finding a nearly new luxury SUV that you can trust for the long haul has never mattered more.

Categories: IT General, Technology

Take a full $1,000 off this Bluetti portable power station and two solar panels

Mashable - Thu, 12/11/2025 - 19:18

SAVE $1,000: As of Dec. 11, the Bluetti Elite 200 V2 solar generator with two 200W solar panels is just $1,299 at Amazon. That's 43% or a full $1,000 off its list price.

Opens in a new window Credit: Bluetti Bluetti Elite 200 V2 solar generator + two 200W solar panels $1,299 at Amazon
$2,299 Save $1,000   Get Deal

Solar generators (aka portable power stations) can be a real lifesaver in many circumstances. Maybe you have a camping trip coming up, but you still need to access your computer for work. Or maybe you live in an area prone to power outages. In any case, having immediate access to one of these gadgets can keep you up and running. They're not the cheapest investments, so when you find a deal — like this Bluetti Elite 200 V2 bundle — it's worth grabbing.

As of Dec. 11, you can pick up the Bluetti Elite 200 V2 solar generator along with two 200W solar panels for only $1,299 at Amazon. That's 43% or $1,000 in savings. That's also cheaper than buying just the solar generator on its own at full price (reg. $1,699).

The Elite 200 V2 packs a 2,073Wh-capacity battery with a 2,600W (3,900W Power Lifting) AC output. That's enough to power nine devices simultaneously via four AC outlets, two 15W USB-A ports, two 100W USB-C ports, and a 120W car port. Meanwhile, it weighs just over 50 pounds and is the size of a small cooler. It's hefty, but still small enough to fit snugly in a corner of your home or portable enough to take on camping trips, tailgates, and other off-grid adventures.

The power station itself can be charged up via a typical AC outlet, with a car port, or with solar power. This bundle includes two 200W solar panels, so you'll be all set. With a 17-year lifespan, it's a smart long-term investment for your home and adventures. And knowing it will last so long definitely makes the price more justified. Save $1,000 while this deal is live.

Categories: IT General, Technology

This hilarious new Netflix mini-series is the perfect Christmas watch

How-To Geek - Thu, 12/11/2025 - 19:00

Christmas is the time of year when all the best holiday TV specials and movies are released, and this season is no different. Rowan Atkinson’s hotly anticipated comedy miniseries is finally out, making it the perfect time to settle on the couch and enjoy it with the family.

Categories: IT General, Technology

Android now lets you share live video on emergency calls

Mashable - Thu, 12/11/2025 - 18:30

Android users can now share live video of their environment while reaching out for emergency services, providing more context and assistance in potentially dire situations.

Launched on Wednesday, Emergency Live Video on Android lets emergency dispatchers request a live video feed while talking to a person who may be in danger. When initiated, callers are prompted with a single-tap button on their screen, which then provides a live video feed to dispatchers who can assess the environment or provide more specific medical assistance, like CPR instructions.

SEE ALSO: People negatively judge others who glitch on video calls, according to study

A Google blog post states that the resource works for both emergency calls and texts, and the footage is encrypted by default. The company also says it is looking to partner with global public safety organizations to expand the feature's reach.

Google has launched a variety of emergency safety features across its Android devices, including crash and fall detection, Emergency SOS, and satellite-based services. With the most recent Android 16, Google also added new accessibility features, parental controls, and the ability to mark communications as urgent.

Emergency Live Video has already rolled out to U.S. devices and emergency service calls in select regions of Germany and Mexico. In order to use the feature, devices must be running Android 8+ with Google Play services.

Categories: IT General, Technology

South Park closes Season 28 with a horrifying Jeffrey Epstein twist

Mashable - Thu, 12/11/2025 - 18:28

Most Christmas specials don't feature President Donald Trump giving Vice President JD Vance genital warts, Jesus Christ committing domestic violence, and Satan mpreg. But then again, most Christmas specials aren't South Park.

The show's Season 28 finale doubled as its holiday special, which meant creators Trey Parker and Matt Stone had to wrap up all of Season 27 and 28's fittingly unhinged Trump critiques in a neat Christmas bow.

SEE ALSO: 'South Park' creators have the most deadpan response to Trump episode controversy

Most pressingly, South Park had to conclude the tale of Satan and Trump's baby. As the episode starts, Satan is almost due and busy decorating the White House for the Antichrist's impending arrival. However, when he learns that Trump and Vance have absconded to Colorado in order to free Peter Thiel from jail and kill the Antichrist, Satan joins them in South Park to give him a piece of his mind.

What follows is a showdown where Jesus, Stan, and the Satanic Woodland Critters from South Park's Season 8 Christmas special square off against Trump, Vance, Peter Thiel, Secretary of Defense Pete Hegseth, and dirtbag Jesus. Due to Stan's pleas for a Christmas miracle, Jesus recognizes the error of his ways and drops his allegiance to Trump.

Don’t miss out on our latest stories: Add Mashable as a trusted news source in Google.

While that marks a happy ending for one of Season 28's most upsetting storylines — that of Jesus embracing a conservative, hateful version of Christianity — the same can't be said for the Antichrist storyline. Here, South Park goes as dark as it possibly can.

What happens to Trump and Satan's baby in South Park?

Satan never gives birth to the Antichrist. Instead, according to an ultrasound, the baby appears to have died by suicide in the womb.

"It appears that at some point when nobody was watching, the baby hung itself and took its own life," a doctor tells Trump. He then shows him the ultrasound, saying, "I’m afraid you can see it all in the video. The baby got hold of some bedsheets. There are a couple of minutes missing from the ultrasound, but it’s definitely a suicide."

The death is a reference to that of Jeffrey Epstein, who hanged himself with bedsheets in his cell at New York’s Metropolitan Correctional Center while awaiting trial for federal sex trafficking charges. Just like with the ultrasound in South Park, there is missing footage from security cameras outside Epstein's cell on the night of his death, leading to theories that Epstein did not die by suicide.

Trump celebrates the death of the Antichrist at his holiday party, dancing around figures like Apple CEO Tim Cook and Attorney General Pam Bondi, both of whom played roles in South Park Seasons 27 and 28. Satan, for his part, packs up all of his baby supplies and leaves the White House. It's a genuinely sobering moment in an episode packed with madcap shock value, including the darkly horrifying Epstein twist.

With that, South Park wishes us a Merry Christmas and closes out a two-season run that angered the White House, drew ratings highs, and managed to combine memes like 6-7 and Labubus with real-world atrocities like the Trump administration's ICE raids.

South Park is now streaming on Paramount+.

If you're feeling suicidal or experiencing a mental health crisis, please talk to somebody. You can call or text the 988 Suicide & Crisis Lifeline at 988, or chat at 988lifeline.org. You can reach the Trans Lifeline by calling 877-565-8860 or the Trevor Project at 866-488-7386. Text "START" to Crisis Text Line at 741-741. Contact the NAMI HelpLine at 1-800-950-NAMI, Monday through Friday from 10:00 a.m. – 10:00 p.m. ET, or email info@nami.org. If you don't like the phone, consider using the 988 Suicide and Crisis Lifeline Chat. Here is a list of international resources.

Categories: IT General, Technology

Obsidian 1.11.0 brings markdown links to properties and safer renaming

How-To Geek - Thu, 12/11/2025 - 18:28

Obsidian 1.11.0 Early Access is officially rolling out. It completely changes how you access your notes on the go, thanks to a massive focus on widgets and deep system integration across both iOS and Android. There are also many upgrades for desktop users.

Categories: IT General, Technology

Supergirl teaser trailer delivers party vibes, action, and attitude

Mashable - Thu, 12/11/2025 - 18:16

James Gunn's relaunch of the DC Universe is looking up. Following the box office success of Superman, this resurging superhero franchise is next offering Supergirl, a flashy spinoff that's giving Guardians of the Galaxy energy with this first trailer.

Maybe it's Blondie blasting "Call Me" across footage of Supergirl partying and crashing alongside her lovable superpup, Krypto. Maybe it's the collision of outer space shenanigans, rock 'n' roll, and eye-popping action. But we are seated to see House of the Dragon's Milly Alcock reprise the role of Kara, Superman's enchantingly chaotic cousin.

Based on the comic mini-series Supergirl: Woman of Tomorrow, written by Tom King and illustrated by Bilquis Evely, this movie will follow Kara on her party odyssey, which crashes into a quest to help an alien girl named Ruthye Marye Knoll (Eve Ridley) seek justice.

Craig Gillespie (I, Tonya, Cruella) directs Supergirl, with a screenplay by Ana Nogueira. Joining Alcock and Ridley in the superhero action are Matthias Schoenaerts, David Krumholtz, Emily Beecham, and Jason Momoa, who is not playing Aquaman here. Nearly a year ago, Momoa announced on Instagram he'd been asked to play Lobo, a hulking anti-hero who works as an interstellar bounty hunter.

What else can you glean from this teaser? Sound off in the comments.

Supergirl opens only in theaters June 26, 2026.

Categories: IT General, Technology

If you haven't streamed these 4 new Christmas movies, you're missing out

How-To Geek - Thu, 12/11/2025 - 18:15

We all love the Christmas classics like Elf and Home Alone, but it's also worth checking out some new festive treats. JustWatch analyzed the Christmas movies that were released this year to see which are the most popular, and these are the results.

Categories: IT General, Technology

Coupon alert: The M3 iPad Air is on sale for under $650 thanks to this Amazon coupon

Mashable - Thu, 12/11/2025 - 18:11

SAVE $150: The M3 iPad Air, 13-inch (128GB, WiFi) is on sale at Amazon in the purple colorway for $649.99, down from the usual price of $799.99. That's a 19% discount and a price that matches the record low.

Opens in a new window Credit: Apple M3 iPad Air, 13-inch (128GB, WiFi, purple) $649.99 at Amazon
$799.99 Save $150   Get Deal

As shopping experts who browse each day for the best deals, there's something espeically satisfying to find a hidden coupon at Amazon. Tucked into the options of colorways, storage sizes, and connectivity options, we stumbled into a lovely iPad coupon that could be excellent if you need an upgrade.

As of Dec. 11, the M3 iPad Air, 13-inch (128GB, WiFi, purple) is on sale for $649.99 at Amazon, marked down from the usual price of $799.99. That's a 19% discount that shaves $150 off the price. Be sure to clip the on-page coupon to score this iPad Air for a record-low price.

As long as you don't totally hate the color purple, this iPad deal will get you an iPad Air for under $650. The M3 iPad Air is one of Mashable's favorite tablets of 2025, earning the honor of the best iPad upgrade. Mashable went with the less expensive 11-inch model, but today's deal applies to the larger 13-inch display. That extra display size will be a nice addition while watching movies, reading e-books, or creating your next masterpiece artwork.

One of the highlight features that earned the M3 iPad Air a place on the Mashable list is the powerful M3 chip allows you to both edit videos and play games. That's not something every tablet can say.

SEE ALSO: Amazon just dropped the AirPods Pro 3 to their best price ever

After testing tablets, Mashable also rated this model as a great option for creators. "We also think the Apple iPad Air is the best tablet for creatives and artists," according to the Mashable shopping team who chose the best tablets of 2025. "Content creators can get plenty done on the go with the featherlight and ferocious Apple iPad Air, and it works well as a drawing tablet."

In Mashable's full review of the M3 iPad Air, Senior Editor Stan Schroeder got several days of battery life from the iPad when using it for browsing and reading. It got 14 hours and 46 minutes of battery life when looping TikTok videos at 50% brightness.

While this coupon is still live, get the purple M3 iPad Air, 13-inch (128GB, WiFi) for $649.99 at Amazon. Keep in mind it might not arrive in time for Christmas depending on your delivery address, but who said a late Christmas present isn't just as exciting?

Categories: IT General, Technology

Internet reacts to architects of AI being named Times 2025 Person of the Year

Mashable - Thu, 12/11/2025 - 18:03

Although it leaked online before the announcement, Time magazine made its 2025 Person of the Year official on Thursday morning: It's the "architects of AI."

This Tweet is currently unavailable. It might be loading or has been removed.

Time magazine's official cover(s) for the 2025 Person of the Year edition include Meta CEO Mark Zuckerberg, AMD CEO Lisa Su, Tesla and xAI CEO Elon Musk, Nvidia CEO Jensen Huang, OpenAI CEO Sam Altman, Google DeepMind CEO Demis Hassabis, Anthropic CEO Dario Amodei, and the "godmother of AI" Fei-Fei Li.

However, Time's accolade arrives as Generative AI faces increasing blowback over concerns about privacy and inaccuracies, its environmental impact, mass production of slop, and copyright issues. Still, more companies have started using AI for creative works, such as commercials. Coca-Cola and McDonald's are just two examples of big brands that published AI-generated holiday commercials – only to retract the ads after receiving criticism from consumers.

Those anti-AI feelings appear to be spilling over into Time's 2025 Person of the Year announcement as well.

On X, users commented about AI training off of intellectual property it may not have consent to use.

This Tweet is currently unavailable. It might be loading or has been removed.

Others knocked how Time was celebrating the CEOs of tech companies rather than the "engineers and researchers who actually build AI."

This Tweet is currently unavailable. It might be loading or has been removed.

Some X users focused on Time's choice of a cover for the Person of the Year issue, which recreates the infamous "Lunch atop a Skyscraper" photo from 1932, replacing the iron workers sitting on a steel beam with tech CEOs.

This Tweet is currently unavailable. It might be loading or has been removed. This Tweet is currently unavailable. It might be loading or has been removed.

On Reddit, some of the top comments on the highest upvoted posts were very critical of Time's choice.

"Oh god Please burst the AI bubble ASAP." 

"how very disappointing" 

"The editors were so proud of this choice... ..until they all lost their jobs to the 'Person of the Year'."

And, of course, there were a few wise guys who cracked jokes about how "AI" written out looks like we're talking about a guy named "Al."

This Tweet is currently unavailable. It might be loading or has been removed.

It's important to note that Time's 2025 Person of the Year article does touch upon many of the negative aspects about AI.

The magazine's Person of the Year designation is not meant to be a moral judgement on the person chosen, but instead a recognition of their impact on the world, whether it's good, bad, or a mix of both. Over Time's history, the publication has given Person of the Year to some truly terrible individuals along with some who have devoted themselves to helping others.

Still, many online seem to believe that AI and its big tech cheerleaders don't deserve the recognition.

Disclosure: Ziff Davis, Mashable’s parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.

Categories: IT General, Technology

How to repurpose an old Samsung phone as a smart home sensor

How-To Geek - Thu, 12/11/2025 - 18:00

Sensors are one of the most underrated parts of a great smart home. You can use them to measure temperature, humidity, presence, and so much more. But what if you didn’t have to buy anything? I used an old Samsung Galaxy phone as a sensor, and it’s surprisingly easy.

Categories: IT General, Technology

Agent Lightning: Adding reinforcement learning to AI agents without code rewrites

Microsoft Research - Thu, 12/11/2025 - 18:00

AI agents are reshaping software development, from writing code to carrying out complex instructions. Yet LLM-based agents are prone to errors and often perform poorly on complicated, multi-step tasks. Reinforcement learning (RL) is an approach where AI systems learn to make optimal decisions by receiving rewards or penalties for their actions, improving through trial and error. RL can help agents improve, but it typically requires developers to extensively rewrite their code. This discourages adoption, even though the data these agents generate could significantly boost performance through RL training.

To address this, a research team from Microsoft Research Asia – Shanghai has introduced Agent Lightning. This open-source (opens in new tab) framework makes AI agents trainable through RL by separating how agents execute tasks from model training, allowing developers to add RL capabilities with virtually no code modification.

Capturing agent behavior for training

Agent Lightning converts an agent’s experience into a format that RL can use by treating the agent’s execution as a sequence of states and actions, where each state captures the agent’s status and each LLM call is an action that moves the agent to a new state.

This approach works for any workflow, no matter how complex. Whether it involves multiple collaborating agents or dynamic tool use, Agent Lightning breaks it down into a sequence of transitions. Each transition captures the LLM’s input, output, and reward (Figure 1). This standardized format means the data can be used for training without any additional steps.

Figure 1. An illustration of Agent Lightning’s standardized format using a retrieval-augmented generation (RAG) agent. Left: The full agent workflow, where the agent’s state updates after each component step. The green blocks show assigned variables, and the gray blocks indicate variables without content. Right: The collected transitions are based on the standardized format for the RL training process, with each transition corresponding to one LLM step that contains its prompt, result, and immediate reward. Hierarchical reinforcement learning

Traditional RL training for agents that make multiple LLM requests involves stitching together all content into one long sequence and then identifying which parts should be learned and which ignored during training. This approach is difficult to implement and can create excessively long sequences that degrade model performance.

Instead, Agent Lightning’s LightningRL algorithm takes a hierarchical approach. After a task completes, a credit assignment module determines how much each LLM request contributed to the outcome and assigns it a corresponding reward. These independent steps, now paired with their own reward scores, can be used with any existing single-step RL algorithm, such as Proximal Policy Optimization (PPO) or Group Relative Policy Optimization (GRPO) (Figure 2).

Figure 2. (a) Single-step GRPO: The LLM completes the task in one call. Multiple responses for the same task are compared to determine how strongly each should be reinforced. (b) Previous multi-step GRPO: The task involves multiple LLM calls. Multiple multi-step runs of the same task are compared, with non-LLM generated tokens (grey boxes) ignored during training. (c) LightningRL: The multi-step run is divided into individual LLM calls. Calls from the same task are compared to determine how strongly each should be reinforced. Each call includes its input, context, output, and reward, assigned by the credit assignment module.

This design offers several benefits. It remains fully compatible with widely used single-step RL algorithms, allowing existing training methods to be applied without modification. Organizing data as a sequence of independent transitions lets developers flexibly construct the LLM input as needed, supporting complex behaviors like agents that use multiple tools or work with other agents. Additionally, by keeping sequences short, the approach scales cleanly and keeps training efficient.

Agent Lightning as middleware

Agent Lightning serves as middleware between RL algorithms and agent environments, providing modular components that enable scalable RL through standardized protocols and well-defined interfaces.

An agent runner manages the agents as they complete tasks. It distributes work and collects and stores the results and progress data. It operates separately from the LLMs, enabling them to run on different resources and scale to support multiple agents running concurrently.

An algorithm trains the models and hosts the LLMs used for inference and training. It orchestrates the overall RL cycle, managing which tasks are assigned, how agents complete them, and how models are updated based on what the agents learn. It typically runs on GPU resources and communicates with the agent runner through shared protocols.

The LightningStore (opens in new tab) serves as the central repository for all data exchanges within the system. It provides standardized interfaces and a shared format, ensuring that the different components can work together and enabling the algorithm and agent runner to communicate effectively.

Figure 3. The Agent Lightning framework

All RL cycles follow two steps: (1) Agent Lightning collects agent execution data (called “spans”) and store them in the data store; (2) it then retrieves the required data and sends it to the algorithm for training. Through this design, the algorithm can delegate tasks asynchronously to the agent runner, which completes them and reports the results back (Figure 4).

Figure 4. Agent Lightning’s RL cycle

One key advantage of this approach is its algorithmic flexibility. The system makes it easy for developers to customize how agents learn, whether they’re defining different rewards, capturing intermediate data, or experimenting with different training approaches.

Another advantage is resource efficiency. Agentic RL systems are complex, integrating agentic systems, LLM inference engines, and training frameworks. By separating these components, Agent Lightning makes this complexity manageable and allows each part to be optimized independently

A decoupled design allows each component to use the hardware that suits it best. The agent runner can use CPUs while model training uses GPUs. Each component can also scale independently, improving efficiency and making the system easier to maintain. In practice, developers can keep their existing agent frameworks and switch model calls to the Agent Lightning API without changing their agent code (Figure 5).

Figure 5. On the left, the developer implements the agent code. On the bottom right is the code required for Agent Lightning. The main body of the agent code is unchanged. Evaluation across three real-world scenarios

Agent Lightning was tested on three distinct tasks, achieving consistent performance improvements across all scenarios (Figure 6):

Text-to-SQL (LangChain): In a system with three agents handling SQL generation, checking, and rewriting, Agent Lightning simultaneously optimized two of them, significantly improving the accuracy of generating executable SQL from natural language queries.

Retrieval-augmented generation (OpenAI Agents SDK implementation): On the multi-hop question-answering dataset MuSiQue, which requires querying a large Wikipedia database, Agent Lightning helped the agent generate more effective search queries and reason better from retrieved content.

Mathematical QA and tool use (AutoGen implementation): For complex math problems, Agent Lightning trained LLMs to more accurately determine when and how to call the tool and integrate the results into its reasoning, increasing accuracy.

Figure 6. Reward curves across the three evaluation scenarios Enabling continuous agent improvement

By simplifying RL integration, Agent Lightning can make it easier for developers to build, iterate, and deploy high-performance agents. We plan to expand Agent Lightning’s capabilities to include automatic prompt optimization and additional RL algorithms.

The framework is designed to serve as an open platform where any AI agent can improve through real-world practice. By bridging existing agentic systems with reinforcement learning, Agent Lightning aims to help create AI systems that learn from experience and improve over time.

Opens in a new tab

The post Agent Lightning: Adding reinforcement learning to AI agents without code rewrites appeared first on Microsoft Research.

Categories: Microsoft

The dream of the Xperia Play lives on with this new slider phone

How-To Geek - Thu, 12/11/2025 - 17:50

Retro gaming is pretty popular these days, with fans everywhere buying handheld consoles or gaming phones. But what if someone combined the two? After months of teasers, the popular gaming handheld manufacturer AYANEO just unveiled the all-new Pocket Play. A sliding Android phone with physical controls reminiscent of the Sony Xperia Play.

Categories: IT General, Technology

You need to know the difference between a range and an array in Excel

How-To Geek - Thu, 12/11/2025 - 17:31

Whether you're interviewing for an Excel-related job or teaching a beginner, using the right terminology is crucial. Above all else, knowing the difference between a range and an array is the key to understanding how Excel processes data, giving you better insight into modern dynamic functions.

Categories: IT General, Technology

KDE brings new features to Dolphin, Kate text editor, Photos, and other apps

How-To Geek - Thu, 12/11/2025 - 17:24

KDE Gear 25.12 is now available and comes with a substantial list of quality-of-life enhancements to essential apps like Dolphin and Kate. If you’re running the KDE Plasma desktop environment or using a GNU/Linux distribution, you should check out these updates.

Categories: IT General, Technology

Amazon just dropped the AirPods Pro 3 to their best price ever

Mashable - Thu, 12/11/2025 - 17:23

SAVE $50: As of Dec. 11, Amazon has the Apple AirPods Pro 3 on sale for $199. That's 20% off the list price of $249 and their lowest price to date.

Opens in a new window Credit: Apple Apple AirPods Pro 3 $199 at Amazon
$249 Save $50   Get Deal

If you grabbed Apple's AirPods Pro 3 on Black Friday, I'm sending my condolences. The same earbuds are now $20 cheaper — great news for those who still have them on their list.

As of Dec. 11, the AirPods Pro 3 are now on sale for just $199 at Amazon. That's 20% or $50 off the list price of $249 and a new best price ever for the premium Apple earbuds. Walmart kicked off the record-low pricing in the early hours of the day, but it didn't take long for Amazon to give 'em the old price match. At the time of writing, Best Buy and Target have yet to join in on the fun (both retailers currently have them on sale for $219.99).

"The AirPods Pro pop up on most best of lists, and though we've always been fans, the third generation takes these earbuds to a new level," writes Mashable's reviewer. The new Apple AirPods Pro 3 feature better noise cancellation (twice as good as the Pro 2s), better battery life (eight hours with ANC, 10 hours in transparency mode), and new foam-infused tips that come in five sizes to find the right fit. They also brought the heart rate monitoring tech from the Powerbeats Pro 2 (our favorite earbuds for working out) and Fitness app compatibility to the Pro 3s, and introduced a live translation feature.

If you already have the AirPods Pro 2, we don't think it's necessarily worth an upgrade, but if you're debating between the Pro 2 and Pro 3, this price drop really seals the deal. Go with the AirPods Pro 3.

Categories: IT General, Technology

Stop trying to upgrade your PC—level up your setup instead

How-To Geek - Thu, 12/11/2025 - 17:15

The prices of PC components have been on a steady rise over the past few years, but this year's been something else. PC gamers are slowly being priced out of building a PC that'll be good enough to handle AAA games for the next few years.

Categories: IT General, Technology
Syndicate content

eXTReMe Tracker