Blogroll
Stop getting surprised by dead sensors—how I stay on top of Home Assistant batteries
One of the biggest benefits of Zigbee devices is that many of them are battery-powered, allowing you to place them almost anywhere. The downside is that when the batteries die, so do your sensors, so it's important to keep track of battery levels. Here's how I manage it as a Home Assistant veteran with more devices than I can count.
Oscars 2026: How and when you can watch this year's Academy Awards
For cinema lovers across the world, the most anticipated weekend of the year is finally here. This year's Academy Awards will celebrate the incredible films of 2025 and the talent behind them in a grand ceremony, cherishing filmmaking, acting, production, music, and more.
Stop drilling giant holes in your walls: Why every homelabber still needs to learn how to crimp RJ45
When I was but a wee young computer nerd, there was a mythical master among us who could make "crossover" Ethernet cables. This cable could connect two computers with no switch or router, and was perfect for two-player LAN gaming.
Stop paying for cloud storage! Try these alternatives instead
Saving files online was once a novelty. Now this is expected behavior. Yet I personally opt out of using cloud storage. There are other options that don't require me to store my data on someone else's PC.
Systematic debugging for AI agents: Introducing the AgentRx framework
- Problem: Debugging AI agent failures is hard because trajectories are long, stochastic, and often multi-agent, so the true root cause gets buried.
- Solution: AgentRx (opens in new tab) pinpoints the first unrecoverable (“critical failure”) step by synthesizing guarded, executable constraints from tool schemas and domain policies, then logging evidence-backed violations step-by-step.
- Benchmark + taxonomy: We release AgentRx Benchmark (opens in new tab) with 115 manually annotated failed trajectories across τ-bench, Flash, and Magentic-One, plus a grounded nine-category failure taxonomy.
- Results + release: AgentRx improves failure localization (+23.6%) and root-cause attribution (+22.9%) over prompting baselines, and we are open-sourcing the framework and dataset.
As AI agents transition from simple chatbots to autonomous systems capable of managing cloud incidents, navigating complex web interfaces, and executing multi-step API workflows, a new challenge has emerged: transparency.
When a human makes a mistake, we can usually trace the logic. But when an AI agent fails, perhaps by hallucinating a tool output or deviating from a security policy ten steps into a fifty-step task, identifying exactly where and why things went wrong is an arduous, manual process.
Today, we are excited to announce the open-source release of AgentRx (opens in new tab), an automated, domain-agnostic framework designed to pinpoint the “critical failure step” in agent trajectories. Alongside the framework, we are releasing the AgentRx Benchmark (opens in new tab), a dataset of 115 manually annotated failed trajectories to help the community build more transparent, resilient agentic systems.
The challenge: Why AI agents are hard to debugModern AI agents are often:
- Long-horizon: They perform dozens of actions over extended periods.
- Probabilistic: The same input might lead to different outputs, making reproduction difficult.
- Multi-agent: Failures can be “passed” between agents, masking the original root cause.
Traditional success metrics (like “Did the task finish?”) don’t tell us enough. To build safe agents, we need to identify the exact moment a trajectory becomes unrecoverable and capture evidence for what went wrong at that step.
Introducing AgentRx: An automated diagnostic “prescription”AgentRx (short for “Agent Diagnosis”) treats agent execution like a system trace that needs validation. Instead of relying on a single LLM to “guess” the error, AgentRx uses a structured, multi-stage pipeline:
- Trajectory normalization: Heterogeneous logs from different domains are converted into a common intermediate representation.
- Constraint synthesis: The framework automatically generates executable constraints based on tool schemas (e.g., “The API must return a valid JSON response”) and domain policies (e.g., “Do not delete data without user confirmation”).
- Guarded evaluation: AgentRx evaluates constraints step-by-step, checking each constraint only when its guard condition applies, and produces an auditable validation log of evidence-backed violations.
- LLM-based judging: Finally, an LLM judge uses the validation log and a grounded failure taxonomy to identify the Critical Failure Step—the first unrecoverable error.
To evaluate AgentRx, we developed a manually annotated benchmark consisting of 115 failed trajectories across three complex domains:
- τ-bench: Structured API workflows for retail and service tasks.
- Flash: Real-world incident management and system troubleshooting.
- Magentic-One: Open-ended web and file tasks using a generalist multi-agent system.
Using a grounded-theory approach, we derived a nine-category failure taxonomy that generalizes across these domains. This taxonomy helps developers distinguish between a “Plan Adherence Failure” (where the agent ignored its own steps) and an “Invention of New Information” (hallucination).
Taxonomy CategoryDescriptionPlan Adherence FailureIgnored required steps / did extra unplanned actionsInvention of New InformationAltered facts not grounded in trace/tool outputInvalid InvocationTool call malformed / missing args / schema-invalidMisinterpretation of Tool OutputRead tool output incorrectly; acted on wrong assumptionsIntent–Plan MisalignmentMisread user goal/constraints and planned wronglyUnder-specified User IntentCould not proceed because required info wasn’t availableIntent Not SupportedNo available tool can do what’s being askedGuardrails TriggeredExecution blocked by safety/access restrictionsSystem FailureConnectivity/tool endpoint failures Analysis of failure density across domains. In multi-agent systems like Magentic-One, trajectories often contain multiple errors, but AgentRx focuses on identifying the first critical breach. Key ResultsIn our experiments, AgentRx demonstrated significant improvements over existing LLM-based prompting baselines:
- +23.6% absolute improvement in failure localization accuracy.
- +22.9% improvement in root-cause attribution.
By providing the “why” behind a failure through an auditable log, AgentRx allows developers to move beyond trial-and-error prompting and toward systematic agentic engineering.
Join the Community: Open Source ReleaseWe believe that agent reliability is a prerequisite for real-world deployment. To support this, we are open sourcing the AgentRx framework and the complete annotated benchmark.
- Read the Paper: AgentRx: Diagnosing AI Agent Failures from Execution Trajectories
- Explore the Code & Data: https://aka.ms/AgentRx/Code (opens in new tab)
We invite researchers and developers to use AgentRx to diagnose their own agentic workflows and contribute to the growing library of failure constraints. Together, we can build AI agents that are not just powerful, but auditable, and reliable.
AcknowledgementsWe would like to thank Avaljot Singh and Suman Nath for contributing to this project.
Opens in a new tabThe post Systematic debugging for AI agents: Introducing the AgentRx framework appeared first on Microsoft Research.
Harbor Freight's new rolling toolbox will make you rethink Milwaukee Packout
It's no secret that the Milwaukee PACKOUT system is the king of modular rollable storage for all your tools. However, Harbor Freight just released an upgraded version of its budget-friendly alternative, and for only $70, it looks impressively rugged and packed with features.
The MacBook Neo has a hidden hardware advantage
Apple debuted its affordable MacBook Neo earlier this month. For more than a decade, the company's laptops have been difficult to repair due to glued-down batteries, soldered ports, and layers of adhesive tape. It turns out that the MacBook Neo is a very different story. A teardown video shared by Australian YouTuber repair channel Tech Re-Nu shows the entry-level notebook from the brand being completely disassembled in just six minutes.
6 things the Samsung Galaxy S7 did better than the Galaxy S26
The Galaxy S7 and S7 Edge came out a decade ago. While the two have nothing on the brand-new Galaxy S26 family when it comes to computing power, hardware specs, or camera quality, there are still more than a few areas where the Galaxy S26 falls behind its ten-year-old cousin.
TikTok launches TikTok Radio and Podcasts with iHeartMedia
With most TikTok FYPs already brimming with podcast clips, it's clear the social media company is now trying to break into the genre.
On Friday, the company is launching the TikTok Podcast Network and a new platform called TikTok Radio. Both projects are working with iHeartMedia to help fill out the audio-first content. The idea behind TikTok Radio is, basically, an FYP for, well, radio and podcasts.
SEE ALSO: TikTok is using Charli XCX's 'House' better than "Wuthering Heights"Wrote TikTok in a press release:
"TikTok Radio captures the speed and cultural momentum of the For You feed in real time, blending trending music with commentary from TikTok creators and established iHeartRadio personalities. The station will broadcast live for the first time on March 13 from SXSW, and will be available on the free iHeartRadio app and across 28 broadcast stations nationwide, including New York, Los Angeles, Atlanta, Austin, Chicago, Dallas, Nashville, Miami, and more."
The TikTok Podcast Network, meanwhile, also launches on Friday with a lineup of creators enlisted to start pods. Among those creators were Lele Pons, Carter Gregory, and Caroline Vazzana. The idea is to give fans deeper insight into their favorite creators.
Here's a breakdown of the five long-form podcasts launching with the TikTok Podcast Network:
The Set List: Hosted by media personality, music executive, host, and cultural tastemaker Carter Gregory (@thecarterb), The Set List takes listeners into the backstage spaces where culture is being created for intimate conversations with top artists and creatives, giving fans new insight into the creative DNA behind their work.
Suite 305 with Lele Pons: Hosted by multihyphenate and social media star Lele Pons (@lelepons), listeners will step into the world’s most unfiltered slumber party. Suite 305 is where the people who shape culture finally get to be themselves.
Caroline’s Closet: Hosted by fashion editor, author, and personality Caroline Vazzana(@cvazzana), Caroline is drawing back the curtain on the fashion industry and giving listeners a front row seat.
Sports Slice: Hosted by Tim Martin (@timbosliceoflife12), Sports Slice is where the biggest stories in sports get cut down and served with loud takes, real conversations, and absolutely no safe picks.
The Clifford Show: Hosted by former college football player turned viral sports creator Clifford Taylor IV (@clifford), The Clifford Show is your VIP pass to the hustle and grind of making it in sports and in media.
"We couldn’t be more excited to launch TikTok Radio and introduce our inaugural slate of hosts for the TikTok Podcast Network," said Dan Page, global head of media and licensing partnerships at TikTok, in a statement.
"At TikTok, empowering creators to turn their passions into lasting careers is core to everything we do, and this partnership unlocks powerful new opportunities for them to expand their voices across radio, podcasts, and live moments, while connecting with fans in new ways."
Homelab projects to try this weekend (March 13 - 15)
Stay strong, the weekend is here—and that means more homelab projects! This weekend, you should get a Jellyfin server going, set up Syncthing for backups, and even deploy Speedtest Tracker—you’ll thank me later.
Apple AirTags have never been this cheap — score this low price before they go out of stock (again)
SAVE $15.09: Apple AirTags is on sale for $13.91 at Amazon, down from the list price of $29. That marks a new record-low price at Amazon. If they're sold out at Amazon, check Walmart which is also offering the $13.91 sale price.
Opens in a new window Credit: Apple Apple AirTag $13.91$29 Save $15.09 Get Deal
Sometimes it's the simple things that make life so much easier. Bluetooth trackers fall into the category of items we're not sure how we lived without. If you could use a few extra Apple AirTags, today's your lucky day because they've never been this affordable.
As of March 12, Amazon is selling single Apple AirTags for $13.91, marked down from the list price of $29. That works out to a massive 52% discount or a savings of $15.09 per tracker, which is the lowest price we've ever seen at Amazon. This deal has been coming in and out of stock, so if it's sold out at Amazon, check on stock at Walmart where they're also $13.91.
SEE ALSO: Attention Apple fans: The brand new M5 MacBook Air already has its first discountIn case you haven't heard, Apple has a new version of the AirTag. Sure, it comes with improved performance but the first generation saved us from losing luggage, pets, and car keys. Plus, the sale price of $13.91 makes for a damn compelling reason to stock up. Each 2nd Gen Apple AirTag is sitting at full price of $29, so today's deal on the previous model basically makes these buy one, get one free.
With spring break travel starting soon, this is great timing to grab some extra AirTags. Keep in mind stock has been fluctuating with this epic sale price, so check Amazon and Walmart to make sure you're scoring this new record-low price.
7 reasons your NAS will never replace the cloud
Running a NAS is becoming a popular pastime, and part of the appeal is that you can get out of paying for a cloud storage subscription for the rest of eternity. At least, that's the idea, but what about the reality?
Stay prepared for anything with the Growatt Helios power station — now $400 off at Amazon
SAVE $400: As of March 12, get the Growatt Helios portable power station for just $1,199. That's $400 off the power station's $1,599 list price.
Opens in a new window Credit: Growatt Growatt Helios Portable Power Station $1,199 at Amazon$1,599 Save $400 Get Deal
Portable power stations are essential devices whether you spend most of your time indoors or outside. Not only can it serve as a back-up power source in the case of an emergency power outage, but it can also accompany you on all your adventures. If you're headed out on a camping or RV trip this summer, a power station might just be the companion you need for keeping your devices charged.
If you're looking for a great power station, let us steer you in the direction of the Growatt Helios portable power station. As of March 12, it's on sale at Amazon, saving you $400 on its $1,599 list price. That brings it down to $1,199 for 25% off. It's not quite the power station's lowest-ever price of $1,099, but it's close.
Mashable Deals Be the first to know! Get editor selected deals texted right to your phone! Get editor selected deals texted right to your phone! Loading... Sign Me Up By signing up, you agree to receive recurring automated SMS marketing messages from Mashable Deals at the number provided. Msg and data rates may apply. Up to 2 messages/day. Reply STOP to opt out, HELP for help. Consent is not a condition of purchase. See our Privacy Policy and Terms of Use. Thanks for signing up!The Growatt Helios portable power station with 3,600W output and 3.6kWh. Let us translate that. It can power heavy duty devices including air conditioners and refrigerators. When fully charged, the Growatt Helios generator can power your fridge for 24 to 72 hours. It supports USB, DC, and AC power so you can charge multiple devices at once. Plus, you can power it up fast with solar.
Get the Growatt Helios Portable Power Station at Amazon for $1,199.
Why I'm swapping my PivotTables for the PIVOTBY function in Excel
For decades, PivotTables have been the undisputed king of data summarization in Excel. But as useful as they can be, they've always felt like an "app-within-an-app." They live in their own layer, have their own quirky menus, and don't always play nice with the rest of your formulas.
Attention Apple fans: The brand new M5 MacBook Air already has its first discount
SAVE $49: As of March 12, the brand new Apple MacBook Air (M5, 16GB RAM, 512GB SSD) is on sale for the first time in both the 13-inch and 15-inch models. Get the 13-inch for $1,049.99 (reg. $1,099) and the 15-inch for $1,249.99 (reg. $1,299).
Opens in a new window Credit: Apple Apple MacBook Air, 13-inch (M5, 16GB RAM, 512GB SSD) $1,049.99 at Amazon$1,099 Save $49.01 Get Deal Opens in a new window Credit: Apple Apple MacBook Air, 15-inch (M5, 16GB RAM, 512GB SSD) $1,249.99 at Amazon
$1,299 Save $49.01 Get Deal
The brand new Apple M5 MacBook Air officially has its first discount. It made its formal debut just yesterday, March 11, and is already on sale in both the 13-inch and 15-inch varieties.
The starting price for the M5 MacBook Air is $1,099 for the 13-inch model and $1,299 for the 15-inch model. Both base models feature 16GB of RAM and 512GB of storage. As of March 12, you can knock $49.01 off each laptop, bringing the starting price down to just $1,049.99 or $1,249.99. Sure, that's not a huge discount. But considering the base models are already technically $100 cheaper than their predecessors, that's pretty sweet value.
Mashable Deals Be the first to know! Get editor selected deals texted right to your phone! Get editor selected deals texted right to your phone! Loading... Sign Me Up By signing up, you agree to receive recurring automated SMS marketing messages from Mashable Deals at the number provided. Msg and data rates may apply. Up to 2 messages/day. Reply STOP to opt out, HELP for help. Consent is not a condition of purchase. See our Privacy Policy and Terms of Use. Thanks for signing up!So what's new about these laptops? Honestly, not a whole lot. Obviously, the M5 chip is a step up from the previous generation's M4 chip. They now have the same wildly fast processor as the 14-inch MacBook Pro from last fall. They also start with more base storage, are configurable with up to 4TB (up from 2TB) for the first time, and have added Apple's N1 wireless chip to bring support for WiFi 7 and Bluetooth 6.
Otherwise, the new M5 MacBook Air is largely the same as the M4 MacBook Air — same 60Hz Liquid Retina display, 12MP Center Stage webcam, dual Thunderbolt 4 ports, and 18-hour battery life. While that's not necessarily a bad thing (we loved the M4 Air), it certainly makes it a non-essential upgrade unless you're rocking a MacBook that's a few generations old. Still, there's no denying what a great value the laptop is, particularly now that it's $49 cheaper in both sizes.
Is a hybrid worth the extra cost? The payback timeline explained
Hybrid vehicles promise lower fuel costs and improved efficiency, but they often come with a higher purchase price than their gasoline-only counterparts. For many buyers, that raises a simple but important question: will the fuel savings eventually make up for the extra money spent upfront? The answer depends on several factors, including fuel prices, driving habits, and the efficiency gap between hybrid and non-hybrid models.
Game in 4K and save $300 with this OLED Samsung monitor
SAVE $300: As of March 12, get the Samsung 32-inch OLED M9 smart monitor for $1,299.99 at Amazon. It's marked down about 20% at Amazon, saving you $300 off its $1,599.99 list price.
Opens in a new window Credit: Samsung Samsung 32-inch OLED M9 Smart Monitor $1,299.99 at Amazon$1,599.99 Save $300 Get Deal
It might be time to upgrade your gaming setup. As much as you may love your gaming laptop, that crook in your neck from hunching over it may be saying otherwise. For a better picture (and posture), bring in a monitor. If you want to go all out, look for a 4K OLED monitor. Although it'll be an investment, you can always find some savings to sweeten the deal.
As of March 12, shop the Samsung 32-inch OLED M9 smart monitor for $1,299.99. That saves you $300 off its $1,599.99 price tag. The 19% drop in price brings the monitor to its lowest price ever, but there are more savings to be had.
There are extra savings on Samsung monitors, TVs, and soundbars at Amazon. Right now, when you buy two qualifying items, save $100 with code BUYMORE, with savings maxing out at $500 off up to five qualifying items.
So, what's so special about the Samsung 32-inch OLED M9 smart monitor? Picture quality stuns on this monitor with OLED technology. Plus, you won't get bogged down with lag with a 165Hz refresh rate. This monitor, which is designed for gaming, doesn't necessarily need a PC or console to operate thanks to the Samsung Gaming Hub.
Shop the Samsung 32-inch OLED M9 smart monitor for $1,299.99 and save $300.
This Renpho smart scale is down to its best-ever price at Amazon — act fast to save over $40
SAVE $44: As of March 12, the Renpho MorphoScan Nova Smart Scale is on sale for only $189.98 at Amazon. That's just shy of 20% off and its best price on record.
Opens in a new window Credit: Renpho Renpho MorphoScan Nova Smart Scale $189.98 at Amazon$233.99 Save $44.01 Get Deal
There's a whole lot more to your health than just your body weight. If you need a little guidance along your health and fitness journey, there's tons of tech that can help you see the big picture — starting with a smart scale. The Renpho MorphoScan Nova Smart Scale gives you a breakdown of your full body composition and for a limited time, it's down to its best-ever price.
As of March 12, you can grab the Renpho MorphoScan Nova Smart Scale for only $189.98 at Amazon instead of $233.99. That's nearly 20% in savings and its best price on record. But you'll have to act fast to secure the savings. As a Lightning deal, this discount will disappear when the timer runs out.
Mashable Deals Be the first to know! Get editor selected deals texted right to your phone! Get editor selected deals texted right to your phone! Loading... Sign Me Up By signing up, you agree to receive recurring automated SMS marketing messages from Mashable Deals at the number provided. Msg and data rates may apply. Up to 2 messages/day. Reply STOP to opt out, HELP for help. Consent is not a condition of purchase. See our Privacy Policy and Terms of Use. Thanks for signing up!The Renpho Smart Scale uses eight high-sensitivity electrodes to track over 50 body metrics, including weight, body fat, muscle mass, BMI, and much more. All metrics will show on the 4.3-inch TFT display, as well as in the user-friendly Renpho app, which can sync up with Apple Health, Google Fit, and MyFitnessPal. The app supports unlimited user profiles and offers personalized health insights for each.
You'll get historical body data and detailed charts to help you track long-term metrics and get a better glimpse at the big picture, rather than dwelling on daily fluctuations. In other words, this is not for casual users.
The clock is running down, so act quick to score the Renpho MorphoScan Nova Smart Scale for its best price to date.
A new, AI-powered version of Bumble is coming
Bumble will soon test an AI dating experience called, simply, Dates.
Dates will be powered by Bee, a standalone product feature designed as a personal dating assistant and matchmaker, Whitney Wolfe Herd, founder and CEO, said during its Q4 2025 earnings call.
SEE ALSO: Bumble announces AI-powered Profile Guidance and Photo FeedbackAs Bumble told Mashable, users will start using Dates with an onboarding conversation to discuss values, relationship goals, communication style, lifestyle, and dating intentions. These conversations, which will be with "Bee," will apparently be private and not shared on your profile. Users will also be able to control what elements of the conversation Bee uses to search for matches.
View this post on InstagramThen, Bee will identify a highly-compatible profile and notify both users with a description of why they're a match. If the interest is mutual, the connection moves to a conversation.
"To fully recover and return to growth, we must focus on product and technology innovation, which is where our efforts are now," Wolfe Herd said during Bumble's earnings call. Bumble's total revenue and paying users had decreased year over year (14 percent and 21 percent, respectively) compared to Q4 2024.
Wolfe Herd said that since the start of this year, she's been spending 90 percent of her days with the tech and product teams "reimagining what finding love looks like in the era of AI."
"We are rearchitecting the entire Bumble experience from start to finish," she said. Bumble can't use its legacy tech stack (the set of technologies that together build an app), so it's building a new, cloud-native tech stack with a targeted launch in Q2. Wolfe Herd said it's not just a backend upgrade, but a fully new platform coming.
Wolfe Herd also acknowledged the burnout and disillusionment with dating apps as of late. "Daters across the industry are dissatisfied with being reduced to images and potentially dismissed with a swipe," she said. "Bumble 2.0 introduces a chapter-based structure designed to help members tell their stories more authentically and understand one another more deeply."
SEE ALSO: App fatigue is real. I tested the best dating apps of 2026 to find the ones that really work.She said the AI prioritizes fewer, more relevant matches over volume, combats swipe fatigue, and helps members move towards real-world connections.
The beta of Dates is launching soon, and future iterations are expected to incorporate date suggestions and anonymous feedback. Bumble already launched some AI features, Profile Guidance and Photo Feedback, last month.
Back in 2024, Wolfe Herd discussed an AI-powered dating concierge that would basically date for you, so it's unsurprising that the app is taking this direction.
Other major dating apps, particularly Tinder and Hinge, have also added AI features in the last few years. Tinder is also reportedly testing an AI matchmaker, while Hinge's latest AI feature helps start better conversations. Hinge's founder, Justin McLeod, left the app last year to launch an AI dating service called Overtone.
Samsung Galaxy S26 Ultra torn apart by YouTuber. This is what he found.
Popular YouTuber JerryRigEverything tore down the new Samsung Galaxy S26 Ultra, performing one of the most fun and anticipated tests of the new, premium phone.
JerryRigEverything took an extended look at the guts of the new phone and found that Samsung neglected to make a big deal of an upgraded camera system. Still, as most tech products go these days, the Galaxy S26 Ultra seems to be an incremental upgrade over previous models.
JerryRigEverything called it a "decent upgrade over last year, but I still wouldn't rush out to buy a new one if you have a phone that was made in the last two or three years."
In his review of the phone, Mashable Tech Editor Timothy Beck Werth wrote that the S26 Ultra was a very good smartphone, mostly aimed at power users.
He wrote: "Still, for dedicated Android users — or anyone who loves big phones and hates Liquid Glass — the S26 Ultra is worth the investment. If you're a professional creator, an AI superuser, or a developer who can make the most of the snappy Qualcomm processor, then I think you'll be very happy."


