Blogroll

Frozen light, DNA cassettes, and laser-etched glass: Sci-fi storage tech that makes your SSD look like a floppy disk

How-To Geek - 1 hour 17 min ago

I've written extensively about how fragile our data storage technology is. So far, the most robust medium we've come up with are carvings in stone or clay tablets, which is why we can read a complaint about poor-quality copper written in 1750 BCE.

Categories: IT General, Technology

BIOS updates are no longer optional

How-To Geek - 1 hour 23 min ago

If you own a Windows PC or laptop, you’ve probably been told to avoid BIOS updates unless something is wrong, likely to prevent a catastrophic issue like bricking your motherboard. However, with security, performance, and compatibility all at stake, I think they're essential for all machines.

Categories: IT General, Technology

Everything Apple just announced: iPhone 17e, MacBook Neo, displays

Mashable - 1 hour 39 min ago

Apple has had an unusually busy week — no keynote required.

In a flurry of press releases, the Cupertino company unveiled a new iPhone, a refreshed MacBook Air, a new MacBook Pro, a pair of new desktop displays, and the chips that power it all.

If you're just getting up to speed, here's every major product Apple announced — and more importantly, what you need to know about each one.

iPhone 17e Credit: Mashable/Apple

The iPhone 17e, announced March 2, is built around Apple's latest-generation A19 chip — the same processor powering the flagship iPhone 17 lineup. It also adds C1X, a next-generation cellular modem the company says is roughly twice as fast as the modem in the iPhone 16e.

The 6.1-inch Super Retina XDR display on the 17e now features Ceramic Shield 2, which Apple says offers three times the scratch resistance of the previous generation.

SEE ALSO: Comparing iPhone 17e vs. iPhone 17: Is the new $599 phone good enough?

The 17e's camera system has been overhauled with a 48MP Fusion lens that Apple says can function like two cameras in one — offering an optical-quality 2x telephoto crop in addition to the standard wide angle. Portrait mode has been improved with a smarter image pipeline that can automatically detect people, dogs, and cats and save depth data in the background, so you can apply bokeh after the fact.

The most consumer-friendly change: iPhone 17e now ships with MagSafe, Apple's magnetic wireless charging ecosystem, supporting up to 15W. The iPhone 16e topped out at 7.5W over standard Qi. Baseline storage has also doubled, to 256GB, at the same $599 starting price.

iPhone 17e comes in black, white, and a new soft pink color. Pre-orders open March 4; the phone is officially available on March 11.

MacBook Air with M5

Apple refreshed the MacBook Air laptop with its M5 chip. The result is up to four times faster for AI tasks than the MacBook Air with M4, the company says, and up to 9.5 times faster than the M1 model. The new chip features a 10-core CPU and a 10-core GPU, with a Neural Accelerator built into each core.

Storage gets a meaningful upgrade too. The new MacBook Air now starts at 512GB — double the previous standard — and can be configured up to 4TB for the first time. Apple claims the new SSD also delivers read/write speeds that are twice as fast as those in the M4 MacBook Air.

The new Apple N1 wireless chip brings Wi-Fi 7 and Bluetooth 6 to the Air, delivering improved performance and reliability. Battery life is unchanged, promising up to 18 hours on a charge. The design — a fanless aluminum chassis in 13- and 15-inch options — is unchanged too. Colors include sky blue, midnight, starlight, and silver.

The 13-inch MacBook Air with M5 starts at $1,099 (or $999 for education). The 15-inch starts at $1,299 ($1,199 for education). Pre-orders open March 4, and the laptop ships March 11.

Macbook Neo

Apple also unveiled the MacBook Neo, a brand-new entry-level laptop starting at $599 — or $499 for students and educators — marking the company's most affordable Mac ever.

The 13-inch machine runs on Apple's A18 Pro chip, the same processor found in the iPhone 16 Pro lineup, paired with 8GB of unified memory that cannot be upgraded. It features a Liquid Retina display, up to 16 hours of battery life, and comes in four colors: blush, indigo, silver, and citrus.

But as Mashable's Stan Schroeder noted in an early spec breakdown, the low price comes with tradeoffs — Touch ID costs an extra $100, the battery is considerably smaller than the one in the MacBook Air, and prospective buyers who need more than 8GB of RAM are simply out of luck. MacBook Neo is available for pre-order now, and ships on March 11.

MacBook Pro with M5 Pro and M5 Max

The new 14- and 16-inch MacBook Pro models are powered by M5 Pro and M5 Max, which Apple says deliver up to four times the AI performance of the M4 Pro and M4 Max, and up to eight times the AI performance of M1-era models. Both chips are built on a new "Fusion Architecture" that combines two dies into a single system-on-a-chip, enabling performance gains that Apple says wouldn't be possible with a traditional single-die design.

SEE ALSO: How to preorder the new Apple MacBook Pros with the M5 Pro and M5 Max chips — preorders now live

MacBook Pro with M5 Pro is aimed at data modelers, sound designers, and complex coders. It pairs an up to 18-core CPU with an up to 20-core GPU and supports up to 64GB of unified memory. The M5 Max doubles down with an up-to 40-core GPU and up to 128GB of unified memory — a figure Apple says meaningfully improves token-generation speeds for Large Language Models (LLMs) running locally.

Storage starts at 1TB for the M5 Pro models, and 2TB for the M5 Max. Apple says SSD speeds have roughly doubled over the previous generation, reaching up to 14.5GB/s read/write. The MacBook Pro also adds the N1 chip for Wi-Fi 7 and Bluetooth 6, and ships with three Thunderbolt 5 ports. Battery life is rated at up to 24 hours.

The 14-inch MacBook Pro with M5 Pro starts at $2,199; the 16-inch version starts at $2,699. M5 Max configurations start at $3,599 for the 14-inch model and $3,899 for the 16-inch model.

All models come in space black and silver. Pre-orders open March 4; availability March 11.

iPad Air M4

Apple also refreshed the iPad Air lineup, bumping it to the M4 chip with 12GB of unified memory — a 50 percent increase over the previous generation. The tablet is available in 11- and 13-inch sizes and, according to Apple, delivers performance up to 30 percent faster than the M3 model and more than twice as fast as the M1 version.

SEE ALSO: The new Apple iPad Air is live on Walmart: Pre-order now to save up to $60

Both the N1 wireless chip for Wi-Fi 7 and the C1X cellular modem make their iPad debut here, with Apple claiming the latter cuts modem power consumption by roughly 30 percent compared to the M3 model.

Pricing holds steady at $599 for the 11-inch Wi-Fi model and $799 for the 13-inch. Pre-orders open March 4; availability starts March 11.

Studio Display and Studio Display XDR

Apple announced a refresh of its external display lineup, introducing both a new Studio Display and an entirely new Studio Display XDR. The Studio Display gets a notable upgrade in the form of Thunderbolt 5 connectivity — two ports that support daisy-chaining up to four displays — and a new 12MP Center Stage camera that now includes support for Desk View, which simultaneously shows the caller and a top-down view of their workspace.

The core display panel remains a 27-inch 5K Retina panel at 600 nits, with P3 wide color.

The Studio Display XDR is a bigger story. Apple is positioning it as a replacement for the Pro Display XDR at a significantly lower price. It features the same 27-inch 5K Retina canvas, but with a mini-LED backlight system using over 2,000 local dimming zones, up to 2,000 nits of peak HDR brightness, a 1,000,000:1 contrast ratio, and a 120Hz refresh rate with Adaptive Sync.

The XDR display adds support for the Adobe RGB color gamut alongside P3 and introduces new DICOM medical imaging presets — pending FDA clearance — that's aimed at radiologists who want to use the display for diagnostic work.

The new Studio Display with a tilt-adjustable stand starts at $1,599. Studio Display XDR with a tilt- and height-adjustable stand starts at $3,299 — that's $2,700 less than the original Pro Display XDR at launch.

As with everything else on Apple's list, pre-orders for the displays open March 4, with availability on March 11.

Categories: IT General, Technology

The 3 best accessories to plug into your TV’s USB port

How-To Geek - 2 hours 2 min ago

Modern TVs are smart, meaning all you need to do is plug them in and turn them on. Most of the setup involves downloading the streaming services you subscribe to and signing in to them.

Categories: IT General, Technology

Every streaming service that's raised its prices in 2026 (so far)

How-To Geek - 2 hours 32 min ago

The streaming industry is in its "nothing stays still" era. Catalogs continuously rotate, bundles get reshuffled and dropped, services rebrand, merge, and corporate consolidation keeps rewriting the map. The latest whiplash example is Paramount Skydance’s victory over Netflix to buy Warner Bros. Discovery, a reminder that the industry is still very volatile and that these corporate wars often come with a higher monthly bill for us.

Categories: IT General, Technology

Disinformation on U.S.-Iran war takes over the internet

Mashable - 2 hours 47 min ago

Before the dust had settled on the ruins of the Shajareh Tayyebeh school — a casualty of the recent U.S.-Israel military strikes against Iran, and one which resulted in the deaths of up to 168 adults and children — people were already engagement-farming online. Clips of digital flight simulators were passed off as real-time ops footage, while out-of-context images of battleships and old videos of aerial missile attacks were repurposed to sell users a tale of Iranian dominance. AI-edited content proliferated.

According to experts, the posts had accumulated hundreds of millions of views in just a handful of days.

SEE ALSO: AI has made us all surveillance targets. This tool helps you fight back.

The growing number of viral posts — and the potential for even more to pop up as users earned cash for the viral falsehoods — was alarming enough to prompt X to edit its policies on misinformation. As of yesterday, X says it will suspend users from its Creator Revenue Sharing program if they post AI-generated content depicting armed conflict without labeling it as such.  

And not even Google searches are safe from misinformation these days. 

The proliferation of digital misinformation is the product of a web of bots and engagement farming accounts, all with the shared goal of being the loudest, most clicked-on account in the room. 

Some hope to win political and social influence, others just want the money. Meanwhile, users, prone to confirmation bias and a reliance on digital news sources, repeatedly fall victim to their racket. Engagement farming, no longer just exchanging the currency of memes and clickbait, has become a dangerous, politically fraught game.

What users are seeing as the U.S.-Iran conflict rages

Recent posts engaging in active disinformation about the conflict in Iran primarily involve exaggerating the scale and success of Iranian counterattacks, experts explain. 

A recent investigation by Wired documented hundreds of posts across Elon Musk's X that included misleading footage and photos — including AI-manipulated content — or promoted false claims about the scale of the attacks, many of which were posted in the immediate aftermath of missile strikes. A post with more than 4 million views claimed to show ballistic missiles sailing over Dubai, but actually depicted an Iranian attack on Tel Aviv in Oct. 2024. Another with more than 375,000 impressions shows a fictitious before-and-after image of the shelled compound of assassinated Iranian leader Ali Hosseini Khamenei. 

According to Wired, nearly all of the posts were shared by premium subscriber accounts with blue checkmarks, including state-funded media outlets in Iran. 

As in previous military conflicts, accounts have also attempted to pass off video game footage as verified news clips, including AI-manipulated images of downed F-35 fighter jets ripped from flight simulator games. The images have been shared across TikTok, some with links to Russian influence operations, the BBC reported. 

In addition to out-of-context footage and misleading content, the BBC also documented a handful of completely AI-generated videos that had amassed nearly 100 million total views, shared by what the outlet calls notorious "super-spreaders" of disinformation. 

Visuals are a good way for us to process what is going on in war when we can't comprehend the scale of these conflicts. - Sofia Rubinson, NewsGuard

A report from misinformation watchdog NewsGuard also chronicled a cadre of users sharing viral posts circulating false claims of targeted military strikes against U.S. and Israeli strongholds, predominately using repurposed video footage and out of context or completely recontextualized images of destruction. 

"[These videos] are posted by anonymous accounts that tend to report on geopolitical conflicts. These are accounts that are known to NewsGuard for spreading exaggerated claims, usually from a pro-Iran perspective," said Sofia Rubinson, senior editor of NewsGuard's Reality Check newsletter and co-author of the report. From there, Rubinson explains, other accounts with larger followings pick up and spread the false claims. 

For example, hours after initial reports of the U.S.'s military strikes in Iran, users on X began reposting an image of a sinking naval aircraft carrier. Users claimed that it showed a recent attack on the battleship USS Abraham Lincoln in the Arabian Sea. The U.S. military's Central Command issued a statement refuting the claim that same day. NewsGuard confirmed the image actually showed the intentional sinking of the USS Oriskany that took place nearly 20 years ago. The claim was shared by unverified "news" accounts and even Kenyan parliamentary member Peter Salasya. Salasya's post has been viewed more than 6 million times. 

Multiple accounts, including Salasya's, shared another video allegedly showing Israel's Dimona nuclear power plant under siege by air. The video racked up hundreds of thousands of impressions across anti-Israel and pro-Iran pages — an X Community Note now appears below the video on Salasya's page, clarifying the images are of a March 2017 attack in Balaklia, Ukraine.

NewsGuard found that such posts have already garnered at least 21.9 million views across X. 

Posts inducing fear of domestic retaliatory attacks have also circulated online, including an unverified list of U.S. cities alleged to be top targets for Iranian sleeper cells — the list appears to have been written in Apple's Notes app.

Disinformation is only going to get worse

The acceleration of advanced generative AI and relaxed moderation policies across social media platforms has exacerbated an online misinformation crisis, experts have warned. 

Particularly over recent months, including during the U.S.-led capture of Venezuelan leader Nicolas Maduro, NewsGuard researchers have noticed a pattern in online disinformation emerging over periods of breaking news.

"People now have a shorter window for the lapse between an event occurring and authentic visuals coming out of the media," explained Rubinson. To put it more bluntly: Users are losing their patience, used to an online environment where information is usually right at your fingertips. 

These brief periods, or voids, between breaking news reports and confirmed video or photos become fertile ground for disinformation bots and engagement farmers, Rubinson says. They also threaten to reinforce conspiratorial thinking — that mainstream news outlets are keeping information from the public, for example — and lend themselves to a user's own confirmation bias.

Political conflict is particularly rife for the spreading of such misinformation, which is in turn strengthened by active disinformation campaigns from both sides of armed conflict. Researchers have found that a lack of proximity to events makes it easier to believe out of context or exaggerated information. 

"It's an attempt to fill this fog of war," said Rubsinson. "It can be very overwhelming for people. They want to make sense of it, and visuals are a good way for us to process what is going on in war when we can't comprehend the scale of these conflicts." 

This becomes a greater problem as individuals increasingly use social media platforms as sole sources for news and as previously reliable fact-checking tools, including straightforward Google searches, become more unreliable.

SEE ALSO: U.S. government creates website to get around European content bans AI is harming more than helping 

AI chatbots and search have become embedded into the very fiber of real world crisis events, as users turn to them real time fact checkers. Rubinson said that nearly every X post NewsGuard analyzed included the same reply: "@Grok is this true?"

But AI assistants and platform chatbots, including X's Grok, are notoriously unreliable at disseminating and verifying breaking news. They are also inconsistent at applying their own platforms' moderation policies. The BBC found that Grok erroneously verified recent AI-generated images depicting Iranian military movements, for example. 

According to a second report by NewsGuard published March 3, Google AI-powered Search Summaries have repeated misleading claims about the U.S.-Iran conflict when prompted with reverse image searches. For example, NewsGuard researchers uploaded a frame from a video shared online claiming to show the destruction of a CIA outpost in Dubai. Google's AI summary verified the story, writing: "The image shows a fire at a high-rise residential building in Dubai, UAE, reportedly occurring on March 1, 2026, following regional tensions. … Conflicting reports emerged regarding the cause, with some sources mentioning a drone strike and others referring to the building as a specific intelligence facility." 

The video actually depicts a 2015 residential fire in the city of Sharjah.     

Security experts have sounded alarm bells over such "AI information threats," including AI tools used to generate and amplify misleading content. A report by the UK Centre for Emerging Technology and Security suggests the worsening information environment may pose existential threats to public safety, national security, and democracy without direct intervention. 

Meanwhile, civilians and journalists on the ground in Iran are fighting back against a near total internet blackout, following a massive push by the Trump administration and its ally Elon Musk to get Starlink internet connections to those on the ground. Bad actors, on the other hand, are still finding their way through the block and back onto sites like X.

Categories: IT General, Technology

Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model

Microsoft Research - 2 hours 56 min ago
At a glance
  • Phi-4-reasoning-vision-15B is a compact and smart open‑weight multimodal reasoning model that balances reasoning power, efficiency, and training data needs. It is a broadly capable model that allows for natural interaction for a wide array of vision-language tasks and excels at math and science reasoning and understanding user-interfaces.
  • We share lessons learned and best practices for training a multimodal reasoning model—showing the benefit of careful architecture choices, rigorous data curation, and the benefits of using a mixture of reasoning and non-reasoning data.

We are pleased to announce Phi-4-reasoning-vision-15B, a 15 billion parameter open‑weight multimodal reasoning model, available through Microsoft Foundry (opens in new tab), HuggingFace (opens in new tab) and GitHub (opens in new tab). Phi-4-reasoning-vision-15B is a broadly capable model that can be used for a wide array of vision-language tasks such as image captioning, asking questions about images, reading documents and receipts, helping with homework, inferring about changes in sequences of images, and much more. Beyond these general capabilities, it excels at math and science reasoning and at understanding and grounding elements on computer and mobile screens. In particular, our model presents an appealing value relative to popular open-weight models, pushing the pareto-frontier of the tradeoff between accuracy and compute costs. We have competitive performance to much slower models that require ten times or more compute-time and tokens and better accuracy than similarly fast models, particularly when it comes to math and science reasoning.

Figure 1: Phi-4-reasoning-vision-15B presents a compelling option compared to existing models, pushing the pareto-frontier of the tradeoff between accuracy and compute costs. We have competitive performance to much slower models that require more time and tokens and higher accuracy than similarly fast models. These values were computed by averaging accuracy, time, and output token-counts for a subset of 4 benchmarks: ChartQA_TEST, MathVista_MINI, MMMU_VAL, and ScreenSpot_v2, where we had logged these values. 

In this post, we share the motivations, design choices, experiments, and learnings that informed its development, as well as an evaluation of the model’s performance and guidance on how to use it. Our goal is to contribute practical insight to the community on building smaller, efficient multimodal reasoning models and to share an open-weight model that is competitive with models of similar size at general vision-language tasks, excels at computer use, and excels on scientific and mathematical multimodal reasoning.

A focus on smaller and faster vision–language models

Many popular vision-language models (VLMs) have trended towards growing in parameter count and, in particular, the number of tokens they consume and generate. This leads to increase in training and inference-time cost and latency, and impedes their usability for downstream deployment, especially in resource‑constrained or interactive settings.

A growing countertrend towards smaller (opens in new tab) models aims to boost efficiency, enabled by careful model design and data curation – a goal pioneered by the Phi family of models (opens in new tab) and furthered by Phi-4-reasoning-vision-15B. We specifically build on learnings from the Phi-4 and Phi-4-Reasoning language models and show how a multimodal model can be trained to cover a wide range of vision and language tasks without relying on extremely large training datasets, architectures, or excessive inference‑time token generation. Our model is intended to be lightweight enough to run on modest hardware while remaining capable of structured reasoning when it is beneficial. Our model was trained with far less compute than many recent open-weight VLMs of similar size. We used just 200 billion tokens of multimodal data leveraging Phi-4-reasoning (trained with 16 billion tokens) based on a core model Phi-4 (400 billion unique tokens), compared to more than 1 trillion tokens used for training multimodal models like Qwen 2.5 VL (opens in new tab) and 3 VL (opens in new tab), Kimi-VL (opens in new tab), and Gemma3 (opens in new tab). We can therefore present a compelling option compared to existing models pushing the pareto-frontier of the tradeoff between accuracy and compute costs.

Figure 2: Phi-4-Reasoning-Vision can help with a wide range of everyday tasks. Lessons from training a multimodal model

Training a multimodal reasoning model raises numerous questions and requires many nuanced design choices around model architecture, dataset quality and composition, and the interaction between reasoning‑heavy and non-reasoning perception‑focused tasks.

Model architecture: Early- vs mid-fusion

Model architectures for VLMs differ primarily in how visual and textual information is fused. Mid-fusion models use a pretrained vision encoder to convert images into visual tokens that are projected into a pretrained LLM’s embedding space, enabling cross-modal reasoning while leveraging components already trained on trillions of tokens. Early-fusion models process image patches and text tokens in a single model transformer, yielding richer joint representations but at significantly higher compute, memory, and data cost. We adopted a mid-fusion architecture as it offers a practical trade-off for building a performant model with modest resources.

Model architecture: Vision encoder and image processing

We build on the SigLIP-2 (opens in new tab) vision encoder and the Phi-4-Reasoning backbone. In previous research, we found that multimodal language models sometimes struggled to solve tasks, not because of a lack of reasoning proficiency, but rather an inability to extract and select relevant perceptual information from the image. An example would be a high-resolution screenshot that is information-dense with relatively small interactive elements.

Several open-source multimodal language models have adapted their methodologies accordingly, e.g., Gemma3 (opens in new tab) uses pan-and-scan and NVILA (opens in new tab) uses Dynamic S2. However, their trade-offs are difficult to understand across different datasets and hyperparameters. To this end, we conducted an ablation study of several techniques. We trained a smaller 5 billion parameter Phi-4 based proxy model on a dataset of 10 million image-text pairs, primarily composed of computer-use and GUI grounding data. We compared with Dynamic S2, which resizes images to a rectangular resolution that minimizes distortion while admitting a tiling by 384×384 squares; Multi-crop, which splits the image into potentially overlapping 384×384 squares and concatenates their encoded features on the token dimension; Multi-crop with S2, which broadens the receptive field by cropping into 1536×1536 squares before applying S2; and Dynamic resolution using the Naflex variant of SigLIP-2, a natively dynamic-resolution encoder with adjustable patch counts.

Our primary finding is that dynamic resolution vision encoders perform the best and especially well on high-resolution data. It is particularly interesting to compare dynamic resolution with 2048 vs 3600 maximum tokens: the latter roughly corresponds to native HD 720p resolution and enjoys a substantial boost on high-resolution benchmarks, particularly ScreenSpot-Pro. Reinforcing the high-resolution trend, we find that multi-crop with S2 outperforms standard multi-crop despite using fewer visual tokens (i.e., fewer crops overall). The dynamic resolution technique produces the most tokens on average; due to their tiling subroutine, S2-based methods are constrained by the original image resolution and often only use about half the maximum tokens. From these experiments we choose the SigLIP-2 Naflex variant as our vision encoder.

MethodMax TokensMathVistaScreenSpotScreenSpot-ProV*BenchDynamic-S2309642.978.49.452.9Multi-crop309643.467.85.451.8Multi-crop with S2204843.479.110.657.1Dynamic resolution204845.281.59.251.3Dynamic resolution360044.979.717.556.0Table 1: Results with different resolution handling approaches. The top two configurations on each benchmark are in bold. Data: Quality and composition

As with its language backbone Phi-4-Reasoning, Phi-4-reasoning-vision-15B was trained with a deliberate focus on data quality. Our final dataset consists primarily of data from three sources: open-source datasets which were meticulously filtered and improved; high-quality domain-specific internal data; and high-quality data from targeted acquisitions. The overwhelming majority of our data lies in the first category: data which originated as open-source data, which were significantly filtered and improved, whether by removing low-quality datasets or records, programmatically fixing errors in data formatting, or using open-source images as seeds to synthetically generate higher-quality accompanying text.

The process of improving open-source data began by manually reviewing samples from each dataset. Typically, 5 to 10 minutes were sufficient to classify data as excellent-quality, good questions with wrong answers, low-quality questions or images, or high-quality with formatting errors. Excellent data was kept largely unchanged. For data with incorrect answers or poor-quality captions, we re-generated responses using GPT-4o and o4-mini, excluding datasets where error rates remained too high. Low-quality questions proved difficult to salvage, but when the images themselves were high quality, we repurposed them as seeds for new caption or visual question answering (VQA) data. Datasets with fundamentally flawed images were excluded entirely. We also fixed a surprisingly large number of formatting and logical errors across widely used open-source datasets.

We extracted additional value from existing datasets through reformatting, diversification, and using images as seeds for new data generation. We generated detailed image descriptions alongside original QA pairs for math and science data, had data perform “double-duty” by embedding instruction-following requirements directly into domain-specific QA, created “scrambled,” “caption-matching,” and “what’s changed?” records to improve multi-image reasoning and sequential navigation for CUA scenarios, and diversifying prompt styles to encourage robustness beyond perfectly structured questions.

To supplement the improved open-source data, we utilize high-quality internal datasets, several math-specific datasets which were acquired during training of the Phi-4 language model, and also some domain-specific curated data; for example, latex-OCR data generated by processing and rendering equations from arXiv documents.

before returning a bounding box coordinates for a UI grounding task, and the other uses a tag with step-by-step reasoning to answer a chart question about expatriate populations, concluding with "Dubai." " class="wp-image-1163336"/> Figure 3: Phi-4-reasoning-vision-15B training data composition and examples Data: Mathematics vs. computer-use data proportion

One of our goals was to train a model that performs well across general vision-language tasks, while excelling at mathematical and scientific reasoning and computer-use scenarios. How to structure datasets for generalizable reasoning remains an open question—particularly because the relationship between data scale and reasoning performance can lead to starkly different design decisions, such as training a single model on a large dataset versus multiple specialized models with targeted post-training.

Research on long-tailed classification robustness has suggested that balancing or removing data from overrepresented tasks or subgroups (opens in new tab) is an effective method for ensuring good performance. Nevertheless, these insights are not fully utilized or explored when it comes to training VLMs, which at times have favored scale over careful data balancing. To achieve our goals, we conducted a set of experiments to analyze a range of data ratios between our focus domains.

Using the same 5 billion parameter proxy model as for previous experiments, we trained while varying the amount of mathematics and science vs. computer-use data for each run. Each dataset included the same subset of 1 million general image-text pairs as a baseline. For mathematics and science data, we used a subsample of 150,000 records, optionally duplicating each one up to three times. Next, we included up to 450,000 computer-use records, and optionally an additional 400,000 from Phi-Ground.

We found that that multimodal mathematics and science performance were not harmed by additional computer-use data, and vice versa. Interestingly, we found that increasing mathematics data by 3x while keeping computer-use data constant improved math, science, and computer-use benchmarks.

GeneralMath and ScienceCUATotalMMMUMathVistaScreenSpot-V21M150K450K1.6M44.037.448.21M150K850K2.0M44.137.360.01M450K450K1.9M45.336.048.31M450K850K2.3M43.438.963.11M150K150K1.3M44.236.929.81M150K250K1.4M45.437.437.7Table 2: Varying the ratios of math and CUA data. Increasing math data by 3x while keeping computer-use data constant improves both math and computer-use benchmarks.  Data: Synthetic data for text-rich visual reasoning

Recent work (opens in new tab) suggests that targeted synthetic data can materially improve multimodal reasoning, particularly for text-rich visual domains such as charts, documents, diagrams, and rendered mathematics. Using images, questions, and answers that are programmatically generated and grounded in the visual structure enables precise control over visual content and supervision quality, resulting in data that avoids many annotation errors, ambiguities, and distributional biases common in scraped datasets. This enables cleaner alignment between visual perception and multi-step inference, which has been shown to translate into measurable gains on reasoning-heavy benchmarks.

Synthetic text-rich images expand coverage of long-tail visual formats that are underrepresented in real data but disproportionately impact reasoning accuracy, improving not only visual grounding but also downstream reasoning by ensuring that failures are less often caused by perceptual errors. We found that programmatically generated synthetic data is a useful augmentation to high-quality real datasets — not a replacement, but a scalable mechanism for strengthening both perception and reasoning that complements the training objectives in compact multimodal models such as Phi-4-reasoning-vision-15B.

Mixing non-reasoning and reasoning as a design objective

In language-only settings, reasoning traces have improved performance on many tasks, but they require additional compute which adds undesired latency. In multimodal settings, this tradeoff is less clear-cut, for tasks such as image captioning and optical character recognition (OCR), reasoning is often unnecessary and can even be harmful (opens in new tab), while mathematical and scientific problem-solving benefit from multi-step reasoning. Thus, the choice of when to reason or not can be quite nuanced.

Training approaches for multimodal reasoning models

Language-only reasoning models are typically created through supervised fine-tuning (SFT) or reinforcement learning (RL): SFT is simpler but requires large amounts of expensive reasoning trace data, while RL reduces data requirements at the cost of significantly increased training complexity and compute. Multimodal reasoning models follow a similar process, but the design space is more complex. With a mid-fusion architecture, the first decision is whether the base language model is itself a reasoning or non-reasoning model. This leads to several possible training pipelines:

  • Non-reasoning LLM → reasoning multimodal training: Reasoning and multimodal capabilities are trained together.
  • Non-reasoning LLM → non-reasoning multimodal → reasoning multimodal training: Multimodal capabilities are learned first, then reasoning is added.
  • Reasoning LLM → reasoning multimodal training: A reasoning base is used, but all multimodal data must include reasoning traces.
  • Our approach: Reasoning LLM → mixed non-reasoning / reasoning multimodal training. A reasoning-capable base is trained on a hybrid data mixture, learning when to reason and when to respond directly.

Approaches 1 and 2 offer flexibility in designing multimodal reasoning behavior from scratch using widely available non-reasoning LLM checkpoints but place a heavy burden on multimodal training. Approach 1 must teach visual understanding and reasoning simultaneously and requires a large amount of multimodal reasoning data, while Approach 2 can be trained with less reasoning data but risks catastrophic forgetting, as reasoning training may degrade previously learned visual capabilities. Both risk weaker reasoning than starting from a reasoning-capable base. Approach 3 inherits strong reasoning foundations, but like Approach 1, it requires reasoning traces for all training data and produces reasoning traces for all queries, even when not beneficial.

Our approach: A mixed reasoning and non-reasoning model

Phi-4-reasoning-vision-15B adopts the 4th approach listed previously, as it balances reasoning capability, inference efficiency, and data requirements. It inherits a strong reasoning foundation but uses a hybrid approach to combine the strengths of alternatives while mitigating their drawbacks. Our model defaults to direct inference for perception-focused domains where reasoning adds latency without improving accuracy, avoiding unnecessary verbosity and reducing inference costs, and it invokes longer reasoning paths for domains, such as math and science, that benefit from structured multi-step reasoning (opens in new tab).

Our model is trained with SFT, where reasoning samples include “…” sections with chain-of-thought reasoning before the final answer, covering domains like math and science. Non-reasoning samples are tagged to start with a “” token, signaling a direct response, and cover perception-focused tasks such as captioning, grounding, OCR, and simple VQA. Reasoning data comprises approximately 20% of the total mix. Starting from a reasoning-capable backbone means this data grounds existing reasoning in visual contexts rather than teaching it to reason from scratch.

This approach is not without limitations. The balance between modes is a direct function of design choices we made, informed by recent literature (opens in new tab) and observed model behavior during training—though the boundary between modes can be imprecise as it is learned implicitly from the data distribution. Our model allows control through explicit prompting with “” or “” tokens when the user wants to override the default reasoning behavior. The 20/80 reasoning-to-non-reasoning data split may not be optimal for all domains or deployment contexts. Evaluating the ideal balance of data and the model’s ability to switch appropriately between modes remains an open problem.

We view this mixed approach not as a definitive solution, but as one practical and well-motivated point in the design space for balancing latency, accuracy, and flexibility in multimodal systems.

Applications Figure 4: Phi-4-Reasoning-Vision can interpret sequences of images 

Phi-4-reasoning-vision-15B is a high-performing model across many vision-language tasks. It sees and understands the world by looking at a photo, document, chart, or screen and making sense of it. In practice that covers an enormous range of applications — just a few examples include: describing images and answering questions about them, interpreting changes and trends in images sequences, and recognizing objects, landmarks, and transcribing text.

Highlights: Scientific and mathematical reasoning and supporting computer-using agents (CUA)

In addition to general vision and language tasks, Phi-4-reasoning-vision-15B was designed to excel at tasks that combine visual input with structured inference, such as solving math problems presented in visual form, such as handwritten or diagram-based questions, extracting and reasoning over quantitative information in documents and charts, and supporting multi-step reasoning in educational or scientific analysis contexts.

Figure 5: Phi-4-reasoning-vision-15B is great at math and science  Figure 6: Phi-4-reasoning-vision-15B can help with written math problems 

In addition, we trained Phi-4-reasoning-vision-15B to have skills that can enable agents to interact with graphical user interfaces by interpreting screen content and selecting actions. With strong high-resolution perception and fine-grained grounding capabilities, Phi-4-reasoning-vision-15B is a compelling option as a base-model for training agentic models such as ones that navigate desktop, web, and mobile interfaces by identifying and localizing interactive elements such as buttons, menus, and text fields. Due to its low inference-time needs it is great for interactive environments where low latency and compact model size are essential.

Figure 7: Phi-4-reasoning-vision-15B can help navigate computer UIs Evaluation

Phi-4-reasoning-vision-15B was evaluated for accuracy and timing using two complementary open-source frameworks to ensure both rigorous and standardized analysis: Eureka ML Insights (opens in new tab) and VLMEvalKit (opens in new tab).

BenchmarkPhi-4-reasoning-vision-15BPhi-4-reasoning-vision-15B – force nothinkPhi-4-mm-instructKimi-VL-A3B-Instructgemma-3-12b-itQwen3-VL-8B-Instruct-4KQwen3-VL-8B-Instruct-32KQwen3-VL-32B-Instruct-4KQwen3-VL-32B-Instruct-32KAI2D_TEST 84.8 84.7 68.6 84.6 80.4 82.7 83 84.8 85 ChartQA_TEST 83.3 76.5 23.5 87 39 83.1 83.2 84.3 84 HallusionBench64.4 63.1 56 65.2 65.3 73.5 74.1 74.4 74.9 MathVerse_MINI 44.9 43.8 32.4 41.7 29.8 54.5 57.4 64.2 64.2 MathVision_MINI 36.2 34.2 20 28.3 31.9 45.7 50 54.3 60.5 MathVista_MINI 75.2 68.7 50.5 67.1 57.4 77.1 76.4 82.5 81.8 MMMU_VAL 54.3 52 42.3 52 50 60.7 64.6 68.6 70.6 MMStar 64.5 63.3 45.9 60 59.4 68.9 69.9 73.7 74.3 OCRBench 76 75.6 62.6 86.5 75.3 89.2 90 88.5 88.5 ScreenSpot_v2 88.2 88.3 28.5 89.8 3.5 91.5 91.5 93.7 93.9 Table 3: Accuracy comparisons relative to popular open-weight, non-thinking models  BenchmarkPhi-4-reasoning-vision-15BPhi-4-reasoning-vision-15B – force thinkingKimi-VL-A3B-Thinkinggemma-3-12b-itQwen3-VL-8B-Thinking-4KQwen3-VL-8B-Thinking-40KQwen3-VL-32B-Thiking-4KQwen3-VL-32B-Thinking-40KAI2D_TEST 84.8 79.7 81.2 80.4 83.5 83.9 86.9 87.2 ChartQA_TEST 83.3 82.9 73.3 39 78 78.6 78.5 79.1 HallusionBench64.4 63.9 70.6 65.3 71.6 73 76.4 76.6 MathVerse_MINI 44.9 53.1 61 29.8 67.3 73.3 78.3 78.2 MathVision_MINI 36.2 36.2 50.3 31.9 43.1 50.7 60.9 58.6 MathVista_MINI 75.2 74.1 78.6 57.4 77.7 79.5 83.9 83.8 MMMU_VAL 54.3 55 60.2 50 59.3 65.3 72 72.2 MMStar 64.5 63.9 69.6 59.4 69.3 72.3 75.5 75.7 OCRBench 76 73.7 79.9 75.3 81.2 82 83.7 85 ScreenSpot_v2 88.2 88.1 81.8 3.5 93.3 92.7 83.1 83.1 Table 4: Accuracy comparisons relative to popular open-weight, thinking models 

Our model balances thinking and non-thinking performance – on average showing better accuracy in the default “mixed-reasoning” behavior than when forcing thinking vs. non-thinking. Only in a few cases does forcing a specific mode improve performance (MathVerse and MMU_val for thinking and ScreenSpot_v2 for non-thinking). Compared to recent popular, open-weight models, our model provides a desirable trade-off between accuracy and cost (as a function of inference time compute and output tokens), as discussed previously.

Note: All numbers here are the result of running benchmarks ourselves and may be lower than other previously shared numbers. Instead of quoting leaderboards, we performed our own benchmarking, so we could understand scaling performance as a function of output token counts for related models. We made our best effort to run fair evaluations and used recommended evaluation platforms with model-specific recommended settings and prompts provided for all third-party models. For Qwen models we use the recommended token counts and also ran evaluations matching our max output token count of 4096. For Phi-4-reasoning-vision-15B, we used our system prompt and chat template but did not do any custom user-prompting or parameter tuning, and we ran all evaluations with temperature=0.0, greedy decoding, and 4096 max output tokens. These numbers are provided for comparison and analysis rather than as leaderboard claims. For maximum transparency and fairness, we will release all our evaluation logs publicly. For more details on our evaluation methodology, please see our technical report (opens in new tab).

Safety

As with other Phi models, Phi-4-reasoning-vision-15B was developed with safety as a core consideration throughout training and evaluation. The model was trained on a mixture of public safety datasets and internally generated examples designed to elicit behaviors the model should appropriately refuse, in alignment with Microsoft’s Responsible AI Principles. For further details, check out our technical report (opens in new tab).

Open release and community engagement

Phi-4-reasoning-vision-15B is available on Microsoft Foundry (opens in new tab) and HuggingFace (opens in new tab) with additional examples and details on GitHub (opens in new tab). For additional guidance on how to use our model properly and safely, please refer to our Model card (opens in new tab). For further details on the technical aspects of the model, training, and evaluation, see our technical report (opens in new tab).

In line with our goal of supporting future AI development in the community, Phi-4-reasoning-vision-15B is released under a permissive license with model weights, fine‑tuning code, and benchmark logs. We intend this release to complement existing work by providing concrete artifacts that help close gaps in understanding how compact multimodal reasoning models can be built and studied.

Looking forward

Smaller vision–language models with selective, task‑aware reasoning offer one promising direction for making multimodal systems more practical and accessible. We present our model and its learnings to inform ongoing research in multimodal modeling, computer‑using agents, and mathematical scientific reasoning. We hope these details are useful to researchers exploring similar tradeoffs and invite critical evaluation, replication, and extension by the community. If you’d like to join us and help shape the future of multimodal models, please apply for one of our open roles.

Acknowledgements

We thank Rachel Ward for her extensive work on data collection and curation. We thank the GenDatasets, PhiGround, SimCity, and Fara-7B efforts for invaluable training data. We thank Harkirat Behl, Mojan Javaheripi, and Suriya Gunasekar for providing us with Phi-4 checkpoints and guidance on training with Phi models. We additionally thank Sahaj Agarwal, Ahmed Awadallah, Qi Dai, Gustavo de Rosa, Rafah Hosn, Ece Kamar, Piero Kauffmann, Yash Lara, Chong Luo, Caio César Teodoro Mendes, Akshay Nambi, Craig Presti, Matthew Rosoff, Corby Rosset, Marco Rossi, Kashyap Patel, Adil Salim, Sidhartha Sen, Shital Shah, Pratyusha Sharma, Alexey Taymanov, Vibhav Vineet, John Weiss, Spencer Whitehead, the AI Frontiers Team and Leadership, and Microsoft Research Leadership, for their valuable help, insightful discussions, and continued support throughout this work.

Opens in a new tab

The post Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model appeared first on Microsoft Research.

Categories: Microsoft

I don’t care if my phone gets long-term updates

How-To Geek - 3 hours 2 min ago

Lengthy software support has become one of the main selling points in modern smartphones. Manufacturers that offer only three or four years of updates are often criticized when competitors promise half a decade or more of support, even on mid-range phones.

Categories: IT General, Technology

Dead laptops, old DVRs, and PS4s: How to harvest free SATA drives for your PC

How-To Geek - 3 hours 17 min ago

You probably have a lot of old hardware that has outlived its usefulness. And if you're desperately looking for storage in these dire times of price hikes, you might be pleasantly surprised.

Categories: IT General, Technology

Keep a tidier home with $400 off the Mova Z60 Ultra Roller Complete Robot Vacuum and Mop

Mashable - 3 hours 27 min ago

SAVE $400.01: As of March 4, get the Mova Z60 Ultra Roller Complete Robot Vacuum and Mop for $1,098.99 at Amazon, down from its usual price of $1,499. That's a discount of 27%.

Opens in a new window Credit: Amazon Mova Z60 Ultra Roller Complete Robot Vacuum and Mop $1,098.99 at Amazn
$1,499 Save $400.01   Get Deal

Tired of spending all your extra time vacuuming and mopping your home? It's 2026, and you've got better things to do. You can offload those tasks to a robot vacuum and recoup that lost time doing things you actually like. And we've found a great model that can both save you time and money, so you can get back to living your life instead of doing menial tasks.

As of March 4, get the Mova Z60 Ultra Roller Complete Robot Vacuum and Mop for $1,098.99 at Amazon, down from its usual price of $1,499. That's a discount of 27%.

SEE ALSO: The Shark Matrix Plus 2-in-1 robot vacuum is down to a record-low $299.99 at Amazon

This powerful robot vacuum and mop combo can handle all the dirty work you don't want to do. It has 28,000Pa of suction combined with a tangle-free brush, so it can not only cut through dirt and debris while capturing up to 99% of large dirt particles, but it can pick up human and pet hair without tangling. Its TurboForce 8 high-speed motor ensures it does all this without any hiccups.

After you've had the robovac go over your home with a fine-toothed comb to pick up the dirt, you can return with the mop, which uses real-time clean water spray to rinse the mop as it cleans to avoid cross-contamination. It also uses smart fluffing to better help maintain the mop, so it doesn't get dingy and old as time wears on to ruin its performance.

If you're ready to turn your cleaning routine over to the robots, this is an excellent option to rely on that can pretty much handle itself. And with $400 off, now's the perfect time to buy it, too.

Categories: IT General, Technology

Stop fixing Excel formulas: 5 vital habits for data integrity

How-To Geek - 3 hours 32 min ago

Whether you're an Excel novice or a seasoned spreadsheet pro, we all share one dream: for our formulas to just... work. These five data integrity habits ensure your data stays clean, and your formulas stay functional, giving you more time to focus on the stuff that actually matters.

Categories: IT General, Technology

Your Meta Ray-Ban smart glasses recordings arent private

Mashable - 3 hours 41 min ago

The things you record with your AI-powered Meta Ray-Ban glasses — yes, even those intimate moments where you think you're alone — are probably being seen by strangers.

An investigation by Swedish outlets Svenska Dagbladet and Göteborgs-Posten found that offshore Meta workers in Kenya were asked to analyze intimate and even "disturbing" videos taken by glasses wearers, including videos taken in bathrooms, footage featuring nudity and sexual content, and images showing personal information like bank accounts. It's part of a process known as data labeling, used to train AI models with footage first reviewed and annotated by humans so that the AI can understand what it's "looking" at.

SEE ALSO: TikTok won't encrypt your DMs

Workers told the publication that many of the videos appear to be moments captured when users weren't aware they were being recorded. The group works under Sama, the same Meta contractor facing a class action lawsuit on behalf of content moderators who allege they have been exploited and forced to review traumatic content without proper working conditions.

"You understand that it is someone’s private life you are looking at, but at the same time you are just expected to carry out the work. You are not supposed to question it. If you start asking questions, you are gone," one employee told the publications.

Meta's Terms of Service reserves the right to send users' interactions with its AI services, including its always-on live AI features, to human moderators — the company referred to this policy when asked for comment by the news outlets.

The Meta Ray-Ban smart glasses collaboration initially launched in 2023 to mixed reviews about its photo and video capabilities and AI features. Meta released the upgraded AI-powered Meta Ray-Ban Display model in September, complete with a new Neural Band interface and promises of AI assistant integrations that would turn them into glasses of the future.

Sales of the glasses tripled in 2025, CNBC reported, with more than 7 million units sold.

But in the months since, Meta's wearable eye camera device has received widespread blowback, following a rise in influencer content depicting Meta glasses wearers secretly recording and even harassing unsuspecting strangers. Wearers have deduced ways to obscure the glasses' always-on recording light, intended to alert the public when a user is taking video, and instead turned the smart device into a tool for viral pickup artists and pranksters.

In addition to concerns about personal consent, the device has prompted worries about a fast-growing web of surveillance and facial recognition tech, which Meta has previously come under fire for. The company later said it was moving ahead with live AI features, including potential facial recognition, in 2025 — with the upgrade, a device "always keep its cameras and sensors turned on and use AI to remember what its wearer encountered throughout a day." Privacy advocates also warn the technology could one day be harnessed by third parties, including the federal government's own militarized police forces.

Categories: IT General, Technology

Amazon has the Ninja Slushi on sale for $50 off and it comes with a free $15 Amazon credit

Mashable - 3 hours 42 min ago

SAVE $50 + GET A FREE $15 AMAZON CREDIT: The Ninja Slushi is on sale at Amazon for $299.99, down from the normal price of $349.99. That's a 14% discount and it comes with a free $15 Amazon credit.

Opens in a new window Credit: Ninja Ninja Slushi $299.99 at Amazon
$349.99 Save $50 get a free $15 Amazon credit Get Deal

We move the clocks forward by one hour this weekend. Yes, that means we all lose an hour of sleep. But it also means that long summer nights are coming. Soon we'll all be outside in the backyard until sunset at 8 p.m. If you're planning epic parties for the summer, check out this deal at Amazon.

As of March 4, the Ninja Slushi is on sale at Amazon for $299.99, down from the normal price of $349.99. That's a 14% discount, and it comes with a free $15 Amazon credit.

With an 88-ounce capacity, the Ninja Slushi is ready to party. It has five pre-set programs that include slush, spiked slush, frappe, milkshake, and frozen juice. That means the Ninja Slushi can get invited to every party from the kids' end-of-school celebration to the 4th of July barbecue.

Depending on your ingredients, you can whip up frozen drinks in as little as 15 minutes with the Ninja Slushi. Others will take up to one hour, but that's still acceptable for party prep. Plus, the Slushi can keep drinks properly chilled and slush-ified to the perfect texture for up to 12 hours.

SEE ALSO: The Dreame L10s Ultra robot vacuum is under $300 at Amazon — save over $40

When it comes time to clean up, the Slushi has a rinse cycle that'll get it prepped and many of the parts are safe to clean in the dishwasher. On Mashable's list of the best Ninja appliances, the Slushi earns the top spot as the best for making frozen drinks.

Mashable Senior Shopping Reporter Leah Stoddart tested out the Ninja Slushi and wrote, "The refreshing spin on mundane bevs is a serotonin booster worth trying." She also mentioned it's quiet while operating and she enjoyed slushing everything from frosé to Pepsi and even a pre-workout electrolyte drink.

While the Ninja Slushi is on sale for under $300 at Amazon, make the upgrade to a more hydrated summer. Amazon is also tossing in a free $15 credit so you can grab some fancy glasses for your frozen bevvies.

Categories: IT General, Technology

These 5 Sonos features are still better than any alternative I’ve tried

How-To Geek - 3 hours 47 min ago

I’m old enough to say that I’ve been a Sonos user pretty much since the start, and let’s be honest—the last couple of years have been tough on the brand. They’re still recovering from the app overhaul gaffe that frustrated a lot of customers (myself included), while competitors like Bluesound and Wiim have taken a few bites out of their lunch.

Categories: IT General, Technology

The $599 MacBook Neo is the new budget laptop to beat

How-To Geek - 4 hours 1 min ago

After months of rumors, leaks, and speculation, Apple has finally revealed its new affordable laptop: the MacBook Neo. There are some notable downgrades compared to the MacBook Air, but it looks like a fantastic computer for $599, and education customers can get it for just $499.

Categories: IT General, Technology

Review: The $499 Pixel 10a does something Samsung and Apple cant

Mashable - 4 hours 2 min ago

Google's new Pixel 10a lacks novelty, but for a budget phone, that's not such a bad thing.

The Android maker's $500 handset walks the same road that its predecessors have for several years now: It's a slightly downgraded version of last year's Pixel 10.

So, what's new this time around? Besides the lower price and slightly downgraded specs, the camera array is now completely flush with the rest of the phone, eliminating the camera bump entirely (take notes, Samsung and Apple). The end result is a phone that's a lot like the more expensive Pixel 10 and still $100 less than the new budget iPhone 17e Apple just announced.

The Pixel 10a may not be a conversation starter, but not every phone needs to be one, and it's a good smartphone nonetheless.

At launch, you can buy the Google Pixel 10a at Amazon and choose from either a free $100 gift card or a free pair of the Pixel Buds 2a.

Opens in a new window Credit: Amazon Google Pixel 10a (128 GB) $499 at Amazon
$599 Save $100 Get $100 Amazon Gift Card with Purchase Get Deal SEE ALSO: Samsung Galaxy S26 vs. Google Pixel 10: Comparing specs, prices Google Pixel 10a: Specs Credit: Joe Maldonado / Mashable

Without wasting too much time, here's what you can expect specs-wise from the Pixel 10a:

  • 6.3-inch display with 1080x2424 resolution and 60-120Hz adaptive refresh rate

  • Up to 3,000 nits peak brightness

  • Google Tensor G4 processor

  • 5,100mAh battery

  • 8GB RAM

  • 128/256GB storage

The two most important points of comparison here are going to be the Pixel 9a and Pixel 10. Let's start with the former. The display size and specs are nearly identical, though the new Pixel 10a sports 3,000 nits of peak brightness, making it the brightest A-series Pixel phone to date. Google kept the processor and battery size the same, and didn't mess with RAM or storage, either. You're not getting less with Pixel 10a than you got with Pixel 9a, but you're not really getting much more, either.

Perhaps more crucially, the Pixel 10a is very similar specs-wise to the Pixel 10, a phone that starts at $800. The display is basically identical, the storage options are too, and the battery cell is actually slightly bigger in Pixel 10a than Pixel 10.

The main things Google cut to get the price down to $500 are the RAM (Pixel 10 has 12GB) and the processor. Pixel 10a uses the comparatively old Tensor G4 chip rather than the Tensor G5 introduced with the Pixel 10 lineup last year. This is a little strange and a departure from how Google usually handles the A-series phones, but it doesn't actually affect practical, daily use much.

Finally, there's also one notable camera downgrade from Pixel 10 to 10a, but we'll get to that later.

Google Pixel 10a: Design No more camera bump. Credit: Joe Maldonado / Mashable

Pixel 10a comes in four colors: Lavender, Berry, Fog, and Obsidian. Our review unit is the Lavender model, and I think it looks lovely. That said, for once, the colors aren't the most interesting part of the redesign.

Last year, with the Pixel 9a, Google removed the iconic horizontal camera bar that adorns regular Pixel phones, in favor of something much less visible and intrusive. Pixel 10a maintains the same basic look, but the camera has been sanded down even more, so it's completely flush with the backside of the device.

Yes, unlike most other modern smartphones, you can lay the Pixel 10a down on its back on a flat surface and there will be no wobble whatsoever. Your expensive Samsung Galaxy or iPhone 17 could never.

It's not a huge change, but it's welcome, nonetheless. I was at first opposed to making Pixel A-series phones look different, but thanks to this alteration, I think I now prefer the way they look over the big boy phones.

For what it's worth, Google also slimmed down the bezels, which is always nice. The new phone also uses Corning Gorilla Glass 7i, as opposed to Gorilla Glass 3 on Pixel 9a. It should be more durable now, but frankly, I'm not going to do a bunch of violent drop tests with a review unit phone to find out. That's what JerryRigEverything is for.

Google Pixel 10a: Software Credit: Joe Maldonado / Mashable

This is really the only somewhat sketchy part of the Pixel 10a. Remember earlier when I mentioned that Google didn't bring forward the Tensor 5 chipset from last year? It turns out this decision had some consequences in terms of which flashy AI features are and aren't available on Pixel 10a.

Put simply, this phone does not have complete AI parity with the Pixel 10. Perhaps the biggest missing feature is Magic Cue, a context-dependent assistant that brings up relevant information based on what's happening on your screen. On Pixel 10, if someone texted you to ask about an event in your calendar, Magic Cue would automatically surface the calendar entry for your convenience. That will not happen on Pixel 10a, because Magic Cue just isn't available here.

Credit: Google

The same goes for Daily Hub, a Pixel 10 feature that would act as a, well, hub for news, sports information, and YouTube videos based on your personal interests. To be fair, though, this may not be a chipset problem; Google removed Daily Hub from Pixel 10 to iron out some kinks, and as of January, it's still not back. Automatic voice translation in phone calls is also missing, which is unfortunate because that was probably the best AI feature of the bunch last year.

While the Pixel 10a can't fully replicate the Pixel 10's AI portfolio, it does bring forward Camera Coach, an AI feature from the Pixel 10 that uses AI to assist amateur photographers in finding the best shot composition for whatever photo they're taking. That's kind of it, though. The other noteworthy AI features, like Best Take (which can combine several shots to produce a perfect take), Circle to Search, and Gemini support, were already present in the Pixel 9a and don't seem to work any differently here.

Google Pixel 10a: Performance and battery life Credit: Joe Maldonado / Mashable

For as much as it kinda stinks that Google maintained the older Tensor G4 chip from a software availability perspective, it doesn't affect day-to-day performance that much. Apps load quickly, everything works smoothly, and generally speaking, I can't think of anything performance-wise that went horribly wrong in my time with the Pixel 10a. I can't provide Geekbench benchmarking metrics because that app isn't compatible with the Pixel 10a at the time of writing, but trust me, the phone works fine.

Bringing along the newer Tensor G5 processor might have potentially helped with battery life, though. The battery life on Pixel 10a isn't bad by any means, but it isn't noticeably better than Pixel 9a, either. The cell size is the same, and Google still rates it for about 30 hours of usage, which you can easily get as long as you don't do too much YouTubing or other battery-intensive activities. My charges lasted closer to 24 hours, which is acceptable for a $500 phone but not remarkable.

The good news is that Pixel 10a now supports 30W fast wired charging and 10W wireless charging, so if you have compatible adapters or chargers, you won't need to wait too long for the Pixel 10a to fill up on juice. I wish Google had improved the battery more, but giving users more charging options is nice, at any rate.

Google Pixel 10a: Cameras Credit: Google

Aside from Camera Coach, Google doesn't seem to have added any fun new photography features to Pixel 10a. The camera specs themselves are also identical to Pixel 9a:

  • 48MP wide

  • 13MP ultra-wide

  • 13MP selfie

One major thing that's missing from the Pixel 10a is a third telephoto lens. That was one of the biggest and best additions to the Pixel 10 family last year, and it's not included here, presumably to keep costs down. Still, the digital zoom does a decent enough job by itself.

No zoom. Credit: Alex Perry/Mashable Max zoom. Credit: Alex Perry/Mashable

While the camera setup here isn't any better than Pixel 9a, it's also not any worse. Colors pop and images look sharp, as you can see in the photographs I took while testing the Pixel 10a.

Feeling blue? Credit: Alex Perry/Mashable

Last year, Google brought Macro Focus to Pixel 9a, enabling up-close shots of tiny objects. It's still here, and it still works fine.

Macro-licious. Credit: Alex Perry/Mashable

And Night Sight continues to do its thing, enabling nice shots of things taken in the dark.

Spring hasn't sprung yet in New York. Credit: Alex Perry/Mashable Google Pixel 10a: Final thoughts

The Google Pixel 10a is a perfectly usable phone that doesn't cost too much money. That's really all you can say about it; these mid-range spin-off phones seem to get less and less interesting every year.

Its cameras are fine, but they're not any different from the Pixel 9a. The same goes for the processor and most of the software. It is really disappointing that Google couldn't give the Pixel 10a more AI software parity with Pixel 10, though. This really feels like a slightly more souped-up Pixel 9a with a flatter backside and some nice charging options.

Hopefully, Google can make the Pixel 11a a little more wow-worthy next year.

Opens in a new window Credit: Google Google Pixel 10a (128 GB) $499 at Amazon
  Shop Now Opens in a new window Credit: Amazon Google Pixel 10a (128 GB) $499 at Amazon
$599 Save $100 Get $100 Amazon Gift Card with Purchase Get Deal
Categories: IT General, Technology

How to watch Real Sociedad vs. Athletic Bilbao in the Copa del Rey online for free

Mashable - 4 hours 2 min ago

TL;DR: Live stream Real Sociedad vs. Athletic Bilbao in the Copa del Rey for free on ITVX. Access this free streaming platform from anywhere in the world with ExpressVPN.

The Copa del Rey semi finals were set up to be entertaining before a ball was kicked. Barcelona vs. Atlético Madrid is obviously a huge matchup, but the Basque Derby is one of the biggest games in football.

Real Sociedad won the first leg 1-0 at the San Mames Stadium. They now need to defend that lead at the Anoeta Stadium. Expect Athletic Bilbao to go on the offensive from the first whistle. It's going to be a fascinating contest between two extremely talented sides. The winner will book a spot in the showpiece event against Atlético Madrid after Diego Simeone's side survived a Barcelona comeback in the second leg.

If you want to watch Real Sociedad vs. Athletic Bilbao in the Copa del Rey from anywhere in the world, we have all the information you need.

When is Real Sociedad vs. Athletic Bilbao?

Real Sociedad vs. Athletic Bilbao in the Copa del Rey kicks off at 3 p.m. ET on March 4. This fixture takes place at the Anoeta Stadium.

How to watch Real Sociedad vs. Athletic Bilbao for free

Real Sociedad vs. Athletic Bilbao in the Copa del Rey is available to live stream for free on ITVX.

ITVX is geo-restricted to the UK, but anyone can access this free streaming platform with a VPN. These tools can hide your real IP address (digital location) and connect you to a secure server in the UK, meaning you can unblock ITVX to stream the Copa del Rey for free from anywhere in the world.

Live stream Real Sociedad vs. Athletic Bilbao in the Copa del Rey for free by following these simple steps:

  1. Subscribe to a streaming-friendly VPN (like ExpressVPN)

  2. Download the app to your device of choice (the best VPNs have apps for Windows, Mac, iOS, Android, Linux, and more)

  3. Open up the app and connect to a server in the UK

  4. Visit ITVX

  5. Watch Real Sociedad vs. Athletic Bilbao for free from anywhere in the world

Opens in a new window Credit: ExpressVPN ExpressVPN (1-Month Plan) $12.95 only at ExpressVPN (with money-back guarantee) Get Deal

The best VPNs for streaming are not free, but most do offer free-trials or money-back guarantees. By leveraging these offers, you can access free live streams of the Copa del Rey without actually spending anything. This obviously isn't a long-term solution, but it does give you enough time to stream Real Sociedad vs. Athletic Bilbao in the Copa del Rey before recovering your investment.

What is the best VPN for ITVX?

ExpressVPN is the best choice for bypassing geo-restrictions to stream live sport on ITVX, for a number of reasons:

  • Servers in 105 countries including the UK

  • Easy-to-use app available on all major devices including iPhone, Android, Windows, Mac, and more

  • Strict no-logging policy so your data is secure

  • Fast connection speeds free from throttling

  • Up to 10 simultaneous connections

  • 30-day money-back guarantee

A two-year subscription to ExpressVPN is on sale for $68.40 and includes an extra four months for free — 81% off for a limited time. This plan includes a year of free unlimited cloud backup and a generous 30-day money-back guarantee. Alternatively, you can get a one-month plan for just $12.99 (with money-back guarantee).

Live stream Real Sociedad vs. Athletic Bilbao in the Copa del Rey for free with ExpressVPN.

Categories: IT General, Technology

MacBook Neos real killer feature: Its $499 education pricing

Mashable - 4 hours 11 min ago

Apple's budget laptop, the MacBook Neo, is here. There has been lots of hype around Apple's cheapest ever MacBook and its price point starting at $599. At that price range, it finally puts Apple's MacBook line into financial reach for many potential users who just couldn't see themselves spending $1,000 or more for a laptop.

While $599 is a pretty affordable deal for a MacBook, knocking off another $100 would really give the budget PC laptop market some stiff competition. And, that's exactly what Apple has done with its education pricing.

If you go to Apple's Education Store for students and education professionals, Apple is selling the base MacBook Neo which comes with 256GB of storage for a discounted price of $499. If you want to double the storage to 512GB and add Touch ID, then the education price is only $599 or the exact same price as the 256GB base model without the education discount.

SEE ALSO: Apple's MacBook Neo comes in citrus yellow and blush pink. The internet has thoughts.

This is the real killer deal that makes the MacBook Neo a real problem for Windows laptop manufacturers. In fact, looking at Apple's MacBook Neo announcement, it is the only new product the company announced where Apple mentions the $499 education pricing in its announcement and not just in its pricing bullet points.

A $599 MacBook is certainly cheap, but a sub-$500 one, even just psychologically puts it within reach for way more potential buyers.

Credit: Mashable

At $499, Apple opens the door for a whole new user base of students and other first-time Mac owners who aren't necessarily in creative industries or art schools and don't require a super powerful laptop to handle 3D modeling, video editing, or graphic design work.

SEE ALSO: How to preorder the 2026 Apple Studio Display, including the mini-LED Studio Display XDR

There's also that not-so-secret "loophole" that many will undoubtedly take advantage of when buying the MacBook Neo: Apple does not check or verify whether consumers are eligible for the education discount. Technically, anyone can order the $499 MacBook Neo in Apple's Education Store.

By mentioning that $499 education pricing in their announcement though, maybe Apple doesn't even mind in exchange for taking over the budget laptop market with the MacBook Neo, the company's most affordable MacBook yet.

Categories: IT General, Technology

The Shark Matrix Plus 2-in-1 robot vacuum is down to a record-low $299.99 at Amazon

Mashable - 4 hours 13 min ago

SAVE 57%: As of March 4, the Shark Matrix Plus 2-in-1 robot vacuum-mop is on sale for $299.99, down from $699.99, at Amazon. That's a 57% discount or $400 in savings.

Shark Matrix Plus 2-in-1 robot vacuum-mop $299.99 at Amazon
$699.99 Save $400   Get Deal at Amazon

If you're sick of vacuuming and mopping your floors but refuse to pay $1,000 to have a robot do it for you, it’s time to look past Roomba (they filed for bankruptcy for a reason!). Shark makes some of the best hybrid vacuums on the market, and one of our favorite models is currently on sale for less than $300.

As of March 4, the Shark Matrix Plus 2-in-1 robot vacuum-mop is on sale for $299.99, down from $699.99, at Amazon. That's a 57% discount or $400 in savings. Plus, if you happen to get approved for an Amazon Business Prime Card, you can knock another $125 off the total.

SEE ALSO: Review: I can't believe how much I loved the Shark AI Ultra 2-in-1 robot vacuum

Our vacuum expert, Leah Stodart, tested this exact model and named it the "Best Budget Shark Robot Vacuum and Mop Combo," mostly because it actually mops. While most budget hybrids just drag a wet rag across your floor, the Matrix Plus uses sonic mopping that scrubs 100 times per minute to lift dried stains.

"In my testing, I watched the Matrix Plus 2-in-1 successfully clear crumbs near the kitchen counter, kitty litter in my bathroom, and minor drops on hardwood or tile several times — messes that similarly-priced Roombas I tested couldn't conquer in one pass," Stodart writes.

It also features a self-cleaning brushroll that won't get tangled with pet hair, and a bagless, self-emptying base that holds up to 60 days of dirt.

Categories: IT General, Technology

Lossless Scaling isn't just frame generation: 5 unique ways to use your NVIDIA GPU's most versatile feature

How-To Geek - 4 hours 17 min ago

Lossless Scaling often gets framed as cheap frame generation. It's true: It costs $7, and it can do incredible things, such as bringing your old Nvidia GTX 1060 GPU out of retirement for the purpose of being a second GPU.

Categories: IT General, Technology
Syndicate content

eXTReMe Tracker