Google LiteRT Accelerator: 100x AI Speed on Snapdragon Android Devices (2025)

Google's latest innovation is a game-changer for AI on Android! 🦾🤖️

Google has unveiled a powerful accelerator, the Qualcomm AI Engine Direct (QNN), designed to revolutionize AI performance on Android devices with Qualcomm's Snapdragon 8 SoCs. But here's the twist: it's not just about raw power. The QNN accelerator promises to supercharge AI tasks, delivering an astonishing 100x speedup over CPU and 10x over GPU execution. This is a huge leap forward, but why the need for such a boost?

The GPU Conundrum:

According to Google's engineers, Lu Wang, Wiyi Wanf, and Andrew Wang, relying solely on GPUs for AI tasks can lead to performance issues. Imagine running a text-to-image model and processing live camera feed simultaneously; even high-end mobile GPUs might struggle, resulting in a less-than-smooth user experience. And this is the part most people miss: the power-hungry nature of GPUs.

Enter the Neural Processing Units (NPUs):

Many modern mobile devices now feature NPUs, custom-built AI accelerators that consume less power and outperform GPUs in AI tasks. QNN takes full advantage of these NPUs, offering a unified and simplified development workflow. It integrates various SoC compilers and runtimes, exposing them through a streamlined API, making it a developer's dream.

QNN supports 90 LiteRT operations, aiming for full model delegation, which is crucial for peak performance. It also includes specialized optimizations for LLMs like Gemma and FastLVM, pushing their capabilities even further. Google's benchmarks on 72 ML models proved impressive, with 64 models achieving full NPU delegation and up to 100x faster performance compared to CPU.

On the Snapdragon 8 Elite Gen 5, QNN unlocks a new world of live AI experiences. Over 56 models run incredibly fast, under 5ms, while only 13 models achieve this on the CPU. Google even developed a concept app using Apple's FastVLM-0.5B model, achieving lightning-fast scene interpretation with a TTFT of 0.12 seconds on high-resolution images.

However, QNN currently supports a limited range of Android hardware, mainly Snapdragon 8 and 8+ devices. So, while it's a significant step forward, it's not a universal solution... yet.

What do you think? Is QNN the future of AI acceleration on mobile devices, or are there other approaches you'd like to see? Share your thoughts in the comments below!

Google LiteRT Accelerator: 100x AI Speed on Snapdragon Android Devices (2025)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Duncan Muller

Last Updated:

Views: 6619

Rating: 4.9 / 5 (79 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Duncan Muller

Birthday: 1997-01-13

Address: Apt. 505 914 Phillip Crossroad, O'Konborough, NV 62411

Phone: +8555305800947

Job: Construction Agent

Hobby: Shopping, Table tennis, Snowboarding, Rafting, Motor sports, Homebrewing, Taxidermy

Introduction: My name is Duncan Muller, I am a enchanting, good, gentle, modern, tasty, nice, elegant person who loves writing and wants to share my knowledge and understanding with you.