AR Briefs: The Convergence of AR and AI

AR Insider
3 min readSep 11, 2024

--

The breakout tech topic of the past year has clearly been AI. Though AI has been around for years, its latest flavors have captured everyone’s attention and imagination, including large language models and generative art. It’s not only captivating but relatable and broadly applicable.

But one looming question is how these latest developments will intersect with the focus of this publication: AR and spatial computing. Though AI has been characterized as a technology that will steal XR’s thunder, the truth is that it will do the opposite. They’ll elevate each other.

How will that happen? There are several ways, including the area we call generative XR. This is the topic of a recent report by our research arm, ARtillery Intelligence. It’s also the topic of a recent episode of ARtillery Briefs, which we’ve summarized and embedded below.

Intersections & Integrations

First, we’ll acknowledge that AI is all around us and has been for a while, as noted. It’s everything from auto-complete on your iPhone to Alexa. But the latest inflections in AI raise the bar considerably with large language models that enable generative and conversational AI.

Back to our theme of AI’s intersections and integrations with spatial computing, each of these types of AI factors in differently. For example, generative AI can streamline XR creation workflows, while conversational AI can create better user interfaces for smart glasses.

Taking those one at a time, we’re starting to see some innovation — though there’s still a far way to go — in generative AI tools that can help XR creators build experiences. Think of this like pervasive generative AI tools like Midjourney, but tuned for 3D graphics and interactions.

For example, Snap’s new GenAI suite offers tools for lens creators to do just that. As Snap CTO Bobby Murphy explained on stage at AWE USA in June (a discussion we moderated), this is just the first step towards AI empowering AR creators and democratizing the field.

Sleeping Giant

Moving on to other AI and AR synergies, there’s an opportunity to rethink the latter’s user interfaces. For example, the rapid evolution of GPT-based chat agents presents opportunities to make user inputs more conversational and intuitive in smart glasses and mixed-reality headsets.

This could fill a key gap because such devices are otherwise limited in their inputs. Though hand tracking and gestural inputs are advancing quickly (see Apple Vision Pro), more capable and conversational inputs could be a force multiplier for XR adoption and interest.

Beyond Apple Intelligence integrations that could come to Vision Pro, we’re seeing others do similar. Chief among them is object recognition in Ray Ban Meta Smartglasses, which have a visual input and audible output (multimodal AI) to contextualize physical scenes and objects.

This could end up being the real killer app that smart glasses have been waiting for. Indeed, visual search has long been a sleeping giant. More capable multimodal AI — married with a head-worn, line-of-sight interface — could finally unlock the potential of tools like Google Lens.

For more color, see the full ARtillery Briefs episode below and stay tuned for much more on this topic…

Originally published at https://arinsider.co on September 11, 2024.

--

--

AR Insider
AR Insider

Written by AR Insider

A publication about spatial computing | Brought to you by ARtillery Intelligence.