This is Part II of ‘THE METAVERSE PRIMER’, which focuses on the role of Hardware in ‘The Metaverse’. Here, Hardware is defined as “The sale and support of physical technologies and devices used to access, interact with, or develop the Metaverse. This includes, but is not limited to, consumer-facing hardware (such as VR headsets, mobile phones, and haptic gloves) as well as enterprise hardware (such as those used to operate or create virtual or AR-based environments, e.g. industrial cameras, projection and tracking systems, and scanning sensors). This category does not include compute-specific hardware, such as GPU chips and servers, as well as networking-specific hardware, such as fiber optic cabling or wireless chipsets.”
Every year, consumer hardware benefits from better and more capable sensors, longer battery life, more sophisticated/diverse haptics, richer screens, sharper cameras, etc. We also see an ever-expanding number of smart devices, such as watches, VR headsets (and soon, AR glasses). All of these advances enhance and extend user immersion, even though software delivers the actual experience or ‘magic’.
Consider, as a limited example, live avatar applications such as Bitmoji, Animoji, and Snapchat AR. These depend on fairly capable CPUs/GPUs (see Section #3), as well as sophisticated software. But they also require and are enriched by powerful face-tracking camera and sensor hardware that continues to improve. Newer iPhone models now track 30,000 points on your face via infrared sensors. While this is most commonly used for Face ID, it can now be connected to apps such as Epic Games’ Live Link Face application, thereby enabling any consumer to create (and stream) a real-time, Unreal Engine-based high-fidelity avatar. It’s clear that Epic’s next step will be to use this functionality to live map a Fortnite player’s face onto their in-game character.
Apple’s Object Capture, meanwhile, enables users to create high-fidelity virtual objects using photos from their standard-issue iPhone in a matter of minutes. These objects can then be transplanted into other virtual environments, thereby reducing the cost and increasing the fidelity of synthetic goods, or overlayed into real environments for the purpose of art, design, and other AR experiences.
Many new smartphones, including the iPhone 11 and iPhone 12, feature new ultra-wideband chips that emit 500,000,000 RADAR pulses per second and receivers that process the return information. This enables smartphones to create extensive RADAR maps of everything from your home, to your office, and the street you’re walking down — and place you within these maps, relative to other local devices, down to a few centimeters. This means your home door can unlock when you approach from the outside, but remain closed from the inside. And using a live RADAR map, you’ll be able to navigate much of your home without ever needing to remove your VR headset.
That all of this is possible through standard consumer-grade hardware is astonishing. And the growing role of this functionality in our daily lives explains why the iPhone has been able to increase its average sales price from roughly $450 in 2007 to over $750 in 2021, rather than just offer greater capability at the same price
XR headsets are another great example of both progress and outstanding needs in hardware. The first consumer Oculus (2016) had a resolution of 1080×1200 per eye, while the Oculus Quest 2, released four years later, had 1832×1920 per eye (roughly equivalent to 4K). Palmer Luckey, one of Oculus’s founders, believes more than twice this resolution is required for VR to overcome pixelation and become a mainstream device. The Oculus Rift also peaked at a 72hz refresh rate, while the most recent edition achieves 90hz, and up to 120hz when connected to a gaming PC via Oculus Link. Many believe 120hz is the minimum threshold for avoiding disorientation and nausea in some users. And ideally this would be achieved without needing a gaming-level PC and tether.
While humans can see an average of 210°, Microsoft’s HoloLens 2 display covers only 52° (up from 34°). Snap’s forthcoming glasses are only 26.3°. To take off, we likely need far wider coverage. And these are primarily hardware challenges, not software ones. What’s more, we need to make these advances while also improving the quality of other hardware inside a wearable (e.g. speakers, processors, batteries) — and ideally shrinking them, too.
Unity CEO Predicts AR-VR Headsets Will Be as Common as Game Consoles by 2030 https://t.co/D6Bcw2Q9wg— Matthew Ball (@ballmatthew) June 21, 2021
Another example is Google’s Project Starline, a hardware-based booth designed to make video conversations feel like you’re in the same room as the other participant, powered by a dozen depth sensors and cameras, as well as a fabric-based, multi-dimensional light-field display, and spatial audio speakers. This is brought to life using volumetric data processing and compression, then delivered via webRTC, but hardware is critical to capturing and presenting a level of detail that ‘seems real’.
Given what’s possible with consumer-grade devices, it’s no surprise that industrial/enterprise hardware at multiples of the price and size will stun. Leica now sells $20,000 photogrammetric cameras that have up to 360,000 “laser scan set points per second”, which are designed to capture entire malls, buildings, and homes with greater clarity and detail than the average person would ever see in person. Epic Games’s Quixel, meanwhile, uses proprietary cameras to generate environmental “MegaScans” comprised of tens of billions of pixel-precise triangles.
These devices make it easier and cheaper for companies to produce high-quality ‘mirror worlds’ or ‘digital twins’ of physical spaces, as well as use scans of the real world to produce higher-quality and less-expensive fantasy ones. Fifteen years ago, we were stunned by Google’s ability to capture (and finance) 360° 2D images of every street in the world. Today, scores of businesses can purchase LIDAR cameras and scanners to build fully immersive, 3D photogrammetric reproductions of anything on earth.
These cameras get particularly interesting when they go beyond static image capture and virtualization, and into real-time rendering and updating of the real world. Today, for example, the cameras at an Amazon Go retail store will track dozens of consumers at the same time via code. In the future, this sort of tracking system will be used to reproduce these users, in real-time, in a virtual mirrorworld. Technologies such as Google’s Starline will then allow remote workers to be ‘present’ in the store (or a museum, or a DMV, or a casino) from a sort of offshore ‘Metaverse call center’ — or perhaps at home in front of their iPhones.
When you go to Disneyland, you might be able to see virtual (or even robot) representations of your friends at home and collaborate with them to defeat Ultron or collect the Infinity Stones. These experiences require far more than hardware — but they’re constrained, enabled, and realized through it.
This is part two of the nine-part ‘METAVERSE PRIMER’.
Matthew Ball (@ballmatthew)