ARKit 2.0 – For Humans

At its annual developer conference this week, Apple announced ARKit 2.0, which will be part of iOS 12, its next update to the operating system that the iPhone and iPad rely on. We’ve been exploring the betas of the new software, examining the SDK, and trying out the sample projects, and have a few thoughts to share.

The new version of ARKit adds a number of features that will be of interest to AR developers and companies that are using AR:

– Persistent AR Experiences: With earlier versions of ARKit, your experiences would only last as long as you kept your app active. Once you moved on to something else, you couldn’t come back to your work in progress. ARKit 2.0 adds the ability to save a session in progress and come back to it later with your augmented objects still in the same place. Users can now start designing their living room decor in the morning and pull their work back up to share with their housemates that evening. This makes AR a viable tool to do real, substantive work — creating persistent designs and stories — rather than just providing fun but short-lived experiences.

– Shared AR Experiences: AR has previously been a solitary activity. There was no way for the items in your AR session to be visible to others using different devices. In ARKit 2.0, the same mechanisms that allow persisting an AR session also provide the ability to share it with others. Multiple designers can style a car together, each exploring and making changes to the same vehicle while examining it from her own viewpoint. Or the florist, caterer, decorator, baker, and photographer can all plan out the space for a wedding reception, combining their individual elements in a shared environment, ensuring the “Just Married” banner doesn’t end up in the cake.

– More Flexible Image Tracking: ARKit 1.5 added the ability to identify static images, like posters, murals, or signs in your environment. With ARKit 2.0, an app can see and respond to images that move around — boxes, magazines, books, etc. — making it useful for augmented books or finding the gluten-free options among all those cereals on the supermarket shelf.

– Object Tracking: In addition to flat images, the new version also tracks 3D objects in a scene and responds to them. With robust object tracking, maintenance technicians will be able to use their ARKit device to quickly identify engine parts and pull up technical references and interactive guides for common maintenance procedures. In a retail setting, workers will be able to conduct inventories simply by showing the products on a shelf to an iPad, which will recognize and tally all of the store’s inventories. (These are both a little beyond the beta ARKit’s capabilities currently, but the technology will doubtless continue to improve at a rapid clip.)

– Reflection Texturing: One of the subtle details that contribute to making AR objects look real is their reflection of the environment around them. Creating accurate reflections is extremely difficult for a variety of technical reasons. Apple added some clever engineering to ARKit 2.0, combining spatial mapping with image capture to generate the reflections it can know about and using machine learning to fill in the remaining gaps. Your shiny pretend teapot can now accurately reflect the real banana sitting right next to it.

– In addition to the new capabilities, ARKit 2.0 also adds a common file format for storing and sharing AR content: the awkwardly-named USDZ. Apple is baking support for this format deep into its operating systems, so that when you visit a web page with an embedded USDZ model of that juicer/blender you’ve been considering, you’ll be able to view it from all angles on the web page or to switch to an AR view to see what it would look like right on your kitchen counter.

While many of the features Apple built into ARKit 2.0 are ones we’ve already seen in Vuforia and ARCore (Google’s analogue to ARKit on Android), Apple’s ARKit tends to work exceptionally well thanks to their control over both the hardware and the software it runs on. In addition, for iOS developers who are used to Swift/Objective C and UIKit, ARKit provides a very capable solution with a familiar API.

We’re excited about the possibilities that ARKit 2.0 brings, and are already busily exploring how best to bring these capabilities to our customers!

 

The State of AR: 2018

Augmented Reality is in the midst of its moment in the sun. While Virtual Reality has had a death grip on the hype spotlight since Facebook acquired Oculus in 2014, AR has been oddly quiet. According to Gartner’s Hype Cycle, AR is in the “Trough of Disillusionment” 5-10 years away from plateau, whereas VR sits on the “Slope of Enlightenment” 2-5 years from its plateau. We believe that this accurately reflects mainstream acceptance of head-mounted AR – but mobile-based AR and other similar platforms (e.g., heads-up displays) are delivering value today across many industries under the umbrella of technologies we identify as “AR”.

AR has crept into our daily lives without the fanfare some would expect, via technologies not traditionally classified as AR: car-based heads-up displays, photo filters, etc. Indeed, much of the AR research that companies such as Apple, Microsoft, and Google, have invested in remains mostly under wraps. Shrouded in secrecy and patent filings, Magic Leap has been steadily building a mountain of hype surrounding a new generation of AR products. Products that promise to blend the virtual with the physical will compete with our own biology’s ability to tell them apart. They claim an ability to deliver a world where data is no longer restricted to a glass rectangle but is instead woven into our environment seamlessly. These visions of the future may seem distant and lofty, but we are already experiencing the early Genesis and will soon integrate these capabilities into our everyday lives.

The Back Story

For years, a single type of AR reigned supreme. Marker-Based AR, known colloquially as “QR Code AR,” has been around since the turn of the century. The idea is simple: a camera points at a fiducial marker (i.e., QR Code), the program finds the target it was looking for, and it displays the 3D model – reorienting the model to match the pose information that the camera’s perspective of the marker creates. It performs these functions as fast as the camera captures the frames, therefore updating the position, rotation, and scale of the virtual object in real time. All of this conveys the illusion that the object is in front of the camera, but this method lacks true spatial awareness of the space surrounding the marker. In order for any virtual object to persist in the physical world, it must understand the physical world’s spatial properties. Newly developed computer vision (CV) techniques alongside performance gains in modern computing hardware have enabled a new class of AR applications that can do just that.

Enter “Markerless AR”

Spearheaded by the development of the Microsoft Hololens in 2016, Markerless AR has quickly become a movement unto itself. Google experimented with a suite of devices and custom software collectively called Project Tango which, like the Hololens, used LiDAR to map environments in real time and allowed users to place virtual objects in space with respect to physical boundaries like actual floors and walls. No longer were markers needed to interact with holograms. However, this hardware proved to be quite expensive to mass-produce, especially so early in the technology’s lifecycle. The Microsoft Hololens sells for a whopping $3,000 USD, and although Tango devices did make it to market, they were never intended for consumer use. With only a handful of developers able to afford the hardware – and near zero consumer adoption – Google shut down Project Tango and the Hololens became little more than a marketing tool for Microsoft (albeit a powerful one!). However, everything changed in mid-2017 when Facebook, Apple, Google, and Snapchat each announced their own Markerless AR solutions for mobile devices.

While Facebook and Snapchat added world tracking features to their existing camera apps, Apple developed an entire AR platform from the ground up for iOS. While ARKit doesn’t allow for some advanced features that the Hololens supports such as cross-session persistence (saving room scans and recognizing them automatically), or head-mounted holograms (still a handheld iPhone), it did effectively eliminate AR’s barrier to entry. For the first time, consumers had instant access to high-quality Markerless AR content. With consumer adoption in hand, developer interest piqued; and the largest market for immersive content to ever exist was instantly created.

2018 and Beyond

Now in 2018, there is more interest in Augmented Reality than ever before. Google answered Apple’s ARKit with ARCore, and in a few months, Magic Leap will release the first consumer-facing untethered Markerless AR Head-Mounted Display. Soon, we will be interacting with the world in ways we could have never imagined, dwarfing the creativity of fantasy and science fiction, prompting the query from future generations: “What is a screen?”

The Creation of “Starfox AR” – VR Austin Jam

Since it was first announced, I have been interested in experimenting with the iPhone X’s fancy new TrueDepth front facing camera. As soon as I got my hands on one, I downloaded the Unity ARKit Plugin and started digging into the new face tracking API’s. The creepy grey mask in the example project immediately reminded me of the final boss from Starfox (SNES), Andross. I found this video of the final battle from Starfox and thought it would make an awesome face tracking experience. This coalesced just days before the VR Austin Jam 2017 was set to begin, giving me the perfect idea for my Jam entry.

I knew going into the weekend, the secret to a successful hackathon is limiting scope. So I decided to focus on getting the face tracked Andross rig working first while my dev partner, Kenny Bier, focused on game mechanics. Luckily, Jeff Arthur (Banjo’s talented 3D artist) supplied me with the low poly Andross model, Starfox’s Arwing, and the door-like Andross projectiles before the Jam began so I had assets to work with.

This Unity blog post got me started by explaining at a high level how to access the iPhone X user’s face position, rotation, and blend shape location properties. Basically, you begin a new ARKitFaceTracking session, subscribe a FaceUpdated event and access the blend shape locations from within that function using the ARFaceAnchor anchorData.blendShapes dictionary.

private UnityARSessionNativeInterface m_session;

void Awake()
 {
m_session = UnityARSessionNativeInterface.GetARSessionNativeInterface();
 }

void Start () {
 Application.targetFrameRate = 60;
 ARKitFaceTrackingConfiguration config = new ARKitFaceTrackingConfiguration();

config.alignment = UnityARAlignment.UnityARAlignmentGravity;
 config.enableLightEstimation = true;

if (config.IsSupported ) {
 m_session.RunWithConfig (config);
 UnityARSessionNativeInterface.ARFaceAnchorAddedEvent += FaceAdded;
 UnityARSessionNativeInterface.ARFaceAnchorUpdatedEvent += FaceUpdated;
 UnityARSessionNativeInterface.ARFaceAnchorRemovedEvent += FaceRemoved;
 }

}
void FaceUpdated (ARFaceAnchor anchorData)
 {
 mAnchorData = anchorData;

currentBlendShapes = anchorData.blendShapes;
 mouthOpenInt = andross.GetComponent().sharedMesh.GetBlendShapeIndex("MouthOpen");

//Open Mouth
 currentBlendShapes.TryGetValue("jawOpen", out jawOpenAmt);
 andross.GetComponent().SetBlendShapeWeight(0, jawOpenAmt * 100);

//Left Eye Blink
 currentBlendShapes.TryGetValue("eyeBlink_L", out l_eyeOpenAmt);
 andross.GetComponent().SetBlendShapeWeight(1, l_eyeOpenAmt * 100);

//Right Eye Blink
 currentBlendShapes.TryGetValue("eyeBlink_R", out r_eyeOpenAmt);
 andross.GetComponent().SetBlendShapeWeight(2, r_eyeOpenAmt * 100);

}

Once you have the iPhone X blend shape hook ins, you route them to the imported blend shapes that correspond to your model. As a test, I imported a fully rigged model from the Unity Asset Store and got the mouth flapping. 

*IMPORTANT: ARKit’s blend shape values operate from 0-1 but your mesh’s blend shape weights operate from 0-100, so remember to multiply by 100 or you won’t see any animations*

Next, I had to rig my own model to be driven by these values.

I knew very little about creating blend shapes going in but I found this article that explains it fairly well. Normally, you would rig an entire face before creating the blend shapes so that the animation would render realistic muscle movement. However, due to the low poly nature of my Andross face, I could skip the rigging step and just manipulate the individual vertices by hand. I created three blend shapes: left eye closed, right eye closed, and mouth open. 

Once I exported the face mesh out of Maya with the blend shapes attached and imported it into Unity, I could manipulate the blend shape weights in the editor. 

After swapping out some variables, I replaced my example face rig with Andross and got my first retro game boss animoji working as intended.

I wanted all the user input to rely on facial expressions such as opening your mouth to fire and closing your eyes to turn ‘invisible’, allowing the Starfox bullets to pass through Andross without hurting him. So all I had to do was trigger functions based off the blend shape weight value (and control firing with a coroutine so that there wasn’t 1 million projectiles coming out Andross’s mouth!).

The bulk of the time spent after this was just creating the *game* part of it: randomizing enemy flight paths, firing projectiles, health systems, placing UI elements (all retro 2D assets created by the lovely/talented Kaci Lambeth), game over/win conditions and generally attempting to make it fun. After squashing a litany of bugs and balancing gameplay, Starfox AR was ready… to make people look strange in public!

Download here: https://rigelprime.itch.io/starfox-ar 

PLNAR: Replacing your Tape Measure

Working with Apple and SmartPicture to Launch an ARKit App on Day 1

SmartPicture approached Banjo prior to the iOS 11 and ARKit launch event with an interesting challenge: Develop a consumer version of the pro-level SmartPicture 3D room planning application in five (5?!) weeks and have it ready for launch. Apple hosted the combined Banjo/SmartPicture team in Cupertino pre-launch to consult and ensure that we hit no roadblocks.

On September 12th 2017, Banjo and SmartPicture launched PLNAR – and it has since racked up over 200,000 downloads from users measuring their rooms and automatically creating floor plans.

As one of the first developers to deploy a large-scale augmented reality application on iOS, Rigel has shared some of the UX considerations and challenges faced when asking users to interact with a third dimension on a two-dimensional screen.

Check out our case study on PLNAR, or download the app from the App Store.

rigel-apple