Baby Steps with the Azure Spatial Anchors Service

NB: The usual blog disclaimer for this site applies to posts around HoloLens. I am not on the HoloLens team. I have no details on HoloLens or Azure Mixed Reality other than what is on the public web and so what I post here is just from my own experience experimenting with pieces that are publicly available and you should always check out the official developer site for the product documentation.

One of the many, many strands of the exciting, recent announcements around Mixed Reality (see the video here) was the announcement of a set of Azure Mixed Reality Services.

You can find the home page for these services on the web here and they encompass;

  • Azure Spatial Anchors
  • Azure Remote Rendering

Both of these are, to my mind, vital foundational services that Mixed Reality application builders have needed for quite some time so it’s great to see them surface at Azure.

At the time of writing, the Azure Remote Rendering service is in a private preview so I’m not looking at that right now but the Azure Spatial Anchors service is in a public preview and I wanted to experiment with it a little and thought I would write up some notes here as I went along.

Before I do that though…

Stop – Read the Official Docs

There’s nothing that I’m going to say in this post that isn’t covered by the official docs so I’d recommend that you read those before reading anything here and I’m providing some pointers below;

  1. Check out the overview page here if you’re not familiar with spatial anchors.
  2. Have a look at the Quick Start here to see how you can quickly get started in creating a service & making use of it from Unity.
  3. Check out the samples here so that you can quickly get up and running rather than fumbling through adding library references etc. (note that the Quick Start will lead you to the samples anyway).

With that said, here’s some rough notes that I made while getting going with the Azure Spatial Anchors service from scratch.

Please keep in mind that this service is new to me so I’m really writing up my experiments & I may well make some mistakes.

A Spatial WHAT?!

If you’re not coming from a HoloLens background or from some other type of device background where you’re doing MR/AR and creating ‘spatial anchors’ then you might wonder what these things are.

To my mind, it’s a simple concept that’s no doubt fiendishly difficult to implement. Here’s my best attempt;

A spatial anchor is a BLOB of data providing a durable representation of a 3D point and orientation in a space.

That’s how I think of it. You might have other definitions. These BLOBs of data usually involve recognising ‘feature points’ that are captured from various camera frames taken from different poses in a space.

If you’re interested in more of the mechanics of this, I found this video from Apple’s 2018 WWDC conference to be one of the better references that I’ve seen;

Understanding ARKit Tracking and Detection

So, a ‘spatial anchor’ is a BLOB of data that allows a device to capture a 3D point and orientation in space & potentially to identify that point again in the future (often known as ‘re-localising the anchor’). It’s key to note that devices and spaces aren’t perfect and so it’s always possible that a stored anchor can’t be brought back to life at some future date.

I find it useful sometimes to make a human analogy around spatial anchors. I can easily make a ‘spatial anchor’ to give to a human being and it might contain very imprecise notions of positioning in space which can nonetheless yield accurate results.

As an example, I could give this description of a ‘spatial anchor’ to someone;

Place this bottle 2cm in on each side from the corner of the red table which is nearest to the window. Lay the bottle down pointing away from the window.

You can imagine being able to walk into a room with a red table & a window and position the bottle fairly accurately based on that.

You can also imagine that this might work in many rooms with windows and red tables & that humans might even adapt and put the bottle onto a purple table if there wasn’t a red one.

Equally, you can imagine finding yourself in a room with no table and saying “sorry, I can’t figure this out”.

I think it’s worth saying that having this set of ‘instructions’ does not tell the person how to find the room nor whether they are in the right room, that is outside of the scope and the same is true for spatial anchors – you have to be vaguely in the right place to start with or use some other mechanism (e.g. GPS, beacons, markers, etc) to get to that place before trying to re-localise the anchor.

Why Anchor?

Having been involved in building applications for HoloLens for a little while now, I’ve become very used to the ideas of applying anchors and, to my mind, there are 3 main reasons why you would apply an anchor to a point/object and the docs are very good on this;

  • For stability of a hologram or a group of holograms that are positioned near to an anchor.
    • This is essentially about preserving the relationship between a hologram and a real point in the world as the device alters its impression of the structure of the space around it. As humans, we expect a hologram placed on the edge of a table to stay on the edge of that table even if a device is constantly refining its idea of the mesh that makes up that table and the rest of the space around it.
  • For persistence.
    • One of the magical aspects of mixed reality enabled by spatial anchors is the ability for a device to remember the positions of holograms in a space. The HoloLens can put the hologram back on the edge of the table potentially weeks or months after it was originally placed there.
  • For sharing.
    • The second magical aspect of mixed reality enabled by spatial anchors is the ability for a device to read a spatial anchor created by another device in a space and thereby construct a transform from the co-ordinate system of the first device to that of the second. This forms the basis for those magical shared holographic experiences.

Can’t I Already Anchor?

At this point, it’s key to note that for HoloLens developers the notion of ‘spatial anchors’ isn’t new. The platform has supported anchors since day 1 and they work really well.

Specifically, if you’re working in Unity then you can fairly easily do the following;

  • Add the WorldAnchor component to your GameObject in order to apply a spatial anchor to that component.
    • It’s fairly common to use an empty GameObject which then acts as a parent to a number of other game objects.
    • The isLocated property is fairly key here as is the OnTrackingChanged event and note also that there is an opaque form of reference to the  underlying BLOB via GetNativeSpatialAnchorPtr and SetNativeSpatialAnchorPtr.
  • Use the WorldAnchorStore class in order to maintain a persistent set of anchors on a device indexed by a simple string identifier.
  • Use the WorldAnchorTransferBatch class in order to;
    • Export the blob representing the anchor
    • Import a blob representing an anchor that has previously been exported

With this set of tools you can quite happily build HoloLens applications that;

  • Anchor holograms for stability.
  • Persist anchors over time such that holograms can be recreated in their original locations.
  • Share anchors between devices such that they can agree on a common co-ordinate system and present shared holographic experiences.

and, of course, you can do this using whatever transfer or networking techniques you like including, naturally, passing these anchors through the cloud via means such as Azure Blob Storage or ASP.NET SignalR or whatever you want. It’s all up for grabs and has been for the past 3 years or so.

Why A New Spatial Anchor Service?

With all that said, why would you look to the new Azure Spatial Anchor service if you already have the ability to create anchors and push them through the cloud. For me, I think there’s at least 3 things;

  1. The Azure Spatial Anchor service is already built and you can get an instance with a few clicks of the mouse.
    1. You don’t have to go roll your own service and wonder about all the “abilities” of scalability, reliability, availability, authentication, authorisation, logging, monitoring, etc.
    2. There’s already a set of client-side libraries to make this easy to use in your environment.
  2. The Azure Spatial Anchor service/SDK gives you x-platform capabilities for anchors.
    1. The Azure Spatial Anchor service gives you the ability to transfer spatial anchors between applications running on HoloLens, ARKit devices and ARCore devices.
  3. The Azure Spatial Anchor service lets you define metadata with your anchors.
    1. The SDK supports the notion of ‘nearby’ anchors – the SDK lets you capture a group of anchors that are located physically near to each other & then query in the future to find those anchors again.
    2. The SDK also supports adding property sets to anchors to use for your own purposes.

Point #2 above is perhaps the most technically exciting feature here – i.e. I’ve never before seen anchors shared across HoloLens, iOS and Android devices so this opens up new x-device scenarios for developers.

That said, point #1 shouldn’t be underestimated – having a service that’s already ready to run is usually a lot better than trying to roll your own.

So, how do you go about using the service? Having checked out the samples, I then wanted to do a walkthrough on my own and that’s what follows here but keep a couple of things in mind;

  • I’m experimenting here, I can get things wrong Smile
  • The service is in preview.
  • I’m going to take a HoloLens/Unity centric approach as that’s the device that I have to hand.
  • There are going to be places where I’ll overlap with the Quick Start and I’ll just refer to it at that point.

Using the Service Step 1 – Creating an Instance of the Service

Getting to the point where you have a service up and running using (e.g.) the Azure Portal is pretty easy.

I just followed this Quick Start step labelled “Create a Spatial Anchors Resource” and I had my service visible inside the portal inside of 2-3 minutes.

Using the Service Step 2 – Making a Blank Project in Unity

Once I had a service up and running, I wanted to be able to get to it from Unity and so I went and made a blank project suitable for holographic development.

I’m using Unity 2018.3.2f1 at the time of writing (there are newer 2018.3 versions).

I’ve gone through the basics of setting up a project for HoloLens development many times on this blog site before so I won’t cover them here but if you’re new to this then there’s a great reference over here that will walk you through getting the camera, build settings, project settings etc. all ok for HoloLens development.

Using the Service Step 3 – Getting the Unity SDK

Ok, this is the first point at which I got stuck. When I click on this page on the docs site;

image

then the link to the SDK takes me to the detailed doc pages but it doesn’t seem to tell me where I get the actual SDK from – I was thinking of maybe getting a Unity package or similar but I’ve not found that link yet.

This caused me to unpick the sample a little and I learned a few things by doing that. In the official Unity sample you’ll see that the plugins folder (for HoloLens) contains these pieces;

image

and if you examine the post-build step here in this script you’ll see that there’s a function which essentially adds the nuget package Microsoft.Azure.SpatialAnchors.WinCPP into the project when it’s built;

image

and you can see that the script can cope with .NET projects and C++ projects (for IL2CPP) although I’d flag a note in the readme right now which suggests that this doesn’t work for IL2CPP anyway today;

### Known issues for HoloLens

For the il2cpp scripting backend, see this [issue]( https://forum.unity.com/threads/httpclient.460748/ ).

The short answer to the workaround is to:

1. First make a mcs.rsp with the single line `-r:System.Net.Http.dll`. Place this file in the root of your assets folder.
2. Copy the `System.net.http.dll` from `<unityInstallDir>\Editor\Data\MonoBleedingEdge\lib\mono\4.5\System.net.http.dll` into your assets folder.

There is an additional issue on the il2cpp scripting backend case that renders the library unusable in this release.

so please keep that in mind given that IL2CPP is the new default backend for these applications.

I haven’t poked into the iOS/Android build steps at the time of writing so can’t quite say what happens there just yet.

This all means that when I build from Unity I end up with a project which includes a reference to Microsoft.Azure.SpatialAnchors.WinCPP as a Nuget package as below (this is taken from a .NET backend project);

image

so, what’s in that thing? Is it a WinRT component?

I don’t think it is. I had to go and visit the actual Nuget package to try and figure it out but when I took a look I couldn’t find any .winmd file or similar. All I found in that package was a .DLL;

image

and as far as I can tell this is just a flat DLL with a bunch of exported flat functions like these;

image

I can only guess but I suspect then that the SDK is built in C/C++ so as to be portable across iOS, Android & UWP/Unity and then packaged up in slightly different ways to hit those different target environments.

Within Unity, this is made more palatable by having a bridge script which is included in the sample project called AzureSpatialAnchorsBridge.cs;

image

which then does a bunch of PInvokes into that flat DLL like this one;

image

so that’s how that seems to work.

If I then want to take this across to a new project, it feels like I need to package up a few things and I tried to package;

image

hoping to come away with the minimal set of pieces that I need to make this work for HoloLens and that seemed to work when I imported this package into my new, blank project.

I made sure that project had both the InternetClient and Microphone capabilities and importantly SpatialPerception, with that in place, I’m now in my blank project and ready to write some code.

Using the Service Step 4 – Getting a Cloud Session

In true Unity tradition, I made an empty GameObject and threw a script onto it called ‘TestScript’ and then I edited in a small amount of infrastructure code;

using Microsoft.Azure.SpatialAnchors;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using UnityEngine;
using UnityEngine.Windows.Speech;
using UnityEngine.XR.WSA;
#if ENABLE_WINMD_SUPPORT
using Windows.Media.Core;
using Windows.Media.Playback;
using Windows.Media.SpeechSynthesis;
#endif // ENABLE_WINMD_SUPPORT

public class TestScript : MonoBehaviour
{
    public Material cubeMaterial;

    void Start()
    {
        this.cubes = new List<GameObject>();

        var speechActions = new Dictionary<Task>()
        {
            ["session"] = this.OnCreateSessionAsync,
            ["cube"] = this.OnCreateCubeAsync,
            ["clear"] = this.OnClearCubesAsync
        };
        this.recognizer = new KeywordRecognizer(speechActions.Keys.ToArray());

        this.recognizer.OnPhraseRecognized += async (s) =>
        {
            if ((s.confidence == ConfidenceLevel.Medium) || 
                (s.confidence == ConfidenceLevel.High))
            {
                Func<Task> value = null;

                if (speechActions.TryGetValue(s.text.ToLower(), out value))
                {
                    await value();
                }
            }
        };
        this.recognizer.Start();
    }
    async Task OnCreateSessionAsync()
    {
        // TODO: Create a cloud anchor session here.
    }
    async Task OnCreateCubeAsync()
    {
        var cube = GameObject.CreatePrimitive(PrimitiveType.Cube);

        cube.transform.localScale = new Vector3(0.2f, 0.2f, 0.2f);

        cube.transform.position = 
            Camera.main.transform.position + 2.0f * Camera.main.transform.forward;

        cube.GetComponent<Renderer>().material = this.cubeMaterial;

        this.cubes.Add(cube);

        var worldAnchor = cube.AddComponent<WorldAnchor>();
    }
    async Task OnClearCubesAsync()
    {
        foreach (var cube in this.cubes)
        {
            Destroy(cube);
        }
        this.cubes.Clear();
    }
    public async Task SayAsync(string text)
    {
        // Ok, this is probably a fairly nasty way of playing a media stream in
        // Unity but it sort of works so I've gone with it for now 🙂
#if ENABLE_WINMD_SUPPORT
        if (this.synthesizer == null)
        {
            this.synthesizer = new SpeechSynthesizer();
        }
        using (var stream = await this.synthesizer.SynthesizeTextToStreamAsync(text))
        {
            using (var player = new MediaPlayer())
            {
                var taskCompletionSource = new TaskCompletionSource<bool>();

                player.Source = MediaSource.CreateFromStream(stream, stream.ContentType);

                player.MediaEnded += (s, e) =>
                {
                    taskCompletionSource.SetResult(true);
                };
                player.Play();
                await taskCompletionSource.Task;
            }
        }

#endif // ENABLE_WINMD_SUPPORT
    }
#if ENABLE_WINMD_SUPPORT
    SpeechSynthesizer synthesizer;
#endif // ENABLE_WINMD_SUPPORT

    KeywordRecognizer recognizer;
    List<GameObject> cubes;
}

and so this gives me the ability to say “session” to create a session, “cube” to create a cube with a world anchor and “clear” to get rid of all my cubes.

Into that, it’s fairly easy to add an instance of CloudSpatialAnchorSession and create it but note that I’m using the easy path at the moment of configuring it with the ID and Key for my service. In the real world, I’d want to configure it to do auth properly and the service is integrated with AAD auth to make that easier for me if I want to go that way.

I added a member variable of type CloudSpatialAnchorSession and then just added in a little code into my OnCreateSessionAsync method;

    async Task OnCreateAsync()
    {
        if (this.cloudAnchorSession == null)
        {
            this.cloudAnchorSession = new CloudSpatialAnchorSession();
            this.cloudAnchorSession.Configuration.AccountId = ACCOUNT_ID;
            this.cloudAnchorSession.Configuration.AccountKey = ACCOUNT_KEY;
            this.cloudAnchorSession.Error += async (s, e) => await this.SayAsync("Error");
            this.cloudAnchorSession.Start();
        }
    }

and that’s that. Clearly, I’m using speech here to avoid having to make “UI”.

Using the Services Step 5 – Creating a Cloud Anchor

Ok, I’ve already got a local WorldAnchor on any and all cubes that get created here so how do I turn these into cloud anchors?

The first thing of note is that the CloudSpatialAnchorSession has these 2 floating point values (0-1) which tell you whether it is ready or not to create a cloud anchor. You call GetSessionStatusAsync and it returns a SessionStatus which reports;

If it’s not ready then you need to get your user to walk around a bit until it is ready with some nice UX and so on and you can even query the UserFeedback to see what you might suggest to the user to get them to improve on the situation.

It looks like you can also get notified of changes to these values by handling the SessionUpdated event as well.

Consequently, I wrote a little method to try and poll these values, checking for something that was over 1.0f;

    async Task WaitForSessionReadyToCreateAsync()
    {
        while (true)
        {
            var status = await this.cloudAnchorSession.GetSessionStatusAsync();

            if (status.ReadyForCreateProgress >= 1.0f)
            {
                break;
            }
            await Task.Delay(250);
        }
    }

and that seemed to work reasonably although, naturally, the hard-coded 250ms delay might not be the smartest thing to do.

With that in place though I can then add this little piece of code to my OnCreateCubeAsync method just after it attaches the WorldAnchor to the cu

        var cloudSpatialAnchor = new CloudSpatialAnchor(
            worldAnchor.GetNativeSpatialAnchorPtr(), false);

        await this.WaitForSessionReadyToCreateAsync();

        await this.cloudAnchorSession.CreateAnchorAsync(cloudSpatialAnchor);

        this.SayAsync("cloud anchor created");
and sure enough I see the portal reflecting that I have created an anchor in the cloud;

image

Ok – anchor creation is working! Let’s move on and see if I can get an anchor re-localised.

Using the Service Step 5 – Localising an Anchor

In so much as I can work out so far, the process of ‘finding’ one or more anchors comes down to using a CloudSpatialAnchorWatcher and asking it to look for some anchors for you in one of two ways by using this AnchorLocateCriteria;

  • I can give the watcher one or more identifiers for anchors that I have previously uploaded (note that the SDK fills in the cloud anchor ID (string (guid)) in the Identifier property of the CloudSpatialAnchor after it has been saved to the cloud).
  • I can ask the watcher to look for anchors that are nearby another anchor.

I guess the former scenario works when my app has some notion of a location based on something like a WiFI network name, a marker, a GPS co-ordinate or perhaps just some setting that the user has chosen and this can then be used to find a bunch of named anchors that are supposed to be associated with that place.

Once one or more of those anchors has been found, the nearby mode can perhaps be used to find other anchors near to that site. The way in which anchors become ‘nearby’ is documented in the “Connecting Anchors” help topic here.

It also looks like I have a choice when loading anchors as to whether I want to include the local cache on the device and whether I want to load anchors themselves or purely their metadata so that I can (presumably) do some more filtering before deciding to load. That’s reflected in the properties BypassCache and RequestedCategories respectively.

In trying to keep my test code here as short as possible, I figured that I would simply store in memory any anchor Ids that have been sent off to the cloud and then I’d add another command “Reload” which attempted to go back to the cloud, get those anchors and recreate the cubes in the locations where they were previously stored.

I set the name of the cube to be the anchor ID from the cloud, i.e. after I create the cloud anchor I just do this;

        await this.cloudAnchorSession.CreateAnchorAsync(cloudSpatialAnchor);

        // NEW!
        cube.name = cloudSpatialAnchor.Identifier;

        this.SayAsync("cloud anchor created");

and so that stores the IDs for me. I also need to change the way in which I create the session in order to handle 2 new events, AnchorLocated and LocateAnchorsCompleted when I create the CloudSpatialAnchorSession;

   async Task OnCreateSessionAsync()
    {
        if (this.cloudAnchorSession == null)
        {
            this.cloudAnchorSession = new CloudSpatialAnchorSession();
            this.cloudAnchorSession.Configuration.AccountId = ACCOUNT_ID;
            this.cloudAnchorSession.Configuration.AccountKey = ACCOUNT_KEY;
            this.cloudAnchorSession.Error += async (s, e) => await this.SayAsync("Error");

            // NEW
            this.cloudAnchorSession.AnchorLocated += OnAnchorLocated;

            // NEW
            this.cloudAnchorSession.LocateAnchorsCompleted += OnLocateAnchorsCompleted;

            this.cloudAnchorSession.Start();
        }
    }

and then I added a new voice command “reload” which grabs all the IDs from the cubes and attempts to create a watcher to reload them;

    async Task OnReloadCubesAsync()
    {
        if (this.cubes.Count > 0)
        {
            var identifiers = this.cubes.Select(c => c.name).ToArray();

            await this.OnClearCubesAsync();

            var watcher = this.cloudAnchorSession.CreateWatcher(
                new AnchorLocateCriteria()
                {
                    Identifiers = identifiers,
                    BypassCache = true,
                    RequestedCategories = AnchorDataCategory.Spatial,
                    Strategy = LocateStrategy.AnyStrategy
                }
            );
        }
    }

and then finally the event handler for each located anchor is as follows – I basically recreate the cube and attach the anchor;

    void OnAnchorLocated(object sender, AnchorLocatedEventArgs args)
    {
        UnityEngine.WSA.Application.InvokeOnAppThread(
            () =>
            {
                var cube = GameObject.CreatePrimitive(PrimitiveType.Cube);

                cube.transform.localScale = new Vector3(0.2f, 0.2f, 0.2f);

                cube.GetComponent<Renderer>().material = this.relocalizedCubeMaterial;

                var worldAnchor = cube.AddComponent<WorldAnchor>();

                worldAnchor.SetNativeSpatialAnchorPtr(args.Anchor.LocalAnchor);

                cube.name = args.Identifier;

                SayAsync("Anchor located");
            },
            false
        );
    }

and the handler for when all anchors have been located just tells me that the process has finished;

    void OnLocateAnchorsCompleted(object sender, LocateAnchorsCompletedEventArgs args)
    {
        SayAsync("Anchor location completed");
        args.Watcher.Stop();
    }

and that’s pretty much it – I found that my anchors reload in much the way that I’d expect them to.

Wrapping Up

As I said at the start of the post, this was just me trying out a few rough ideas and I’ve covered nothing that isn’t already present in the official samples but I found that I learned a few things along the way and I feel like I’m now a little more conversant with this service. Naturally, I need to revisit and go through the process of updating/deleting anchors and also of looking at gathering ‘nearby’ anchors and re-localising them but I think that I “get it” more than I did at the start of the post.

The other thing I need to do is to try this out from a different kind of device, more than likely an Android phone but that’s for another post Smile