Experiments with Shared Holograms and Azure Blob Storage/UDP Multicasting (Part 1)

NB: The usual blog disclaimer for this site applies to posts around HoloLens. I am not on the HoloLens team. I have no details on HoloLens other than what is on the public web and so what I post here is just from my own experience experimenting with pieces that are publicly available and you should always check out the official developer site for the product documentation.

I’ve written a number of posts about the idea of shared holographic experiences including the posts below;

Experiments with Shared Holographic Experiences and Photon Unity Networking

Experiments with Shared Holographic Experiences and AllJoyn (Spoiler Alert- this one does not end well!)

Hitchhiking the HoloToolkit-Unity, Leg 13–Continuing with Shared Experiences

Hitchhiking the HoloToolkit-Unity, Leg 11–More Steps with Sharing Holographic Experiences

Hitchhiking the HoloToolkit-Unity, Leg 10–Baby Steps with Sharing Holographic Experiences

Hitchhiking the HoloToolkit-Unity, Leg 5–Baby Steps with World Anchors and Persisting Holograms

It sometimes feels like I can’t leave this topic alone Smile but that’s perhaps to be expected as I’ve seen a lot of developer interest in applications that offer a shared holographic experience and so maybe it’s right that I keep coming back to it.

Additionally, when I’ve worked with developers to set up shared holograms in their applications I’ve found that they sometimes get bogged down in the enabling technologies that underpin the Mixed Reality Toolkit (for Unity) such as UNET or the networked sharing service.

With that in mind, I wanted to return to the shared holograms topic again with some experiments around enabling shared holograms without using the Mixed Reality Toolkit/Unity and without using some external enabling library such as UNET or Photon.

This is the first in a small series of posts taking a look at that type of experiment and it starts with thinking about what needs to be passed over the network.

Enabling Shared Holograms

In order to enable shared holograms, we need some kind of networking that enables two things;

  1. The ability to quickly send messages between HoloLens devices on a common network – examples might be “Hologram has been created” or “Hologram has moved”.
    • I’m assuming that ‘common network’ is ok here and not considering the case when the HoloLens devices are separated by network infrastructure or the internet etc.
  2. The ability to send serialized blobs representing HoloLens ‘spatial anchors’ between HoloLens devices on a common network.
    • This is necessary to establish a common co-ordinate system between devices.
    • I think it’s safe to assume that these serialized blobs can easily grow into 1-10MB in size based on my past experiments which can mean that transfers can take some time and also that some networking technologies might not really be appropriate.

There are many, many ways in which this can be achieved and some of them appear in the previous posts that I’ve referred to above.

For this small set of posts I’m going to use two technologies to try and implement 1 and 2 above;

  1. I’m going to use UDP multicasting to distribute small network messages to a set of HoloLens devices.
  2. I’m going to use Azure Blob Storage to store serialized spatial anchor blobs.

In many ways, this is similar to what I did in the blog post;

Experiments with Shared Holographic Experiences and AllJoyn (Spoiler Alert- this one does not end well!)

except that I’m taking AllJoyn out of the equation and using UDP multicasting instead.

Against that backdrop, I needed a library to enable UDP multicasting of messages between devices and so I made one…

A Quick UDP Multicasting Library

I wrote a quick library for UDP Multicasting of messages and dropped it onto github over here.

There’s nothing radical or clever in there, the intention is to multicast send single UDP packet-sized messages around the network making it easy to send them and pick them up again (on different machines, the library doesn’t currently support multiple processes on the same machine).

I wanted the library to target both UWP (specifically SDK level 14393 for HoloLens) and .NET Framework (specifically I’ve targeted the “Unity 3.5 .net Subset BCL”) and so inside the solution are a few projects;

image

and so the code is 100% in the SharedCode folder referenced from the other 2 projects and at the end of the build process out should pop a library targeting .NET 3.5 for Unity and another targeting UWP.

Despite the (bad!) naming of the solutions and projects in the screenshot above, both projects should build a library called BroadcastMessaging.dll and that’s intentional because it will hopefully allow me to use the .NET 3.5 version as a placeholder library for Unity’s editor while the UWP library will be the library actually used at runtime. Unity likes placeholder and replacement libraries to have the same name.

The way that the library is intended to work is by requiring both message senders and receivers to have common types derived from the the base class Message in the library, with any state serialized by overriding the Load/Save methods as below in this example message class below which is intended to carry around a GUID;

namespace SharedTestAppCode
{
    using BroadcastMessaging;
    using System;
    using System.IO;

    public class TestGuidMessage : Message
    {
        public override void Load(BinaryReader reader)
        {
            this.Id = Guid.Parse(reader.ReadString());
        }
        public override void Save(BinaryWriter writer)
        {
            writer.Write(this.Id.ToString());
        }
        public Guid Id { get; set; }
    }
}

In fact, identification of a message type is done purely by its short, unqualified .NET type name and so MyNamespaceOne.MyMessage and MyNamespaceTwo.MyMessage would be confused by my library so, clearly, it could be a lot more robust than it currently is.

Once a message type is defined there’s a type called a MessageRegistrar which needs to know how to create these messages and so there’s a call to register them;

            MessageRegistrar messageRegistrar = new MessageRegistrar();

            messageRegistrar.RegisterMessageFactory<TestGuidMessage>(
                () => new TestGuidMessage());

and that means that the system can now create one of these message types should it need to. The registrar also deals with adding callbacks if code wants to be notified when one of these message types arrives on the network;

            messageRegistrar.RegisterMessageHandler<TestGuidMessage>(
                msg =>
                {
                    var testMessage = msg as TestGuidMessage;

                    if (testMessage != null)
                    {
                        Console.WriteLine(
                            $"\tReceived a message from {testMessage.Id}");
                    }
                }
            );

and once that is set up there’s a class named MessageService which does the work of sending/receiving messages and it’s instantiated as below;

            var messageService = new MessageService(messageRegistrar, ipAddress);

and there are extra parameters here to control the multicast address being used (it defaults to 239.0.0.0) and the port being used (it defaults to 49152) and there’s also a need to pass the local IP address on which the UdpClient that this library uses should bind itself to.

I find that on my machines getting hold of that local IP address isn’t always “so easy” across .NET Framework and UWP so I added another type into my library to encode some process for trying to get hold of it;

            var ipAddress =
                NetworkUtility.GetConnectedIpAddresses(false, true, AddressFamilyType.IP4)
                .FirstOrDefault()
                .ToString();

and the library routine NetworkUtility.GetConnectedIpAddresses takes parameters controlling whether to only consider WiFi networks, whether to rule out networks that appear to be ‘virtual’ and which address family (IP4/6) to consider. I’m sure that this code will have problems on some systems and could be improved but it seems to work reasonably well on the PCs that I’ve tried it on to date.

With that all set up, the only remaining thing to do is to send messages via the MessageService and that code looks something like;

 messageService.Open();
Console.WriteLine("Hit X to exit, S to send a message");

var msg = new TestGuidMessage()
{
    Id = guid
};
messageService.Send(msg,
    sent =>
    {
        Console.WriteLine($"\tMessage sent? {sent}"); 
    }
);

with the callback being an optional part of sending a message. I’ve used callbacks here rather than exposing Task-based asynchronous code because I want the API surface to be the same across .NET Framework 3.51 and UWP clients. For similar reasons, I ended up with some methods taking object based parameters as I struggled with some co/contravariance pieces across the two different .NETs.

The other projects in the solution here are test apps with one being a .NET Console application targeting 3.5 and a UWP application targeting 14393.

Both applications create a GUID to identify themselves and then multicast it over the network, logging when they receive a message from somewhere else on the network.

You can see those “in action” ( Winking smile ) below – here’s the console app receiving/sending messages;

image

and here’s the UWP app sending/receiving messages;

image

And that’s pretty much it for this post. In the next post, I’ll start to add some code inside of Unity that makes use of this layer and ties it in with Azure blob storage to build up the basics of some shared holograms.

HoloLens<->Immersive Headset Tracking Experiment Part 2

NB: The usual blog disclaimer for this site applies to posts around HoloLens. I am not on the HoloLens team. I have no details on HoloLens other than what is on the public web and so what I post here is just from my own experience experimenting with pieces that are publicly available and you should always check out the official developer site for the product documentation.

Just a small update to a recent post where I used the sharing service from the Mixed Reality Toolkit in order to experiment with the idea of creating a common co-ordinate system between an app running on an immersive Mixed Reality headset and the same app running on the holographic HoloLens headset.

As part of that, I had the immersive headset broadcast its position to the  HoloLens such that the HoloLens could track it as it moved around.

It seemed obvious that this should also work in the other direction – i.e. the HoloLens could also broadcast its position such that the immersive headset could track it as it moved around and so I’ve modified the code to make that change.

The code is where it was previously, it’s quite rough but quite good fun Smile

The screenshot below shows capture from the Unity editor where the app is running on an immersive headset and the glasses represent where the immersive headset thinks the HoloLens is in the ‘real’ world around it – it seems to work reasonably well.

holo

HoloLens Tracking of an Immersive Headset (or “Manual Spatial Anchoring”)

NB: The usual blog disclaimer for this site applies to posts around HoloLens. I am not on the HoloLens team. I have no details on HoloLens other than what is on the public web and so what I post here is just from my own experience experimenting with pieces that are publicly available and you should always check out the official developer site for the product documentation.

This post falls mainly into the category of “just for fun” but since I first got an immersive Windows Mixed Reality headset (an Acer) I’ve been quite keen to set up a situation where I could track its position using my HoloLens.

I don’t really know why and I don’t know whether there’s a tangible use for this, I just wanted to experiment with it.

What do I mean by track? Here’s a video example to explain. Please keep in mind that this was captured using mixed reality capture on a HoloLens which means that the quality is much lower than the on-device experience would be;

Tracking with Multiple HoloLens Devices

In building shared holographic experiences between multiple HoloLens devices its not too tricky to have multiple HoloLens devices all in one scene, networked together with each device capable of displaying the position, orientation and gaze vector of the other devices or some other shared holograms.

For the purposes of this post, I’m using the term “track” to describe the ability of one HoloLens to know the position and orientation of another device but it’s my own term rather than some official one.

There’s a script in the Mixed Reality Toolkit named RemoteHeadManager which does some of this for you and in previous blog posts like this one I’ve shown examples of doing that as demonstrated in the picture below;

where you can see a HoloLens floating and displaying its gaze ray. In that particular example the participant was remote and so there’s no local human being attached to that HoloLens but, hopefully, you get the idea.

Co-ordinate Systems

Being able to do this piece of magic ultimately comes down to being able to agree a common co-ordinate system between the multiple devices or at least a transformation from the co-ordinate system of one device to that of another.

When you first run an application on a HoloLens the starting device (or head) position is taken as the origin of the Unity scene (i.e. a Vector3(0,0,0)) with the X,Y,Z axes pointing to the right, up and forward in the natural way with respect to the device and/or the user’s head.

This means that if multiple HoloLens devices are present in a location then, unless they all run the application by being placed in the exact same physical start up spot, they are all going to have different positions in that location meaning that their origin point (0,0,0) will be in a different physical position and their X,Y,Z axes are likely to be pointing in different directions.

How to rationalise across these different co-ordinate systems in order to be able to display consistent content? The devices need to agree on something Smile

HoloLens sprinkles in some magic here because the device supports the idea of Spatial Anchors – a blob of data that represents a position and orientation in physical space.

The magic comes when you first learn that a HoloLens can export a spatial anchor, pass it over the network to another HoloLens and then that receiving device can attempt to import the same spatial anchor and locate it in the same space.

If that all works successfully (and generally it does) then the two devices now have an agreement about how a (position, rotation) within the room space is represented in their respective co-ordinate systems – this makes it “relatively easy” to consistently display objects.

A common way of then achieving that is to have each device maintain a GameObject locked to the position and orientation of the spatial anchor and then parent all content to be shared across devices from that GameObject such that all that content effectively has its origin and its axes determined by the anchored object.

This then means that e.g. a co-ordinate of (3,3,3) relative to the spatial anchored object on one device will show up in the same physical place in the world as a co-ordinate of (3,3,3) relative to the spatial anchored object on another device.

So, for HoloLens this is all good because of the magic of Spatial Anchors. What about an immersive headset?

Tracking with a HoloLens and an Immersive Headset

If you’ve looked at the immersive Mixed Reality headsets then you’ll know that they feature inside-out tracking and so it’s perhaps natural to assume that an application running on a PC displaying on an immersive headset would be able to import a spatial anchor from a HoloLens meaning that the code here would be the same as for the HoloLens scenario.

As far as I know, that’s not the case and I don’t believe it’s possible today to share a spatial anchor between an immersive headset and a HoloLens although I can’t quite find the definitive link that tells me that at the time of writing.

I’d be happy to be wrong here and it’d make the rest of the post redundant but that’d be a good thing Smile

Additionally, it’s relevant to consider that on an immersive headset the origin (0,0,0) and axis orientation (X,Y,Z) is not just determined by the place and direction that the headset is physically sitting at the point when the application first runs.

The documentation on coordinate systems explains different the scales of experience as being orientation, seated, standing, room and world and the different frames of reference that make these experiences possible.

One of these is the stage frame of reference where the origin is going to be on the floor of the room at the point that the user defined it when they set up their headset. So, for instance it’s perfectly possible for an app to start on an immersive headset at some position of (2,0.5,2) rather than at (0,0,0) as it would on HoloLens.

So, if I’ve got a HoloLens and an immersive headset operating in the same physical space then they almost certainly will have different origins within the space and differently aligned axes.

In order then for the HoloLens to somehow track the immersive headset in its own co-ordinate system, some form of manual means is going to be needed to agree on some common reference point that can be used to span co-ordinate systems.

Now, one way of doing this might be to use something like a Vuforia tag but the immersive headsets don’t have a web camera on them and so I’m not sure this would be feasible like it would on HoloLens.

With that in mind, I set about an approach of doing this manually along the lines of the following steps;

  1. HoloLens app runs up and displays some marker object that can be positioned in physical space.
    1. The HoloLens app can then create an empty GameObject at this co-ordinate with the same orientation
  2. Immersive headset runs up and is physically moved to the same place as the HoloLens marker object with the same orientation.
    1. The immersive app can then be informed (e.g. via a voice command) to create an empty GameObject at this co-ordinate with the same orientation
  3. The immersive headset sends its subsequent camera positions over the network relative to the GameObject created at step 2.1 above.
  4. The HoloLens headset can now reposition its marker object using the co-ordinates sent from the immersive headset relative to the game object created at step 1.1 above.

and, while quite “manual”, this seems to work out relatively nicely and the human being does the work of telling the devices how to align their respective co-ordinate systems.

It’s like spatial anchors for the generation who remember black and white TV Winking smile

In terms of how that was put together…

Putting it Together

The implementation of this seems relatively simple. I made a new Unity project in Unity 2017.2.0f3, brought in the Mixed Reality Toolkit and set up my project using the provided dialogs for;

  • Project Settings
  • Scene Settings
  • UWP Capabilities (including the microphone)

and so in the scene below, the only part that I created is the GameObject labelled Root with the rest coming from the toolkit dialogs;

image

From there, I wanted a shared experience and was happy to make use of the sharing server and so I brought in the SharingStage prefab from the toolkit and configured it for my local PC’s IP address and I also configured it to use the AutoJoinSessionAndRoom script from the toolkit such that it would automatically join a default session and room on connection;

image

Also on that same object is a script called Logic.cs which simply tries to enable (in a clunky way) one of two child objects named HoloLens and Immersive depending on which type of headset the code is running on;

image

From there, the HoloLens object looks like this;

image

and so it makes use of a modified version of the CustomMessages.cs script taken from the toolkit’s tests project and then also contains this HoloLensLogic.cs script which essentially;

  • Creates the prefab representing the spectacles 1.5m in front of the user and locks them to their gaze (this is a cheap way of positioning them)
  • Waits for a click event and then
    • Creates a new game object at the position where the spectacles are to be used as the parent representing that transform in space
    • Registers to receive broadcasts of the immersive headset position and forward vector
  • On receipt of a broadcast
    • Updates the position of the spectacles (relative to the parent) to reflect the update from the remote immersive headset

On the immersive side, the game object is as below;

image

and so it also uses the same CustomMessages script, it also sets itself up to handle the speech keyword “mark” and has the ImmersiveLogic.cs script set up to provide that handling which;

  • Waits for the speech keyword “mark” and then creates a GameObject to represent the position, orientation of the Camera in space at the point when that keyword is received.
  • Once the common co-ordinate point has been defined, transmits its Camera position and forward vector relative to that GameObject over the network to the HoloLens on every Update().

Wrapping Up

That’s pretty much it – nothing too complicated once I’d decided on an approach although it took me a little while to figure things out initially and I learned a couple of things during the process.

As I said at the start of the post, this was “just for fun” and I’m not yet decided on the use cases for establishing a common co-ordinate system across HoloLens/immersive but something in me would like to take it one step further and add the code to make the immersive headset display the actual position of the HoloLens in its surroundings even if that doesn’t necessarily 100% make sense in an immersive environment.

Maybe I could also then add some code to create other holograms consistently positioned across the two devices. I’m not sure what it would ‘feel’ like to position an object in the real world with HoloLens and then to don an immersive headset and have that object appear “in the same location” given that I couldn’t see that location! Perhaps I need to try Smile

I may update the code to do that at a later point – in the meantime, it’s all over here on github.