Hitchhiking the HoloToolkit-Unity, Leg 14–More with Spatial Understanding

NB: The usual blog disclaimer for this site applies to posts around HoloLens. I am not on the HoloLens team. I have no details on HoloLens other than what is on the public web and so what I post here is just from my own experience experimenting with pieces that are publicly available and you should always check out the official developer site for the product documentation.

I experimented with Spatial Understanding back in this blog post;

Hitchhiking the HoloToolkit-Unity, Leg 3–Spatial Understanding (& Mapping)

but I’ve done more with it in the meantime and wanted to try and write up a small part of that here.

Firstly, let’s outline a basic project with some pieces within it. In the Unity project in the screenshot below I have brought in the HoloToolkit-Unity and I’ve set up the project for HoloLens project settings, scene settings and I’ve made sure that I have the microphone and spatial perception capability settings.

I’ve then brought in the standard SpatialMapping, SpatialUnderstanding and InputManager prefabs.

image

I’ve done little to alter them other than the settings shown below for SpatialMapping;

image

and these settings for SpatialUnderstanding;

image

and then I’ve added to my Placeholder object some components from the toolkit;

image

Including a keyword manager to handle voice commands and Space Visualizer and App State object which I brought in to try and visualise the ‘smoothed’ mesh produced by spatial understanding.

With a small amount of script in place to handle my two voice commands of Scan and Stop;

using HoloToolkit.Unity;
using UnityEngine;

public class Placeholder : MonoBehaviour {


    void Start ()
    {
        SpatialUnderstanding.Instance.ScanStateChanged += OnScanStateChanged;
	}
    void OnScanStateChanged()
    {
        if (SpatialUnderstanding.Instance.ScanState == SpatialUnderstanding.ScanStates.Done)
        {
            this.isScanning = false;
            this.isStopping = false;
        }
    }
    public void OnScan()
    {
        if (!this.isScanning)
        {
            this.isScanning = true;
            SpatialUnderstanding.Instance.RequestBeginScanning();
        }
    }
    public void OnStop()
    {
        if (this.isScanning && !this.isStopping)
        {
            this.isStopping = true;
            SpatialUnderstanding.Instance.RequestFinishScan();
        }
    }
    bool isStopping;
    bool isScanning;
}

 

and that’s enough for me to be able to use Scan and Stop in order to see the smoothed mesh from Spatial Understanding over my environment.

image

and then if I want to (e.g.) find and visualize the largest wall in my environment then I can make use of convenience methods on the SpaceVisualizer class – e.g.

    void OnScanStateChanged()
    {
        if (SpatialUnderstanding.Instance.ScanState == SpatialUnderstanding.ScanStates.Done)
        {
            this.isScanning = false;
            this.isStopping = false;

            // Use Space Visualizer to find a large wall
            SpaceVisualizer.Instance.Query_Topology_FindLargeWall();
        }
    }

and that method in SpaceVisualizer looks like;

        public void Query_Topology_FindLargeWall()
        {
            ClearGeometry();

            // Only if we're enabled
            if (!SpatialUnderstanding.Instance.AllowSpatialUnderstanding)
            {
                return;
            }

            // Query
            IntPtr wallPtr = SpatialUnderstanding.Instance.UnderstandingDLL.PinObject(resultsTopology);
            int wallCount = SpatialUnderstandingDllTopology.QueryTopology_FindLargestWall(
                wallPtr);
            if (wallCount == 0)
            {
                AppState.Instance.SpaceQueryDescription = "Find Largest Wall (0)";
                return;
            }

            // Add the line boxes
            float timeDelay = (float)lineBoxList.Count * AnimatedBox.DelayPerItem;
            lineBoxList.Add(
                new AnimatedBox(
                    timeDelay,
                    resultsTopology[0].position,
                    Quaternion.LookRotation(resultsTopology[0].normal, Vector3.up),
                    Color.magenta,
                    new Vector3(resultsTopology[0].width, resultsTopology[0].length, 0.05f) * 0.5f)
            );
            AppState.Instance.SpaceQueryDescription = "Find Largest Wall (1)";
        }

Now, we have some interesting calls in here referencing SpatialUnderstanding.Instance and then SpatialUnderstandingDllTopology and then SpatialUnderstanding.Instance.UnderstandingDLL and this produces a result such as that blurred out below where the pink lines represent an edge of the largest wall that the device could see;

image

So, the result is fine but what’s going on with the structure of the code here? I think it’s relevant to talk about how some of these pieces are layered together because I think it can be confusing to make use of the spatial understanding pieces and I forget how it works each time that I come to this library and so I’m writing them down as far as I understand them.

SpatialUnderstanding (the Unity script)

The script named SpatialUnderstanding is a singleton accessed via the static SpatialUnderstanding.Instance and is what showed up in the editor as the SpatialUnderstanding component.

It has properties such as AutoBeginScanning, UpdatePeriod_DuringScanning, UpdatePeriod_AfterScanning and it controls the scanning process via methods like RequestBeginScanning/RequestFinishScan and maintains the current state via the ScanState property. It has an Update method which drives the scanning process.

It’s fairly clear what this does for us but where does the functionality ultimately come from?

SpatialUnderstanding (the native DLL)

Inside of the HoloToolkit (rather than the HoloToolkit-Unity) there is a native SpatialUnderstanding project which looks something like this;

image

This builds out into a Windows 8.1 WinRT DLL and ultimately shows up in Unity under the Plugins folder;

image

and it’s essentially a regular, flat DLL with a number of exports – these are defined in the various DLL_*.h header files so for example Dll_Interface.h contains methods like;

	// Init/term
	EXTERN_C __declspec(dllexport) int SpatialUnderstanding_Init();
	EXTERN_C __declspec(dllexport) void SpatialUnderstanding_Term();

	// Scan flow control
	EXTERN_C __declspec(dllexport) void GeneratePlayspace_InitScan(
		float camPos_X, float camPos_Y, float camPos_Z,
		float camFwd_X, float camFwd_Y, float camFwd_Z,
		float camUp_X, float camUp_Y, float  camUp_Z,
		float searchDst, float optimalSize);

and then Dll_Topology.h includes functions such as;

	EXTERN_C __declspec(dllexport) int QueryTopology_FindLargestWall(
		_Inout_ TopologyResult* wall);

Now, calling these functions from C# in Unity is going to require using PInvoke and so…

SpatialUnderstandingDll*.cs in the Toolkit

There’s a number of scripts in the toolkit which then provide PInvoke wrappers over these different functional areas exported from the native DLL;

image

and so if I dig into a file like SpatialUnderstandingDll.cs I’ll find there’s an internal, public class called Imports which has PInvoke signatures like;

            [DllImport("SpatialUnderstanding")]
            public static extern int SpatialUnderstanding_Init();

and so that’s how I might call into that one function exported from the DLL from my C# code and if I dig into SpatialUnderstandingTopologyDll.cs as another example then I’ll find;

        [DllImport("SpatialUnderstanding")]
        public static extern int QueryTopology_FindLargestWall(
            [In, Out] IntPtr wall);             // TopologyResult

and so these provide the wrappers that make these functions callable but there’s another ‘trick’ here…

SpatialUnderstandingDll.cs Script in the Toolkit

When it comes to calling into functions like this one below;

            [DllImport("SpatialUnderstanding")]
            public static extern int QueryPlayspaceStats(
                [In] IntPtr playspaceStats);    // PlayspaceStats

so, what do I provide as an IntPtr here? Well, the same class file defines;

            [StructLayout(LayoutKind.Sequential, Pack = 1)]
            public class PlayspaceStats
            {
                public int IsWorkingOnStats;				// 0 if still working on creating the stats

                public float HorizSurfaceArea;              // In m2 : All horizontal faces UP between Ground - 0.15 and Ground + 1.f (include Ground and convenient horiz surface)
                public float TotalSurfaceArea;              // In m2 : All !
                public float UpSurfaceArea;                 // In m2 : All horizontal faces UP (no constraint => including ground)
                public float DownSurfaceArea;               // In m2 : All horizontal faces DOWN (no constraint => including ceiling)
                public float WallSurfaceArea;               // In m2 : All Vertical faces (not only walls)
                public float VirtualCeilingSurfaceArea;     // In m2 : estimation of surface of virtual Ceiling.
                public float VirtualWallSurfaceArea;        // In m2 : estimation of surface of virtual Walls.

                public int NumFloor;                        // List of Area of each Floor surface (contains count)
                public int NumCeiling;                      // List of Area of each Ceiling surface (contains count)
                public int NumWall_XNeg;                    // List of Area of each Wall XNeg surface (contains count)
                public int NumWall_XPos;                    // List of Area of each Wall XPos surface (contains count)
                public int NumWall_ZNeg;                    // List of Area of each Wall ZNeg surface (contains count)
                public int NumWall_ZPos;                    // List of Area of each Wall ZPos surface (contains count)
                public int NumPlatform;                     // List of Area of each Horizontal not Floor surface (contains count)

                public int CellCount_IsPaintMode;           // Number paint cells (could deduce surface of painted area) => 8cm x 8cm cell
                public int CellCount_IsSeenQualtiy_None;    // Number of not seen cells => 8cm x 8cm cell
                public int CellCount_IsSeenQualtiy_Seen;    // Number of seen cells => 8cm x 8cm cell
                public int CellCount_IsSeenQualtiy_Good;    // Number of seen cells good quality => 8cm x 8cm cell
            };

and so that’s good – I guess that I just have to alloc one of these, pin it and pass it across the boundary before unpinning it after the method completes?

The class helps me with that again. Firstly, it has private members like this one (there are others following the same pattern);

        private Imports.PlayspaceStats reusedPlayspaceStats = new Imports.PlayspaceStats();
        private IntPtr reusedPlayspaceStatsPtr;

and then it provides a method like this one;

        public Imports.PlayspaceStats GetStaticPlayspaceStats()
        {
            return reusedPlayspaceStats;
        }

and so rather than having to make my own instance of PlayspaceStats I can just ‘borrow’ this one when I need it – e.g;

            var ptr = SpatialUnderstanding.Instance.UnderstandingDLL.GetStaticPlayspaceStatsPtr();
            SpatialUnderstandingDll.Imports.QueryPlayspaceStats(ptr);

and that Imports class also has methods to PinObject/String which both pins and puts them onto a list which can later be cleared via UnpinAllObjects so it’s essentially helping with the mechanics involved in calling the underlying DLL.

Extending Functionality

I was looking into the library here because I wanted to extend the functionality provided.

There’s lots of different functionality supported including examples like;

  • Getting hold of meshes
  • Raycasting to determine what type of object the user is looking at (wall, ceiling, etc)
  • Getting the space alignment, floor, ceilings etc.
  • Creating objects of specific sizes on walls, floors, ceilings etc. or randomly away from those obstacles.
  • Finding positions for objects on walls, ceilings, etc.

and quite a lot more. What I wanted though was a simple list of the walls that the library has found within my environment and I didn’t find a method in the library that already did this and so I thought it might be useful to step through how I might add that functionality for my own purposes in the future and for anyone else that wants to do something similar.

The functionality relating to walls seems to reside in these classes in the native SpatialUnderstanding project;

image

and so the first thing I did was to visit TopologyAnalyzer_W.h and add a new method signature;


	void	GetWalls(Vec3fDA& _outPos, Vec3fDA& _outNormal, 
		FloatDA& _outWidths, FloatDA& _outLengths, Bool _bAllowVirtualWall = TRUE);

and then I added an implementation of this in the .cpp file which essentially just copies the centre, normal, width, height of a wall out of the array it resides in;

void TopologyAnalyzer_W::GetWalls(Vec3fDA& _outPos, Vec3fDA& _outNormal, 
	FloatDA& _outWidths, FloatDA&_outLengths, Bool _bAllowVirtualWall)
{
	for (S32 w = 0; w<m_daWalls.GetSize(); w++)
	{
		Wall& wall = m_daWalls[w];

		if (wall.m_bIsVirtual && !_bAllowVirtualWall)
			continue;

		_outPos.Add(wall.m_vCentroid);
		_outNormal.Add(wall.m_vNormal);
		_outLengths.Add(wall.m_fHeight);
		_outWidths.Add(wall.m_fWidth);
	}
}

With that in place, I can modify the Dll_Topology.h and .cpp files to expose that new function;

	EXTERN_C __declspec(dllexport) int QueryTopology_FindWalls(
		_Inout_ TopologyResult* wall);

and;

EXTERN_C __declspec(dllexport) int QueryTopology_FindWalls(
	_In_ int locationCount,
	_Inout_ Dll_Interface::TopologyResult* locationData)
{
	UnderstandingMgr_W &UnderstandingMgr = UnderstandingMgr_W::GetUnderstandingMgr();

	Vec3fDA outPos, outNormal;
	FloatDA outWidths, outLengths;

	UnderstandingMgr.GetPlayspaceInfos().m_TopologyAnalyzer.GetWalls(
		outPos, outNormal, outWidths, outLengths, FALSE);

	return(OutputLocations(locationCount, locationData, outPos, outNormal, outWidths,
		outLengths));
}

and that leans heavily on the existing OutputLocations function within that class.

I can then alter the PInvoke wrapper in SpatialUnderstandingDllTopology.cs in order to surface this new API;

        [DllImport("SpatialUnderstanding")]
        public static extern int QueryTopology_FindWalls(
            [In] int locationCount,             // Pass in the space allocated in locationData
            [In, Out] IntPtr locationData);     // TopologyResult

and then I’m set up to be able to make that call although naturally I need to make sure that I recompile my C++ code, get the resultant DLL and ensure that Unity has picked it up instead of the one that lives in the HoloToolkit-Unity by default.

Making Use of the Extension

I can then perhaps try this new ‘FindWalls’ functionality out by adding a method to the existing SpaceVisualizer to make this call and visualize it with rectangles;

        public void Query_Topology_FindWalls(int top)
        {
            // Only if we're enabled
            if (!SpatialUnderstanding.Instance.AllowSpatialUnderstanding)
            {
                return;
            }
            var resultsTopology = new SpatialUnderstandingDllTopology.TopologyResult[128];

            IntPtr resultsTopologyPtr =
                SpatialUnderstanding.Instance.UnderstandingDLL.PinObject(resultsTopology);
            
            int locationCount = SpatialUnderstandingDllTopology.QueryTopology_FindWalls(
                resultsTopology.Length,
                resultsTopologyPtr);

            if (locationCount != 0)
            {
                var rects = resultsTopology.OrderByDescending(
                    r => r.width * r.length).Take(Math.Max(top, resultsTopology.Length));

                foreach (var rect in rects)
                {
                    float timeDelay = (float)lineBoxList.Count * AnimatedBox.DelayPerItem;
                    lineBoxList.Add(
                        new AnimatedBox(
                            0.0f,
                            rect.position,
                            Quaternion.LookRotation(rect.normal, Vector3.up),
                            Color.blue,
                            new Vector3(rect.width, rect.length, 0.05f) * 0.5f));
                }
            }
        }

and I could then call that from my original code when the status of the scanning goes –> Done;

    void OnScanStateChanged()
    {
        if (SpatialUnderstanding.Instance.ScanState == SpatialUnderstanding.ScanStates.Done)
        {
            this.isScanning = false;
            this.isStopping = false;

            // Use Space Visualizer to find a large wall
            SpaceVisualizer.Instance.Query_Topology_FindWalls(10);

        }
    }

and that produces some ‘interesting’ results in giving me walls around this chair;

image

but, naturally, I could then filter down by size or filter down by angle to remove some of those from my result set but I’ve managed to get the data that I wanted out from the native code and into my C# code so that I can further work on it.

It’s possible that I’ve missed a “get all walls” method in here so feel free to comment and let me know if that’s the case but I thought I’d write up these rough notes in the meantime as I know I’ll be coming back to read this note myself in the future when I revisit spatial understanding again Smile