A Modern Approach to 3D Data Visualization

A Modern Approach to 3D Data Visualization

By Team "Move37"

In today's technology-driven world we witness that most of our day to day used tools have been moved to a web based application. New and promising technologies such as web components, progressive web apps and web assembly have proven that the possibilities of the web are much greater than just displaying simple informative websites. This validates the fact that people prefer to make use of their tools in a mobile scenario. The same trend can be observed for virtual reality headsets, with new and improved headsets being released which can be used in a standalone format. To push the limits even further we implemented a tool that is able to visualize 3D word embeddings for web and virtual reality.

What are we trying to create?

We were commissioned by KdG and Textgain to create a platform that would be able to visualize 3D word embeddings in an interactive and intuitive way. Word embedding is a term used in natural language processing that encodes the meaning and relations between words based on their distance from each other. These datasets contain a lot of information in their raw format, but to display them in a 3D environment Textgain reduced the amount of dimensional points to just three numbers, which could be easily translated to X,Y,Z coordinates. The project we created is mainly focused on rendering those reduced datasets in the most user-friendly format.

The application we designed consists of two major components, a web component and a virtual reality component. For the web component we concentrated on having an easy to use interface and seamless navigation throughout the dataset. This allowed us to use the web component for getting an immediate look at the complete dataset. For the virtual reality component we focused more on the interactive aspect of both applications. We made sure that we are fully immersed in the environment of words by grabbing different words, flying around in the word cloud as we like to call it, flying towards a selected word,....

Our technology stack

For the web application we have chosen to utilise React, one of the most known and performant front end libraries available. To visualize the 3D environments we’ve selected three.js, a lightweight javascript library that utilises WebGL to render the scenes. To integrate the rendering library into React we’ve made use of a rendering tree that wraps the three.js components into react components. This allowed us to fully use the power and flexibility of React by making our three.js components react to state changes.

For the VR application, we made use of the Unity Game Engine and the C# language. To visualize the 3D environment we made our own prefab that consists of a TextMesh so the words are written out instead of using spheres, like in the web application. To make everything work with VR we used the Oculus VR assets. These assets made it possible to work with VR without the hassle of creating everything ourselves.

164650099_168350241783611_7591299771908107190_n.png

Performance challenges

One of the most challenging aspects of taking this modern approach was finding the right balance between performance and usability. This was more of an active issue on the web application as javascript is an interpreted language thus resulting in slower performance compared to C# which is used by Unity.

We experienced the most performance problems due to the large amounts of data we had to process. Our first approach consisted of using a seperate mesh for each and every word we were rendering in the scene. This quickly resulted in us realising rendering these amounts of words wasn’t possible on most machines. We spent a good amount of time perfecting this method, with a lot of testing and benchmarking, and came to the conclusion that, with the use of separate meshes, we could only render a maximum of 5000 words without having performance issues and usability problems.

That’s the moment we started our research on performance. Our main goal was to reduce the total amount of draw calls made to WebGL. Every draw call results in a delay of the time until we generate a single frame. To put this into perspective, every second we want to generate 60 frames for a smooth experience. This means that we’ve got 1000 milliseconds to complete 60 frames, or in other words 16.667 milliseconds per frame. Whenever a single draw call takes longer than 16.667 milliseconds our frames per second will drop below

The solution to solving our draw call problem was making use of an instanced mesh. An instanced mesh is a special version of mesh with instanced rendering support, like the name suggests. This type of mesh is able to render a large number of objects with the same geometry and material but with different world transformations. This resulted in amazing performance and only a single draw call for the entire dataset.

<instancedMesh
      ref={mesh}
      args={[null, null, points]}
      onClick={onClickHandler}
      onPointerOver={onHoverHandler}
    >
      <sphereBufferGeometry attach="geometry" args={[size / 2, 8, 6]}>
        <instancedBufferAttribute
          ref={attribute}
          attachObject={['attributes', 'color']}
          args={[colorArray, 3]}
        />
      </sphereBufferGeometry>
      <meshStandardMaterial
        attach="material"
        vertexColors
      ></meshStandardMaterial>
    </instancedMesh>

The given code shows us everything we need for rendering our words, in the form of a sphere.

While in the VR application we had little to no problems regarding performance when rendering in the dataset. Some issues did start to arise, however, when we made some usability adjustments in order to make it so that the words which were rendered in will always look at the user’s camera no matter where the user is positioned in the VR world. When we’re working with datasets that can go over 20 000 words then it can cause massive performance hits. Which is no surprise. If a certain method is called to be executed on every word every frame then it is bound to cause performance degradation. In order to remedy this, we had to make it so that only the words currently seen by the user’s camera would execute this method rather than every single word.

Usability

Another important task we had to focus on was overall usability of the tools we had created. We had to develop every feature with user experience in the back of our minds. A feature that works but is barely usable or too complex is not a nice feature to have.

On the front end we have provided the user with floating boxes which contain basic functionality such as the information for the selected words, the configuration details which define how the cluster behaves and a search input to look up a word.

afb1.png

For better visualization of the selected words and their surrounding neighbours we have made use of color coding the spheres in the cluster. The red color represents the currently selected word and the purple color indicates the words that are in range according to the relation delta. When hovering over a word you will see that sphere turn green and an infotainment box will appear near the bottom of your screen.

afb2.png

We have also added support for a different view mode on the front end. We call this view mode ‘text’. Whenever this view mode is toggled the selected word will appear in a 3D text format along with the neighboring words. All other words that are not considered in range will not be rendered. This is to improve readability and performance.

afb3.png

On the VR application, we have provided the user with a basic, minimalistic UI. This was very important to us since we wanted to have the focus aimed towards our word cloud instead of the UI built around it. Therefore we created a wrist menu where the user can search for a word, load one of the available datasets in to explore said dataset and toggle the relationship lines on and off.

To make sure the user doesn't experience the feeling of being somewhere else when switching from web to VR or the other way round, we also made sure to use the exact same color scheme across both apps.

StO0LiK.jpg

Conclusion

This story doesn’t end here. We think there is still a lot to discover within ThreeJS especially. For a more detailed documentation be sure to visit ThreeJS’ documentation. Even though Unity isn’t really used for visualizing data, we really enjoyed exploring in this environment and learned a lot about the game engine.