Visualising Space-Time Data

Blog / Marco Tamassia / January 17, 2020

In this blog post, we show how to visualise space-time data using a visualisation called space-time cube using Python and Plotly. A space-time cube is a 3D cartesian plot where one of the three axes (the vertical, usually) represents time, while the others represent space.

This technique enables a data analyst to better visualise movements over time, be it player movements in sports and video games, or soldier movements in military simulations. This can help one see at a glance patterns both in space and in time, such as competing players meeting in certain areas of the map, or soldiers being more likely to succeed when reaching the objective location early.

Plotly allows us to produce interactive visualisations, which are much easier to inspect than static “snapshots” of space-time cubes. This allows the analyst to more easily form a mental model of the data being presented, as she explores the plot from various points of view, rotating it interactively to change perspective.

In the context of this article, we will be using a DOTA 2 dataset to generate a space-time cube. DOTA 2 is a video game played by two teams of 5 players with competing objectives and it is highly location-based, which makes it a perfect candidate for spatio-temporal analysis.

Showing 4D data on a surface

Visually analysing time-dependent data is a common task in many fields. This, however, becomes problematic when the data is collected in a spatial context. The longstanding problem is: how does one visualise 2 or 3-dimensional data that evolves over time using computer screens and paper? In other words, how do you visualise 4-dimensional data on mediums that are inherently 2-dimensional? There are different approaches one can take.

The first option is reducing the dimensionality of your data. For example, Principal Component Analysis and Self-Organising Maps can sometimes be used to produce a “good-enough” 2-dimensional representation of 3-dimensional data. More simply, some problems do not require all of the original dimensions: for example, movement data on a flat terrain do not require an extra dimension to represent altitude.

A second option is to use time as an extra dimension. This is nothing new: for more than a century people have been using a fast-changing sequence of images to trick the eye into perceiving something as smoothly moving over time. This technique has been widely used to produce content such as animations, movies films, etc. Animations augment your medium of visualisation by a dimension, giving you more space to manoeuvre and represent your data.

A third option is to use stereoscopic images to transmit to the brain the third dimension using a 2-dimensional medium. The method relies on the ability of the brain to “understand” two images (of a 3-dimensional scene) taken at slightly different angles as a 3-dimensional scene. This, however, requires that each image is sent to one of the eyes; options to accomplish this include 1) Virtual Reality headsets, 2) polarised glasses, 3) bi-coloured goggles and 4) simply crossing your eyes (see Figure 1). The requirement for additional equipment and the awkwardness of crossing one’s eyes make this option not widely viable.

Fig.1 – Hyades – the movement of stars in 300,000 years. To view this picture you need cross-eyed viewing (source: Wikipedia).

Chromostereopsis is another option, which we mention for completeness even though it is probably too rudimental to be a valid alternative. Chromostereopsis is a visual illusion whereby different colours physically lying at the same distance from the eye are perceived at slightly different distances. The effect is due to chromatic aberration, where an imperfect lens, such as the human pupil, refracts different wavelengths differently and causes them to be perceived at different distances. The effect is subtle and arguably annoying, but it has widely been used to create visual illusions.

Fig.2 – An example of chromostereopsis: the red circle on the inside should look like it is slightly deeper than the blue ring on the outside.

One more option is to simulate an extra dimension on a 2-dimensional medium. This has been done for decades in movies and video-games: the scene is shown as flat, but the brain perceives it in 3 dimensions. This can be accomplished using perspective and parallax movement, which transmit depth information to the brain.

In this blog post we focus on the latter option, and in particular on how to achieve an interactive 3-dimensional visualisation in Python using the Plotly library. Plotly provides the perspective representation out-of-the-box, and its plots are interactive, which provides the movement much needed by the brain to interpret the plot effectively. This type of visualisation is called a space-time cube and was invented in the 70s by Torsten Hägerstraand [1].

Showing images in Plotly 3D spaces

When representing data whose context is spatial, plotting a terrain is most helpful to understand “where” the data points are located. Unfortunately, Plotly at the moment does not support plotting a 2-dimensional image in a 3-dimensional plot (e.g. sticking it onto a plan embedded in the 3D scene).

However, it does support colouring a flat surface with a colourmap based on some value. We will bend this feature to our needs; that is, to visually represent an approximation of a terrain. The trick we will use involves two steps:

  1. associating a scalar value to each pixel, and
  2. associating values to colours, the same way heatmaps do.

The trick is to create a custom colourmap. Our colourmap, instead of going from, say, green to black, will include a selection of colours representative enough that we can draw the map using those colours only.

As a use-case, we will take a video game, Defense of the Ancients 2 (more commonly known as DotA 2). The game sees 2 teams of 5 players each, each team controlling half of the map, and having a base deep in their controlled area. The two teams fight each other with the objective to destroy the opponents base. Figure 3 shows a sketch of the map.

Fig.3 – The map used in DotA 2 (version 7.07, source: Gamepedia).

To start with, we need to reduce the number of colours present in the image. We can achieve this in several ways. To keep this blog post limited, we will use an external tool, ImageMagick to do the job. While we are at it, we will also crop and resize the image, because high resolutions are rendered slowly and high numbers of colours will look worse (more on this later).

Resize and reduce the number of colours in the image

convert Minimap_7.07.png \
  -crop 1000x940+10+30 \
  -resize 640000@ \
  +dither \
  -colors 64 \
  Minimap_7.07_64colors.png

I will be assuming you are running the code in a Jupyter Notebook. The requirements for running the code in this blog post are:

Requirements list

Cython==0.29.14
numpy==1.17.4
scipy==1.4.0
pandas==0.25.3
scikit-learn==0.22
matplotlib==3.1.2
plotly==4.4.1
scikit-image==0.16.2
imageio==2.6.1
Jupyter==1.0.0
colormath==3.0.0
ACO-pants==0.5.2

First of all, let’s import all the modules and functions we will need.

Import all the tools we need

import re
import subprocess
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import plotly.graph_objs as go
from copy import deepcopy
from imageio import imread
from matplotlib.pyplot import imshow
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)
%matplotlib inline

We can now load and display the image using Python with the following code.

Load and show map

img = imread('images/Minimap_7.07.png') / 255
plt.figure(figsize = (10,10))
imshow(img)
print(f"Size: {img.shape[0]}x{img.shape[1]} {img.dtype}")

Python informs us that this image has a resolution of 387×412, with 4 values for the colours (RGB plus alpha). The next step is to find the unique colours chosen by ImageMagick; that is, all the colours present in the image. We can also display the colours found.

Find the colour palette

# Find and show unique colors
img_array = img[:, :, :3].reshape((img.shape[0] * img.shape[1], 3))
colors = np.unique(img_array, axis=0)
n_colors = colors.shape[0]

def show_colors(colors):
    # colors_matrix = np.reshape(colors, [4, n_colors // 4, 3])
    imshow(np.reshape(colors, (1, -1, 3)), aspect='auto')
    plt.xticks([])
    plt.yticks([])
    plt.gcf().set_size_inches(10, 1)
show_colors(colors)

Python informs us that there are 64 colours, which is what we expect since that is how many colours we asked ImageMagick to use. We then use the colours to create a custom Plotly colourmap, which is nothing but a list of tuples with a float as the first element and a string in the format 'rgb({}, {}, {})' as the second element, with integer values in the range 0-255. Plotly uses linear interpolation to determine the colours in between the points we provide.

Create a custom Plotly colourmap

# Create a custom colormap
color_to_value = {tuple(color[:3]): i / (n_colors - 1) for i, color in enumerate(colors)}
my_cmap_ply = [(value, 'rgb({}, {}, {})'.format(*color)) for color, value in color_to_value.items()]

Now that we have a colourmap, we map each pixel to the value that corresponds to its colour within the colourmap.

Convert RGBs to appropriate values based on the colourmap

# Map pixels to values
fun_find_value = lambda x: color_to_value[tuple(x[:3])]
values = np.apply_along_axis(fun_find_value, 2, np.flipud(img))

We are now ready to show the terrain map in a 3-dimensional plot.

Show the map on a 3D interactive plot

# Display terrain
yy = np.linspace(0, 1, img.shape[0])
xx = np.linspace(0, 1, img.shape[1])
zz = np.zeros(img.shape[:2])

surf = go.Surface(
    x=xx, y=yy, z=zz,
    colorscale=my_cmap_ply,
    surfacecolor=values,
    showscale=False
)
fig = go.Figure(data=[surf], layout=go.Layout())
iplot(fig, filename='terrain.html')

Visualising player movements over time

The terrain is nice and all, but it is not very useful in itself. The core point of space-time cubes is visualising data, so let’s add data!

The good Prof. Anders Drachen from the DC Labs at the University of York happens to have a dataset of DotA2 games, including movement data. The dataset is not publicly available, but he allowed us to showcase a match for this blog post.

he data of each match was provided in a CSV file with a series of columns containing heroes data over time, plus a column indicating the time, labelled tick. We will use this data for the rest of the post. The columns we will use are spatial coordinates, cumulative death counters and a flag indicating whether the hero was alive at the time (as opposed to being dead and waiting to respawn). Column names follow a specific pattern, which can be matched against to extract information about the players. The data in all columns (except time) is sparse, meaning that values are only present when a value changes. Let’s load the data, forward-fill the data, and visualise a random row:

Load, forward-fill and display data

# Load and display data
df = pd.read_csv('data/2842231742.csv')
cols = [col for col in df.columns if
        col.startswith('X_') or
        col.startswith('Y_') or
        col.startswith('Deaths_') or
        col.startswith('IsAlive_')
       ] + ['tick']
df = df[cols]
df.ffill(inplace=True)
print(df.iloc[1000,:])

The snippet above outputs:

Deaths_0_R_enigma_L                  0.00000
IsAlive_0_R_enigma_L                 1.00000
X_0_R_enigma_L                   -5029.00000
Y_0_R_enigma_L                    1187.00000
...
Deaths_9_D_juggernaut_W              2.00000
IsAlive_9_D_juggernaut_W             1.00000
X_9_D_juggernaut_W               -5471.00000
Y_9_D_juggernaut_W                5630.00000
tick                               176.70154
Name: 1000, dtype: float64

Now let’s extract players information from the columns and have a look at it:

Extract players information

# Infer players data from column names
players = set()
pattern = re.compile(r'[^_]+_([0-9])_(R|D)_([a-z_]+)_(W|L)')
for col in df.columns:
    match = re.match(pattern, col)
    if match:
        player_id, team, hero, outcome = match.groups()
        players.add((player_id, team, hero, outcome))
print("\n".join([str(t) for t in sorted(players)]))

which outputs

('0', 'R', 'enigma', 'L')
('1', 'R', 'life_stealer', 'L')
('2', 'R', 'axe', 'L')
('3', 'R', 'witch_doctor', 'L')
('4', 'R', 'earth_spirit', 'L')
('5', 'D', 'ursa', 'W')
('6', 'D', 'disruptor', 'W')
('7', 'D', 'bounty_hunter', 'W')
('8', 'D', 'legion_commander', 'W')
('9', 'D', 'juggernaut', 'W')

Now we have player IDs, their team, hero names and whether the team won or lost. Instead of generating movement traces straight-away, we will first create a dictionary containing the style for each hero in Plotly format. We will refer to these styles later, when plotting the legend.

Setting the styles

# Set styles
styles = {}
for player_id, team, hero, outcome in players:
    color = '#0088FF' if team == 'R' else '#FF530D'
    styles[hero] = {
        'mode': 'lines',
        'line': go.scatter3d.Line(color=color),
        'legendgroup': hero.replace('_', ' ').title(),
        'name': '{hero} ({team})'.format(hero=hero.replace('_', ' ').title(), team=team)
    }

The next step is to generate Plotly traces by iterating through players. Instead of drawing a continuous line start-to-end, we will split the data of each hero using the deaths counter column. We do this because in DotA2, when heroes die, they re-appear in their base: deaths would be shown as long, straight lines, which could be confusing.

Generating movement traces

# Generate movement traces
traces = []
col_suffix_pattern = '_{player_id}_{team}_{hero}_{outcome}'
for player_id, team, hero, outcome in players:
    col_suffix = col_suffix_pattern.format(player_id=player_id, team=team, hero=hero, outcome=outcome)
    for _, sub_df in df.groupby('Deaths' + col_suffix):
        sub_df = sub_df[sub_df['IsAlive' + col_suffix] == 1]
        xx = sub_df['X' + col_suffix].values
        yy = sub_df['Y' + col_suffix].values
        zz = sub_df['tick'].values
        style = styles[hero]
        trace = go.Scatter3d(
            x=xx, y=yy, z=zz,
            showlegend=False,
            **style
        )
        traces.append(trace)

Instead, we will draw death “teleportations” as dashed lines.

Generating death teleportation traces

# Generate death traces
for player_id, team, hero, outcome in players:
    col_suffix = col_suffix_pattern.format(player_id=player_id, team=team, hero=hero, outcome=outcome)

    spawn_locs, death_locs = [], [(np.nan, np.nan, np.nan)]
    for _, sub_df in df.groupby('Deaths' + col_suffix):
        sub_df = sub_df[sub_df['IsAlive' + col_suffix] == 1]
        xx = sub_df['X' + col_suffix].values
        yy = sub_df['Y' + col_suffix].values
        zz = sub_df['tick'].values
        # Save spawn and death location for this "life"
        spawn_locs.append((xx[0], yy[0], zz[0]))
        death_locs.append((xx[-1], yy[-1], zz[-1]))
    spawn_locs.append((np.nan, np.nan, np.nan))

    # Pairwise iterate death and spawn locations (misaligned on purpose with those NaNs)
    for death_loc, spawn_loc in zip(death_locs, spawn_locs):
        style = deepcopy(styles[hero])
        # noinspection PyTypeChecker
        style['line'] = go.scatter3d.Line(color=style['line']['color'], dash='dash')
        xx = [death_loc[0], spawn_loc[0]]
        yy = [death_loc[1], spawn_loc[1]]
        zz = [death_loc[2], spawn_loc[2]]
        trace = go.Scatter3d(
            x=xx, y=yy, z=zz,
            showlegend=False,
            **style
        )
        traces.append(trace)

The last step is to add a legend which can be used to turn off single hero lines (hence having a style for each hero instead of one per team).

Setting up the legend

# Setup legend
for legend_group, style in styles.items():
    trace = go.Scatter3d(
        x=[np.nan], y=[np.nan], z=[np.nan],
        **style
    )
    traces.append(trace)

The final step is to generate and save the final result. We will use the orthographic projection, which allows to more easily inspect space-time cubes from the top or the sides.

Visualising the result

yy = np.linspace(-8000, +8000, img.shape[0])
xx = np.linspace(-8000, +8000, img.shape[1])
zz = np.full(img.shape[:2], -90)

surf = go.Surface(
    x=xx, y=yy, z=zz,
    colorscale=my_cmap_ply,
    surfacecolor=values,
    showscale=False
)

layout = go.Layout(
    margin=dict(l=0,r=0,b=0,t=0),
    scene=go.layout.Scene(
        xaxis=go.layout.scene.XAxis(title='', showticklabels=False),
        yaxis=go.layout.scene.YAxis(title='', showticklabels=False),
        zaxis=go.layout.scene.ZAxis(title='Time (s)'),
        aspectratio=dict(x=1, y=1, z=1.3),
        camera=go.layout.scene.Camera(
            projection=go.layout.scene.camera.Projection(
                type='orthographic'
            )
        )
    )
)
fig = go.Figure(data=[surf] + traces, layout=layout)
iplot(fig, filename=f'terrain-64colors+paths.html')

Bonus – Sorting colour palette for enhanced aesthetics

If you zoom in on the first terrain visualisation, you can notice artefacts between colours. This is an unwanted consequence of bringing down a 3-dimensional space (the space of RGB colours) to a 1-dimensional space (the colourmap). When we did that, we artificially put an order to the colours, and now when Plotly wants to smoothly transition from colour A to colour B, it will use all the colours in between A and B. Since the colours are arranged in whatever order numpy found them in, it can happen that black is in between two shades of green: this will cause there being a black line between any two adjacent pixels with those shades of green. There is no way around this, but there is a way to mitigate this effect: sorting the colours such that they are in a “visually smooth” order, whatever that means.

The way I approached this problem is to find the “shortest visual path” through all the colours. This is an instance of the famous Traveling Salesman Problem, and as such is NP-hard. As far as science knows, there is no way to solve this problem efficiently; that is, there is no way to find the best solution. However, there are a number of algorithms to compute approximate solutions. The first coming to my mind are meta-heuristics, strategies to solve optimisation problems that “tend to work”; the most famous examples are evolutionary computation algorithms. Conveniently, there is a Python package that approximately solves the TSP problem using ant-colony optimisation, ACO-pants.

We can use the library to solve our colours sorting problem if we can provide a function computing our notion of “visual distance” between two colours. It turns out this is a common enough problem that standards have been created and that a Python implementation exists. Enter Delta E CIE 2000 and the colormath package.

We are all set to solve the problem.

Sort colours “visually”

from colormath.color_diff import delta_e_cie2000
from colormath.color_objects import LabColor, sRGBColor
from colormath.color_conversions import convert_color
from pants import World, Solver
def rgb_distance(color1, color2):
    color1 = sRGBColor(*color1)
    color2 = sRGBColor(*color2)
    color1 = convert_color(color1, LabColor)
    color2 = convert_color(color2, LabColor)
    return float(delta_e_cie2000(color1, color2))
colors = [tuple(c) for c in colors]
solution = Solver().solve(World(colors, rgb_distance))
colors = np.array(solution.tour)
print(f"Sorted colors array: {colors.shape[0]}x{colors.shape[1]}")
show_colors(colors)

Conclusion

In this blog post, we saw how to visualise space-time data using Python and Plotly, and dealt with the library limitations to achieve our objective. This type of visualisation, while not new, has only recently become accessible to many with the advent of powerful visualisation libraries such as Plotly. I hope space-time cubes will help you to find great insights into your data and bring you some nerd joy! You can find the whole code used in this post in this repository.

Bibliography

  1. T. Hägerstraand, “What about people in regional science?,” Papers in regional science, vol. 24, no. 1, pp. 7–24, 1970.

Header image by Aksonsat Uanthoeng from Pexels.