1. Overview

This tutorial will describe how to use the evaluation scripts provided with the framework as well as how to create a new evaluation script. To perform the steps in this tutorial, you will need this archive. It contains - a set of slam log files (created by running a VSLAM system on an example sequence) - the repository items that have been used to create the sequence. These items are also required for ground truth tracing.

(The creation of a slam log is covered in the API Documentation.)

2. Getting started

At first, let's get an overview about the directory layout of the framework's evaluation part. In the base directory you will find a directory evaluation which initially contains the directories data, evaluation_scripts, a Makefile and a README. You may read the README if you can't resist.

$ cd evaluation
$ ls
data/ evaluation_scripts/ Makefile README

The data directory is were we will put the slam log files for processing afterwards. The operations that will be performed on these files during processing are defined by a set of evaluation scripts, which we find under evaluation_scripts.

$ cd evaluation_scripts
$ ls
camera_errors.py    feature_inits_list.py          groundtruth_map.py
common.py           frame_errors.py                groundtruth_measurements.py
feature_clouds.py   gnuplots.py                    __init__.py
feature_errors.py   groundtruth_feature_inits.py   skeleton

The files common.py, skeleton and init.py have special meanings, which will be explained later. The remaining scripts each perform a dedicated task on the slam log files, such as computing the estimated camera's position error with respect to the ground truth. We will analyse the structure of these files in the last part of this tutorial.

Now we need data to work on. Download this archive and extract it in the framework base directory. This will put example slam log data into the evaluation/data directory. In the next section, we will see how we run the evaluation on this data.

3. Performing Evaluation

Let's go back to the evaluation directory.

$ cd evaluation

The Makefile in this directory takes care of running all evaluation scripts (and solving their dependencies) on the sets of slamlog files that it can find. Each set of slamlog files is expected to be placed in a subdirectory in data of arbitrary name and depth. This allows you to structure your experiment's outcome. We have just one set of slamlog files in this tutorial.

$ ls data
$ ls data/bmvc_09/
feature_inits.xml  maps.xml  measurements.xml  poses.xml  setup.xml

That's all we need to perform the evaluation. A call to make lets the actual work begin:

$ make

You'll see the output of the evaluation scripts. Each of the seven evaluation scripts states what it is going to do an produces more or less verbose output about it.

4. Inspecting the Results

After some seconds, the work of the evaluation scripts should be done. The results are stored in subdirectories of the the slamlog directory:

$ ls data/bmvc09/
feature_inits.xml  maps.xml          misc/   poses.xml
groundtruth/       measurements.xml  plots/  setup.xml

The outcomes of the evaluation scripts are written to the three directories groundtruth, misc and plots. Some of them are .xml files (the files in groundtruth). These files are used as intermediate results in the evaluation process to produce the second type of files, the .dat files in plots. These files contain relational data in plaintext and can therefore easily be inspected with a plotting program such as gnuplot. The misc directory is used for any file that is not of the two types. Currently, it just stores a file which is needed as input to the povray tracing implementation.

Each of the .dat files starts with a comment block about its contents. For instance, lets take a look at camera_position_errors.dat:

# Camera Position Errors
# euclidean distance of estimated to ground truth camera positions
#:label camera position error
#:column_names time, error
#:column_units s, m
0.0 0.0
0.033333 0.00121525203304
0.066666 0.00620133129292
0.1 0.00370625882526
0.133333 0.00818630970122

This file contains the camera position errors, that is the euclidean distance of estimated to ground truth camera positions. The last two lines indictate, that data is organised in two columns: The first value in every line is the time (with unit s), the second value is the error (with unit m).

For most of the .dat files the framework also generates plots (using gnuplot). For instance camera_position_errors.ps is generated from the above file. plot example .

At the moment, more sophisticated plots (showing data from multiple experiments in one diagram, for example) must be generated by hand from the .dat files. We are currently writing a script that automates this and creates LaTeX code from plot files. Once it is mature enough, we will provide it together with our framework. Here is a combined example plot of the files camera_position_errors.dat and num-features.dat:

plot example .

Note For an explanation of the output files of each evaluation script please refer to the manual.

The evaluation scripts initially provided by our framework perform fundamental tasks, such as computing the camera's position and rotation error, determining the map error with respect to the real feature positions and detect the measurement errors. If you want to go beyond this functionality you may do so by writing your own evaluation script.

5. Writing an Evaluation Script

In the previous section we saw the output of an evaluation script that computes the map error at every frame. We define the map error as the mean squared Euclidean distance between estimated and ground truth feature positions. Suppose we want to have a deeper understanding about the map error: we want to know how the error evolves for every individual feature. Let's write an evaluation script that performs this task.

As mentioned above, all evaluation scripts can be found under evaluation_scripts. Let's go there:

$ cd evaluation/evaluation_scripts

In this directory you will find a file called skeleton. This serves as a template for writing evaluation scripts. We will call our new script feature_errors.py. Let's copy skeleton and inspect it's content:

$ cp skeleton feature_errors.py

from common import *

script = EvaluationScript(
    [], # your requirements go here (class Requirement)
    []  # your products go here     (class Product)

if __name__ == "__main__":

  # makes the paths to your requirements/products accessible
  # via the above specified variable names

  # perform evaluation, write results

The first line just states how the script has to be run and points out that this script needs to be executable.

$ chmod 755 feature_errors.py

The second statement tells Python to include all definitions from common.py in the same directory. This is basically the definition of the class EvaluationScript, which is used in the third statement. The instantiation of the class EvaluationScript tells the framework how the evaluation script file is called, which files it needs and which files it produces.

The if statement in the middle of the file is a bit Python magic. Everything that follows is executed only if this script was called directly, i.e. was not included from another script. Our framework uses this magic: Each evaluation script is called indirectly to check it's requirements and products. The actual work should only be done if the script is called directly. If the script is called directly, our framework passes some command line arguments to it, which need to be processed. This is done by the obligatory statement script.parseCommandLine().

Now let's extend this skeleton to our needs.

Each evaluation script has to instantiate EvaluationScript. The arguments of the constructor are the script's filename, it's requirements (i.e. slamlog files or files created by other scripts) and it's products (i.e. the files that are provided by this script).

For our task we need to know the estimated feature positions at every frameground and the ground truth feature positions. The estimated feature positions can be found in the slamlog file maps.xml and the ground truth in groundtruth/map.xml, a file created by the evaluation script groundtruth_map.py. Each requirement is given to the constructor of EvaluationScript as an instance of the class Requirement. The arguments to the constructor of Requirement are the name of the file we need (without the path to the slamlog set) and the name of the variable that will hold the path to the file we requested.

In our case this means constructing two requirements:

Requirement("maps.xml",            "maps_file")
Requirement("groundtruth/map.xml", "groundtruth_file")

In the same way, the third argument to the constructor of EvaluationScript is built: It's a list of files our script creates, the according class is called Product. We would create just one Product:

Product("plots/feature_errors.dat", "outfile")

Putting this all together, the EvaluationScript instantiation in our script reads as follows:

script = EvaluationScript(
    [Requirement("maps.xml",              "maps_file"),
     Requirement("groundtruth/map.xml",   "groundtruth_file")],
    [Product("plots/feature_errors.dat",  "outfile")]

The next step will be the actual computation of the per-feature distances. We do this by appending source code to the end of the file. Obviously, we need to iterate over the estimated maps of each frame and compare every feature of such a map with the ground truth. To parse the files maps.xml and groundtruth/map.xml we will use the xml parsers that are provided by the framework. To calculate the distance between two feature positions we will use numpy, a numerics library for python. Therefore, we need to import certain modules first:

  # add this after the script.parseCommandLine() statement:

  from numpy                        import array, linalg
  from parsers.MapsParser           import MapsParser
  from parsers.GroundTruthMapParser import GroundTruthMapParser

As we stated in the constructor of EvaluationScript, we want access to the full path of the files we require trough variables named maps and groundtruth. These are accessible via the script object. We use these paths now to instantiate the xml parsers of the framework:

  # create xml parsers
  maps_parser         = MapsParser(script.maps_file)
  groundtruth_parser  = GroundTruthMapParser(script.groundtruth_file)
Note There are parsers available for every xml format in the framework. See the manual for details.

The ground truth parser provides a list of features, each having an id. We need to request certain features by their id later, so it will be a good idea to create a lookup table to avoid searching in the list over and over again:

  # create groundtruth lookup table
  groundtruth = {}
  for feature in groundtruth_parser.features:
    groundtruth[feature.feature_id] = feature

Before we start with the outer iteration over every map in maps.xml, let's make our script say what it is doing:

  print "Calculating per feature errors"
  print "------------------------------"

This is the output you will see later when you type make in the evaluation directory.

Now lets perform the iterations and store the feature errors in a hash:

  # create feature error hash
  # it will contain the feature errors for every frame
  feature_errors          = {}

  for map in maps_parser.maps:

      # create a hash that will map from a feature id to the corresponding
      # error of this feature
      feature_errors[map.timestamp] = {}

      for feature in map.features:

        # get feature id
        id = feature.feature_id

        # get corresponding ground truth position as numpy array
        gt = array([groundtruth[id].feature_position.x,
        # put estimated feature position in a numpy array
        ft = array([feature.feature_position.x,
        # calculate squared distance
        squ_dist = sum((a-b)**2 for a, b in zip(gt, ft))

        # store squared distance value with feature id
        feature_errors[map.timestamp][id] = squ_dist

As you can see, iterating over the xml parsers is straight forward. The structure of the elements of the parsers reflect the structure of the according xml file. The maps parser holds a list of maps, which themselves hold a list of features, which have a feature_position and so on. At the end of this block we have a hash, that maps from timestamps (each representing a frame) to another hash. This second hash maps from feature ids to the error of this feature. Everything that's left for now is to write the results in the file we claimed to produce:

  # open the output file
  outfile = open(script.outfile, 'w')

  # write header
  outfile.write("# Feature Errors\n")
  outfile.write("# euclidean distance between estimated and ")
  outfile.write("real feature positions for every frame\n")
  outfile.write("#:label feature error\n")
  outfile.write("#:column_names time, feature id, error\n")
  outfile.write("#:column_units s, , m\n")

  for timestamp in sorted(feature_errors.keys()):

    for id in feature_errors[timestamp]:

      outfile.write(str(timestamp) + " ")
      outfile.write(str(id) + " ")
      outfile.write(str(feature_errors[timestamp][id]) + "\n")

Writing the header is not necessary, but it is nice because it gives others a clue on how to use this file.

That's all. Let's give our evaluation script a try. To introduce a new script to the framework, we need to tweak the init.py we found in the current directory. Open it and add an entry for the new evaluation script:

__all__ = [ "common",
            "feature_errors",   # <--- add this line
            "groundtruth_measurements" ]

Now change to the evaluation directory and type make:

$ cd ..
$ make

After a few seconds, a new file called plots/feature_errors.dat will emerge in the experiment directory. You may now use the plotting program of your choice and inspect the results. For example the following gnuplot commands will plot the error for feature 42:

set terminal postscript color rounded;
set output 'feature_error_42.ps';
set xlabel 'time [s]';
set ylabel 'squared position error [m]';
plot 'feature_errors.dat' using ($1):($2 == 42 ? $3 : 1/0) title 'feature error' with points;

Here is the resulting plot: plot example .