Blog

Orientation Problems? A GPX comparison

On a trail run with Lukas we felt to be lost quite often. Even though we had planned the tour beforehand and had a GPX file. But we don't have navigation watches (yet) and getting your phone out of the pocket is a bit tedious. So I want to compare our planned and actual route using Python.

Loading GPX Data

First gpxpy is used to load the GPX files. The data gets summarized and saved for later in a pandas DataFrame. Furthermore the elevation difference, cumulative distance and vertical speed are calculated

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import os
import pandas as pd
import matplotlib.pyplot as plt
import gpxpy
from geopy import distance
import folium


def load_gpx(file_name):
    gpx = gpxpy.parse(open(file_name))
    track = gpx.tracks[0]
    segment = track.segments[0]
    print(f"File: {file_name}")
    print(f"Points: {len(segment.points)}")
    print(f"Length: {segment.length_2d() / 1000:.1f} km")
    print(f"Up:     {segment.get_uphill_downhill().uphill:.0f} m")
    print(f"Down:   {segment.get_uphill_downhill().downhill:.0f} m\n")

    # Load the data into a Pandas dataframe (by way of a list)
    data = []
    for point_idx, point in enumerate(segment.points):
        data.append([point.longitude, point.latitude, point.elevation,
                     point.time, segment.get_speed(point_idx)
                     ])

    columns = ['long', 'lat', 'ele', 'time', 'speed']
    gpx_df = pd.DataFrame(data, columns=columns)

    # Calculate elevation diff, cumulative distance and vertical speed
    gpx_df['ele_d'] = gpx_df.ele.diff(1)
    gpx_df = get_distance(gpx_df)
    if segment.has_times():
        gpx_df['td'] = (gpx_df.time.diff(1)).astype('timedelta64[s]')
        gpx_df['vert_speed'] = gpx_df.ele_d / gpx_df.td
    return gpx_df


def get_distance(_df):
    _df = _df.reset_index(drop=True)
    _df['dist'] = 0.0
    for pnt in _df.index[1:]:
        _df.dist.iat[pnt] = distance.distance((_df.lat.iat[pnt - 1], _df.long.iat[pnt - 1]),
            (_df.lat.iat[pnt], _df.long.iat[pnt])).meters + _df.dist.iat[pnt - 1]
    return _df


tour = 'brecherspitz'

df_plan = load_gpx(os.path.join(tour, 'plan.gpx'))
df_track = load_gpx(os.path.join(tour, 'track.gpx'))

The summary yields interesting results. We know that we didn't follow the path all the time but we definitely didn't do more than 1000 m extra vertical. Also 5km more seems too much. So it's time do look closer to the GPS data.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
[Out]
File: brecherspitz\plan.gpx
Points: 1163
Length: 20.0 km
Up:     1874 m
Down:   1871 m

File: brecherspitz\track.gpx
Points: 6723
Length: 25.4 km
Up:     3006 m
Down:   3010 m

Plotting

The plots are done using Matplotlib

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def plot_tour(_df_plan, _df_track, _title="GPS Plot"):
    fig, ax = plt.subplots()
    ax.plot(_df_plan.long, _df_plan.lat, label="Plan", alpha=1)
    ax.plot(_df_track.long, _df_track.lat, label="Track", alpha=0.8)
    ax.legend(loc='upper left')
    ax.set(title=_title, xlabel="Latitude", ylabel="Longitude")

    fig, ax = plt.subplots()
    ax.plot(_df_track.dist / 1000, _df_track.ele)
    ax.set(title=_title, xlabel="Distance [km]", ylabel="Elevation [m]")

    fig, ax = plt.subplots()
    ax.plot(_df_track.dist / 1000, _df_track.vert_speed)
    ax.set(title=_title, xlabel="Distance [km]", ylabel="Vertical speed [m/s]")

    fig, ax = plt.subplots()
    ax.plot(_df_track.dist / 1000, _df_track.speed)
    ax.set(title=_title, xlabel="Distance [km]", ylabel="Speed [m/s]")

    plt.show()


plot_tour(df_plan, df_track, "Brecherspitz Bodenschneid")
Elevation profile with wrong peaks and more than 1000 m extra vertical
Analyzing and plotting the data reveals its some flaws. First the elevation profile has some false peaks at 5, 7, 12 and 24 km. They are caused by me using my phone as a tracker which doesn't have the best GPS module and no barometer.

Speed: we were running with up to 120 m/s
Also the speed shows some faulty peaks which often correlate with the wrong elevation data. We were not more than ten times faster (~120 m/s) than Usain Bolt. Even during our downhills 😉. The time data only has second accuracy. So this can lead to wrong speed if the two datapoints are to close together.

Vertical speed: 15 m/s vertical isn't too bad either
Vertical speed shows a similar story with a 15 m/s maximum.

Map
Looking at the coordinates itself it stands out that on the bottom right the GPS was jumping around and that we lost the trail or took a wrong turn from time to time.

Filtering

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
def filter_df(_df, _max_speed=6, _max_ele_diff=5, _ele_diff_reps=20, _max_vert_speed=1):
    n_raw = len(_df.index)
    _df = _df[_df.speed < _max_speed]
    for i in range(_ele_diff_reps):
        _df = _df[(_df.ele_d < _max_ele_diff) & (_df.ele_d > -_max_ele_diff)]
        _df.ele_d = _df.ele.diff(1)
    _df = _df[(_df.vert_speed < _max_vert_speed) & (_df.vert_speed > -_max_vert_speed - 1)]
    _df = get_distance(_df)
    _df.ele_d = _df.ele.diff(1)
    n_filtered = len(_df.index)
    print("Filtered Track: ")
    print(f"Points: {n_filtered} ({n_filtered / n_raw * 100:.1f}%)")
    print(f"Length: {_df.dist.iat[-1] / 1000:.1f} km")
    print(f"Up:     {_df[_df.ele_d > 0].ele_d.sum():.0f} m")
    print(f"Down:   {-_df[_df.ele_d < 0].ele_d.sum():.0f} m")
    return _df

df_track_filtered = filter_df(df_track)
I used some easy filters to clean the data. This provides a more realistic result of 22.5 km and 2k vertical meters using roughly 80%.

1
2
3
4
5
Filtered Track:
Points: 5567 (82.8%)
Length: 22.5 km
Up:     2034 m
Down:   2038 m

The plots are much better after filtering.

Elevation profile filter: less than 5m difference for a maximum of 20 data points

Speed limit: 6 m/s

Vertical speed limit: -2 to +1 m/s

Map

Map

To put the GPS data in a context I plotted the filtered track using Folium

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
def create_points_html(_df):
    _points = []
    for _idx in _df.index:
        _points.append(tuple([_df.lat.iat[_idx], _df.long.iat[_idx]]))
    return _points


def create_html_map(_df_plan, _df_track, _zoom=13):
    _points_plan = create_points_html(_df_plan)
    _points_track = create_points_html(_df_track)
    mymap = folium.Map(location=[_df_track.lat.mean(), _df_track.long.mean()], zoom_start=_zoom, tiles=None)
    folium.TileLayer('https://{s}.tile.opentopomap.org/{z}/{x}/{y}.png',
                     attr='Map data: &copy; <a href="https://www.openstreetmap.org/copyright">OpenStreetMap</a> '
                          'contributors, <a href="http://viewfinderpanoramas.org">SRTM</a> | Map style: &copy; '
                          '<a href="https://opentopomap.org">OpenTopoMap</a> '
                          '(<a href="https://creativecommons.org/licenses/by-sa/3.0/">CC-BY-SA</a>)',
                     name='OpenTopoMap').add_to(mymap)
    folium.PolyLine(_points_plan, color='blue', weight=4.5, opacity=.5).add_to(mymap)
    folium.PolyLine(_points_track, color='red', weight=4.5, opacity=.5).add_to(mymap)
    mymap.save('map.html')

Fullscreen map

Result

Due to the bad GPS data I had to do a lot of filtering but we did roughly a 10% longer run in distance and elevation. So I think it was not too bad. But we felt lost without a trail for quite a few times. But we were pretty close to the trail. So I didn't expect too much difference in the metrics.