TubeNoise is an attempt to show the real-time position of tube trains on the Northern Line around Kennington. This is a somewhat complicated area of the tube network due to the joining up of the Charing Cross and Bank branches of the Northern Line at Kennington.
Around half of the trains are turned around at Kennington on the Kennington loop - which allows Southbound Charing Cross trains to turn around and head North. There is also a siding at Kennington that allows Southbound trains from either branch of the line to return North on either branch of the line.
Determining the real-time position of tube trains presents some challenges. Below is a description of how TubeNoise works and the approach I've taken to addressing these challenges. Please feel free to provide me with feedback / suggestions / corrections.
The approach is organised into answering four questions:
1. Where are the trains in the tunnels?
2. Where are the tunnels?
3. How to predict train locations?
4. When will the train go round the loop?
The Trackernet Data Service from TfL provides information about how long it will be before each train arrives in the next station. The data provided by Trackernet is similar to the data shown on the tube wait time displays on all underground platforms.
You can see the current XML data returned by Trackernet for Kennington station by clicking here.
An example of the data provided for one train approaching Kennington station is below.
<T LCID="1104740" SetNo="120" TripNo="10" SecondsTo="90" TimeTo="1:30" Location="Between Waterloo and Kennington" Destination="Kennington via CX" DestCode="249" Order="0" DepartTime="16:06:49" DepartInterval="90" Departed="0" Direction="0" IsStalled="0" TrackCode="TN20794" LN="N"/>
This repesents a train (T) with Leading Card Identifier (LCID) of 1104740 that is 90 seconds from Kennington station. The location is described as 'Between Waterloo and Kennington' and this train is on the Charing Cross branch. We can also see something called the 'TrackCode' set to TN20794.
What's not provided is how far this train is from Kennington - the description isn't specific enough. However the TrackCode is intriguing. If this is granular enough this might allow the actual position in the tunnel to be known.
There is no publicly available data for the TrackCodes beyond the TfL description: 'The current section of track the train occupies'. However by looking at the returns from Trackernet for multiple trains it's clear the TrackCodes change in a regular pattern.
A TrackCode is also returned for each platform. As an example:
<P N="Northbound - Platform 1" Num="1" TrackCode="TN20634">
This is the Northbound Charing Cross platform at Kennington - platform 1 - with a track code of TN20634.
By repeating this exercise for multiple stations all the track codes for each platform can be determined. The table below shows the TrackCodes for the Northbound platforms of the Northern Line from Kennington to Euston and Kennington to Archway.
In general, the track code for a station is bigger number than the track code of the previous station - Elephant & Castle track code of TN20100 is 35 units bigger than the Kennington track code of TN20065. However there are some discontinuities. Angel's track code of TN60010 looks like it is set on a different basis from Old Street at TN20260. Similarly, Embankment's track code of TN30080 is substantially more than Waterloo's of TN20709.
There is a helpful Freedom of Information request made to TfL that asked for the distance between each of the stations on the tube network. This has the distance in kilometres between each station. Adding this information against the track codes and ignoring the discontinuities shows that each track code represents approximately 25 metres of track.
Armed with the track code length it's then easy to work out the track distances between Kennington and the 3 connecting stations - taking in to account the potential difference in length between Northbound and Southbound tracks:
So now the train track code can be mapped to the distance the train is from a station.
The GPS location of each tube station's ticket office has also been the subject of a Freedom of Information request, but clearly the tracks are not straight connections between the tube stations. In addition Waterloo station is several hundred metres in size.
There are very few maps showing the locations of the tube tunnels - the one below shows an engineering drawing of Northern Line when it was the 'City and South London Railway' - originally created in the late 1800's.
There are some detailed maps of parts of the Northern Line that have been created as part of the proposed Northern Line Extension. In particular, the Detailed Route Map issued on 31st August 2012 (and now out of date) shows a reasonable level of detail of the tracks south of Kennington station including the Kennington Loop. The map includes markings that correspond to distance in metres.
By far the most helpful source though is OpenStreetMap. This map shows the location of the tunnels and includes the Kennington Loop as well as the sidings and connections between the sidings and the main lines.
The approach used started with the Open Street Map data showing the location of the four platforms at Kennington and the tube lines from Kennington. The platforms at Waterloo, Elelphant & Castle and Oval are not shown on Open Street Map so the lines were drawn from Kennington towards each of the three stations for the distance previously calculated using the Track Codes. So the Northbound tunnel from Kennington follows the Open Street Map tunnel path for 1,875m. At the end it is assumed (but not verified) that the Waterloo Northbound platform is located. The same approach was used for each of the 6 main tunnels leaving Kennington.
The other tunnel connections around Kennington were added using Open Street Map data then modified using the Northern Line Extension map where there was a mismatch. In particular the junction labelled as 16 point in the chart at the top of the page appears too far North on Open Street Map when compared to the Northern Line Extension map.
The Trackernet data includes the TrackCode as well as the estimated seconds to the next station. The data is refreshed every 30 seconds so a forecast for the next 30 seconds is required. Mapping the estimated seconds against track code should allow the relationship to be determined.
Over a million train locations for the 4 stations of interest were recorded to act as the data source for determining the relationship between track code and seconds to the next station. Using this data and taking each section of track in turn reveals something unexpected. The chart below shows this relationship for the 6 main routes: Waterloo - Kennington and return, Elephant & Castle - Kennington and return, Oval - Kennington and return.
Looking at these charts two points are clear - and somewhat surprising
Firstly the charts are all straight lines, and straight lines imply the trains move at a constant speed. This is clearly inconsistent with what is experienced on a real tube train with the train accelerating for a few seconds as it leaves a station then generally travelling at a constant speed before decelerating in to the next station.
Secondly, the train is always assumed to spend exactly 50 seconds at the platform - regardless of time of day, and regardless of how long the train has already spent in the station. This can be seen by comparing the last point (or first point if the track codes decrease towards the station) which represents the train in the station before it leaves with the next point which represents the departure of the train from the platform. This gap is consistently 50 seconds for all stations and platforms and for all times of day.
The relationship between track codes and seconds to station embedded in the Trackernet data is:
Seconds To Station = (Number Of TrackCodes To Station * 1.95) + (Train In Station ? 50 : 0)
Using a linear approach to train location combined with fixed duration stops in the stations certainly simplifies predicting train locations in the tunnels - but seems like it will be less than accurate.
The actual train movement around the track can be analysed using the train locations dataset. As mentioned above, each train has a unique identifier - the LCID (Leading Car Identifier). This can be used to analyse where a particular train was and where it moved to when the Trackernet data refreshed. This data should allow the train speeds on different sections of track to be estimated as well as the train wait time in each station.
The chart below is for the section of track between Borough and Elephant & Castle. The track codes for this section start at 20443 at Borough and run to 20409 at Elephant & Castle. The chart shows the starting track code of the train on the x-axis and how far the train went in 25 seconds on the y-axis. 25 seconds is used as this is one of the common update intervals for the Trackernet data.
What's shown by the chart is an initial period of acceleration when the train is close to Borough station (left hand side of chart) followed by a period of fairly constant speed where the train travels around 300 metres every 25 seconds, followed by deceleration into Elephant & Castle station.
A model can be created to forecast the position of the train on the track by estimating the average speed on each section of track. For this section of track the speeds were estimated and then iterated until a reasonably good fit was found - shown in the chart below.
This model has speeds ranging from 10km/hr close to the stations to 48km/hr in the middle section between the two stations.
By following this approach with all the sections of track around Kennington the train speeds on most sections of the track around Kennington were estimated. The section of track where this doesn't work is the Kennington Loop - more on this below.
The Trackernet data provides a snapshot of the train locations at approximately 30 second intervals. By comparing the proportion of trains that were in a station at the start of a period with the proprortion of trains that left the station by the end of the period the train wait time in stations can be estimated. The formula is:
Time In Station = Snapshot Interval / % Trains Leaving Station
This can also be calculated by time of day. This is shown for Elephant & Castle station in the chart below. There is, as expected, quite a difference in wait time depending on time of day. For Elephant & Castle Southbound there is a pronounced jump in wait time - from around 40 seconds to 55 seconds during the morning rush hour. Then smaller jumps at 6pm and 10pm.
By taking the same approach with all the stations the average time in each of the stations as well as the how this time varies by hour of the day can be calculated - shown in the charts below.
There is a substantial difference between the average time trains spend in stations - from a low of 40 seconds average for Elephant & Castle to a high of 110 seconds average for Kennington Southbound trains at Platform 2.
The time in station also varies materially by time of day - although not in a consistent way. For stations like Waterloo Southbound and Elephant & Castle Southbound there is a jump in time in station around the rush hours. For the other stations something else seems to be going. As a guess, the trains are being held at the platforms to even out the spacing of trains on the Northern line.
As mentioned previously, southbound trains on the Charing Cross branch of the Northern Line typically turn around at Kennington - going around a section of track known as the Kennington Loop. This takes trains heading south from Kennington Platform 2 to heading north on Kennington Platform 1.
Trains that are going around the loop seem to 'disappear' from the Trackernet data when they go South of Platform 2 and reappear later heading towards Platform 1. An example helps here. The data below shows 5 consecutive requests into Trackernet from 17:58:12 to 18:00:53. The data has been cleaned up to show only the relevant train information.
<S Code="KEN" Mess="" N="Kennington." CurTime="17:58:12"> <P N="Northbound - Platform 1" Num="1" TrackCode="TN20634"> <-- other trains --> </P> <P N="Southbound - Platform 2" Num="2" TrackCode="TN20747"> <T LCID="1221787" SetNo="111" TripNo="11" SecondsTo="0" TimeTo="-" Location="At Platform" Destination="Kennington via CX" DestCode="249" Order="0" DepartTime="17:57:39" DepartInterval="0" Departed="0" Direction="0" IsStalled="0" TrackCode="TN20745" LN="N" /> <-- other trains --> </P> </S>
<S Code="KEN" Mess="" N="Kennington." CurTime="17:59:08"> <P N="Northbound - Platform 1" Num="1" TrackCode="TN20634"> <-- other trains --> </P> <P N="Southbound - Platform 2" Num="2" TrackCode="TN20747"> <-- other trains --> </P> </S>
<S Code="KEN" Mess="" N="Kennington." CurTime="17:59:35"> <P N="Northbound - Platform 1" Num="1" TrackCode="TN20634"> <-- other trains --> </P> <P N="Southbound - Platform 2" Num="2" TrackCode="TN20747"> <-- other trains --> </P> </S>
<S Code="KEN" Mess="" N="Kennington." CurTime="18:00:25"> <P N="Northbound - Platform 1" Num="1" TrackCode="TN20634"> <-- other trains --> </P> <P N="Southbound - Platform 2" Num="2" TrackCode="TN20747"> <-- other trains --> </P> </S>
<S Code="KEN" Mess="" N="Kennington." CurTime="18:00:53"> <P N="Northbound - Platform 1" Num="1" TrackCode="TN20634"> <T LCID="1221787" SetNo="111" TripNo="12" SecondsTo="26" TimeTo="0:30" Location="Kennington Loop" Destination="Check Front of Train" DestCode="0" Order="0" DepartTime="18:00:31" DepartInterval="26" Departed="0" Direction="0" IsStalled="0" TrackCode="TN20620" LN="N" /> </P> <-- other trains --> <P N="Southbound - Platform 2" Num="2" TrackCode="TN20747"> <-- other trains --> </P> </S>
The first request shows a train with LCID 1221787 and Trip Number 11 at Kennington Platform 2. The Platform 1 request doesn't show that this train will appear at Platform 1 at some future time. For the next 3 requests this train effectively disappears from Trackernet reporting. Then on the fifth request the train re-appears with the same LCID of 1221787 and a Trip Number that has incremented to 12. At this point the train's location is listed as 'Kennington Loop', track code TN20620.
Trains disappearing from Trackernet certainly complicates estimating their location. From reviewing the Trackernet data, there are two challenges in predicting when the train will reappear on Platform 1. The first is that the trains take a long and varied time to travel from Platform 2 around the loop. The second is that it appears that trains going around the loop are often held part way around at track code TN20620 (see image below for the approximate location of the different track codes).
Both the average time to travel around the loop and the average holding time at track code TN20620 can be measured using the same approach as used for the other track sections, however based on the observed data there is a large variation in the times. The chart below shows this variation in the time taken to travel around the loop. The average is 364 seconds but with a broad distribution of times. The TubeNoise train plots are based on the averages, but the train position on the Kennington Loop remains an area where the train location accuracy is not very high - be warned!
The track position estimate forms the starting point for the TubeNoise plot - and this is visible as the grey lines overlaid on top of Google Maps.
The Trackernet data is queried every 30 seconds to pull down the latest train positions for the relevant train stations.
Each web browser that connects to TubeNoise makes an api request every 10 seconds for the latest train data using JQuery / AJAX. The current train positions are calculated using the last known positions of each train from Trackernet and advancing the known positions by the number of seconds between the browser request and when the Trackernet data was created combined with the train speed at that position.
From the current position the track speed data and station wait time data are combined to predict the train's movement over the rest of the visible track section. Trains that have disappeared from Trackernet as they go round the Kennington Loop are added back in to the list of trains on the track. Trains that are in a station have the remaining time in station estimated based on their last position and time before the station.
If you've gone through the above you'll recognise that I've made a number of assumptions! If you have any thoughts or comments on how to make TubeNoise more accurate please get in touch.