Train Spotting With The Network Rail APIJanuary 2, 2019 2:37 pm
My trains to and from work have been pretty bad recently thanks to Southern Railway. This got me wondering whether there is a way that I could quantify how bad the delays have been, and whether there are particular times that the service is problematic.
I discovered that Network Rail provide data feeds that include the movements (arrivals and departures) of every train across the network. If you want to set this up yourself you will need to register for an account on their website: https://datafeeds.networkrail.co.uk/ntrod/login.
Once you have created an account, make sure that you are subscribing to some train movement feeds. Either use “All TOCs” for all train movements or select the train operator you are interested in tracking:
With a bit of python code (available on Github here) I was able to stream the movements of every train on Network Rail into a Postgres database that I setup in Amazon EC2. To set this up for yourself, you will need to enter your own database credentials on line 11 and your own Network Rail account details on line 15 of the Python Script. If you have chosen to subscribe to something other than “All TOCS” then you will also need to modify line 108 to reflect your chosen subscription. Next, you will also need to run this SQL Script to create the table that will hold your train movement data.
Now you are setup, run the Python script and you should see the train movements appear in the console in real time. Note that you will need to install the and psycopg2Python libraries:
Once I had collected some data, I wanted to start visualising it in Tableau. The data I had collected had no spatial elements and was at the Stanox (Station Number) level which made it difficult to map. I wasn’t able to find a dataset that provides coordinates for all Stanox locations in the UK so I had to combine a few data sets to create a lookup table that would convert the Stanox codes to a Tiploc (Timing Point Location) code and then to a Station Name. The data for this lookup table is available in .csv format here. By uploading this data into my Postgres database I was able to join the lookup table with the train data I was collecting:
Now I could start building my viz which allows you to track the trains between your chose stations to understand how late they run. The data I have uploaded to Tableau Public contains around 1 month’s data in order to stay below the 10m record limit (click on viz to view on Tableau Public):
Categorised in: Germany