Utrecht, the Netherlands, 17 July 2017 — Both ProRail and Nederlandse Spoorwegen have a Hadoop cluster to analyse the vast amounts of data they produce and collect. In order to be able to make the most of these clusters, thirteen of their data scientists participated in our PySpark training.
In three days we covered the Spark architecture, the RDD and DataFrame APIs, and machine learning. To put this new knowledge to the test, we worked through two real-world use cases based on their own data. It’s wonderful to see these two organisations learning and working together.