en | nl

Data Science Workshops

Together you’ll learn better.

Jeroen trains and supports researchers, developers, and analysts in Python, R, data science, statistics, and machine learning. His approach is hands-on and casual, but always professional. Both in person and online.

Python workshops

Programming in Python
Programming in Python

In this two-day workshop, we will help you get started learning how to program in Python, one of the most popular languages for quick scripts, production software, and doing data science.

Through realistic examples, you’ll be introduced to various fundamental programming concepts, like variables, functions, and control flow. The workshop will be hands-on, with challenging exercises to complete. Unique about this workshop is that we’ll be using JupyterLab, a popular environment to run code interactively and do data science.

This workshop will not only prepare you for more advanced Python workshops but will also provide you with a solid and reliable foundation upon which to base your data science journey.

2 days
online or on location
Data Analysis with Python and Pandas
Data Analysis with Python and Pandas

Learn how to accelerate your data analyses using Pandas, a Python library specifically designed for working with medium-sized data sets. Together with JupyterLab it enables a convenient environment for interactive data analysis.

Pandas is part of the so-called PyData ecosystem, and in this workshop we’ll start by providing an overview of PyData and explain where Pandas stands and how it interacts with other libraries such as NumPy and Seaborn. Pandas introduces a few new data structures, most importantly the DataFrame, which are essential to understand how to work with tabular data efficiently.

Pandas offers many features, and in one day, through a good balance of presentation and interactive exercises, we’re going to cover the most important ones, including: importing, filtering, grouping, joining, exploring, and visualising data. By the end of this workshop, you’ll understand the fundamentals of Pandas, be aware of common pitfalls, and be ready to perform your own analyses.

1 day
online or on location
Web Scraping and Crawling with Python
Web Scraping and Crawling with Python

The internet is not just a collection of webpages, it’s a gigantic resource of interesting data. Being able to extract that data is a valuable skill. It’s certainly challenging, but with the right knowledge and tools, you’ll be able to leverage a wealth of information for your personal and professional projects.

Imagine building a web scraper that legally gathers information about potential houses to buy, a process that automatically fills in that tedious form to download a report, or a crawler that enriches an existing data set with weather information. In this hands-on workshop we’ll teach you how to accomplish just that using Python and a handful of packages.

You’ll learn about the concepts underlying HTML, CSS selectors, and HTTP requests; and how to inspect those using the developer tools of your browser. We’ll show you how to turn messy HTML into structured data sets, how to automate interacting with dynamic websites and forms, and how to set up crawlers that can traverse thousands or million of websites. Through plenty of exercises you’ll be able to apply this new knowledge to your own projects in no time.

1 day
online or on location
Machine Learning with Python
Machine Learning with Python

Machine learning has become an essential component in many applications and projects that involve data. With the power of Python and the scikit-learn package, this exciting field is no longer exclusive to large companies with extensive research teams. If you use Python, even as a beginner, machine learning applications are limited only by your imagination.

During this workshop, we will take a hands-on approach to learning about machine learning algorithms. Topics include: regression, classification, outlier detection, dimensionality reduction, and clustering. During two days, we’ll explore various algorithms such as linear regression, logistic regression, decision trees, neural networks, and many more.

By the end of this workshop you’ll confidently select and employ machine learning algorithms using Python and scikit-learn. You’ll have gained a new understanding of the inner workings of machine learning algorithms and know how to leverage them to produce valuable results and insights.

2 days
online or on location
Data Science with Python and Spark
Data Science with Python and Spark

Apache Spark is an open-source distributed engine for querying and processing data. In this three-day hands-on workshop, you will learn how to leverage Spark from Python to process large amounts of data.

After a presentation of the Spark architecture, we’ll begin manipulating Resilient Distributed Datasets (RDDs) and work our way up to Spark DataFrames. The concept of lazy execution is discussed in detail and we demonstrate various transformations and actions specific to RDDs and DataFrames. You’ll learn how DataFrames can be manipulated using SQL queries.

We’ll show you how to apply supervised machine learning models such as linear regression, logistic regression, decision trees, and random forests. You’ll also see unsupervised machine learning models such as PCA and K-means clustering.

By the end of this workshop, you will have a solid understanding of how to process data using PySpark and you will understand how to use Spark’s machine learning library to build and train various machine learning models.

3 days
online or on location

R workshops

Programming in R
Programming in R

R is a statistical environment and programming language that is widely used among statisticians and data scientists to work with data. This one-day workshop will be your guide, taking you through different programming aspects with R.

You will learn to work with powerful R tools and techniques. You’ll be able to boost your productivity with the most popular R packages and tackle data structures such as data frames, lists, and matrices. You’ll see how to create vectors, handle variables, and perform other core functions. You’ll be able to tackle issues with data input/output and will learn to work with strings and dates.

Moving forward, we’ll look into more advanced concepts such as metaprogramming with R and functional programming. Finally, you’ll get a glimpse of R’s data visualization and data manipulation capabilities.

1 day
online or on location
Transforming Data with R and the Tidyverse
Transforming Data with R and the Tidyverse

In this one-day hands-on workshop, RStudio certified instructor Jeroen Janssens will walk you through the so-called tidyverse to transform data. The tidyverse is an ecosystem of R packages that share an underlying design philosophy, grammar, and data structures.

We’ll start at the beginning, with importing CSV data using readr and spreadsheets using readxl. We’ll cover the most important functions from dplyr and tidyr for generic data wrangling and cleaning. We’ll also look at dealing with dates, factors, and textual data specifically using the packages lubridate, forcats, and stringr, respectively. Note that this workshop does not cover ggplot2; for that we recommend our one-day workshop Data Visualisation with R and ggplot2.

By the end of this workshop, you’ll have a good understanding of the tidyverse ecosystem and you’ll be able to apply many of its packages to your own data.

1 day
online or on location
Data Visualization with R and ggplot2
Data Visualization with R and ggplot2

In this one-day hands-on workshop, we’re going to have a close look at ggplot2, a widely used R package that implements the so-called grammar of graphics. Its concise and consistent syntax allows you to create high-quality data visualisations in a quick and iterative manner that are suitable for both exploration and communication.

By the end of this workshop you’ll have a solid understanding of the grammar of graphics and how to create data visualisations in R for your daily work. But beware: there’s a good chance you will want to learn more about R.

1 day
online or on location

Other popular workshops

Data Science at the Command Line
Data Science at the Command Line

The unix command line, although invented decades ago, is an amazing environment for efficiently performing tedious but essential data science tasks. By combining small, powerful, command-line tools (like parallel, jq, and csvkit), you can quickly scrub and explore your data and hack together prototypes.

This hands-on workshop is based on the O’Reilly book Data Science at the Command Line, written by instructor Jeroen Janssens. You’ll learn how to build fast data pipelines, how to leverage R and Python at the command line, and how to quickly visualise data. No prior knowledge about the unix command line is required.

By the end of this workshop you will have a solid understanding of how to integrate the command line in your data science workflow. Even if you’re already comfortable processing data with, for example, R or Python, being able to also leverage the power of the command line can make you a more effective and efficient data scientist.

2 days
online or on location
Under the Hood of Data Science
Under the Hood of Data Science

Ask a dozen people what “data science” means, and you get back thirteen different answers. This vagueness is unfortunately accompanied with a lot of hype and misaligned expectations. In this inspiration session, we aim to mitigate this by taking a good look under the hood of data science.

In three hours, we not only explain in clear terms what data science entails, but we also let participants experience what a typical data scientist does by working through a practical use case using a real-world dataset and a programming language such as Python or R. This session is meant for everybody who wants to know what data science is (and isn’t) about. Even when you’ll never intend to work with data yourself, it can be eye-opening to have experienced it. Caution: there’s a chance you’ll want to learn more afterwards!

3 hours
online or on location
Version Control with Git and GitHub
Version Control with Git and GitHub

Description will be added soon.

1 day
online or on location

Testimonials from participants

I found Jeroen to be a wonderfully welcoming, knowledgeable, and patient instructor. He covered content at a very nice pace, and made the workshop feel like a welcoming space where any question was fair game. Thanks to our small class, I really appreciated how he took interest in what each participant wanted to get out of the class, and often asked for feedback so that he could adjust to our level as needed. When we asked him some question that were outside the scope of the workshop’s main content, he even spent some of his breaks researching them to find us answers!

Carolina Simao Roe-Raymond, Ph.D.
Visualization Analyst at Princeton University

The 5-day workshop Data Science with R gave us a glimpse of the possibilities that R has to offer. During the workshop we got to work on actual current projects under the supervision of Jeroen, which was very insightful. Jeroen gave us a lot of practical examples and tips. We’re not quite there yet, but Data Science Workshops gave us a head start!

Arjen Verhulst
Analyst at Gemeente Nijmegen

I learned a lot from the in-company workshop on Programming in Python, and thought it was great that Jeroen could improvise around our specific learning needs and thus offer exactly what we wanted to learn without making it too difficult. Also, it was really useful to see what different solutions there were for an exercise, and what was and wasn’t “good” about it—that’s exactly something which is difficult to pick up with self-study.

Floor Buschenhenke
PhD Candidate at Huygens ING (KNAW)

The Anomaly Detection masterclass by Jeroen provided us with very useful tools to address business issues where (early) detection of anomalies is of the greatest importance. Consider, for instance, early detection of DDoS attacks, credit fraud or insurance fraud. Recommended!

Rik Kleine
Data Science Consultant at KPN ICT Consulting

Great workshop! Very well done and very useful information delivered in an excellent and interactive manner. Jeroen anticipated very well on the different knowledge levels within the group. I would highly recommend the Data Science at the Command Line workshop to anyone that is interested in either kickstarting their command-line experiences or improving their data science with Unix power tools.

Sanne Bouwman
Data Scientist at Teradata

At Brabant Water, most of us were still using spreadsheets to clean, analyse, and model our data. Thanks to Jeroen, who delivered an engaging, hands-on workshop at our office, many of us have switched to Python and Jupyter Notebook, which allows our analyses to be much more advanced and reliable.

Stijn de Jong
Senior Advisor Water Supply at Brabant Water

Attending the bespoke course Data Munging with Pandas at Textkernel has proven to be an excellent choice. Jeroen’s personal approach and highly interactive way of teaching made this course valuable to a diverse group of developers and analysts, as did the possibility to apply theory on our own data and API during the courses. I’ve since been able to code cleaner and more efficient, and applied the pandas package in several monitoring and analytics scripts.

Karlijn Dinnissen
Data Quality Analyst at Textkernel

Even experienced data scientists need to keep working on their skills and knowledge. For the past half a year, Data Science Workshops has come to our office once a month, to teach us about a variety of topics, ranging from NoSQL to t-SNE. This is a great way to stay fresh and look beyond the tools and techniques that you’re already familiar with.

Anne-Marie Dekkers
Data Scientist at ProRail

About Jeroen

Jeroen Janssens, PhD, is a data science consultant and certified instructor. His expertise lies in visualizing data, implementing machine learning models, and building solutions using Python, R, JavaScript, and Bash. He’s passionate about helping and teaching others to do such things.

Since 2013, Jeroen runs Data Science Workshops, a training and coaching firm that organizes open enrollment workshops, in-company courses, inspiration sessions, hackathons, and meetups. Clients include Amazon, eHealth Africa, Schiphol Airport, The New York Times, and T-Mobile.

Previously, he was an assistant professor at Jheronimus Academy of Data Science and a data scientist at Elsevier in Amsterdam and various startups in New York City. He is the author of Data Science at the Command Line (O’Reilly Media, 2021). Jeroen holds a PhD in machine learning from Tilburg University and an MSc in artificial intelligence from Maastricht University.

He lives with his wife and two kids in Rotterdam, the Netherlands.
If you would like to know more about his services, fees, and availability, then please email Jeroen. You can also find him on Twitter, GitHub, and LinkedIn.

Subscribe to my newsletter

Stay up-to-date about new workshops, upcoming events, and other news about myself and Data Science Workshops.

You’ll be in good company

Dozens of organisations trust Jeroen to train and support their developers, analysts, and researchers so they can achieve more with data.

eHealth Africa
O'Reilly Media
Princeton University
Schiphol Amsterdam Airport
The New York Times

Testimonials from managers

Jeroen organised for KPN a ten-week course on Data Science with R. The combination of training, on-site coaching, and remote support ensured that our analysts are applying the new knowledge and skills in their daily projects. For instance, they’re now capable to implement complex predictive models using R. We’re looking forward to the follow-up course on Advanced Machine Learning.

Wouter Egberink
Manager Commercial Analytics at KPN

Data Science Workshops facilitated a data hackathon for the data team of Transavia. They made sure it was inspiring, helpful, and leading to valuable insights in the way of working with Python for multiple projects and analyses that Transavia is currently implementing.

Charles Verstegen
Head of Partner Sales and Data & Analytics at Transavia

We received a personalized coaching session from Jeroen and this was very valuable to me. It’s such a skill to correctly estimate the level of the participants, but this was rightly done and the training was perfectly adapted to our needs. Our training had a practical focus and we could use it as a starting point for the rest of our work. This training provided us with the tools and way of thinking to perform the data analyses independently.

Mirthe Groothuis
Project Lead at Dutch Institute for Clinical Auditing

Besides demonstrating a good knowledge and experience in command-line tools for data science, the instructor had very good training skills, clear communication, and managed to adapt the level of the training to the level of the audience, which is not always easy!

Marc Canaleta
CTO at Social Point

Our DataLab team enjoyed a three-day PySpark course from Jeroen. Jeroen’s approach is personal and professional. I recommend Data Science Workshops to anyone in the field of data science.

Laurens Koppenol
Lead Data Scientist at ProRail

Data Science Workshops was able to skillfully differentiate, addressing various Unix Consultants at Snow with very different skill sets. Jeroen made some people rise above themselves.

Joost Helberg
CTO at Snow

Our Insight & Analytics team followed a five-week course in R provided by Jeroen. The training was built around our own data and challenges and therefore was easily applyable in our daily business. The atmosphere during our training days was always very pleasant and everybody looks back on a very succesful training.

Yannick Jacobs
Manager Insight & Analytics at DPD

Before the six-day workshop with Data Science Workshops, our team of engineers only had some theoretical knowledge of data science and we primarily used costly tools such as Tableau to do data analysis. However, after four days of interactive hands-on sessions with Jeroen, we were able to use Python, our preferred programming language at eHealth Africa, to analyse our data, create some amazing visualisations and even start making machine learning predictions. We moved from theory to real application in a very short period of time, making this workshop extremely valuable. I highly recommend Data Science Workshops.

Aboubacar Sidiki Douno
Senior Software Engineering Manager at eHealth Africa

I love to hear from you.

Do you have a question about one of my workshops? Would you like to know how I facilitate in-company or online training? Are you looking for advice on what to learn next? Send an email to jeroen@datascienceworkshops.com.