Jan 2021 - Workshop

Open Data Science with ODI (Open Data Institute)


location_on Online


2 day training on open data science in the sports sector

Dates: 21st and 29th January 2021 

Venue: Online

Link: https://us02web.zoom.us/j/88251505459?pwd=QWJuaitEbURWM3NSQW83M0U1MGVqdz09

Times: 09:00-16:30 (CET)

As a partner of the RAIS project, the Open Data Institute will be delivering the first “hands-on” research training event in Q1 2021. This two day training event will focus on Data Innovation in the Sports Sector.

The training will match closely with the research areas of the RAIS project including:

  • Distributed Sensing Infrastructure & Networking for Internet of Sports

  • Security, Privacy, and Trust

  • Data Mining and Edge Analytics

  • Predictive Analytics

This hands-on training will act as a stimulus to help PhD students think about the opportunities of applying data science techniques in the sports sector. In addition to the taught material, a number of guest speakers will offer insights into the applications of data science in both the sports and other sectors. 

The training will be split into a number of sessions covering the following topics:

Day 1

  • Open Data Science - What, how and why?

  • Open data, closed data, personal data: a data spectrum.

  • The internet of sports data - including guest speakers from OpenActive

  • Applying data science in the sports data landscape

Mid training practical

  • Hands-on - Building a dashboard from large quantities of sports data 

Day 2

  • Data science @ WildAI - Guest speaker

  • Predictive analytics and machine learning

  • Hands-on - Building a machine learning algorithm for predictive classification

  • Practical steps to build and maintain trust

By the end of the course students will be able to apply a broad range of data science skills and knowledge into their own work. We will do this by:

  1. Building a profile of a data scientist and identifying the knowledge and skills required. 

  2. Exploring a number of case studies of data science applied in the sports and other sectors

  3. Creating a spectrum of sports data, covering open data, closed data, shared data and personal data

  4. Analysing the development of the sports data ecosystem, identifying future opportunities

  5. Evaluating how to build and maintain trust, security and privacy when dealing with different sources of data

  6. Applying a number of analysis techniques on data to discover insight

  7. Examining the implications of applying predictive analytics and machine learning techniques to data

  8. Creating a number of practical outputs to take away

About the ODI 

The Open Data Institute (ODI) was co-founded in 2012 by the inventor of the web Sir Tim Berners-Lee and artificial intelligence expert Sir Nigel Shadbolt to show the value of open data, and to advocate for the innovative use of open data to affect positive change across the globe. 

The ODI works with companies and governments to build an open, trustworthy data ecosystem.

To further this mission, the ODI strives to bring about sustainable behaviour change within companies and governments that hold and use data. This is done through three key activities:

  1. Sector programmes – coordinating organisations to tackle a social or economic problem with data and an open approach.

  2. Practical advocacy – working as a critical friend with businesses and government, and creating products they can use to support change.

  3. Peer networks – bringing together peers in similar situations to learn together.

One of the ODIs key sectors is Sport, specifically through the OpenActive project. OpenActive is community-led initiative with the ambition to help people in England get active using open data. The OpenActive project has two key areas of focus:

  1. Working with the community to develop standards for the release of open data from the sports sector.

  2. Stimulating growth in the sector through the OpenActive accelerator programme which focuses on delivering benefits from the newly released data. 

Detailed Agenda - Day 1

Time Activity

LO 1, LO 2

Session 1: Open Data Science - What, how and why?

A modern data scientist is expected to be a catalyst for change in an organisation. In this session we take a holistic look at the skills required of a modern data scientist and why an open approach is essential. Through stories, we’ll look at how open data science has been applied in many sectors and the risks of not having a balanced set of skills.

10:30 Break

LO 3

Session 2: Discovering Open Data and Licensing

This session looks at the spectrum of data from open to closed and the importance of understanding copyright and licencing when using data. Participants will be challenged to build and analyse a number of data licences before building a spectrum of sports data. 

12:30 LUNCH
(12:45 UK)

GUEST SPEAKER 1: Tara Lee - Open Active Engagement Lead

Sport England’s vision is that everyone in England feels able to take part in sport or activity, regardless of age, background or ability. This talk will introduce the key role data plays to get people active and how analysis can inform future direction. 

Tara joined the ODI in September 2018 to work on OpenActive – the community-led initiative to help people in England get active using open data. Her role is to lead on the initiative’s engagement strategy, building an empowered community of data publishers and users in the physical activity sector


LO 4

Session 3: The data science process, simple tools, powerful outcomes

In this session we will explore the process of discovering insight from data. We’ll look at the techniques that can be used to discover insight from data and the important role that high quality standardised data plays. We’ll also look at how to boost the quality of data ready for analysis. 

A key part of this session will be to look at the OpenActive project. Funded by Sport England and led by ODI, we’ll introduce the activities that are creating a rich ecosystem of sports data that anyone can access, use and share. 

Day 1 end

Detailed Agenda - Mid training practicals

Practical 1: Personal data and the rights of individuals survey

Participants will be given a short survey evaluating the mornings learning from Day 1 looking at open and closed data, personal data and individuals rights. 

Practical 2: Building a dashboard from OpenActive data

Participants will be provided with a practical, self paced exercise guiding them through the process of building a dashboard from OpenActive data. In building a powerful faceted browser for data, participants will discover how having a complete view on data can reveal insight not contained in a single source.

Detailed Agenda - Day 2

Time Activity
Feedback from mid training cross filter practical


LO 5

Data science - The maths and stats behind machine learning and AI

One of the most essential skills of a data scientist is that of applying appropriate statistical models to data. In this session we start to look at the maths and stats behind machine learning and discover how shapes and trends in data can be used to discover insight. We also look at how, when it goes wrong, it can go very wrong.




LO 6

LO 7

Data science - Big data and machine learning - Part 1

Building from the last session we look at how the processing of big data can cause challenges and what to do to ensure analysis is scientifically robust. Specifically in this session we look at how machine learning is used to do classification and recommendations based upon ever changing inputs.

In this session we will challenge participants to build a classification algorithm for a set of data.   


(11AM UK)


WILD AI is an AI powered, science backed fitness app and smart coach for female athletes. Sahana Gopal, sports science researcher at WILD AI, will be introducing the science behind the app and how AI provides insights specifically for women. In addition to leading the research of female health, Sahana is a Strength and Conditioning coach for the British Olympic Diving team, and is a badminton and olympic weightlifting athlete herself.




Data science - Big data and machine learning - Part 2

This session will look at the answers from the classification exercise and look at strategies for avoid teaching machines bad habits.


LO 4

Ethics, security, privacy and trust (including feedback from mid-training survey)

This session asks the question “What happens when machines learn bad habits and what can be done about it?” We introduce a number of case studies and discuss the importance of considering ethics, security, privacy, trust and openness when using data. 

We will introduce the ODI Data Ethics Canvas as a tool to help guide any projects or research that use data and spend time examining how it can be applied in different situations.