Introducing Twenty3’s Sports Data Platform

13th May 2021 |

By Simon McMillan

The promise of ‘data’

Over the course of the last few years, the world has become enchanted by the potential of ‘data’: the potential to make more accurate decisions, faster, at lower cost than before, by using data as the vital ingredients in a variety of related disciplines from Business Intelligence through Analytics to Machine Learning.

The sports world is, in no way, immune to this enchantment. Over the last five to ten years, we’ve seen the emergence of a vast array of businesses focused on collecting, processing and presenting data about sport. The opportunity is pretty clear: sports teams can make better decisions about tactics, opponents and player acquisition; broadcasters can more readily put relevant content into the hands of fans. More can be done, quicker, at lower cost; previously hidden insights can be extracted through models or algorithms which can be executed repeatedly and improved iteratively.

The reality is, inevitably, not so simple.

The challenge of using data

Early in Twenty3’s development, we took the well-trodden path which I suspect many others have taken. We entered into a number of long-term contracts with more than one data supplier before we really had established what was possible and what was problematic. We were encountering at least three specific challenges that we now believe any organisation, with even the most sophisticated users, will encounter when executing a data strategy.

Firstly, whilst the promise of a data-driven approach is that it drives efficiency, the reality is that achieving a return on investment is very difficult. The unit economics of data contracts are driven by the cost of data collection which remains, largely, a manual process. What’s more, the output of this process isn’t really in a usable state from which the layman can readily extract value. For these reasons, whilst it’s easy to sign up to a multi-year contract with one or more data providers, it is more difficult to get a return on the substantial investment these contracts require.

Secondly, dealing with data is a multi-disciplinary problem. The mythical role of the ‘Data Scientist’ is someone who has a vast array of relevant skills that all add up to an almost magical capability to produce winning answers from raw data. The truth is, to create an effective data pipeline, you really need cloud computing experts, database admins, developers, data engineers, data scientists and subject matter experts to get it right. It’s unlikely that even the most sophisticated organisations will have expert level capability in each of these areas.

Thirdly, sports data may take a number of forms and be procured from a number of vendors. There is event data which provides ‘on-ball’ actions, optical tracking, broadcast tracking, even proprietary data-sets developed in-house. Whilst the most complete data strategy will frequently require more than one of those data-sets working in concert, there is little or no incentive for competing vendors to make interoperability a priority.

Faced with the above, we at Twenty3 decided to respond to these challenges. We wanted to find a way to effectively put data from multiple sources into the hands of a variety of users; to find a way to free up the time of experts so that they can do the most valuable activities; and to ensure that executives in an organisation are able to realise the return on investment that they set out to achieve when they decided to pursue a data-driven strategy.

Twenty3’s Sports Data Platform

The Sports Data Platform (SDP) is a technology platform that collects, processes, stores and presents a variety of data, from a number of sources, for use in different scenarios.

SDP is the foundation for Twenty3’s Toolbox family of products but it also works as a standalone platform and has been built using the same principles adopted by the most sophisticated users of technology in areas such as fintech, e-commerce and logistics.

The SDP is built for the cloud

Many sports organisations are undoubtedly using sports data to create very sophisticated outputs but very few have implemented a pipeline that embraces the latest thinking in data engineering. It is not uncommon for models to only run on a single user’s machine or data-sets to be replicated many times across multiple users.

SDP utilises the full benefits of a cloud-based infrastructure. It is deployed as Infrastructure as Code using a CI/CD pipeline and the infrastructure is fully containerised. The infrastructure is therefore replicable, scalable and reliable. If an organisation wants to deploy a new development environment they can do so with minimal effort; if additional compute power is needed to train a Machine Learning model then it can be added as needed.

SDP is fully managed

Anecdotally, I have heard that Pelé was a great goalkeeper, arguably as good as the goalkeepers he played with in the great Brazilian teams. This said, it would have been a tragedy if one of the world’s football geniuses had been playing in goal for a team that rarely had to defend. Similarly, I am sure that organisations who have invested in highly paid data professionals would rather that these experts are spending their time on the most valuable activities possible: whether that be finding the next Erling Haaland or working out a way to beat the Gegenpress.

SDP provides a fully managed platform, thereby removing the requirement for much of the time consuming and repetitive tasks such as data cleansing, database maintenance etc. that otherwise a team would have to undertake.

SDP services a variety of users

Clearly, there are many types of users of data in an organisation with varying expertise, using a diverse set of tools, with different tasks to complete. Use-cases may be as diverse as a recruitment team wishing to identify potential players eligible to play in a particular geography; a media team wishing to highlight some standout stats to engage with fans; or a product team wishing to deploy a new front-end environment.

SDP is able to present data to these users in a variety of different ways: a data scientist can hook up a Jupyter Notebook, an analyst can connect using Tableau and a developer can consume a RESTful API with a built-in authentication layer. All users can, of course, access the platform using Twenty3’s Toolbox – crucially though, they are not restricted to this.

The impact of SDP

In developing the SDP, we believe that we have created something that can massively help any user of sports data. Over the course of the last two years, we have successfully integrated data from a number of the world’s leading data providers including Stats Perform, Hudl, Sportlogiq, Ortec and others. Licensed customers of these data providers can readily access their data in a secure and efficient manner. Those who want to combine data-sets, whether that be from multiple vendors or proprietary in-house data – any source really – can do so readily.

Our aim has always been to enable others – to make their workflows more efficient yet to allow them to leverage their own expertise, whether that be a content creator creating graphics in the Content Toolbox’s Smart Graphics tool or an analyst using the Analytics Toolbox to gain insight about a player. Our approach to SDP has been no different, we’ve created something that allows users to focus on the most valuable activities and help them to realise the great potential of data.

We help our customers maximise the potential of football data. Whether you’re a data novice or expert, the Twenty3 Toolbox gives you the tools to do your job quicker and better.

If you think your organisation – whether in the media, broadcast, agency or pro club sector – could benefit from our product, you can request a demo here.