The next generation Sports database
Why is a Sports Database So Hard?
The only route to success in sports at the highest level involves detailed analysis of data from a vast range of domains. Practice, clean living and pep talks, on their own, simply don’t win championships any longer
Although there are plenty of software systems that apply to individual domains rather well (web sites for uploading GPS data, stats and charting packages for number crunching and visualisations, apps for high tech devices like wind tunnels and stationary bikes bristling with sensors), there is no single system that brings all this data together into a queryable form
So it is difficult, if not impossible to gain insights into the true relationships between, say, a nutrition regime and resulting endurance, a bike configuration and sustainable performance, weather and its role forming strategy, and eventually winning
Such a system hasn’t existed because the underlying data problems are deceptively complex, although not intractable
A sophisticated, comprehensive sports database presents a unique set of problems to the would-be system designer. Many of these issues are hard to anticipate, and once discovered are deceptively difficult to solve gracefully; few have obvious pat answers.
Let’s discuss these problems.
It is Tempting to Think in Terms of Traditional Approaches — And It is Wrong
A key issue is that the sub-systems required do all appear to be amenable to a straightforward row-and-column database approach. A system tracking, say, weight-training exercises can assume a traditional record-based design — date, athlete, exercise, reps, weights — and do a decent job. The summary of a run in the mountains or a few hours on the bike likewise seems tractable — distance, total time, altitude gain, average speed.
So how difficult can it be, really?
Adding to this deceptive sense of simplicity is the fact that there are already plenty of public-facing web sites that handle exactly these scenarios: Don’t the likes of Strava and Gamin Connect prove this can all be done pretty easily?
The answer is an emphatic “Yes” — as long as you are talking about simple solutions dedicated to a few general types of training that come from very similar, limited domains. Biking and trail running are good examples, amenable to the same general-purpose algorithms and solutions: if you are content to be limited to the start that can be gleaned from a simple GPS file.
In Fact, It is Complicated
But the moment an athlete, a researcher, a performance consultant or an equipment manufacturer wants to bring several different heterogeneous domains of data together, and query them as a single data resource, the real issues start to emerge. And we should make no mistake, in order genuinely to improve performance, any serious amateur or professional needs to be tracking so much more than reps and weights, lap times and altitude gain. The low-hanging performance fruit has already been harvested, and everyone knows any significant further advance requires shaving fractions off in a whole range of domains.
Let’s take a Garmin-equipped cyclist as an example. Between the GPS device and Connect web site, the athlete can get some basic stats about previous sessions, personal bests, and so on. It is good for browsing, and the Sunday afternoon amateur can learn a fair amount that way.
Cross-Domain Integration of Data is a Must
But how can you ask more intelligent questions of such a system? How do you rank your ascents by more sophisticated measures, for example, VAM? Once you’ve found areas of improvement, how do you discover what influenced them? Did nutrition have an impact? How about that recent wind-tunnel session? What was special about the bike configuration that shaved 0.5% off your previous best time? Did those £2,000 rims really make an improvement? More so than the £200 helmet? And when did you have them installed, anyway?
So one of the key questions becomes how you integrate a bunch of simple row-and-column sub-systems into an coherent whole, one that is queryable, and capable of helping the athlete or consultant make new discoveries?
In short, how do you move from rudimentary browsing to true performance insight?
The Route to Performance Insight…
The quick answer is to start tracking additional domains: nutritional data, injury status, medical records, one-off analysis (bike fitting, wind tunnel sessions), weather integration, weight-training, well-being, and so on; but to do so in a way that unifies all those diverse sub-systems. This is where complexity rapidly starts to multiply. Let’s take a few examples.
…is Riddled with Pitfalls…
First, there is just plain combinatorics: adding more systems means the number of potential questions you can ask goes up exponentially. How is x related to y? To z, to a, b and c?
Units and unit conversions become an issue: not just athlete’s body weight (kilograms, pounds, stone?), but medication characteristics (milligrams, micrograms, duration, side-effects), blood tests (grams per decilitre, millimoles per litre), pedal forces, wind resistance, angles of various body parts, coefficient of drag, and so on.
Or take a specific sub-system like bike fitting. The modern system (or systems — the field is hardly standardised) is actually not so amenable to a row-and-column approach. Raw video also needs to be stored, as well the results of semi-automated video analysis. Various bike fitting consultants use different methodologies. The workflow is iterative, but not necessarily sequential — the final position may not prove to be the best. There is no simple, single way forward.
…and the IT Challenges are Legion….
Moving on from the sports-related issues, the IT domain presents its own complexities. A fully-functional system would need to be API-based, but support web, mobile, desktop and Internet of Things input. The system has to account for users of vastly different levels of sophistication: whereas some will need simple charts explained to them, others will be downloading the processed data and generating their own sophisticated charts. The IT tools involved need to be various and powerful: modern databases, the ability to cope with big data, powerful statistical and visualisation components, deploy-everywhere technology, agile development capabilities….
Then there is the question security: knowing that Ronaldo’ knee injury is actually far worse than publicly thought could mean a lot to an opponent — not to mention a bookie. Research projects need to be properly anonymised. Performance peaks need to be hidden; training strategies need to remain confidential.
Winners Don’t Just “Keep Up”
Finally, it is a fast-moving world out there: manufacturers come up with new products, performance analysts come up with new metrics, nutritionists come up with new dietary regimes, and so on. Any system that doesn’t keep ahead of the game will never help its users stand on the podium.
So the list of complications involved in producing such a system goes on and on; the risk of designing yourself into an IT corner is ever-present. A sophisticated, comprehensive sports database is truly a difficult nut to crack.
Where OwnGoal Fits In
We at OwnGoal don’t imagine we’ve solved or even discovered all the problems out there. But we have thought about them for a long time, and have put in considerable effort working towards solutions. We believe we’ve identified the tools, strategies and structures, as well as created the basic framework, to solve the problems we’ve identified, as well as the ones we haven’t yet.
Designed by a sports performance consultant who has extensive experience in the domains of Olympic performance, university post-grad-level research and commercial product development, along with an IT professional who has spent his career in data-, stats- and visualisation-driven applications, we believe our OwnGoal framework can help professional teams, performance consultants and equipment manufacturers win Gold in this highly competitive field.
We feel that anyone who is keen on cutting-edge sports performance cannot ignore the benefits — nor the complexities — of complex data integration across all relevant domains; that is, assuming they hope to win.
OwnGoal can offer you the head start necessary to cross the line first.