Earlier this month I had the chance to present on Topological Data Analysis and applications to football at a seminar series held at Queen’s University. The talk was split into three parts: the first an introduction about understanding the relation between player space and passing options and the second about the homology groups of graphs to identify team play styles. There wasn’t actually enough time to get to the last section, which was about parsing out the more dynamical nature of sport so I thought I’d write about one of the main ideas here: the intersection of topology with dynamical systems theory.
There’s actually quite a bit to get into so I’m going to split the ideas over two posts. In this first one, I’ll be exploring the idea of how to parse out the underlying patterns and behaviours of teams when it comes to how they occupy space by setting some fundamental ideas about dynamical systems, while linking them to topological data analysis.
In fact, in a previous post, I already went over the notion of persistent homology, which is a tool to help us identify “natural” structures or features within data – identify the “shape” of data in a sense. And that’s the main topological tool we need for this particular application. The other concepts we need come from dynamical systems. So what is a dynamical system in the first place?
A dynamical system is a triple (set of three objects) defined by a state space (M), a set of indices (T), and an evolution function (ϕ: MxT → M) which maps the state space back onto itself, i.e. describes how the systems evolves from one state to the next. It’s probably worth going over some examples of how we can actually use these three objects together to actually model systems.
Let’s start with a simple system: a binary system. Our state space contains every possible state for our system, which in this case is the set of all infinite binary sequences (M = 2^N). Our evolution function (ϕ) can be anything that takes a binary sequence as an input and outputs another binary sequence; in this case, let’s just choose to “shift” our sequence, i.e. delete the first entry of our sequence. Then our indices just help us keep us track of how many times we’ve applied ϕ to a sequence (in this case, how many digits have we deleted from the original sequence). Here’s what this “system” might look like, if we started with two different sequences:
Even from this simple system, we can identify a couple of important features: the existence of fixed points and of orbits (i.e. periodic points). The sequence of all 1s (or all 0s) never changes no matter how many times ϕ is applied, hence they are “fixed” in the system. The sequence of repeating 10s represents an “orbit” because after 2 applications of ϕ the sequence is identical to the original sequence, hence it is a point in our system that is “2-periodic” (or represents a “period-2 orbit”).
But how about we consider a more “real” system – a pendulum – and see how it might be represented as a dynamical system. We can describe the state of the pendulum by its position and velocity, so our state space can be represented by the plane, M = R2. And if we plot position on one axis and velocity on another, we can find the “orbit” for our system:
This leads us to another important feature of dynamical systems: attractors.
An attractor is a subset of our state space that a system tends towards; like how in the pendulum example above, we have a “limit-cycle” attractor, which is basically a circle (a subset of R2, our state space) that the pendulum states exist on. The shape of the attractor gives us information about the system dynamics and how it tends to evolve (i.e. behave) over time. For example, consider another pendulum but this time with damping applied so that the pendulum slows down over time:
Now the attractor for the system has another shape – a spiral – or what’s called a “fixed point” attractor, since the system tends towards a fixed state (in this case, a motionless pendulum hanging straight down). So now we have dynamical systems and how we can use attractors to understand the underlying dynamics and patterns, let’s try to design an attractor for a more complex system – a game of football.
In the same vein as our pendulum system, let’s describe the “state” of a team by the area covered by its players and the speed at which it is expanding/contracting. That gives us 4 variables to describe a state: Team 1 Area, Team 1 Expansion, Team 2 Area, Team 2 Expansion.
However, let’s try to compress this a bit and instead look at the difference of area and expansion rate between teams, therefore giving us a state space in R2. Here’s what that might look like:
Here the red team is Team 1 (the home side), the blue team is Team 2 (the away side) and the Net Area/Net Expansion are calculated at (Team1-Team2) for the respective values. So looking at the state space, when the system state is in the right quadrants → home team has larger area, left → away team has larger area, top → home team expands faster, bottom → away team expands faster.
So the state space here gives us a nice way to categorize the state of the match, along with tracing out these “orbits” which describe what patterns are occurring; here we have four:
The three smaller orbits (in red) show the small back/forth between the teams when Team 1 has possession (right), Team 2 has possession (left) and during a more contested scenario between the two (middle). The larger orbit in blue describes the overall back and forth between the two teams as they move to/from the smaller orbits. Now this is a general state of play we’re examining here, but what if we looked at when these two teams scored goals? Does the attractor change shape? Is it more or less balanced for both teams?
On the top, we have Team 1 scoring, through a sequence of controlled possession and slower build-up; meanwhile on the bottom we have Team 2 scoring off of a counter-attack and a much quicker build-up. The shapes of the attractors reflect this as well.
With Team 1’s goal, we see this large orbit fixed in the right half of the state space → Team 1 occupies more space as it tries to pass the ball and pull Team 2 out of position → the moment Team 2 begins to react to Team 1 and begin expanding, Team 1 quickly closes in → Team 2 tries to react again but the gaps have already been exploited by Team 1, who scores.
With Team 2’s goal, we have two much tighter orbits: on one side Team 1 occupies more space but is closely matched in movement by Team 2 → as soon as Team 2 wins the ball, they rapidly expand in order to move the ball down the field quickly → by the time Team 1 has reacted (second small orbit), Team 2 has already breached the defense and scores.
So how can we actually classify these different plays? Well we have the attractors and want to identify the orbits that define these attractors… in a sense, we want to find holes, persistent holes, in our state space. This is where those notions of persistent homology come into play, so let’s look at the barcodes for these two attractors:
Again, this lines up with what we would expect from seeing the attractors: Team 1’s goal has a larger orbit corresponding to the larger bars appearing further along the plot, while Team 2’s goal has those smaller orbits corresponding to the shorter bars appearing earlier in the plot. But now we have a way to actually vectorize our attractors (i.e. characterize different types of play) by considering the size of the bars in our barcode and where they show up; in this way we can start comparing different moments across the match and finding similar phases of play. In this case we’d be finding moments in the match that had similar dynamics to the goal-scoring moments, even if they didn’t end up in a goal; key for further analysis into critical moments of a match.
Just as an example, here are three moments for both teams that exhibit similar dynamics to their goals, all found within the first 5 minutes of the match based off of clustering their barcodes (left is based on Team 1’s goal, right is based on Team 2’s goal):
While what we’ve done here is a good start in terms of trying to frame the underlying dynamics of a football match, clearly there are also a lot of questions that stem from this approach. Surely, the state of a match is affected by more than just the relative team areas and expansion rates? What about the relative positions of the teams? Or the distribution of players? Or a host of other possible variables… In fact, how many dimensions should we even be considering here?
All valid considerations that we can start addressing in Part II.