The idea of this article arose from considering topological objects associated to directed graphs, or digraphs. Digraphs have been playing an ever increasing role in the study of networks where connections between nodes have a prescribed direction. A notable example of such networks are neural networks, where the use of digraphs is particularly prevalent (Rubinov and Sporns 2010). The digraphs we consider in this article are finite and simple i.e., loop-free and any pair of distinct vertices may be connected by reciprocal edges, but not by more than one edge in the same direction.

An n -clique is a complete graph on n-vertices. Any undirected graph \({\mathcal {G}}\) gives rise to a simplicial complex called the clique complex, or flag complex, where n-simplices are the \((n+1)\)-cliques in \({\mathcal {G}}\). The clique complex is a very common construction that can be found in both theoretical and applied works and is highly prevalent in topological data analysis. However, in many contexts one needs to deal with digraphs rather than undirected graphs, in which case forgetting edge orientation will imply loss of information. One solution for this problem was proposed in Reimann et al. (2017) with the introduction of the directed flag complex.

By analogy to the undirected case, a directed n -clique is a digraph, whose underlying undirected graph is an n-clique, and such that the orientation of its edges defines a linear order on its vertices. Given a digraph \({\mathcal {G}}\), one associates with it the directed flag complex (Reimann et al. 2017), that is, the complex whose n-simplices are the directed \((n+1)\)-cliques in \({\mathcal {G}}\). The directed flag complex was used successfully in Reimann et al. (2017) as a way of gaining information about structural and functional properties of the Blue Brain Project reconstruction of the neocortical microcircuitry of a juvenile rat (Markram et al. 2015). However, while more naturally fitting for use with digraphs, this construction still raises two natural problems. The first is that, while edge orientation dictates which cliques are used in the construction of the complex, once the construction is done, it plays no further role in computations. A second problem is that the directed flag complex does not give a proper topological representation to cliques that are not directed, i.e. those cliques whose vertices are not linearly ordered via the orientation of their edges. This leads to the central idea of this article, namely to the construction of a complex built of all possible cliques in the digraph, while preserving partial information that is determined by edge orientation.

Finite digraphs whose underlying undirected graphs are cliques are referred to in the combinatorics literature as tournaments. Tournaments have been studied since the 1940’s by many authors from an applicable point of view (Alway 1962; Kendall and Babington Smith 1940; Moran 1947; Landau 1951), as well as theoretically (Moon 2015). Tournaments naturally lend themselves to be considered as standard simplices, since an induced subgraph of a tournament is clearly also a tournament. This lead us to the idea that there is a natural family of complexes that are built out of tournaments. We refer to such complexes as tournaplexes. The directed flag complex of a digraph, and in fact any ordered simplicial complex is in particular a tournaplex. By analogy to the flag complex of a graph and the directed flag complex of a digraph, we introduce the flag tournaplex associated to a digraph. The flag tournaplex of a digraph \({\mathcal {G}}\) contains the directed flag complex of \({\mathcal {G}}\) as a subcomplex, as well as other naturally occurring sub complexes.

A key advantage of the flag tournaplex, over other combinatorial constructions one can associate to a digraph, is that it invites the use of persistent homology (Edelsbrunner and Harer 2008), where the filtration parameters arise naturally from the digraph in question, without any extra weights. Furthermore, this can be done in a number of different ways, which makes the flag tournaplex, at least in theory, a more powerful instrument by which one can study properties of digraphs and networks. The idea arose from considering a simple numerical invariant of tournaments that was introduced in Reimann et al. (2017)—the directionality of a tournament.

In this paper we define three types of directionality, two that are local in nature and do not depend on an ambient tournaplex the tournament is a part of, and a third that does depend on the containing tournaplex. Local directionality invariants divide tournaments into equivalence classes, and since they are always defined by means of the orientation of the edges in a tournament, they could potentially provide an efficient dimension reduction method in studying dynamics on a digraph. Since the number of non-isomorphic n-tournaments grows very rapidly with n, such a method is essential for any applicable purposes. However, in this work we concentrate on the filtrations of tournaplexes that arise from directionality invariants, examine the way in which certain simple examples behave with respect to those filtrations, and apply the technique to some more involved data sets. Thus, we aim to convince the reader that the flag tournaplex of a digraph, combined with its directionality invariants and persistent homology, are a powerful tool in the study of neuroscience, network theory, and in general any subject that involves digraphs. Furthermore, we show that these methods have the potential to reveal more information about a digraph than some other, more standard, topological tools.

After introducing the concept of a tournaplex in Sect. 1, we proceed in Sect. 2 by defining several types of directionality. We mainly discuss three types: local directionality, 3-cycle directionality and global directionality. We concentrate on the first two, which are closely related to each other, as demonstrated in Proposition 2.3. We then discuss some basic properties of local and 3-cycle directionality, and in particular derive in Proposition 2.10 a probabilistic formula for the expectation of tournaments of a given order (number of vertices) and local directionality in a random digraph.

In Sect. 3 we show how directionality can be used to define filtrations on tournaplexes. In particular we produce simple examples that show the potential these filtrations have to distinguish tournaplexes with the use of persistent homology in cases where the homotopy types of the corresponding geometric realisations are identical. We note that each isomorphism type of a tournament gives rise to a directionality invariant on tournaplexes. This is similar to the idea behind 3-cycle directionality, which is basically an invariant that associates with a tournament the number of regular 3-tournaments it contains (see Remark 3.4). In particular we observe (Figs. 5 and  6) that different directionality filtrations can be used in conjunction to achieve a greater separation power among tournaplexes. Furthermore, the variety of directionality filtrations on tournaplexes suggests the possibility of associating multi-parameter persistence modules to them. This of course raises the question whether a multi-parameter persistence module may contain more information about the tournaplex than each of the filtrations used to create it individually. We demonstrate by means of a simple example two 8-tournaments whose local directionality filtration and 3-cycle filtration each give isomorphic 1-parameter persistence modules, while taken together they are distinguished by their corresponding 2-parameter persistence modules. In these examples we do not compute the full persistence module structure (i.e., we don’t compute the morphisms in the module), as the values at the bifiltration points suffice to separate the samples.

Finally in Sect. 4 we give two examples of applications, where we use the flag tournaplex of a digraph and the associated local directionality filtration as a classifier of certain families of graphs.

The first family is one of random graphs generated by a specific regime that depends on two parameters p and q. We show that applying the method to 200 such random graphs with a fixed value of p and four different values of q, we obtain an almost perfect separation of the graphs into four families using a technique as simple as k-means clustering. This is used as a test case, which shows that the technique we develop here is indeed sensitive to network structure.

A second example is taken from data that was generated by the Blue Brain Project for the paper (Reimann et al. 2017). Here we have response signals of a digital reconstruction of neocortical tissue of a rat (Markram et al. 2015) to a collection of stimuli that depend on two parameters and appear in 5 seeds (repeats of the same experiment). This gives 45 spike train datasets, which we then use to generate a family of 450 tournaplexes. Once more, k-means clustering is applied to separate the signals almost perfectly with respect to one of the two parameters and across all seeds. In both cases we compare the performance of this technique to that of a similar approach using only the Betti numbers of the corresponding directed flag complex (as was done, using a much larger spiking dataset and without an attempt at classification, in Reimann et al. (2017)), and observe that using tournaplexes with local directionality filtration gives much better results. See Sect. 4.2 for more detail about the Blue Brain Project model, and the data used in this example.

We emphasise that the methods proposed in this article are applicable to any system or phenomenon that can be encoded on a digraph. In many such applications researchers consider motifs, namely subgraphs on a small number of vertices, as a means of characterising various graph regimes and graph dynamics. Tournaments are nothing but motifs whose underlying undirected graph is a clique, and can thus be considered as building blocks of any other motif. Directionality invariants give ways of controlling the distribution of tournaments in a digraph, without getting lost in the abundance of their isomorphism types, and the associated filtrations provide the means of studying persistent homology on a digraph with intrinsically defined weights. These features are independent of any specific application one may wish to apply the methods to, rather than only to neuroscience.

Most of our computations were carried out by purpose designed modification of the software package Flagser (Lütgehetmann et al. 2020), designed for computation of persistent homology of directed flag complexes. The package which we named Tournser is designed specifically for the computation and manipulation of flag tournaplexes associated to digraphs. Flagser and Tournser are freely available at Lütgehetmann (2020) and Smith (2020a), respectively, and interactive online versions are available at Smith (2020c) and Smith (2020b).

As a “spin-off” project, the first named author investigated the homotopy types that may arise as directed flag complexes associated to tournaments. He showed in particular that this gives an invariant of tournaments that provides a finer partition within a given directionality class of tournaments (Govc 2020). The methods in this paper were instrumental in finding the example in Sect. 3 demonstrating that a multi-parameter persistence module of tournaplexes associated to two or more directionality invariants can be more informative than the the corresponding 1-parameter modules associated to each of them separately.

The authors acknowledge support from EPSRC, grant EP/P025072/ - “Topological Analysis of Neural Systems”, and from École Polytechnique Fédérale de Lausanne via a collaboration agreement with the second named author. We also thank Jānis Lazovskis, Henri Riihimäki and Pedro Conceição for useful discussions.

1 Tournaments and tournaplexes

We start by defining the basic objects of study and some of their properties. All digraphs considered in this article are assume to be finite and simple. By simple we mean loop-free, namely edges of the form (vv) are not allowed for any vertex v, and if v and w are two distinct vertices then a digraph may contain the reciprocal edges (vw) and (wv), but two edges in the same orientation between v and w are not allowed.

If \({\mathcal {G}}=(V,E)\) and \({\mathcal {G}}'=(V',E')\) are digraphs, then a morphism \(f:{\mathcal {G}}\rightarrow {\mathcal {G}}'\) is a pair of functions \((\phi , \psi )\), where \(\phi :V\rightarrow V'\) and \(\psi :E\rightarrow E'\), such that if \(e=(v,v')\) is a directed edge in \({\mathcal {G}}\), then \(\psi (e) = (\phi (v),\phi (v'))\) is a directed edge in \({\mathcal {G}}'\). If \({\mathcal {G}}\) is any digraph, then by its underlying undirected graph we mean the graph \({\widehat{{\mathcal {G}}}}\) on the same vertices and edges, where edge orientation is ignored (i.e. any directed edge or pair of reciprocal edges is replaced by a single undirected edge).

Definition 1.1

For a non-negative integer n, an n-tournament is a digraph with no reciprocal edges whose underlying undirected graph is an n-clique.

We will generally denote tournaments by greek letters, \(\sigma , \tau \) etc.

Definition 1.2

An n-tournament \(\sigma \) is said to be

  • transitive, if its edge orientation defines a total ordering on its vertex set,

  • semi-regular, if for each vertex \(v\in \sigma \) its in-degree and out-degree differ by at most 1, and

  • regular if for each vertex \(v\in \sigma \) its in-degree and out-degree are equal.

Clearly a regular tournament must be odd, and a regular tournament is always semi-regular. A tournament containing a vertex whose out-degree and in-degree differ by exactly one must be even.

Let \(\sigma \) be an n-tournament. Then for any set of k vertices in \(\sigma \), the induced subgraph is a k-tournament. Such sub-tournaments will be referred to as the faces of \(\sigma \).

We can now define complexes built out of tournaments in a way that is totally analogous to the definition of an ordinary abstract simplicial complex.

Definition 1.3

A tournaplex is a collection X of tournaments, such that if \(\sigma \in X\) and \(\sigma '\) is a face of \(\sigma \), then \(\sigma ' \in X\).

If X is a finite tournaplex, then the dimension of X is the largest integer n, such that X contains at least one \((n+1)\)-tournament and no k-tournaments for \(k>n+1\). Thus we will sometimes refer to \((n+1)\)-tournaments as n -dimensional.

1.1 Geometric realisation

One may think of tournaplexes abstractly as in Definition 1.3. It is also possible to associate a topological space, a chain complex and homology with a tournaplex. To do so, the simplest way is to think of tournaplexes as semi-simplicial sets.

Definition 1.4

Let X be a tournaplex together with a fixed linear ordering on its set of vertices (1-tournaments). Define a semi-simplicial set \(X_\bullet \), whose n-simplices are the set \(X_n\) of all \((n+1)\)-tournaments in X. For every tournament \(\sigma \in X\), the vertices of \(\sigma \) inherit a total order \(v_0, v_1,\ldots , v_n\). For all \(n>0\) and \(0\le i\le n\), define face operators \(\partial _i:X_n\rightarrow X_{n-1}\), where \(\partial _i(\sigma )\) is the sub-tournament spanned by all \(v_j\) except \(v_i\).

The face operators \(\partial _i\) clearly satisfy the usual simplicial face relations, and hence \(X_\bullet \) is a well defined semi-simplicial set. Clearly, up to isomorphism of semi-simplicial sets, this does not depend on the choice of total ordering on \(X_0\).

Definition 1.5

The geometric realisation of a tournaplex X is defined to be the geometric realisation of the corresponding semi-simplicial set \(X_\bullet \).

The chain complex of a tournaplex X and its homology are simply the usual chain complex and homology of the corresponding semi-simplicial set.

We proceed with some naturally occurring examples.

Example 1.6

Let X be an abstract simplicial complex, and let \(X_1\) denote the collection of its 1-simplices. Fix an arbitrary orientation on each simplex \(\tau \in X_1\). Thus every simplex \(\sigma \in X\) becomes a tournament, and turns X into a tournaplex.

Notice that the homeomorphism type of the geometric realisation of the resulting tournaplex in Example 1.6 is independent of the choice of orientations on the 1-simplices. It is in fact just that of the original simplicial complex.

Example 1.7

Let X be an ordered simplicial complex, i.e. a collection of finite ordered sets that is closed under subsets. Then each ordered simplex \(\sigma \in X\) can be thought of as a transitive tournament. Hence ordered simplicial complexes are special cases of tournaplexes in which every tournament is transitive.

Fig. 1
figure 1

A digraph consisting of two 5-tournaments “glued” along a 1-face. The flag tournaplex (Definition 1.8) of this graph consists of two 4-simplices glued along an edge, and as such is contractible. By contrast, the directed flag complex of the graph has the homotopy type of a wedge of two circles and a 2-sphere. However, the topology of the directed flag complex that is embedded in the flag tournaplex can be revealed using the 3-cycle filtration (see Sect. 3)

Definition 1.8

Let \({\mathcal {G}}\) be a digraph. The flag tournaplex associated to \({\mathcal {G}}\) is the tournaplex \(\mathrm {tFl}({\mathcal {G}})\), whose n-simplices are the subgraphs of \({\mathcal {G}}\) that are \((n+1)\)-tournaments.

Notice that not every tournaplex is a flag tournaplex. For instance the boundary of a 2-simplex, with any edge orientation is a tournaplex because it consists of three 2-tournaments and three 1-tournaments, but it is not a flag tournaplex because it does not contain the interior of the 2-simplex. However, just like the case of ordinary flag complexes, every tournaplex is contained as a sub-tournaplex in the flag tournaplex generated by its 1-skeleton.

The directed flag complex associated to a digraph \({\mathcal {G}}\) is the ordered simplicial complex \(\mathrm {dFl}({\mathcal {G}})\) whose simplices are the transitive tournaments in \({\mathcal {G}}\). Thus \(\mathrm {dFl}({\mathcal {G}})\subseteq \mathrm {tFl}({\mathcal {G}})\), where equality holds if and only if every tournament in \({\mathcal {G}}\) is transitive. Moreover, if \({\mathcal {G}}\) has no reciprocal edges then \(\mathrm {tFl}({\mathcal {G}})\) is isomorphic to the flag complex of the underlying undirected graph. See Fig. 1 for an example of a digraph, a description of the associated directed flag complex and flag tournaplex.

We end this section with a comment on the number of tournaments up to digraph isomorphism. In low dimensions the numbers are small: the number of non-isomorphic tournaments on 2, 3, 4 and 5 vertices is 1, 2, 4, and 12, respectively. However the numbers grow rapidly with dimension with nearly 200,000 non-isomorphic tournaments on 9 vertices and nearly 10 million of them on 10 vertices. It is clear therefore that it may be useful to find invariants that can divide tournaments into a relatively small number of classes in a meaningful way. This is what we aim to do in Sect. 2.

2 Directionality invariants of tournaments

Given a digraph \({\mathcal {G}}=(V,E)\), one may associate with each vertex \(v\in V\) its signed directionality. This allows us to define certain integer valued functions on the set of simplices of a tournaplex. This in turn gives natural ways of filtering tournaplexes, and as such they become natural candidates for analysis by means of techniques of persistent homology.

Definition 2.1

Let \({\mathcal {G}}= (V, E)\) be a digraph. For a vertex \(v\in V\), define the signed degree of v in \({\mathcal {G}}\), by

$$\begin{aligned}{\mathrm {sd}}_{\mathcal {G}}(v) = {\mathrm {indeg}}_{\mathcal {G}}(v) - {\mathrm {outdeg}}_{\mathcal {G}}(v).\end{aligned}$$

Let \(U\subseteq V\) be a subset of vertices.

  1. i)

    Define the signed degree of U relative to \({\mathcal {G}}\) by

    $$\begin{aligned} {\mathrm {sd}}_{\mathcal {G}}(U) = \sum _{v\in U} {\mathrm {sd}}_{\mathcal {G}}(v). \end{aligned}$$
  2. ii)

    Define the directionality of U relative to \({\mathcal {G}}\) by

    $$\begin{aligned} {\mathrm {Dr}}_{\mathcal {G}}(U) = \sum _{v\in U} {\mathrm {sd}}_{\mathcal {G}}(v)^2. \end{aligned}$$

Let U be a finite set, and let \(\mathbb {R}U\) be the real vector space generated by U, and let \(\mathbb {R}U^*\) be the dual space of linear functionals \(\mathbb {R}U \rightarrow \mathbb {R}\). Equip \(\mathbb {R}U^*\) with an inner product defined by

$$\begin{aligned} (\alpha , \beta ) = \sum _{v\in U} \alpha (v)\beta (v) \end{aligned}$$

for \(\alpha , \beta \in \mathbb {R}U^*\). If U is the vertex set of a digraph \({\mathcal {G}}\), then the functions \({\mathrm {indeg}}\), \({\mathrm {outdeg}}\) and \({\mathrm {sd}}\) can be extended to functionals \({\mathrm {indeg}}_{\mathcal {G}}, \,{\mathrm {outdeg}}_{\mathcal {G}}, \,{\mathrm {sd}}_{\mathcal {G}}\in \mathbb {R}U^*\), and \({\mathrm {Dr}}_{\mathcal {G}}(U) = ({\mathrm {sd}}_{\mathcal {G}}, {\mathrm {sd}}_{\mathcal {G}})\). In the norm on \(\mathbb {R}U^*\) defined by the inner product, \({\mathrm {Dr}}_{\mathcal {G}}(U)\) is the square length of the functional \({\mathrm {sd}}_{\mathcal {G}}\), or equivalently the square distance between the functionals \({\mathrm {indeg}}_{\mathcal {G}}(U)\) and \({\mathrm {outdeg}}_{\mathcal {G}}(U)\). Clearly, if \(f:{\mathcal {G}}\rightarrow \mathcal {H}\) is an isomorphism of digraphs, and \(W=f(U)\), then \({\mathrm {sd}}_{\mathcal {G}}(U) = {\mathrm {sd}}_\mathcal {H}(W)\) and \({\mathrm {Dr}}_{\mathcal {G}}(U)= {\mathrm {Dr}}_\mathcal {H}(W)\). We now use these constructions to define graph invariants on tournaments.

Definition 2.2

For any n-tournament \(\sigma \) in a graph \({\mathcal {G}}\), let \(V_\sigma \) denote the vertex set of \(\sigma \), and define:

  1. (i)

    Local directionality: \({\mathrm {Dr}}(\sigma ) = {\mathrm {Dr}}_{\sigma }(V_\sigma )\).

  2. (ii)

    3-cycle directionality: Let \(c_3(\sigma )\) denote the number of regular 3-sub-tournaments in \(\sigma \).

  3. (iii)

    Global directionality: \({\mathrm {Dr}}_{\mathcal {G}}(\sigma ) = {\mathrm {Dr}}_{\mathcal {G}}(V_\sigma )\).

Notice that local directionality and the 3-cycle directionality are both invariants of a single tournament, independent from the ambient graph containing it. Global directionality on the other hand takes into account the general connectivity of the ambient graph. Next, we observe that the local directionality of a tournament is strongly related to its 3-cycle directionality.

Proposition 2.3

Let \(\sigma \) be an n-tournament. Then

$$\begin{aligned} {\mathrm {Dr}}(\sigma ) = 2{\left( {\begin{array}{c}n+1\\ 3\end{array}}\right) } - 8c_3(\sigma ). \end{aligned}$$
(1)

Proof

For \(n=3\) the tournament \(\sigma \) is either transitive or regular. In the first case \({\mathrm {Dr}}(\sigma ) = 8\) and in the other \({\mathrm {Dr}}(\sigma )=0\). Thus the claim follows. Proceed by induction on n. Assume (1) holds for all \((n-1)\)-tournaments and prove it holds for n-tournaments. Choose a vertex \(v_0\in \sigma \), and let \(\sigma '\) denote the face of \(\sigma \) corresponding to removing \(v_0\). We split the vertices \(V_{\sigma '}\) of \(\sigma '\) into two parts: \(V_{\mathrm {in}}\) consisting of the \(v\in V_{\sigma '}\) such that \([v_0\rightarrow v]\in \sigma \) and \(V_{\mathrm {out}}\) consisting of all vertices \(v\in V_{\sigma '}\) such that \([v\rightarrow v_0]\in \sigma \).

Define \(\sigma _{\mathrm {in}}\) to be the subgraph of \(\sigma '\) induced by \(V_{\mathrm {in}}\), and let \(\sigma _{\mathrm {out}}\) be the subgraph induced by \(V_{\mathrm {out}}\). Let \(\sigma _{\mathrm {in-out}}\) be the subgraph with vertex set \(V_{\sigma '}\), and whose edges are incident to one vertex in \(V_{\mathrm {in}}\) and the other in \(V_{\mathrm {out}}\).

Now, write:

$$\begin{aligned} \sum _{v\in V_{\sigma }}{\mathrm {sd}}_{\sigma }(v)^2&= \sum _{v\in V_{\mathrm {in}}}( {\mathrm {sd}}_{\sigma '}(v)+1)^2+ \sum _{v\in V_{\mathrm {out}}}( {\mathrm {sd}}_{\sigma '}(v)-1)^2+(|V_{\mathrm {out}}|-|V_{\mathrm {in}}|)^2\\&=\sum _{v\in V_{\sigma '}}( {\mathrm {sd}}_{\sigma '}(v))^2+ 2\sum _{v\in V_{\mathrm {in}}} {\mathrm {sd}}_{\sigma '}(v)-2\sum _{v\in V_{\mathrm {out}}} {\mathrm {sd}}_{\sigma '}(v) \\&\quad +|V_{\sigma '}|+(|V_{\mathrm {out}}|-|V_{\mathrm {in}}|)^2. \end{aligned}$$

Note that \(\sum _{v\in V_X}{\mathrm {sd}}_{X}(v)=0\) for any graph X with vertex set \(V_X\), since this is equivalent to sum of all in-degrees minus the sum of all out-degrees. So we can rewrite the second and third term as follows:

$$\begin{aligned} 2\sum _{v\in V_{\mathrm {in}}} {\mathrm {sd}}_{\sigma '}(v)-2\sum _{v\in V_{\mathrm {out}}} {\mathrm {sd}}_{\sigma '}(v)&=2\sum _{v\in V_{\sigma '}} {\mathrm {sd}}_{\sigma '}(v)-4\sum _{v\in V_{\mathrm {out}}} {\mathrm {sd}}_{\sigma '}(v)\\&=2\sum _{v\in V_{\sigma '}} {\mathrm {sd}}_{\sigma '}(v)-4\sum _{v\in V_{\mathrm {out}}}{\mathrm {sd}}_{\sigma _{\mathrm {out}}}(v)-4\sum _{v\in V_{\mathrm {out}}}{\mathrm {sd}}_{\sigma _{\mathrm {in-out}}}(v)\\&=-4\sum _{v\in V_{\mathrm {out}}}{\mathrm {sd}}_{\sigma _{\mathrm {in-out}}}(v)\\&=4|V_{\mathrm {in}}||V_{\mathrm {out}}|-8\ell , \end{aligned}$$

where \(|V_{\mathrm {in}}||V_{\mathrm {out}}|\) arises as the number of all edges in \(\sigma _{\mathrm {in-out}}\) and \(\ell \) is the number of edges \(v_1\rightarrow v_2\) with \(v_1\in V_{\mathrm {in}}\) and \(v_2\in V_{\mathrm {out}}\). Therefore,

$$\begin{aligned} {\mathrm {Dr}}(\sigma )= \sum _{v\in V_\sigma }{\mathrm {sd}}_{\sigma }(v)^2&= {\mathrm {Dr}}(\sigma ')-8\ell +|V_{\sigma '}|+(|V_{\mathrm {out}}|-|V_{\mathrm {in}}|)^2+4|V_{\mathrm {in}}||V_{\mathrm {out}}|\\&={\mathrm {Dr}}(\sigma ')-8\ell +|V_{\sigma '}|+|V_{\sigma '}|^2. \end{aligned}$$

Using the inductive hypothesis, this simplifies to

$$\begin{aligned} {\mathrm {Dr}}(\sigma ) = 2{\left( {\begin{array}{c}n\\ 3\end{array}}\right) } + n(n-1) - 8(c_3(\sigma ') + \ell ) = 2{\left( {\begin{array}{c}n+1\\ 3\end{array}}\right) }- 8(c_3(\sigma ') + \ell ). \end{aligned}$$

Finally, observe that \(\ell \) is precisely the number of regular 3-tournaments in \(\sigma \) that are not already present in \(\sigma '\), and the proof is complete. \(\square \)

Note that Proposition 2.3 can be derived from the first corollary to (Moon 2015, Theorem 4). However, we include the above proof as we feel it gives a simpler combinatorial understanding of the link between local directionality and the 3-cycle directionality.

Corollary 2.4

(Reimann et al. 2017, Supp. Meth. 2.1, Proposition 1) Let \(\sigma \) be an n-tournament. Then

$$\begin{aligned} {\mathrm {Dr}}(\sigma ) \le 2{\left( {\begin{array}{c}n+1\\ 3\end{array}}\right) }, \end{aligned}$$

with equality obtained if and only if \(\sigma \) is transitive.

Corollary 2.5

Let \(\sigma \) be an n-tournament, and let \(\sigma '\) be a face of codimension 1. Then

$$\begin{aligned} c_3(\sigma ) - c_3(\sigma ')\le \left( \frac{n-1}{2}\right) ^2. \end{aligned}$$

Proof

Let \(\sigma '\) be an \((n-1)\)-face of \(\sigma \), and let \(v_0\in V_\sigma \) be the vertex not present in \(V_{\sigma '}\). Partition \(V_{\sigma '}\) into two disjoint subsets \(V_{\mathrm {in}}\) and \(V_{\mathrm {out}}\), where \(V_{\mathrm {in}}\) contains the vertices \(v\in V_{\sigma '}\) such that the edge between \(v_0\) and v is oriented towards v, and \(V_{\mathrm {out}}=V_{\sigma '}\setminus V_{\mathrm {in}}\). Set \(\ell =|V_{\mathrm {in}}|\), so \(|V_{\mathrm {out}}|=n-1-\ell \). Then the largest value of \(c_3(\sigma )-c_3(\sigma ')\), that is, the number of directed 3-cycles in \(\sigma \) that are not in \(\sigma '\), is obtained if every edge between a vertex \(v'\in V_{\mathrm {in}}\) and a vertex \(v''\in V_{\mathrm {out}}\) is oriented from \(v'\) to \(v''\), and that number is \(\ell (n-1-\ell ) = -\ell ^2 + (n-1)\ell \). This quadratic function of \(\ell \) obtains its maximum exactly when \(\ell =\frac{n-1}{2}\), and so the maximal number of directed 3-cycles that can appear in \(\sigma \) and are not present in \(\sigma '\) is \(\left( \frac{n-1}{2}\right) ^2\). \(\square \)

Example 2.6

If \(\sigma \) is an n-tournament and \(\tau \) is a k-face of \(\sigma \), then it is not true in general that \({\mathrm {Dr}}(\tau )\le {\mathrm {Dr}}(\sigma )\). For instance a regular 3-tournament has local directionality 0 but each of its 2-faces has local directionality 2. Similarly a semi-regular 4-tournament has local directionality 4 but has two 3-faces with local directionality 8.

One natural question in view of Proposition 2.3 is whether every number that can theoretically appear as the local directionality of a tournament, is indeed realisable as such. The answer to this question follows from three theorems by Kendall and Babington-Smith in their 1940 paper (Kendall and Babington Smith 1940, 8.(1)–(3)). Using our terminology these results read as follows:

Theorem 2.7

(Kendall and Babington Smith 1940) The following statements hold for an n-tournament \(\sigma \):

  1. (i)

    \(c_3(\sigma )\le {\left\{ \begin{array}{ll} \frac{n^3-n}{24} &{} n\; \text {odd}\\ \frac{n^3-4n}{24} &{} n\; \text {even} \end{array}\right. }.\) Furthermore, these bounds are sharp in the sense that there exists an n-tournament with this number of 3-cycles.

  2. (ii)

    For any integer k between 0 and the upper bounds in i), there exists at least one n-tournament \(\sigma \) such that \(c_3(\sigma ) = k\).

Corollary 2.8

For each pair of non-negative integers n and k, where k is at most as large as the bounds in Theorem 2.7, there exists at least one n-tournament \(\sigma \) such that \({\mathrm {Dr}}(\sigma ) = 2{\left( {\begin{array}{c}n+1\\ 3\end{array}}\right) }-8k\).

Corollary 2.9

For any \(n\ge 0\) there exists an n-tournament \(\sigma \) of minimal local directionality that is regular if \({\mathrm {Dr}}(\sigma )=0\) (n odd) and semi-regular if \({\mathrm {Dr}}(\sigma )=n\) (n even).

2.1 Distribution of tournaments in digraphs by local directionality

The motivation for introducing tournaplexes is the idea that tournaments are basic building blocks in a geometric object that can be associated to a digraph. Rather than ignoring edge orientation and considering the ordinary flag complex, or neglecting cliques that are not linearly ordered and considering the directed flag complex, the flag tournaplex allows us to consider all tournaments as simplices, and as such forms a topological object that is richer in structure, and better informing about properties of the digraph in question. See Fig. 2 for a simple example of how these three constructions differ. However, the vast number of isomorphism types of tournaments in higher dimensions makes using them as individual data units quite impractical. We have thus introduced directionality as one way of taking advantage of general tournaments, without the constraints imposed by the large number of isomorphism types.

Fig. 2
figure 2

The flag complex of this digraph (ignoring orientation) is a 2-simplex. The directed flag complex is a 2-simplex with an additional arc on its bottom 1-face. The flag tournaplex is a cone formed of two 2-simplices attached on two of their edges. Its homotopy type can be distinguished from that of the flag complex by using the directionality of its 3-tournaments

To demonstrate the potential of tournaments to inform on network structure, we examined a number of networks, and recorded the distribution of tournaments by local directionality in Fig. 3. In particular the probability of each possible value of directionality of a random tournament in a fixed dimension is known up to \(n=10\), by calculations of Alway (1962) and Kendall and Babington Smith (1940). More generally it was shown by Moran (1947) that the distribution of the number of 3-cycles in a random n-tournament tends to normal for n sufficiently large (see also (Moon 2015, Theorem 6)). All these results are stated in terms of the number of 3-cycles in a tournament.

Let \(T_{n,j}\) be the number of all n-tournaments \(\sigma \) on a fixed vertex set, such that \(c_3(\sigma )=j\). One can compute the distribution of local directionality in n-tournaments and \(T_{n,j}\) recursively (Alway 1962). The following gives the said distributions for \(n\le 5\).

figure a

Using \(T_{n,j}\) we can express the expected number of tournaments of various 3-cycle directionalities in Erdős-Rényi graphs. Recall that an Erdős-Rényi digraph with n vertices and connection probability p is a random subgraph of the complete digraph with n vertices, where each of the \(n(n-1)\) possible directed edges is included with probability p independently from every other directed edge.

Proposition 2.10

Let \({\mathcal {G}}\) be a directed Erdős-Rényi graph with n vertices and connection probability p. Let \(X_k\) be the total number of k-tournaments that occur in \({\mathcal {G}}\) and let \(X_{k,j}\) be the total number of k-tournaments containing exactly j 3-cycles that occur in \({\mathcal {G}}\). Then the expected values of \(X_k\) and \(X_{k,j}\) are given by the formulas

$$\begin{aligned} E(X_k)=\left( {\begin{array}{c}n\\ k\end{array}}\right) 2^{\left( {\begin{array}{c}k\\ 2\end{array}}\right) }p^{\left( {\begin{array}{c}k\\ 2\end{array}}\right) }\qquad \text {and}\qquad E(X_{k,j})=\left( {\begin{array}{c}n\\ k\end{array}}\right) T_{k,j}p^{\left( {\begin{array}{c}k\\ 2\end{array}}\right) }. \end{aligned}$$

Proof

First note that the number of all possible k-tournaments on an n-vertex set is given by

$$\begin{aligned} \left( {\begin{array}{c}n\\ k\end{array}}\right) 2^{\left( {\begin{array}{c}k\\ 2\end{array}}\right) }, \end{aligned}$$

as each one is obtained by choosing k out of n vertices and then orienting each of the \(\left( {\begin{array}{c}k\\ 2\end{array}}\right) \) edges in one of two possible ways.

The number of all possible k-tournaments on an n-vertex set containing exactly j 3-cycles is given by

$$\begin{aligned} \left( {\begin{array}{c}n\\ k\end{array}}\right) T_{k,j}, \end{aligned}$$

as we have to choose k vertices and then by definition \(T_{k,j}\) is the number of tournaments on that vertex set containing exactly j 3-cycles.

Now, any specific k-tournament \(\sigma \) will occur in \({\mathcal {G}}\) with probability

$$\begin{aligned} P(\sigma \subseteq {\mathcal {G}})=p^{\left( {\begin{array}{c}k\\ 2\end{array}}\right) }, \end{aligned}$$

as it has exactly \({\left( {\begin{array}{c}k\\ 2\end{array}}\right) }\) edges, each of which occurs independently with probability p. Let \(Y_\sigma \) be the random variable which takes value 1 if \(\sigma \) occurs in \({\mathcal {G}}\) and 0 otherwise. It follows that

$$\begin{aligned} E(Y_\sigma )=p^{\left( {\begin{array}{c}k\\ 2\end{array}}\right) }. \end{aligned}$$

Note that \(X_k\) is just the sum of \(Y_\sigma \) over all possible k-tournaments \(\sigma \). Similarly, \(X_{k,j}\) is the sum of \(Y_\sigma \) over all possible k-tournaments \(\sigma \) containing exactly j 3-cycles. Therefore, the result follows by linearity of expectation. \(\square \)

We can consider \(T_{k,j}\) as the theoretical distribution of k-tournaments \(\sigma \) with \(c_3(\sigma )=j\). In Fig. 3 we show the distribution of tournaments in all values of local directionality in a number of sample networks. Notice the rather accurate estimates in the case of Erdős-Rényi graphs. It is also worth noticing how close the distribution of local directionality values is between the connectivity graph of Varshney et al. (2011) and that of the Blue Brain Project microcircuit simulating a section of the neocortex of a rat (Blue Brain Project 2020). The bottom three datasets can be found in Takahata (1991), Leskovec (2020), Ewing et al. (2007). The bottom five datasets are available in Tournser format at Aberdeen University Neuro-Topology Group (2020).

Fig. 3
figure 3

Distribution of tournaments by local directionality in various networks. The rows correspond to the graphs indicated on the left. The columns are ordered left to right by the number n of vertices in a tournament for \(n\ge 3\). Each histogram shows the distribution of n-tournaments by local directionality. The numbers on top refer to the number \(c_3\) of 3-cycles in the tournaments represented by the relevant column of the histogram. From this number local directionality can be computed as \(2{n+1\atopwithdelims ()3} - 8c_3\). The asterisk indicates the existence of further distributions to the right.

3 Filtrations

In this section we discuss three different ways to define a filtration on a tournaplex using the idea of directionality.

Let K be a tournaplex with vertex set V. An increasing filtration on K is an increasing sequence of sub-tournaplexes

$$\begin{aligned} \varnothing =F_0(K)\subseteq F_1(K)\subseteq F_2(K)\subseteq \cdots \subseteq F_n(K) = K. \end{aligned}$$

An increasing filtering weight function on K is a function \(W:K\rightarrow \mathbb {R}\) with the property that if \(\sigma \) is a tournament in K and \(\sigma '\subseteq \sigma \) is a face, then \(W(\sigma ')\le W(\sigma )\). Similarly one defines decreasing filtrations and decreasing filtering weight functions.

An increasing filtering weight function W on a tournaplex K naturally gives rise to increasing filtrations on K as follows: Fix an increasing sequence of real numbers \(r_1<r_2<\cdots <r_{n-1}\). Set \(F_0(K)=\varnothing \), \(F_n(K)=K\), and

$$\begin{aligned} F_i(K) = \{\sigma \in K\;|\; W(\sigma )\le r_i\}, \end{aligned}$$

for \(1\le i\le n-1\). Similarly a decreasing filtering weight function gives rise to a decreasing filtration on K. If the weight function on K is a step function, the one has an obvious choice for the sequence defining the filtration, namely the sequence of all possible values of the function in increasing order. We now use local, 3-cycle and global directionality as ways of defining filtrations on any tournaplex.

We start with local directionality. As pointed out in Example 2.6, it is not generally the case that if \(\sigma \) is a tournament and \(\tau \) is a face of \(\sigma \), then \({\mathrm {Dr}}(\tau )\le {\mathrm {Dr}}(\sigma )\). Thus one is led to make the following definition.

Definition 3.1

Let K be a tournaplex. For each n-tournament \(\sigma \in K\) define the local directionality weight of \(\sigma \) to be

$$\begin{aligned} W_{{\mathrm {Dr}}}(\sigma ) \overset{\text {def}}{=}{\mathrm {Dr}}(\sigma ) + 2{\left( {\begin{array}{c}n\\ 3\end{array}}\right) }. \end{aligned}$$

Lemma 3.2

Let K be a tournaplex and let \(W_{{\mathrm {Dr}}}:K\rightarrow \mathbb {N}\) be the local directionality weight function. Then \(W_{{\mathrm {Dr}}}\) defines an increasing filtration on K.

Proof

It suffices to show that if \(\sigma \) is an n-tournament for \(n>2\) and \(\tau \subseteq \sigma \) is a face of codimension 1, then \(W_{{\mathrm {Dr}}}(\tau )\le W_{{\mathrm {Dr}}}(\sigma )\). By Proposition 2.3 and Corollary 2.5

$$\begin{aligned} W_{{\mathrm {Dr}}}(\sigma )-W_{{\mathrm {Dr}}}(\tau ) \ge&\;2\left( {\left( {\begin{array}{c}n+1\\ 3\end{array}}\right) }-{\left( {\begin{array}{c}n-1\\ 3\end{array}}\right) }\right) -2(n-1)^2=0. \end{aligned}$$

The claim follows. \(\square \)

Notice that it is possible for an n-tournament \(\sigma \) to have a face \(\tau \) of codimension larger than 1, whose directionality is larger than that of all codimension 1 faces. This justifies adding the maximal possible directionality of a codimension 1 face to the local directionality in order to create a filtration. See Fig. 4 for such an example.

Fig. 4
figure 4

A regular 5-tournament (\({\mathrm {Dr}}=0\)) in which each 4-face is semi-regular (\({\mathrm {Dr}}=4\)). Hence each 4-face contains two 3-faces that are transitive (\({\mathrm {Dr}}=8\))

Definition 3.3

Let K be a tournaplex. For each n-tournament \(\sigma \in K\) define the 3-cycle directionality weight of \(\sigma \) to be \(W_{c_3}(\sigma )=c_3(\sigma )\).

Clearly if \(\sigma \) is a tournament and \(\tau \) is a face of \(\sigma \), then \(c_3(\tau )\le c_3(\sigma )\), and hence this defines a filtration on any tournaplex.

Remark 3.4

Observe that tournaplexes may be naturally filtered in many other ways that are similar to the 3-cycle filtration. For instance, if K is a tournaplex, define \(W_T:K\rightarrow \mathbb {R}\) by letting \(W_T(\sigma )\) be the number of transitive 3-tournaments in \(\sigma \). Pushing this idea further, one can consider any fixed (small) tournament \(\tau \), and filter K by the number of times \(\tau \) appears as a face in each \(\sigma \in K\). This automatically yields a filtration, and relates nicely to well known approaches that regard the prevalence of certain motifs in networks as a determining factor in the emerging dynamics (see for instance Schiller et al. 2015).

By Proposition 2.3, the 3-cycle weight function and the local directionality weight function on an arbitrary n-tournament \(\sigma \) are related by the formula:

$$\begin{aligned} W_{{\mathrm {Dr}}}(\sigma ) = {\left( {\begin{array}{c}n\\ 2\end{array}}\right) }\frac{4n-2}{3} - 8W_{c_3}(\sigma ). \end{aligned}$$
(2)

Thus, these filtrations are related to each other by a naturally occurring monotone transform that is a function of simplex dimension. However, because of the intrinsic geometric data that gives rise to the filtrations, the functions \(W_{{\mathrm {Dr}}}\) and \(W_{c_3}\) are far from inducing similar filtrations, in spite of this close relationship. Specifically, the local directionality filtration of a tournaplex has its vertices in filtration 0, its edges and regular 3-tournaments in filtration 2 and is, roughly speaking, a refinement of the dimension filtration. The relationship between the 3-cycle filtration and dimension is more subtle. In filtration 0 one has the sub-tournaplex consisting of all transitive tournaments (in any dimension). In higher stages of the 3-cycle filtration the minimal possible dimension of an added simplex increases with the filtration value (a 3-tournament cannot have a 3-cycle filtration larger than 1), but the dimension of a simplex that is added on in any filtration value is not bounded above.

In Fig. 5 we have four digraphs whose flag tournaplexes realise the same homotopy type, but which are distinguished by the associated persistence diagrams with respect to local directionality filtration. Similarly it is easy to see that each of the four digraphs contains a different number of 3-cycles, and hence they are also distinguished by their persistence diagrams corresponding to their 3-cycle filtration. There are however examples of graphs that cannot be distinguished by local directionality or 3-cycle filtration, as in Fig. 6. Thus, a related question is whether there could be an advantage in using one of these filtrations as opposed to another, or if there is a point in using a combination of both. In Table 1 we show an example of the two non-isomorphic tournaments in Fig. 7, where the 3-cycle filtration is always contractible (see the bottom row), and where the local directionality filtration is much more interesting (see the right column). This gives a simple answer to the first question. The second will be discussed further below.

Fig. 5
figure 5

Four non-isomorphic digraphs and their local directionality filtrations reflected in their persistence diagrams. The signed degree of each vertex in each digraph is 0. Hence the global directionality filtration cannot distinguish these graphs. \(H_0\), \(H_1\) and \(H_2\) are denoted in the diagrams by blue, orange and red dots, respectively, and the numbers next to some dots correspond to rank. All four tournaplexes have the homotopy type of a 2-sphere. These graphs can however be distinguished by the homotopy type of their directed flag complexes (colour figure online)

Definition 3.5

Let K be a tournaplex. For each tournament \(\sigma \in K\), define the global directionality weight of \(\sigma \) to be \(W_K(\sigma ) = {\mathrm {Dr}}_{K_1}(V_\sigma )=\sum _{v\in V_\sigma } {\mathrm {sd}}_{K_1}(v)^2\), where \(K_1\) is the 1-skeleton of K considered as a digraph.

Fig. 6
figure 6

Four non-isomorphic digraphs and their global directionality filtrations. The numbers on the vertices denote the corresponding squared signed directionality, and the chart below each digraph is the persistence diagram corresponding to the global directionality filtration on the corresponding tournaplex. \(H_0\), \(H_1\) and \(H_2\) are denoted in the diagrams by blue, orange and red dots, respectively. The numbers next to some dots correspond to rank. Each 3-tournament in all four tournaplexes is transitive. Hence these graphs cannot be distinguished by local directionality filtration or by 3-cycle filtration (colour figure online)

The global directionality weight function is clearly an increasing filtering function and as such induces an increasing filtration on a tournaplex. Its properties are quite different from the local and the 3-cycle directionality functions in that any positive integer value can be achieved as the global directionality of a tournament in some digraph. See Fig. 6 for an example of some digraphs whose flag tournaplexes can be distinguished by the global directionality filtration, but not by the local or the 3-cycle filtrations. On the other hand, the digraphs in Fig. 5 cannot be distinguished by global directionality.

Fig. 7
figure 7

The two tournaments \({\mathcal {G}}_1\) (left) and \({\mathcal {G}}_2\) (right), whose 2D-persistence module is given in Table 1. With their common edges shaded grey, and differing edges in black (colour figure online)

Table 1 The homotopy types of all bifiltration stages of \(\mathrm {tFl}({\mathcal {G}}_1)\) and \(\mathrm {tFl}({\mathcal {G}}_2)\)

The multitude of directionality filtrations on tournaplexes suggest that two or more of them can be used in conjunction to create a combined filtration or a multi-parameter persistence module. We examined this idea with respect to local directionality and 3-cycle filtrations. Let \({\mathcal {G}}_1\) and \({\mathcal {G}}_2\) be the 8-tournaments depicted in Fig. 7. The flag tournaplexes \(\mathrm {tFl}({\mathcal {G}}_1)\) and \(\mathrm {tFl}({\mathcal {G}}_2)\) have identical 1-parameter persistence modules with respect to the 3-cycle directionality filtration and with respect to the local directionality filtration. The 3-cycle filtration stages are all contractible (see the bottom row of Table 1). The local directionality filtration is more interesting and is summarised in (3) (cf. also the right column in Table 1) below.

$$\begin{aligned} H_i(X)={\left\{ \begin{array}{ll}{[}0,2)^7\oplus [0,\infty );&{}i=0,\\ {[}2,10)^{12};&{}i=1,\\ {[}10,12)^{13}\oplus [10,20)^{14}\oplus [10,28)^8;&{}i=2,\\ {[}20,28)^5\oplus [28,36)^6\oplus [28,44)^{16}\oplus [28,52)^6\oplus [28,60)^2;&{}i=3,\\ {[}44,62)^3\oplus [52,70)^2\oplus [52,78)^6\oplus [60,86)^6\oplus [60,94)^3\oplus [60,102);&{}i=4,\\ {[}78,110)\oplus [94,134)^3\oplus [102,142)^2\oplus [102,150);&{}i=5,\\ {[}150,208);&{}i=6,\\ \end{array}\right. } \end{aligned}$$
(3)

where X is \(\mathrm {tFl}({\mathcal {G}}_j)\) with local directionality filtration, for \(j=1,2\).

However taking the two filtrations together yields two distinct 2-parameter persistence modules. Considering the bifiltration of the geometric realisation of each tournaplex, and applying the homotopy theoretic techniques used in Govc (2020), we were able to compute the homotopy types of each bifiltration pair which are displayed in Table 1. The difference between \(\mathrm {tFl}({\mathcal {G}}_1)\) and \(\mathrm {tFl}({\mathcal {G}}_2)\) reveals itself in bifiltration (44, 3). In both cases we did not compute the full persistence module structure (namely the maps between bifiltration stages). The point of this calculation was to show that there are examples that are not distinguishable by local directionality or by 3-cycle filtration alone, but a combination of the two does separate them. Similarly one can construct a 1-parameter filtration by taking a function of two variables combining the two filtration functions. For example, in our case, consider

$$\begin{aligned} f(\sigma ) \overset{\text {def}}{=}\max \{3W_{{\mathrm {Dr}}}(\sigma ), 44 W_{c_3}(\sigma )\}. \end{aligned}$$

Then \(f(\sigma )\le 132\) if and only if \(W_{{\mathrm {Dr}}}(\sigma )\le 44\) and \(W_{c_3}(\sigma )\le 3\). It is then easy to check, given the data in Table 1, that the filtration function f will distinguish the tournaments in Fig. 7 by 1-parameter persistence.

4 Applications

The flag complex associated to a graph reveals higher order properties of the graph as they are encoded in its topology. Similarly, the directed flag complex does the same for digraphs. However, as already pointed out, the directed flag complex “sees” tournaments that are not transitive in the graph only by neglecting them, and as such they may or may not be expressed as homology classes in the resulting complex. We introduced the flag tournaplex, motivated exactly by the idea that it will contain more information about the digraph in question than the directed flag complex. This point is clear even without any empirical evidence since in any tournaplex X, the subcomplex of transitive tournaments appears as the 0-th stage in the 3-cycle filtration. Thus if X is a flag tournaplex, then any higher filtration can only add on the information present in the directed flag complex. In this section we demonstrate by several examples that this is indeed the case.

We consider the flag tournaplexes arising from two data sets. The first is a set of directed Erdős-Rényi type digraphs, and the second is a collection of activity simulation data from the Blue Brain Project that was used in Reimann et al. (2017). In both cases we consider a set of digraphs divided into subsets by some known parameters, and examine the capability of the topological metrics one can associate to flag tournaplexes with the local directionality filtration to cluster the data into distinct classes, and compare the performance to that of the metrics associated to the directed flag complex.

Given a collection \({\mathbb {G}}\) of digraphs we wish to partition these graphs into groups of graphs with similar properties. In order to do this we must first map each graph \({{\mathcal {G}}\in {\mathbb {G}}}\) to a feature vector \(V({\mathcal {G}})\) of length k for a suitable positive integer k. We shall produce such a vector by two methods. The first is based on a computation of the Betti numbers of directed flag complexes of the graphs in question. The second uses the persistent homology of the flag tournaplex \(\mathrm {tFl}({\mathcal {G}})\) with the local directionality filtration. Each method produces a matrix with as many rows as the number of graphs to be clustered and k columns. The rows of these matrices are fed into a k-means clustering algorithm to produce our results, as explained below.

Definition 4.1

For a digraph \({\mathcal {G}}\), represent the persistent homology of the flag tournaplex \(\mathrm {tFl}({\mathcal {G}})\) with respect to the local directionality filtration as a list of triples (mbd), corresponding to (dimensionbirthdeath). Let \(T^\ell ({\mathcal {G}})\) denote the resulting data set. Set \({\widehat{T}}^\ell ({\mathcal {G}})\) as the set of the unique triples in \(T^\ell ({\mathcal {G}})\) and \(N_{{\mathcal {G}}}(m,b,d)\) as the number of times the triple (mbd) appears in \(T^\ell ({\mathcal {G}})\).

4.1 Erdős-Rényi type digraphs with varying local directionality distributions

In this section we describe a validation experiment we ran where we attempt to show that using local directionality filtration and persistent homology provides greater discriminatory power than that of Betti numbers of the associated directed flag complex for a fixed artificial dataset. This is a data set of random graphs denoted \({\mathcal {G}}(n,p,q)\) where n is the number of vertices, indexed \(1,\ldots , n\), and p and q are probability parameters. An instantiation of \({\mathcal {G}}(n,p,q)\) is generated by adding a directed edge (ij) with probability

$$\begin{aligned} {\left\{ \begin{array}{ll} p,&{}\text{ if } i>j\\ q,&{}\text{ if } i<j\end{array}\right. }. \end{aligned}$$

By varying the probability for different orientations we can control the frequency of tournaments with different local directionality.

We constructed 50 instantiations in 4 groups with \(n=250\), \(p=0.25\) and \(q = 0, 0.025, 0.05\) and 0.075, a total of 200 graphs, which we denote by \({\mathbb {G}}_{ER}\). We use small values for q so that the number of transitive tournaments remains largely unchanged, thus causing minimal difference to the directed flag complexes. To check that this procedure achieved different distributions of local directionality, we computed the average distribution by dimension and directionality. The result is summarised in Fig. 8. We then applied Algorithm 1 to produce the matrix \(V^{6}_\mathcal {T}({\mathbb {G}}_{ER})\).

Remark 4.2

The aim of Algorithm 1 is to use persistent homology of tournaplexes to extract the key functional information from the spike trains and express said data as a feature vector that can be used for classification. We begin by applying persistent homology and expressing the resulting information as a vector, this is only possible since we know the directionality filtration has a finite number of possible values. However, the dimension of the resulting vector is too high to apply clustering algorithms. So the remaining steps of Algorithm 1 are a form of feature selection designed to extract the most important entries of the vector. This is done by using standard deviation and selecting the filtration value and time bin pairs which vary the most, the intuition being that these entries are the most likely to vary between input, thus giving the best classification accuracy.

Fig. 8
figure 8

Distribution of local directionalities as a function of the probability parameter q, normalised by connection density. The x-axis is labeled by Frequency/Density. The different colours correspond to the number n of vertices in a tournament, with blue corresponding to a single vertex and pink to 7 vertices. The vertical axis is labeled by the possible local directionality weights in each dimension (colour figure online)

figure b

Let \(V_\beta ^6({\mathbb {G}}_{ER})\) denote the matrix where the i-th row is the vector \(V_\beta ^6({\mathcal {G}}_i)\overset{\text {def}}{=}[\beta _0,\beta _1,\ldots ,\beta _5]\) of Betti numbers of the directed flag complex of \({\mathcal {G}}_i\in {\mathbb {G}}_{ER}\). Next, we applied k-means clustering to the rows of each of the matrices \(V_\mathcal {T}^6({\mathbb {G}}_{ER})\) and \(V_\beta ^6({\mathbb {G}}_{ER})\), using Python package scikit-learn (Pedregosa et al. 2011). The results are displayed in Fig. 9. Comparing the results to the distribution of directionalities in Fig. 8, one notices that the distribution of transitive tournaments remains roughly the same regardless of q. Hence, it is to be expected that the Betti numbers of the directed flag complex will perform poorly in clustering the four families, which is indeed the case. By contrast, the associated tournaplexes, filtered by local directionality give almost perfect separation, which is particularly remarkable in the cases \(q=0.05\) and \(q=0.075\) that give very similar distributions in Fig. 8. This demonstrates that the topology of the tournaplex holds more information about the orientation of the edges in a digraph, compared to the directed flag complex.

Remark 4.3

We select \(d=6\) in this computation for two reasons. Firstly, the Betti numbers are always zero for dimension 6 and above in the directed flag complex. Secondly, for k-means clustering to give reliable results we require that the size of the vectors to be clustered is significantly smaller than the size of the data set. Also, we use k-means clustering as it is a simple well known technique, but similar results are obtained using decision tree learning or linear discriminant analysis, as verified by Henri Riihimäki.

Fig. 9
figure 9

The cluster assignment of k-means clustering applied to the columns of the matrices \(V^6_{\mathcal {T}}({\mathbb {G}})\) (top) and \(V^6_\beta ({\mathbb {G}})\) (bottom), where \({\mathbb {G}}\) is a set of 200 directed Erdős-Rényi type graphs with parameters \(p=0.25\) and varying q. The four colours represent the different clusters, each row corresponds to a different value of q (colour figure online)

4.2 Brain activity simulation digraphs

The data in this example was taken from Blue Brain Project’s reconstruction of the neocortical column of a juvenile rat (Markram et al. 2015). The reconstruction is a digital model of brain tissue based on basic biological principles. The particular model we used contains approximately 31,000 digitally simulated neurons with roughly 8 million synaptic connections. There are 55 morphological types of neurons simulated in the column, in correspondence with known biological classification, and the column is built in 6 layers, again following known biological principles. With the possible exception of more recent developments by the Blue Brain Project team, this model is the most accurate approximation of real brain tissue currently available. In particular it allows researchers to obtain full connectivity data of the neural microcircuitry, and can be stimulated by spike trains that are either recorded in vivo or artificially generated.

The data set we use here was used in Reimann et al. (2017) to demonstrate the capability of the homology of the directed flag complex to express certain properties of the simulations. In this data set nine different stimuli were applied to the Blue Brain Project microcircuit. These stimuli, which take the form of spike trains that are injected directly into the digital tissue, can be distinguished by two properties, their class and their grouping, with three possible values for each property. Each class, denoted 5, 15 or 30, represents a different temporal input of the stimulus, and each grouping, denoted a, b or c, represents a different spatial input of the stimulus. See (Reimann et al. 2017, Fig. 4a) for further information about the stimuli. Rather than using the entire microcircuit, we extracted spike trains only from L5TTPC1 neurons (Markram et al. 2015). The code name stands for Layer 5 Thick Tufted Pyramidal Cell of Type 1, and these neurons are one of the most prevalent morphological types in the neocortex and are assumed to be of major significance in information processing in the brain. The structural data is publicly available at Blue Brain Project (2020).

figure c

We consider five repetitions of each experiment, referred to as seed s, for \(s=1,\ldots ,5\). Thus we have 45 distinct data sets, each consisting of a 250 ms spike train, that is a list of pairs (tg) where t is a time (in resolution of 0.1 ms) and g is a neuron (vertex) that spiked at time t. Each data set is then converted to a sequence of digraphs, using the transmission-response method introduced in Reimann et al. (2017). The construction relies on three items of input: (1) the underlying structural graph \({\mathcal {G}}\), (2) a spike train data set D and( 3) a pair of integers \(t_1\) and \(t_2\). We then split the spike train into time bins of size \(t_1\)-ms, and construct a graph \({\mathcal {G}}_{r}\) for each time bin, where \({\mathcal {G}}_{r}\) has the same vertices of \({\mathcal {G}}\) and the edge (ij) if all of the following conditions hold:

  1. (a)

    (ij) is an edge in \({\mathcal {G}}\),

  2. (b)

    neuron i fired at time \(t_0\) in the r-th time bin, and

  3. (c)

    neuron j fired at time \(t_0<t\le t_0+t_2\).

See (Reimann et al. 2017) for further background on transmission-response graphs.

figure d
Fig. 10
figure 10

The cluster assignment of k-means clustering applied to \({\widehat{V}}^6_{\mathcal {T}}({\mathbb {D}})\) (top) and \({\widehat{V}}^6_\beta ({\mathbb {D}})\) (bottom), where \({\mathbb {D}}\) is a set of spike trains on the Blue Brain microcircuit reconstruction

A typical collection \({\mathbb {D}}\) of spike train data sets consists of n distinct instantiations, denoted by \(D_r\), \(r=1,\ldots , n\), each of length L ms. Out of this collection we produce an \(n\times m\) feature matrix \({\widehat{V}}^m_{\mathcal {T}}({\mathbb {D}})\), for some natural number m. The rows of this matrix are then fed into a k-means clustering algorithm. To obtain \({\widehat{V}}^m_{\mathcal {T}}({\mathbb {D}})\) we apply Algorithm 2, which is a similar procedure to Algorithm 1, but is adapted for use with functional data. As noted above, for the collection \({\mathbb {D}}\) at our disposal we have \(n=45\) and \(L=250\), and we set \(m=6\) so that the size of our vectors is suitable compared to the size of the data set (see Remark 4.3).

For the directed flag complex we compute \({\widehat{V}}^m_\beta ({\mathbb {D}})\) by applying Algorithm 3, which is very similar to Algorithm 2, but using the Betti numbers instead of persistent pairs. We only use the first three Betti numbers (\(d=3\) in the algorithm), as in this data set all higher Betti numbers were zero.

Once we have computed both \({\widehat{V}}^6_\beta ({\mathbb {D}})\) and \({\widehat{V}}^6_{\mathcal {T}}({\mathbb {D}})\), we apply k-means clustering to their rows. The results of this are displayed in Fig. 10. Here too one sees that the data is split into three classes almost perfectly using k-means clustering on the corresponding flag tournaplexes filtered by local directionality, while applying the analogous clustering method on the Betti numbers of the directed flag complexes yields poorer separation. This suggests once more that the tournaplex construction stores different, and possibly more detailed information about the network than the directed flag complex.