We have been talking about the elements of topological data analysis. In my previous post, I introduced simplicial complexes, concerning the ways to connect points together. In topology, it is the shape and geometry, not distances, which matter ( although while constructing the distance does play a role).

With the simplicial complexes, we can go ahead to describe its topology. We will use the techniques in algebraic topology without going into too much details. The techniques involves homology, but a full explanation of it requires the concepts of normal subgroup, kernel, image, quotient group in group theory. I will not talk about them, although I admit that there is no easy ways to talk about computational topology without touching them. I highly recommend the readers can refer to Zomorodian’s textbook for more details. [Zomorodian 2009]

I will continue with the Python class

SimplicialComplex

that I wrote in the previous blog post. Suppose we have an k-simplex, then the n-th face is any combinations with n+1 vertices. A simplicial complex is such that a face contained in a face is also a face of the complex. In this, we can define the boundary operator by

$\partial_k \sigma = \sum_i (-1)^i [v_0 v_1 \ldots \hat{v}_i \ldots v_k]$,

where $\hat{v}_i$ indicates the i-th vertex be removed. This operator gives all the boundary faces of a face $\sigma$. The faces being operated are k-faces, and this operator will be mapped to a (k-1)-faces. Then the boundary operator can be seen as a $(n_k \times n_{k-1})$-matrix, where $n_k$ is the number of k-faces. This can be easily calculated with the following method:

class SimplicialComplex:
...
def boundary_operator(self, i):
source_simplices = self.n_faces(i)
target_simplices = self.n_faces(i-1)

if len(target_simplices)==0:
S = dok_matrix((1, len(source_simplices)), dtype=np.float32)
S[0, 0:len(source_simplices)] = 1
else:
source_simplices_dict = {}
for j in range(len(source_simplices)):
source_simplices_dict[source_simplices[j]] = j
target_simplices_dict = {}
for i in range(len(target_simplices)):
target_simplices_dict[target_simplices[i]] = i

S = dok_matrix((len(target_simplices), len(source_simplices)), dtype=np.float32)
for source_simplex in source_simplices:
for a in range(len(source_simplex)):
target_simplex = source_simplex[:a]+source_simplex[(a+1):]
i = target_simplices_dict[target_simplex]
j = source_simplices_dict[source_simplex]
S[i, j] = -1 if a % 2==1 else 1 # S[i, j] = (-1)**a
return S


With the boundary operator, we can calculate the Betti numbers that characterize uniquely the topology of the shapes. Actually it involves the concept of homology groups that we are going to omit. To calculate the k-th Betti numbers, we calculate:

$\beta_k = \text{rank} (\text{ker} \partial_k) - \text{rank} (\text{Im} \partial_{k+1})$.

By rank-nullity theorem, [Jackson]

$\text{rank} (\text{ker} \partial_k) +\text{rank} (\text{Im} \partial_k) = \text{dim} (\partial_k)$

the Betti number is then

$\beta_k = \text{dim} (\partial_k) - \text{rank}(\text{Im} \partial_k)) - \text{rank} (\text{Im} \partial_{k+1})$

where the rank of the image of an operator can be easily computed using the rank method available in numpy. Then the method of calculating the Betti number is

class SimplicialComplex:
...
def betti_number(self, i):
boundop_i = self.boundary_operator(i)
boundop_ip1 = self.boundary_operator(i+1)

if i==0:
boundop_i_rank = 0
else:
try:
boundop_i_rank = np.linalg.matrix_rank(boundop_i.toarray())
except np.linalg.LinAlgError:
boundop_i_rank = boundop_i.shape[1]
try:
boundop_ip1_rank = np.linalg.matrix_rank(boundop_ip1.toarray())
except np.linalg.LinAlgError:
boundop_ip1_rank = boundop_ip1.shape[1]

return ((boundop_i.shape[1]-boundop_i_rank)-boundop_ip1_rank)


If we draw a simplicial complex on a 2-dimensional plane, we almost have $\beta_0$, $\beta_1$ and $\beta_2$. $\beta_0$ indicates the number of components, $\beta_1$ the number of bases for a tunnel, and $\beta_2$ the number of voids.

Let’s have some examples. Suppose we have a triangle, not filled.

e1 = [(0, 1), (1, 2), (2, 0)]
sc1 = SimplicialComplex(e1)


Then the Betti numbers are:


In [5]: sc1.betti_number(0)
Out[5]: 1

In [6]: sc1.betti_number(1)
Out[6]: 1

In [7]: sc1.betti_number(2)
Out[7]: 0


Let’s try another example with multiple components.

e2 = [(1,2), (2,3), (3,1), (4,5,6), (6,7), (7,4)]
sc2 = SimplicialComplex(e2)


We can graphically represent it using networkx:

import networkx as nx
import matplotlib.pyplot as plt
n2 = nx.Graph()
nx.draw(n2)
plt.show()


And its Betti numbers are as follow:


In [13]: sc2.betti_number(0)
Out[13]: 2

In [14]: sc2.betti_number(1)
Out[14]: 2

In [15]: sc2.betti_number(2)
Out[15]: 0


A better illustration is the Wolfram Demonstration, titled “Simplicial Homology of the Alpha Complex”.

On top of the techniques in this current post, we can describe the homology of discrete points using persistent homology, which I will describe in my future posts. I will probably spend a post on homotopy in comparison to other types of quantitative problems.