Transforms the challenge of understanding high-dimensional data into graph analysis - a more intuitive domain for human interpretation
Given: Dataset \( X \subset \mathbb{R}^n \) and filter function \( f: X \to \mathbb{R}^d \)
\( f: X \to \mathbb{R}^d \) projects data to lower dimensions: PCA (most common - 47% usage), t-SNE, UMAP, MDS, domain-specific functions, etc
Create overlapping intervals \(U_i\) covering \(f(X)\): uniform vs. balanced covers. Overlap percentage (typically 20-50%). Adaptive methods emerging.
Cluster points in \(f^{-1}(U_i)\) for each interval: HACA (most popular), DBSCAN (density-based), custom algorithms
Connect clusters sharing data points: nodes = clusters, edges = shared points \(\Rightarrow\) topological graph \(G = (N, E)\)
Given graphs \(G_1, G_2 \) with WL equivalence, construct covers \(U_1, U_2\) such that:
| Class | Precision | Recall | F1-score | Support |
|---|---|---|---|---|
| knot0009 | 1.00 | 1.00 | 1.00 | 28 |
| knot0020 | 0.92 | 1.00 | 0.96 | 24 |
| knot0021 | 1.00 | 0.80 | 0.89 | 10 |
| knot0034 | 1.00 | 0.92 | 0.96 | 24 |
| knot0035 | 0.88 | 1.00 | 0.93 | 14 |
| accuracy | 0.96 | 100 | ||
| macro avg | 0.96 | 0.94 | 0.95 | 100 |
| weighted avg | 0.96 | 0.96 | 0.96 | 100 |
| Class | Precision | Recall | F1-score | Support |
|---|---|---|---|---|
| knot0009 | 1.00 | 0.60 | 0.75 | 25 |
| knot0020 | 1.00 | 1.00 | 1.00 | 25 |
| knot0021 | 1.00 | 1.00 | 1.00 | 25 |
| knot0034 | 1.00 | 0.96 | 0.98 | 25 |
| knot0035 | 0.69 | 1.00 | 0.82 | 25 |
| accuracy | 0.91 | 125 | ||
| macro avg | 0.94 | 0.91 | 0.91 | 125 |
| weighted avg | 0.94 | 0.91 | 0.91 | 125 |