Identify Significant Features in Persistent Homology

An empirical method (bootstrap) to differentiate between features that constitute signal versus noise based on the magnitude of their persistence relative to one another. Note: you must have at least 5 features of a given dimension to use this function.

id_significant(features, dim = 1, reps = 100, cutoff = 0.975)

Arguments

features	3xn data frame of features; the first column must be dimension, the second birth, and the third death
dim	dimension of features of interest
reps	number of replicates
cutoff	percentile cutoff past which features are considered significant

Examples

# get dataset (noisy circle) and calculate persistent homology
angles <- runif(100, 0, 2 * pi)
x <- cos(angles) + rnorm(100, mean = 0, sd = 0.1)
y <- sin(angles) + rnorm(100, mean = 0, sd = 0.1)
annulus <- cbind(x, y)
phom <- calculate_homology(annulus)

# find threshold of significance
# expecting 1 significant feature of dimension 1 (Betti-1 = 1 for annulus)
thresh <- id_significant(features = as.data.frame(phom),
                         dim = 1,
                         reps = 500,
                         cutoff = 0.975)

# generate flat persistence diagram
# every feature higher than `thresh` is significant
plot_persist(phom, flat = TRUE)

Identify Significant Features in Persistent Homology

Arguments

Examples

Contents