evomap.metrics
#
Module for evaluating maps.
Module Contents#
Functions#

Calculate misalignment of a sequence of maps. 

Calculate alignment of a sequence of maps. 

Calculate the Hitrate of nearest neighbor recovery for a single map. The 

Calculate the Hitrate of nearest neighbor recovery for a single map, adjusted 

Calculate the average Hitrate of nearest neighbor recovery for a sequence of maps. 

Calculate the average Adjusted Hitrate of nearest neighbor recovery for a sequence of maps, 

Calculate persistence of a sequence of maps as the average Pearson correlation coefficient 
 evomap.metrics.misalign_score(X_t, normalize=True)[source]#
Calculate misalignment of a sequence of maps.
Misalignment is measured as the average Euclidean distance between objects’ subsequent map positions. The final score is averaged across all objects.
 Parameters:
X_t (list of ndarrays, each of shape (n_samples, n_dims)) – Map coordinates.
normalize (bool, optional) – If True, misalignment is normalized by the average interobject distance on the map. Useful for comparing maps across differently scaled coordinate systems, by default True.
 Returns:
Misalignment score, bounded within [0, inf). Lower values indicate better alignment.
 Return type:
float
 evomap.metrics.align_score(X_t)[source]#
Calculate alignment of a sequence of maps.
Alignment is measured as the mean cosine similarity of objects’ subsequent map positions. The final score is averaged across all objects. Cosine similarity measures the cosine of the angle between two vectors, thus providing a scaleinvariant metric of similarity.
 Parameters:
X_t (list of ndarray, each of shape (n_samples, n_dims)) – Sequence of map coordinates.
 Returns:
Alignment score, bounded between [1,1]. Higher values indicate better alignment.
 Return type:
float
 evomap.metrics.hitrate_score(X, D, n_neighbors=10, inc=None, input_format='dissimilarity')[source]#
Calculate the Hitrate of nearest neighbor recovery for a single map. The score is averaged across all objects.
 Parameters:
X (ndarray of shape (n_samples, d)) – Map coordinates.
D (ndarray) – Input data, either a similarity/dissimilarity matrix of shape (n_samples, n_samples), or a matrix of feature vectors of shape (n_samples, d_input).
n_neighbors (int, optional) – Number of neighbors considered when calculating the hitrate, by default 10.
inc (ndarray of shape (n_samples,), optional) – Inclusion array, indicating if an object is present (via 0 and 1s), by default None.
input_format (str, optional) – One of ‘vector’, ‘similarity’, or ‘dissimilarity’, indicating the type of the input D, by default ‘dissimilarity’.
 Returns:
Hitrate of nearest neighbor recovery, bounded within [0,1]. Higher values indicate better recovery.
 Return type:
float
 Raises:
ValueError – If the input dimensions mismatch or unsupported input format is provided.
 evomap.metrics.adjusted_hitrate_score(X, D, n_neighbors=10, inc=None, input_format='dissimilarity')[source]#
Calculate the Hitrate of nearest neighbor recovery for a single map, adjusted for random agreement. The score is averaged across all objects.
 Parameters:
X (ndarray) – Map coordinates, shape (n_samples, d).
D (ndarray) – Input data, either a similarity/dissimilarity matrix of shape (n_samples, n_samples), or a matrix of feature vectors of shape (n_samples, d_input).
n_neighbors (int, optional) – Number of neighbors considered when calculating the hitrate, by default 10.
inc (ndarray of shape (n_samples,), optional) – Inclusion array, indicating if an object is present (via 0 and 1s), by default None.
input_format (str, optional) – One of ‘vector’, ‘similarity’, or ‘dissimilarity’, by default ‘dissimilarity’.
 Returns:
Adjusted Hitrate of nearest neighbor recovery, bounded within [0,1]. Higher values indicate better recovery. Adjusted hitrate corrects the raw hitrate by the expected hitrate due to chance.
 Return type:
float
 Raises:
ValueError – If parameters are out of expected range or input dimensions mismatch.
 evomap.metrics.avg_hitrate_score(X_t, D_t, n_neighbors=10, inc_t=None, input_format='dissimilarity')[source]#
Calculate the average Hitrate of nearest neighbor recovery for a sequence of maps. The score is averaged across all maps within the sequence.
 Parameters:
X_t (list of ndarray) – List of map coordinates for each time period, each of shape (n_samples, d).
D_t (list of ndarray) – List of input data matrices for each time period, each either a similarity/dissimilarity matrix of shape (n_samples, n_samples), or a matrix of feature vectors of shape (n_samples, d_input).
n_neighbors (int, optional) – Number of neighbors considered when calculating the hitrate for each map, by default 10.
inc_t (list of ndarray, optional) – List of inclusion arrays for each time period, each indicating if an object is present (via 0 and 1s), by default None. If provided, each should match the number of samples in the corresponding X and D.
input_format (str, optional) – Specifies the input format of D_t, one of ‘vector’, ‘similarity’, or ‘dissimilarity’, by default ‘dissimilarity’.
 Returns:
Average hitrate of nearest neighbor recovery, bounded between [0,1]. Higher values indicate better recovery.
 Return type:
float
 Raises:
ValueError – If there are inconsistencies in array sizes or unsupported input format is specified.
 evomap.metrics.avg_adjusted_hitrate_score(X_t, D_t, n_neighbors=10, inc_t=None, input_format='dissimilarity')[source]#
Calculate the average Adjusted Hitrate of nearest neighbor recovery for a sequence of maps, adjusted for random agreement. The score is averaged across all maps within the sequence.
 Parameters:
X_t (list of ndarray) – List of map coordinates for each time period, each of shape (n_samples, d).
D_t (list of ndarray) – List of input data matrices for each time period, each either a similarity/dissimilarity matrix of shape (n_samples, n_samples), or a matrix of feature vectors of shape (n_samples, d_input).
n_neighbors (int, optional) – Number of neighbors considered when calculating the adjusted hitrate for each map, by default 10.
inc_t (list of ndarray, optional) – List of inclusion arrays for each time period, each indicating if an object is present (via 0 and 1s), by default None. If provided, each should match the number of samples in the corresponding X and D.
input_format (str, optional) – Specifies the input format of D_t, one of ‘vector’, ‘similarity’, or ‘dissimilarity’, by default ‘dissimilarity’.
 Returns:
Average adjusted hitrate of nearest neighbor recovery, bounded between [0,1]. Higher values indicate better recovery.
 Return type:
float
 Raises:
ValueError – If there are inconsistencies in array sizes or unsupported input format is specified.
 evomap.metrics.persistence_score(X_t)[source]#
Calculate persistence of a sequence of maps as the average Pearson correlation coefficient between objects’ subsequent map movements (i.e., the first differences of their map positions). The score is averaged across all objects.
 Parameters:
X_t (list of ndarrays, each of shape (n_samples, n_dims)) – Sequence of map coordinates. Each ndarray represents a map at a different time.
 Returns:
Persistence score, bounded within (1,1). Higher positive values indicate higher persistence of map movements across time periods.
 Return type:
float
 Raises:
ValueError – If fewer than two maps are provided or if maps do not have consistent dimensions.