evomap.metrics#

Module for evaluating maps.

Module Contents#

Functions#

misalign_score(X_t[, normalize])

Calculate misalignment of a sequence of maps.

align_score(X_t)

Calculate alignment of a sequence of maps.

hitrate_score(X, D[, n_neighbors, inc, input_format])

Calculate the Hitrate of nearest neighbor recovery for a single map. The

adjusted_hitrate_score(X, D[, n_neighbors, inc, ...])

Calculate the Hitrate of nearest neighbor recovery for a single map, adjusted

avg_hitrate_score(X_t, D_t[, n_neighbors, inc_t, ...])

Calculate the average Hitrate of nearest neighbor recovery for a sequence of maps.

avg_adjusted_hitrate_score(X_t, D_t[, n_neighbors, ...])

Calculate the average Adjusted Hitrate of nearest neighbor recovery for a sequence of maps,

persistence_score(X_t)

Calculate persistence of a sequence of maps as the average Pearson correlation coefficient

evomap.metrics.misalign_score(X_t, normalize=True)[source]#

Calculate misalignment of a sequence of maps.

Misalignment is measured as the average Euclidean distance between objects’ subsequent map positions. The final score is averaged across all objects.

Parameters:
  • X_t (list of ndarrays, each of shape (n_samples, n_dims)) – Map coordinates.

  • normalize (bool, optional) – If True, misalignment is normalized by the average interobject distance on the map. Useful for comparing maps across differently scaled coordinate systems, by default True.

Returns:

Misalignment score, bounded within [0, inf). Lower values indicate better alignment.

Return type:

float

evomap.metrics.align_score(X_t)[source]#

Calculate alignment of a sequence of maps.

Alignment is measured as the mean cosine similarity of objects’ subsequent map positions. The final score is averaged across all objects. Cosine similarity measures the cosine of the angle between two vectors, thus providing a scale-invariant metric of similarity.

Parameters:

X_t (list of ndarray, each of shape (n_samples, n_dims)) – Sequence of map coordinates.

Returns:

Alignment score, bounded between [-1,1]. Higher values indicate better alignment.

Return type:

float

evomap.metrics.hitrate_score(X, D, n_neighbors=10, inc=None, input_format='dissimilarity')[source]#

Calculate the Hitrate of nearest neighbor recovery for a single map. The score is averaged across all objects.

Parameters:
  • X (ndarray of shape (n_samples, d)) – Map coordinates.

  • D (ndarray) – Input data, either a similarity/dissimilarity matrix of shape (n_samples, n_samples), or a matrix of feature vectors of shape (n_samples, d_input).

  • n_neighbors (int, optional) – Number of neighbors considered when calculating the hitrate, by default 10.

  • inc (ndarray of shape (n_samples,), optional) – Inclusion array, indicating if an object is present (via 0 and 1s), by default None.

  • input_format (str, optional) – One of ‘vector’, ‘similarity’, or ‘dissimilarity’, indicating the type of the input D, by default ‘dissimilarity’.

Returns:

Hitrate of nearest neighbor recovery, bounded within [0,1]. Higher values indicate better recovery.

Return type:

float

Raises:

ValueError – If the input dimensions mismatch or unsupported input format is provided.

evomap.metrics.adjusted_hitrate_score(X, D, n_neighbors=10, inc=None, input_format='dissimilarity')[source]#

Calculate the Hitrate of nearest neighbor recovery for a single map, adjusted for random agreement. The score is averaged across all objects.

Parameters:
  • X (ndarray) – Map coordinates, shape (n_samples, d).

  • D (ndarray) – Input data, either a similarity/dissimilarity matrix of shape (n_samples, n_samples), or a matrix of feature vectors of shape (n_samples, d_input).

  • n_neighbors (int, optional) – Number of neighbors considered when calculating the hitrate, by default 10.

  • inc (ndarray of shape (n_samples,), optional) – Inclusion array, indicating if an object is present (via 0 and 1s), by default None.

  • input_format (str, optional) – One of ‘vector’, ‘similarity’, or ‘dissimilarity’, by default ‘dissimilarity’.

Returns:

Adjusted Hitrate of nearest neighbor recovery, bounded within [0,1]. Higher values indicate better recovery. Adjusted hitrate corrects the raw hitrate by the expected hitrate due to chance.

Return type:

float

Raises:

ValueError – If parameters are out of expected range or input dimensions mismatch.

evomap.metrics.avg_hitrate_score(X_t, D_t, n_neighbors=10, inc_t=None, input_format='dissimilarity')[source]#

Calculate the average Hitrate of nearest neighbor recovery for a sequence of maps. The score is averaged across all maps within the sequence.

Parameters:
  • X_t (list of ndarray) – List of map coordinates for each time period, each of shape (n_samples, d).

  • D_t (list of ndarray) – List of input data matrices for each time period, each either a similarity/dissimilarity matrix of shape (n_samples, n_samples), or a matrix of feature vectors of shape (n_samples, d_input).

  • n_neighbors (int, optional) – Number of neighbors considered when calculating the hitrate for each map, by default 10.

  • inc_t (list of ndarray, optional) – List of inclusion arrays for each time period, each indicating if an object is present (via 0 and 1s), by default None. If provided, each should match the number of samples in the corresponding X and D.

  • input_format (str, optional) – Specifies the input format of D_t, one of ‘vector’, ‘similarity’, or ‘dissimilarity’, by default ‘dissimilarity’.

Returns:

Average hitrate of nearest neighbor recovery, bounded between [0,1]. Higher values indicate better recovery.

Return type:

float

Raises:

ValueError – If there are inconsistencies in array sizes or unsupported input format is specified.

evomap.metrics.avg_adjusted_hitrate_score(X_t, D_t, n_neighbors=10, inc_t=None, input_format='dissimilarity')[source]#

Calculate the average Adjusted Hitrate of nearest neighbor recovery for a sequence of maps, adjusted for random agreement. The score is averaged across all maps within the sequence.

Parameters:
  • X_t (list of ndarray) – List of map coordinates for each time period, each of shape (n_samples, d).

  • D_t (list of ndarray) – List of input data matrices for each time period, each either a similarity/dissimilarity matrix of shape (n_samples, n_samples), or a matrix of feature vectors of shape (n_samples, d_input).

  • n_neighbors (int, optional) – Number of neighbors considered when calculating the adjusted hitrate for each map, by default 10.

  • inc_t (list of ndarray, optional) – List of inclusion arrays for each time period, each indicating if an object is present (via 0 and 1s), by default None. If provided, each should match the number of samples in the corresponding X and D.

  • input_format (str, optional) – Specifies the input format of D_t, one of ‘vector’, ‘similarity’, or ‘dissimilarity’, by default ‘dissimilarity’.

Returns:

Average adjusted hitrate of nearest neighbor recovery, bounded between [0,1]. Higher values indicate better recovery.

Return type:

float

Raises:

ValueError – If there are inconsistencies in array sizes or unsupported input format is specified.

evomap.metrics.persistence_score(X_t)[source]#

Calculate persistence of a sequence of maps as the average Pearson correlation coefficient between objects’ subsequent map movements (i.e., the first differences of their map positions). The score is averaged across all objects.

Parameters:

X_t (list of ndarrays, each of shape (n_samples, n_dims)) – Sequence of map coordinates. Each ndarray represents a map at a different time.

Returns:

Persistence score, bounded within (-1,1). Higher positive values indicate higher persistence of map movements across time periods.

Return type:

float

Raises:

ValueError – If fewer than two maps are provided or if maps do not have consistent dimensions.