evomap.mapping._sammon#

Nonlinear Sammon Mapping, as proposed in:

Sammon, J. W. (1969). A nonlinear mapping for data structure analysis. IEEE Transactions on computers, 100(5), 401-409.

Classes#

Functions#

_check_prepare_input_sammon(D)

Check and prepare the input distance matrix for Sammon Mapping.

_sammon_stress_function(positions, disparities[, ...])

Compute the Sammon stress function and its gradient.

_sammon_stress_gradient(Y, D_map, D)

Compute the gradient of the Sammon stress function.

Module Contents#

class evomap.mapping._sammon.Sammon(n_dims=2, n_iter=2000, n_iter_check=50, init=None, verbose=0, input_type='distance', max_halves=5, tol=0.001, n_inits=1, step_size=1)[source]#
n_dims = 2#
n_iter = 2000#
n_iter_check = 50#
init = None#
verbose = 0#
input_type = 'distance'#
max_halves = 5#
tol = 0.001#
n_inits = 1#
step_size = 1#
method_str = 'SAMMON'#
__str__()[source]#

Return a string representation of the Sammon instance with input_type and user-modified parameters.

fit(X)[source]#

Fit the Sammon model to the input data.

Parameters:

X (np.array of shape (n_samples, n_features) or (n_samples, n_samples)) – The input data. If input_type is ‘vector’, X should be the feature vectors of the samples. If input_type is ‘distance’, X should be the pairwise distance matrix.

Returns:

self – Returns the instance of the Sammon class with the configuration matrix Y_ stored as an attribute.

Return type:

object

fit_transform(X)[source]#

Fit the Sammon mapping model and return the transformed coordinates.

Parameters:

X (np.array of shape (n_samples, n_features) or (n_samples, n_samples)) – The input data. If input_type is ‘vector’, X should be the feature vectors of the samples. If input_type is ‘distance’, X should be the pairwise distance matrix.

Returns:

The transformed coordinates of the samples in the reduced-dimensional space.

Return type:

np.array of shape (n_samples, n_dims)

Raises:

ValueError – If the input_type is not ‘distance’ or ‘vector’.

evomap.mapping._sammon._check_prepare_input_sammon(D)[source]#

Check and prepare the input distance matrix for Sammon Mapping.

This function ensures that the input distance matrix is valid for Sammon Mapping. It checks for strictly positive off-diagonal dissimilarities and adds a small diagonal correction if necessary.

Parameters:

D (ndarray of shape (n_samples, n_samples)) – Input distance matrix. Should be a square, symmetric matrix representing pairwise dissimilarities between samples.

Returns:

Prepared distance matrix with a diagonal correction applied.

Return type:

ndarray of shape (n_samples, n_samples)

Raises:

ValueError – If any off-diagonal entries in the distance matrix are non-positive.

evomap.mapping._sammon._sammon_stress_function(positions, disparities, compute_error=True, compute_grad=True)[source]#

Compute the Sammon stress function and its gradient.

The Sammon stress function measures the discrepancy between the input distances (disparities) and the distances among the estimated positions in the reduced space. Optionally, it can also compute the gradient of the stress function for optimization purposes.

Parameters:
  • positions (ndarray of shape (n_samples, n_dims)) – The estimated positions of the samples in the reduced-dimensional space.

  • disparities (ndarray of shape (n_samples, n_samples)) – The input distance (or dissimilarity) matrix.

  • compute_error (bool, optional) – Whether to compute the stress (error) value, by default True.

  • compute_grad (bool, optional) – Whether to compute the gradient of the stress function, by default True.

Returns:

  • float or None – The Sammon stress value (cost), or None if compute_error is False.

  • ndarray or None – The gradient of the stress function, or None if compute_grad is False.

evomap.mapping._sammon._sammon_stress_gradient(Y, D_map, D)[source]#

Compute the gradient of the Sammon stress function.

Parameters:
  • Y (ndarray of shape (n_samples, n_dims)) – The current positions of the samples in the reduced-dimensional space.

  • D_map (ndarray of shape (n_samples, n_samples)) – The pairwise Euclidean distances between the estimated positions.

  • D (ndarray of shape (n_samples, n_samples)) – The input distance (or dissimilarity) matrix.

Returns:

The computed gradient of the Sammon stress function, flattened for optimization.

Return type:

ndarray of shape (n_samples * n_dims)