Extending

You can customize scipr to use your own matching or transformation functions (e.g. a complicated deep neural network) by subclassing the below Abstract base classes and overriding the necessary methods.

As a rule, the interface is functional, and so you are always required to return the new object state in functions such as fit(), instead of setting object attributes. scipr will take care of managing the object state for you.

It is recommended that you look at the source code of the provided functions in the API and use them as examples of how to implement your own.

Custom matching functions

Match base class

class scipr.matching.Match[source]

Base class for all matching function objects.

Your matching functions should also subclass this class.

match(A, B, kd_tree_B)[source]

Find matching pairs of cells between two batches.

Parameters:
  • A (numpy.ndarray) – The “source” batch of cells to align. Dimensions are (cellsA, genes).
  • B (numpy.ndarray) – The “target” (or “reference”) batch data to align to. Dimensions are (cellsB, genes).
  • kd_tree_B (scipy.spatial.ckdtree.cKDTree) – A KD tree of the B batch for fast queries, since B is the stationary “reference” batch which does not move, SCIPR computes once at the beginning and passes it along to the matching algorithm at each step.
Returns:

  • A_indices (numpy.ndarray) – Index array into A, selecting the matched cells in A. Of length S.
  • B_indices (numpy.ndarray) – Index array into B, selecting the matched cells in B. Also of length S. Each element in this array corresponds to its match in A_indices.
  • distances (numpy.ndarray) – The distances between the pairs in A_indices and B_indices, also of length S.

Custom transformation functions

Transformer base class

class scipr.transform.Transformer[source]

Base class for all transformation function objects.

Your transformation functions should also subclass this class.

fit(A, B)[source]

Return a model to transform A onto B.

Each point in A correspods to the point in B at the same index (as chosen by the matching algorithm of SCIPR). The Transformer is fitted to learn a function to move points in A closer to their corresponding point in B.

Parameters:
  • A (numpy.ndarray) – The selected “source” cells to align. Dimensions are (cells, genes).
  • B (numpy.ndarray) – The “target” cells which correspond to each of the “source” cells in A. Dimensions are the same as A, so that each row in B is a cell that is paired up with the same row (cell) in A.
Returns:

model – The fitted model parameters (state) of the transformation function. For example, weights and biases.

Return type:

dict

transform(model, A)[source]

Use the given model to transform the data.

Parameters:
  • model (dict) – The fitted model parameters (state) of the transformation function to use to transform A.
  • A (numpy.ndarray) – The cells to transform (i.e. to “align”), dimensions are (cells, genes).
Returns:

The trasformation of A, same shape as input A.

Return type:

numpy.ndarray

chain(model, step_model, step_number)[source]

Update the overall model.

Update the transformation function’s parameters with the fitted parameters of the latest step. The overall alignment function we are learning is the composition of the transformation functions learned at each step, and you need to provide the logic for how to compose these functions here.

Parameters:
  • model (dict) – The current state of the overall model parameters, before fitting the latest step.
  • step_model (dict) – The fitted model parameters (state) of the transformation function from the latest step.
  • step_number (int) – The number of the current step in the SCIPR algorithm.
Returns:

model – The updated overall model that is the fitted weights from all of prior steps, composed with the weights from the current latest step.

Return type:

dict

finalize(model, A_orig, A_final)[source]

Finalize the overall model at the end of SCIPR.

If there are any final operations necessary to fit the overall model, take them here.

Parameters:
  • model (dict) – The current state of the overall model parameters, after fitting and updating at all of the steps of SCIPR.
  • A_orig (numpy.ndarray) – The original “source” batch before the first step of SCIPR, which we are fitting to align.
  • A_final (numpy.ndarray) – The final state of the “source” batch after the last step of SCIPR, the result of transforming A_orig at each step.
Returns:

model – The final overall model that is the fitted weights from all of the steps of the SCIPR algorithm.

Return type:

dict