# Matrix Module¶

HiDi’s matrix module exposes functionality for transforming matrices.

class hidi.matrix.ApplyTransform(fn)[source]

Bases: hidi.transform.Transform

Apply a function to an input.

Takes a single argument, fn, which must be a function accepting one argument (the function to apply), and kwargs.

Parameters: fn (function) – The function to be applied to transform input.
transform(x, **kwargs)[source]
Parameters: x – The input to the function fn. Any
class hidi.matrix.SimilarityTransform(axis=0)[source]

Bases: hidi.transform.Transform

Takes the dot product of a link*item matrix.

Returns either a link*link or item*item similarity matrix. If axis is 0, an item*item matrix is returned, if axis is 1 a link*link matrix is returned. The returned matrix represents a similarity matrix.

The transform function returns a tuple containing the similarity matrix, and the links or items, depending on the axis.

Parameters: axis (int[0,1]) – The axis to perform the dot product for.
transform(M, items, links, **kwargs)[source]
Parameters: M (numpy ndarray-like) – The matrix to create a similarity matrix from items (array) – Array of item_ids in the same order that they appear in M. links (array) – Array of link_ids in the same order that they appear in M. numpy.ndarray-like
class hidi.matrix.ScalarTransform(fn=<ufunc 'log'>)[source]

Bases: hidi.transform.Transform

Scale the matrix using a function or class method.

ScalerTransform takes an fn argument that specifies the function that should be applied to the matrix. If fn is a string the scaler transform will try to call a function by that name on the matrix, if it is a function reference, scaler transform will call that function with the matrix as input.

Parameters: fn (str | function) – The scalar function to use. If fn is a string then an attribute of that name will be looked up and called. If fn is a function, that function will be called with the input given to transform.
transform(matrix_to_scale, **kwargs)[source]

Takes a matrix_to_scale as a numpy ndarray-like object and performs scaling on it, then returns the result.

Return type: Any
class hidi.matrix.SparseTransform[source]

Bases: hidi.transform.Transform

Make a sparse item*link matrix using SciPy’s sparse compressed row matrix implementation.

transform(*func_args, **func_kwargs)[source]

Takes a dataframe that has link_id, item_id and score columns.

Returns a SciPy csr_matrix.

Parameters: df (pandas.DataFrame) – The DataFrame to make a sparse matrix from. Must have link_id, item_id, and score columns. scipy.sparse.csr_matrix
class hidi.matrix.DenseTransform[source]

Bases: hidi.transform.Transform

Transform a sparse matrix to its dense representation.

transform(M, **kwargs)[source]

Takes a sparse matrix and transform it into its dense representation

Parameters: M (scipy.sparse classes) – a sparse matrix numpy.ndarray
class hidi.matrix.ItemsMatrixToDFTransform[source]

Bases: hidi.transform.Transform

Create a Pandas DataFrame object with items as the index.

transform(M, items, **kwargs)[source]

Takes a numpy ndarray-like object and a list of item identifiers to be used as the index for the DataFrame.

Return type: pandas.DataFrame
class hidi.matrix.KerasEvaluationTransform(keras_model, validation_matrix, tts_seed=42, tt_split=0.25, **keras_kwargs)[source]

Bases: hidi.transform.Transform

Generalized transform for Keras algorithm

This transform takes a Keras sequential model, a validation matrix and its keyword arugments upon initialization.

Parameters: keras_model (Keras Sequential model) – a Keras sequential model which is documented here: https://keras.io/getting-started/sequential-model-guide/ validation_matrix (pandas.DataFrame) – A validation matrix is a dataframe that has item_id index, other ‘label’ columns. It will be inner joined with the M matrix and then fed into the Keras sequential model. tts_seed (int) – random state seed for train_test_split tt_split (float) – the proportion of the dataset to include in the test split for train_test_split
transform(M, **kwargs)[source]

Takes a Takes a dataframe that has item_id index, other ‘features’ columns for prediction, and applies a Keras sequential model to it.

Parameters: M (pandas.DataFrame) – a dataframe that has an item_id index, and “features” columns a tuple with trained Keras model and its keyword arguments
class hidi.matrix.KerasKfoldTransform(keras_model, validation_matrix, kfold_n_splits=10, kfold_seed=42, kfold_shuffle=True, classification=False, **keras_kwargs)[source]

Bases: hidi.transform.Transform

Generalized transform for Keras algorithm with k fold cross validation evaluation

Parameters: keras_model (Keras Sequential model) – a Keras sequential model which is documented here: https://keras.io/getting-started/sequential-model-guide/ validation_matrix (pandas.DataFrame) – A validation matrix is a dataframe that has item_id index, other ‘label’ columns. It will be inner joined with the M matrix and then fed into the Keras sequential model. kfold_n_splits (int) – Number of folds for kfold. Must be at least 2. kfold_seed (None, int or RandomState) – random state seed for kfold kfold_shuffle (boolean) – Whether to shuffle the data before splitting into batches for kfold
transform(M, **kwargs)[source]

Takes a Takes a dataframe that has item_id index, other ‘features’ columns for prediction, and applies a Keras sequential model to it.

Parameters: M (pandas.DataFrame) – a dataframe that has an item_id index, and “features” columns. a tuple with trained Keras model and its keyword arguments
class hidi.matrix.KerasPredictionTransform(model)[source]

Bases: hidi.transform.Transform

Generalized transform for Keras model prediction

This transform takes a trained Keras model. It applies the train model to the input when transform is called.

Param: model: trained keras model
transform(M, **kwargs)[source]

Takes a numpy ndarray-like object and applies a trained Keras model to it.

Returns the predictions from the trained Keras model

Parameters: M (pandas.DataFrame) – a dataframe that has an item_id index, and a “features” columns ndarray-like object with its kwargs
class hidi.matrix.SkLearnTransform(SkLearnAlg, **sklearn_args)[source]

Bases: hidi.transform.Transform

Generalized transform for SciKit Learn algorithms.

This transform takes a SciKit Learn algorithm, and its keyword arguments upon initialization. It applies the algorithm to the input when transform is called.

The algorithm to be applied is likely, but not necessarily a sklearn.decomposition algorithm.

transform(M, **kwargs)[source]

Takes a numpy ndarray-like object and applies a SkLearn algorithm to it.

Return type: numpy.ndarray
class hidi.matrix.SVDTransform(**svd_kwargs)[source]

Perform Truncated SVD on the matrix.

This uses SciKit Learn’s Tuncated SVD implementation, which is documented here: http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.TruncatedSVD.html

All kwargs given to SVDTransform‘s initialization function will be given to sklearn.decomposition.TruncatedSVD.

Please reference the sklearn docs when using this transform.

class hidi.matrix.NimfaTransform(NimfaAlg, **nimfa_kwargs)[source]

Bases: hidi.transform.Transform

Generalized Nimfa transform.

This transform takes a nimfa algorithm, and its keyword arguments upon initialization. It applies the algorithm to the input when transform is called.

transform(M, **kwargs)[source]
Return type: numpy.ndarray
class hidi.matrix.SNMFTransform(**snmf_kwargs)[source]

Perform Sparse Nonnegative Matrix Factorization.

This wraps nimfa’s snmf function, which is documented here: http://nimfa.biolab.si/nimfa.methods.factorization.snmf.html

All kwargs given to SNFMTransform‘s initialization function will be given to nimfa.Snmf.

Please reference the nimfa docs when using this transform.