Matrix Module¶

HiDi’s matrix module exposes functionality for transforming matrices.

class hidi.matrix.ApplyTransform(fn)[source]¶

Bases: hidi.transform.Transform

Apply a function to an input.

Takes a single argument, fn, which must be a function accepting one argument (the function to apply), and kwargs.

Parameters:	fn (function) – The function to be applied to transform input.

transform(x, **kwargs)[source]¶

Parameters:	x – The input to the function `fn`.

class hidi.matrix.SimilarityTransform(axis=0)[source]¶

Bases: hidi.transform.Transform

Takes the dot product of a link*item matrix.

Returns either a link*link or item*item similarity matrix. If axis is 0, an item*item matrix is returned, if axis is 1 a link*link matrix is returned. The returned matrix represents a similarity matrix.

The transform function returns a tuple containing the similarity matrix, and the links or items, depending on the axis.

Parameters:	axis (int[0,1]) – The axis to perform the dot product for.

transform(M, items, links, **kwargs)[source]¶

Parameters:	M (numpy ndarray-like) – The matrix to create a similarity matrix from items (array) – Array of `item_ids` in the same order that they appear in `M`. links (array) – Array of `link_ids` in the same order that they appear in `M`.

class hidi.matrix.ScalarTransform(fn=<ufunc 'log'>)[source]¶

Bases: hidi.transform.Transform

Scale the matrix using a function or class method.

ScalerTransform takes an fn argument that specifies the function that should be applied to the matrix. If fn is a string the scaler transform will try to call a function by that name on the matrix, if it is a function reference, scaler transform will call that function with the matrix as input.

Parameters:	fn (str \| function) – The scalar function to use. If `fn` is a string then an attribute of that name will be looked up and called. If `fn` is a function, that function will be called with the input given to transform.

transform(matrix_to_scale, **kwargs)[source]¶: Takes a matrix_to_scale as a numpy ndarray-like object and performs scaling on it, then returns the result.

class hidi.matrix.SparseTransform[source]¶

Bases: hidi.transform.Transform

Make a sparse item*link matrix using SciPy’s sparse compressed row matrix implementation.

transform(*func_args, **func_kwargs)[source]¶

Takes a dataframe that has link_id, item_id and score columns.

Returns a SciPy csr_matrix.

Parameters:	df (pandas.DataFrame) – The DataFrame to make a sparse matrix from. Must have `link_id`, `item_id`, and `score` columns.
Return type:	scipy.sparse.csr_matrix

class hidi.matrix.DenseTransform[source]¶

Bases: hidi.transform.Transform

Transform a sparse matrix to its dense representation.

transform(M, **kwargs)[source]¶

Takes a sparse matrix and transform it into its dense representation

Parameters:	M (scipy.sparse classes) – a sparse matrix

class hidi.matrix.ItemsMatrixToDFTransform[source]¶

Bases: hidi.transform.Transform

Create a Pandas DataFrame object with items as the index.

transform(M, items, **kwargs)[source]¶: Takes a numpy ndarray-like object and a list of item identifiers to be used as the index for the DataFrame.

class hidi.matrix.KerasEvaluationTransform(keras_model, validation_matrix, tts_seed=42, tt_split=0.25, **keras_kwargs)[source]¶

Bases: hidi.transform.Transform

Generalized transform for Keras algorithm

This transform takes a Keras sequential model, a validation matrix and its keyword arugments upon initialization.

Parameters:

keras_model (Keras Sequential model) – a Keras sequential model which is documented here: https://keras.io/getting-started/sequential-model-guide/
validation_matrix (pandas.DataFrame) – A validation matrix is a dataframe that has item_id index, other ‘label’ columns. It will be inner joined with the M matrix and then fed into the Keras sequential model.
tts_seed (int) – random state seed for train_test_split
tt_split (float) – the proportion of the dataset to include in the test split for train_test_split

transform(M, **kwargs)[source]¶

Takes a Takes a dataframe that has item_id index, other ‘features’ columns for prediction, and applies a Keras sequential model to it.

Parameters:	M – a dataframe that has `item_id` index, other

‘features’ columns :type M: pandas.DataFrame :rtype: a tuple with trained Keras model and its keyword

arguments

class hidi.matrix.KerasKfoldTransform(keras_model, validation_matrix, kfold_n_splits=10, kfold_seed=42, kfold_shuffle=True, classification=False, **keras_kwargs)[source]¶

Bases: hidi.transform.Transform

Generalized transform for Keras algorithm with k fold cross validation evaluation

Parameters:

keras_model (Keras Sequential model) – a Keras sequential model which is documented here: https://keras.io/getting-started/sequential-model-guide/
validation_matrix (pandas.DataFrame) – A validation matrix is a dataframe that has item_id index, other ‘label’ columns. It will be inner joined with the M matrix and then fed into the Keras sequential model.
kfold_n_splits (int) – Number of folds for kfold. Must be at least 2.
kfold_seed (None, int or RandomState) – random state seed for kfold
kfold_shuffle (boolean) – Whether to shuffle the data before splitting into batches for kfold

transform(M, **kwargs)[source]¶

Takes a Takes a dataframe that has item_id index, other ‘features’ columns for prediction, and applies a Keras sequential model to it.

Parameters:	M (pandas.DataFrame) – a dataframe that has `item_id` index, other ‘features’ columns
Return type:	a tuple with trained Keras model and its keyword arguments

class hidi.matrix.KerasPredictionTransform(model)[source]¶

Bases: hidi.transform.Transform

Generalized transform for Keras model prediction

This transform takes a trained Keras model. It applies the train model to the input when transform is called.

Param:	model: trained keras model

transform(M, **kwargs)[source]¶

Takes a numpy ndarray-like object and applies a trained Keras model to it.

Returns the predictions from the trained Keras model

Param:	M: a dataframe that has `item_id` index, other ‘features’ columns
Param:	M: pandas.DataFrame
Return type:	ndarray-like object with its kwargs

class hidi.matrix.SkLearnTransform(SkLearnAlg, **sklearn_args)[source]¶

Bases: hidi.transform.Transform

Generalized transform for SciKit Learn algorithms.

This transform takes a SciKit Learn algorithm, and its keyword arguments upon initialization. It applies the algorithm to the input when transform is called.

The algorithm to be applied is likely, but not necessarily a sklearn.decomposition algorithm.

transform(M, **kwargs)[source]¶: Takes a numpy ndarray-like object and applies a SkLearn algorithm to it.

class hidi.matrix.SVDTransform(**svd_kwargs)[source]¶

Bases: hidi.matrix.SkLearnTransform

Perform Truncated SVD on the matrix.

This uses SciKit Learn’s Tuncated SVD implementation, which is documented here: http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.TruncatedSVD.html

All kwargs given to SVDTransform‘s initialization function will be given to sklearn.decomposition.TruncatedSVD.

Please reference the sklearn docs when using this transform.

class hidi.matrix.NimfaTransform(NimfaAlg, **nimfa_kwargs)[source]¶

Bases: hidi.transform.Transform

Generalized Nimfa transform.

This transform takes a nimfa algorithm, and its keyword arguments upon initialization. It applies the algorithm to the input when transform is called.

class hidi.matrix.SNMFTransform(**snmf_kwargs)[source]¶

Bases: hidi.matrix.NimfaTransform

Perform Sparse Nonnegative Matrix Factorization.

This wraps nimfa’s snmf function, which is documented here: http://nimfa.biolab.si/nimfa.methods.factorization.snmf.html

All kwargs given to SNFMTransform‘s initialization function will be given to nimfa.Snmf.

Please reference the nimfa docs when using this transform.