vsm.model.TfMulti¶

class vsm.model.TfMulti(corpus=None, context_type=None)¶

Trains a term-frequency model.

In a term-frequency model, the number of occurrences of a word type in a context is counted for all word types and contexts. Word types correspond to matrix rows and contexts correspond to matrix columns.

The data structure is a sparse integer matrix.

See Also:	`vsm.model.base.BaseModel`, `vsm.corpus.Corpus`, `scipy.sparse.coo_matrix`

Methods

`__init__`([corpus, context_type])	Initialize TfMulti.
`load`(f)	Takes a filename or file object and loads it as an npz archive
`save`(f)	Takes a filename or file object and saves self.matrix in an npz archive.
`train`(n_procs)	Takes a number of processes n_procs over which to map and reduce.

__init__(corpus=None, context_type=None)¶

Initialize TfMulti.

Parameters:	corpus (Corpus, optional) – A Corpus object containing the training data context_type (string, optional) – A string specifying the type of context over which the model trainer is applied.

static load(f)¶

Takes a filename or file object and loads it as an npz archive into a BaseModel object.

Parameters:	file (str-like or file-like object) – Designates the file to read. If file is a string ending in .gz, the file is first gunzipped. See numpy.load for further details.
Returns:	A dictionary storing the data found in file.
See Also:	`numpy.load()`

save(f)¶

Takes a filename or file object and saves self.matrix in an npz archive.

Parameters:	file (str-like or file-like object) – Designates the file to which to save data. See numpy.savez for further details.
Returns:	None
See Also:	`numpy.savez()`

train(n_procs)¶

Takes a number of processes n_procs over which to map and reduce.

Parameters:	n_procs (int) – Number of processors.

vsm.model.TfMulti¶

Previous topic

Next topic

This Page

Navigation

vsm.model.TfMulti¶

Previous topic

Next topic

This Page

Quick search

Navigation