scQUEST.individuality module
Summary
Classes:
Computes the individuality of each observation in the data set according to [Wagner2019]. |
Reference
- class Individuality(n_neighbors=100, radius=1.0, graph_type='knn', graph=None, prior='frequency', metric='minkowski', metric_params=<factory>, nn_params=<factory>)[source]
Bases:
object
Computes the individuality of each observation in the data set according to [Wagner2019].
- graph
if provided, uses this graph instead of construction an own one
- Type
Union[None, numpy.ndarray, scipy.sparse._csr.csr_matrix, scipy.sparse._csc.csc_matrix]
- prior
either
frequency
,uniform
or custom prior probabilities for each class/group/label. If set tofrequency
the empirical class/group probabilities are used as prior. If set touniform
all classes/groups have the same prior probability.- Type
Union[str, numpy.ndarray]
- metric
distance metric to use when constructing the graph topology
- Type
Union[Callable[[numpy.ndarray, numpy.ndarray], float], str]
Notes
The posterior class probabilities for each observation are computed as follows [Bishop]:
\[\begin{split}p(c_i | x) &= \frac{p(x | c_i) * p(c_i)}{p(x)} \\ p(x | c_i) &= \frac{K_i}{N_i * V} \\ p(x) &= \sum_{i \in C}p(x | c_i)*p(c_i) \\ p(c_i) &= \frac{N_i}{N} \;\; \texttt{if prior=frequency} \\ p(c_i) &= \frac{1}{|C|} \;\; \texttt{if prior=uniform} \\\end{split}\]Where
\(C\) is the set of classes and \(c_i \in C\) a particular class
\(x\) is the feature vector of an observation
\(K_i\) the number of neighbor of class \(c_i\)
\(V\) the volume of the sphere that either
contains
n_neighbors
orhas radius
radius
.
\(N\) the total number of observations and \(N_i\) the number of observations belonging to class \(c_i\)
The resulting observation-level class probabilities estimates (\(N \times |C|\)) are aggregated (averaged) by each class to result in class-level individuality estimates (\(|C| \times |C|\)).
- Returns
Instance of
Individuality
.
- graph: Union[None, numpy.ndarray, scipy.sparse._csr.csr_matrix, scipy.sparse._csc.csc_matrix] = None
- prior: Union[str, numpy.ndarray] = 'frequency'
- metric: Union[Callable[[numpy.ndarray, numpy.ndarray], float], str] = 'minkowski'
- predict(ad, labels, layer=None, inplace=True)[source]
Performs prediction of the individuality of each observation and aggregates (average) results for each label. If you wish to access the posterior probabilities for each observation (cell) use
compute_individuality()
.- Parameters
X – matrix with observations as rows and columns as features.
labels (
Union
[Iterable
,Categorical
]) – indicator for the group/sample an observation belongs to.
- Return type
- Returns
DataFrame with rows as observations and columns with the estimated probability to belong in the given group/sample
- static compute_individuality(g, num_seq_labs, prior)[source]
Computes the observation-level individuality based on a given graph structure and labels. See
Individuality
for an in-depth description.- Parameters
g (
Union
[ndarray
,csr_matrix
,csc_matrix
]) – graph encoding the observation-level interactionsnum_seq_labs (
ndarray
) – labels of the observations. Must be sequential and numeric, i.e. [0,1,2,…,N]prior (
Union
[str
,ndarray
]) – eitherfrequency
,uniform
or custom prior probabilities for each class/group/label. If set tofrequency
the empirical class/group probabilities are used as prior. If set touniform
all classes/groups have the same prior probability.
- Return type
- Returns
\(N\times |C|\)
ndarray
with \(N\) observations across \(|C|\) classes.
- __annotations__ = {'graph': typing.Union[NoneType, numpy.ndarray, scipy.sparse._csr.csr_matrix, scipy.sparse._csc.csc_matrix], 'graph_type': <class 'str'>, 'metric': typing.Union[typing.Callable[[numpy.ndarray, numpy.ndarray], float], str], 'metric_params': <class 'dict'>, 'n_neighbors': typing.Union[NoneType, int], 'nn_params': <class 'dict'>, 'prior': typing.Union[str, numpy.ndarray], 'radius': typing.Union[NoneType, float]}
- __dataclass_fields__ = {'graph': Field(name='graph',type=typing.Union[NoneType, numpy.ndarray, scipy.sparse._csr.csr_matrix, scipy.sparse._csc.csc_matrix],default=None,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'graph_type': Field(name='graph_type',type=<class 'str'>,default='knn',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'metric': Field(name='metric',type=typing.Union[typing.Callable[[numpy.ndarray, numpy.ndarray], float], str],default='minkowski',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'metric_params': Field(name='metric_params',type=<class 'dict'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<class 'dict'>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'n_neighbors': Field(name='n_neighbors',type=typing.Union[NoneType, int],default=100,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'nn_params': Field(name='nn_params',type=<class 'dict'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<class 'dict'>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'prior': Field(name='prior',type=typing.Union[str, numpy.ndarray],default='frequency',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'radius': Field(name='radius',type=typing.Union[NoneType, float],default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)}
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)
- __dict__ = mappingproxy({'__module__': 'scQUEST.individuality', '__annotations__': {'n_neighbors': typing.Union[NoneType, int], 'radius': typing.Union[NoneType, float], 'graph_type': <class 'str'>, 'graph': typing.Union[NoneType, numpy.ndarray, scipy.sparse._csr.csr_matrix, scipy.sparse._csc.csc_matrix], 'prior': typing.Union[str, numpy.ndarray], 'metric': typing.Union[typing.Callable[[numpy.ndarray, numpy.ndarray], float], str], 'metric_params': <class 'dict'>, 'nn_params': <class 'dict'>}, '__doc__': 'Computes the individuality of each observation in the data set according to [Wagner2019]_.\n\n Attributes:\n n_neighbors: number of neighbors in *k*\\ NN graph\n radius: radius in radius graph\n graph_type: type of graph to build, either ``knn`` or ``radius``\n graph: if provided, uses this graph instead of construction an own one\n prior:\n either ``frequency``, ``uniform`` or custom prior probabilities for each class/group/label.\n If set to ``frequency`` the empirical class/group probabilities are used as prior.\n If set to ``uniform`` all classes/groups have the same prior probability.\n metric: distance metric to use when constructing the graph topology\n metric_params: additional kwargs passed to :attr:`~Individuality.metric`\n nn_params: additional kwargs passed to :class:`~.sklearn.NearestNeighbors`\n\n Notes:\n The posterior class probabilities for each observation are computed as follows [Bishop]_:\n\n .. math::\n\n p(c_i | x) &= \\frac{p(x | c_i) * p(c_i)}{p(x)} \\\\\n p(x | c_i) &= \\frac{K_i}{N_i * V} \\\\\n p(x) &= \\sum_{i \\in C}p(x | c_i)*p(c_i) \\\\\n p(c_i) &= \\frac{N_i}{N} \\;\\; \\texttt{if prior=frequency} \\\\\n p(c_i) &= \\frac{1}{|C|} \\;\\; \\texttt{if prior=uniform} \\\\\n\n Where\n\n - :math:`C` is the set of classes and :math:`c_i \\in C` a particular class\n - :math:`x` is the feature vector of an observation\n - :math:`K_i` the number of neighbor of class :math:`c_i`\n - :math:`V` the volume of the sphere that either\n\n - contains :attr:`~Individuality.n_neighbors` or\n - has radius :attr:`~Individuality.radius`.\n\n - :math:`N` the total number of observations and :math:`N_i` the number of observations belonging to class :math:`c_i`\n\n The resulting observation-level class probabilities estimates (:math:`N \\times |C|`) are aggregated (averaged) by each class\n to result in class-level individuality estimates (:math:`|C| \\times |C|`).\n\n Returns:\n Instance of :class:`Individuality`.\n\n ', 'n_neighbors': 100, 'radius': 1.0, 'graph_type': 'knn', 'graph': None, 'prior': 'frequency', 'metric': 'minkowski', '__post_init__': <function Individuality.__post_init__>, 'predict': <function Individuality.predict>, 'compute_individuality': <staticmethod object>, '_build_topology': <function Individuality._build_topology>, '_check_args': <function Individuality._check_args>, '__dict__': <attribute '__dict__' of 'Individuality' objects>, '__weakref__': <attribute '__weakref__' of 'Individuality' objects>, '__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False), '__dataclass_fields__': {'n_neighbors': Field(name='n_neighbors',type=typing.Union[NoneType, int],default=100,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'radius': Field(name='radius',type=typing.Union[NoneType, float],default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'graph_type': Field(name='graph_type',type=<class 'str'>,default='knn',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'graph': Field(name='graph',type=typing.Union[NoneType, numpy.ndarray, scipy.sparse._csr.csr_matrix, scipy.sparse._csc.csc_matrix],default=None,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'prior': Field(name='prior',type=typing.Union[str, numpy.ndarray],default='frequency',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'metric': Field(name='metric',type=typing.Union[typing.Callable[[numpy.ndarray, numpy.ndarray], float], str],default='minkowski',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'metric_params': Field(name='metric_params',type=<class 'dict'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<class 'dict'>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'nn_params': Field(name='nn_params',type=<class 'dict'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<class 'dict'>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)}, '__init__': <function __create_fn__.<locals>.__init__>, '__repr__': <function __create_fn__.<locals>.__repr__>, '__eq__': <function __create_fn__.<locals>.__eq__>, '__hash__': None})
- __doc__ = 'Computes the individuality of each observation in the data set according to [Wagner2019]_.\n\n Attributes:\n n_neighbors: number of neighbors in *k*\\ NN graph\n radius: radius in radius graph\n graph_type: type of graph to build, either ``knn`` or ``radius``\n graph: if provided, uses this graph instead of construction an own one\n prior:\n either ``frequency``, ``uniform`` or custom prior probabilities for each class/group/label.\n If set to ``frequency`` the empirical class/group probabilities are used as prior.\n If set to ``uniform`` all classes/groups have the same prior probability.\n metric: distance metric to use when constructing the graph topology\n metric_params: additional kwargs passed to :attr:`~Individuality.metric`\n nn_params: additional kwargs passed to :class:`~.sklearn.NearestNeighbors`\n\n Notes:\n The posterior class probabilities for each observation are computed as follows [Bishop]_:\n\n .. math::\n\n p(c_i | x) &= \\frac{p(x | c_i) * p(c_i)}{p(x)} \\\\\n p(x | c_i) &= \\frac{K_i}{N_i * V} \\\\\n p(x) &= \\sum_{i \\in C}p(x | c_i)*p(c_i) \\\\\n p(c_i) &= \\frac{N_i}{N} \\;\\; \\texttt{if prior=frequency} \\\\\n p(c_i) &= \\frac{1}{|C|} \\;\\; \\texttt{if prior=uniform} \\\\\n\n Where\n\n - :math:`C` is the set of classes and :math:`c_i \\in C` a particular class\n - :math:`x` is the feature vector of an observation\n - :math:`K_i` the number of neighbor of class :math:`c_i`\n - :math:`V` the volume of the sphere that either\n\n - contains :attr:`~Individuality.n_neighbors` or\n - has radius :attr:`~Individuality.radius`.\n\n - :math:`N` the total number of observations and :math:`N_i` the number of observations belonging to class :math:`c_i`\n\n The resulting observation-level class probabilities estimates (:math:`N \\times |C|`) are aggregated (averaged) by each class\n to result in class-level individuality estimates (:math:`|C| \\times |C|`).\n\n Returns:\n Instance of :class:`Individuality`.\n\n '
- __eq__(other)
Return self==value.
- __hash__ = None
- __init__(n_neighbors=100, radius=1.0, graph_type='knn', graph=None, prior='frequency', metric='minkowski', metric_params=<factory>, nn_params=<factory>)
- __module__ = 'scQUEST.individuality'
- __repr__()
Return repr(self).
- __weakref__
list of weak references to the object (if defined)