Utilities¶

Classifiers¶

class FuzzyTextClassifier(measure_caller: collections.abc.Callable, *, is_distance: bool = True, **measure_kwargs)[source]¶

Text document classification proposed by P. Intarapaiboon from the related article: “Text classification using similarity measures on intuitionistic fuzzy sets”.

Train and classification object to classify text documents following the proposed method.

Follows Estimator API from scikit-learn Estimator object.

See also

sklearn.base.BaseEstimator

Attributes

measure_callerCallable: Measure to use during prediction process.
measure_kwargsdict: Passed to the measure_caller.
is_distancebool: If measure_caller is a distance or not.

Methods

`fit`(X, y)	"Trains" on the X data prpovided.
`predict`(X)	Predict class labels for samples X.
`predict_proba`(X)	Measures of each sample.
`get_params`([deep])	Get parameters for this estimator.
`set_params`(**parameters)	Set the parameters of this estimator.

fit(X: Iterable[sets.FuzzySet], y: Iterable) → object[source]¶

“Trains” on the X data prpovided.

Calculates the membership and non-membership values of each word for each unity class in y.

Parameters

XIterable[FuzzySet]: Data to train upon.
yIterabale: Target vector relative to X.

Returns

self: Object: An instance of the estimator.

See also

utils.calculate_documents_membership

get_params(deep=True)[source]¶

Get parameters for this estimator.

Parameters

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsmapping of string to any: Parameter names mapped to their values.

predict(X: Iterable[sets.FuzzySet]) → List[numpy.int64][source]¶

Predict class labels for samples X.

Calculates the membership and non-membership values of each word for each unity class in y.

Parameters

XIterable[FuzzySet]: Samples.

Returns

list[np.intp]: Predicted class label per sample.

predict_proba(X: Iterable[sets.FuzzySet])[source]¶

Measures of each sample.

The returned values are the results returned when the measure_caller is called for all classes for each label.

Parameters

XIterable[FuzzySet]: Samples.

Returns

list[list[float]]: Returns the measures of the sample for each class, where classes are in self.classes_

set_params(**parameters)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**paramsdict: Estimator parameters.

Returns

selfobject: Estimator instance.

classify(class_patterns: Iterable[sets.FuzzySet], sample_pattern: sets.FuzzySet, measure_caller: collections.abc.Callable, *, is_distance=True, return_confidence=False, **kwargs) → numpy.int64[source]¶

Simple classification method to classify a sample pattern given class patterns, using the measure provided.

For each class pattern c, calculates the measure from measure_caller between c and sample_pattern. The class is chosen by the min/max measure between the sample and class patterns, depending on is_distance. If return_confidence is true, returns the degree of confidence for the chosen class.

Parameters

class_patternslist[FuzzySet]: Class patterns to which the sample_pattern is classified.
sample_patternFuzzySet: The sample to be classified.
measure_callerCallable: The measure function to use when measuring the sets of the ideally thresholded image and the current image set.
is_distancebool: If the measure provided is a distance or a similarity. Used to pick the best measure calculated.
return_confidencebool: Whether the confidence degree is returned. Can only be calculated if is_distance is True.
**kwargsadditional arguments: Passed to the measure_caller.

Returns

numpy.intp: The class of sample_pattern.

See also

utils.confidence_degree

Image Processing¶

threshold(image: numpy.ndarray, measure_caller: Callable, *, is_distance: bool = True, l: float = 0.2, **kwargs) → Tuple[numpy.ndarray, bool][source]¶

Thresholds the input image following the proposed method by T. Chaira and A.K. Ray, from the related article: “Threshold selection using fuzzy set theory”.

For each thresold value t [0, 255], the set S1 of the ideally thresholded image is calculated and the set S2 of thresholded image with value t. The selected threshold value is the one with the min/max (Depending on is_distance) measure between the sets for each threshold value t.

Parameters

imagenp.ndarray: Single channel input image.
measure_callerCallable: The measure function to use when measuring the sets of the ideally thresholded image and the current image set.
is_distancebool: If the measure provided is a distance or a similarity. Used to pick the best measure calculated.
lfloat: Used to calculate the membership values of the thresholded image with value t.
**kwargsadditional arguments: Passed to the measure_caller.

Returns

thresholded_imagenp.ndarray: Thresholded image.
threshold_valuenp.float64: Threshold value.

Methods¶

calculate_documents_membership(data: Iterable, membership_weight: float, non_membership_weight: float) → Tuple[list, numpy.ndarray, numpy.ndarray][source]¶

Calculates the Fuzzy Set of each class from token counts. Proposed by P. Intarapaiboon from the related article: “Text classification using similarity measures on intuitionistic fuzzy sets”.

Parameters

dataIterable: Token counts of a document dataset.
membership_weightfloat: Weight used to calculate the membership values of each token’s membership value.
non_membership_weightfloat: Weight used to calculate the non-membership values of each token’s membership value.

Returns

list[FuzzySet]: List of FuzzySets for each class.
np.ndarraymeans: Mean values for each token in data.
np.ndarraystds: Standard deviation values for each token in data.

See also

sklearn.feature_extraction.text.CountVectorizer

check_similarity_conditions(measure_caller: collections.abc.Callable, **measure_kwargs)[source]¶

Checks if measures satisfies the required conditions.

Checks whether the measure provided satisfies the following conditions:

\[S(A, B) >= 0\ and\ S(A, B) <= 1\]

\[S(A, B) == S(B, A)\]

\[S(A, A) == 1\]

The algorithm conducts two tests. The first one creates a list from np.arange(start=0.0, stop=1.01, step=0.01) and 10.000 creates two sets A and B with a random size (1-1000) of (non)membership values with random values picked from the range above, checking if the conditions apply.

The second test creates 4 sets with random sizes (1-1000), with random values using np.random.rand(random_size): and checks for the conditions again.

The first two sets only contain membership and non-membership values, while the 2 last sets contain membership values and hesitation degrees. Checks if the condition applies when the measure is applied on all combination of the 4 sets.

Parameters

measure_callerCallable: The measure to be tested.
**measure_kwargsmeasure arguments: Passed to the measure_caller.

Returns

numpy.float64: Compactness of the image.

compactness(A: sets.FuzzySet, shape: tuple) → numpy.float64[source]¶

Image geometry proposd by S.K. Pal, A. Rosenfeld, from the related article: “Image enhancement and thresholding by optimization of fuzzy compactness”

Parameters

AFuzzySet: Fuzzy set of an image, containing a 2-d np.ndarray as an image.
shapeTuple[float, float]: Shape of the original image.

Returns

numpy.float64: Compactness of the image.

Raises

ValueError: In case the shape propvided is not a tuple of 2 values.

confidence_degree(predicted_class_distance: float, other_classes_distance: Iterable[float]) → numpy.float64[source]¶

Degree of Confidence proposd by A.G. Hatzimichailidis, G.A. Papakostas, V.G. Kaburlasos from the related article: “A Novel Distance Measure of Intuitionistic Fuzzy Sets and Its Application to Pattern Recognition Problems”

Parameters

predicted_class_distancefloat: Distance calculated between the sample and the predicted class.
other_classes_distance: Iterable[float]: An Iterable consisted of the distances calculated between the sample and every other class.

Returns

numpy.float64: Degree of Confidence.

Measures