Utilities¶
Classifiers¶
- class FuzzyTextClassifier(measure_caller: collections.abc.Callable, *, is_distance: bool = True, **measure_kwargs)[source]¶
-
Text document classification proposed by P. Intarapaiboon from the related article: “Text classification using similarity measures on intuitionistic fuzzy sets”.
Train and classification object to classify text documents following the proposed method.
Follows Estimator API from scikit-learn Estimator object.
See also
- Attributes
-
- measure_callerCallable
-
Measure to use during prediction process.
- measure_kwargsdict
-
Passed to the measure_caller.
- is_distancebool
-
If measure_caller is a distance or not.
Methods
fit(X, y)"Trains" on the X data prpovided.
predict(X)Predict class labels for samples X.
Measures of each sample.
get_params([deep])Get parameters for this estimator.
set_params(**parameters)Set the parameters of this estimator.
- fit(X: Iterable[sets.FuzzySet], y: Iterable) object[source]¶
-
“Trains” on the X data prpovided.
Calculates the membership and non-membership values of each word for each unity class in y.
- Parameters
-
- XIterable[FuzzySet]
-
Data to train upon.
- yIterabale
-
Target vector relative to X.
- Returns
-
- self: Object
-
An instance of the estimator.
See also
- get_params(deep=True)[source]¶
-
Get parameters for this estimator.
- Parameters
-
- deepbool, default=True
-
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
-
- paramsmapping of string to any
-
Parameter names mapped to their values.
- predict(X: Iterable[sets.FuzzySet]) List[numpy.int64][source]¶
-
Predict class labels for samples X.
Calculates the membership and non-membership values of each word for each unity class in y.
- Parameters
-
- XIterable[FuzzySet]
-
Samples.
- Returns
-
- list[np.intp]
-
Predicted class label per sample.
- predict_proba(X: Iterable[sets.FuzzySet])[source]¶
-
Measures of each sample.
The returned values are the results returned when the measure_caller is called for all classes for each label.
- Parameters
-
- XIterable[FuzzySet]
-
Samples.
- Returns
-
- list[list[float]]
-
Returns the measures of the sample for each class, where classes are in self.classes_
- set_params(**parameters)[source]¶
-
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters
-
- **paramsdict
-
Estimator parameters.
- Returns
-
- selfobject
-
Estimator instance.
- classify(class_patterns: Iterable[sets.FuzzySet], sample_pattern: sets.FuzzySet, measure_caller: collections.abc.Callable, *, is_distance=True, return_confidence=False, **kwargs) numpy.int64[source]¶
-
Simple classification method to classify a sample pattern given class patterns, using the measure provided.
For each class pattern c, calculates the measure from measure_caller between c and sample_pattern. The class is chosen by the min/max measure between the sample and class patterns, depending on is_distance. If return_confidence is true, returns the degree of confidence for the chosen class.
- Parameters
-
- class_patternslist[FuzzySet]
-
Class patterns to which the sample_pattern is classified.
- sample_patternFuzzySet
-
The sample to be classified.
- measure_callerCallable
-
The measure function to use when measuring the sets of the ideally thresholded image and the current image set.
- is_distancebool
-
If the measure provided is a distance or a similarity. Used to pick the best measure calculated.
- return_confidencebool
-
Whether the confidence degree is returned. Can only be calculated if is_distance is True.
- **kwargsadditional arguments
-
Passed to the measure_caller.
- Returns
-
- numpy.intp
-
The class of sample_pattern.
See also
Image Processing¶
- threshold(image: numpy.ndarray, measure_caller: Callable, *, is_distance: bool = True, l: float = 0.2, **kwargs) Tuple[numpy.ndarray, bool][source]¶
-
Thresholds the input image following the proposed method by T. Chaira and A.K. Ray, from the related article: “Threshold selection using fuzzy set theory”.
For each thresold value t [0, 255], the set S1 of the ideally thresholded image is calculated and the set S2 of thresholded image with value t. The selected threshold value is the one with the min/max (Depending on is_distance) measure between the sets for each threshold value t.
- Parameters
-
- imagenp.ndarray
-
Single channel input image.
- measure_callerCallable
-
The measure function to use when measuring the sets of the ideally thresholded image and the current image set.
- is_distancebool
-
If the measure provided is a distance or a similarity. Used to pick the best measure calculated.
- lfloat
-
Used to calculate the membership values of the thresholded image with value t.
- **kwargsadditional arguments
-
Passed to the measure_caller.
- Returns
-
- thresholded_imagenp.ndarray
-
Thresholded image.
- threshold_valuenp.float64
-
Threshold value.
Methods¶
- calculate_documents_membership(data: Iterable, membership_weight: float, non_membership_weight: float) Tuple[list, numpy.ndarray, numpy.ndarray][source]¶
-
Calculates the Fuzzy Set of each class from token counts. Proposed by P. Intarapaiboon from the related article: “Text classification using similarity measures on intuitionistic fuzzy sets”.
- Parameters
-
- dataIterable
-
Token counts of a document dataset.
- membership_weightfloat
-
Weight used to calculate the membership values of each token’s membership value.
- non_membership_weightfloat
-
Weight used to calculate the non-membership values of each token’s membership value.
- Returns
-
- list[FuzzySet]
-
List of FuzzySets for each class.
- np.ndarraymeans
-
Mean values for each token in data.
- np.ndarraystds
-
Standard deviation values for each token in data.
- check_similarity_conditions(measure_caller: collections.abc.Callable, **measure_kwargs)[source]¶
-
Checks if measures satisfies the required conditions.
Checks whether the measure provided satisfies the following conditions:
\[S(A, B) >= 0\ and\ S(A, B) <= 1\]\[S(A, B) == S(B, A)\]\[S(A, A) == 1\]The algorithm conducts two tests. The first one creates a list from np.arange(start=0.0, stop=1.01, step=0.01) and 10.000 creates two sets A and B with a random size (1-1000) of (non)membership values with random values picked from the range above, checking if the conditions apply.
- The second test creates 4 sets with random sizes (1-1000), with random values using np.random.rand(random_size)
-
and checks for the conditions again.
The first two sets only contain membership and non-membership values, while the 2 last sets contain membership values and hesitation degrees. Checks if the condition applies when the measure is applied on all combination of the 4 sets.
- Parameters
-
- measure_callerCallable
-
The measure to be tested.
- **measure_kwargsmeasure arguments
-
Passed to the measure_caller.
- Returns
-
- numpy.float64
-
Compactness of the image.
- compactness(A: sets.FuzzySet, shape: tuple) numpy.float64[source]¶
-
Image geometry proposd by S.K. Pal, A. Rosenfeld, from the related article: “Image enhancement and thresholding by optimization of fuzzy compactness”
- Parameters
-
- AFuzzySet
-
Fuzzy set of an image, containing a 2-d np.ndarray as an image.
- shapeTuple[float, float]
-
Shape of the original image.
- Returns
-
- numpy.float64
-
Compactness of the image.
- Raises
-
- ValueError
-
In case the shape propvided is not a tuple of 2 values.
- confidence_degree(predicted_class_distance: float, other_classes_distance: Iterable[float]) numpy.float64[source]¶
-
Degree of Confidence proposd by A.G. Hatzimichailidis, G.A. Papakostas, V.G. Kaburlasos from the related article: “A Novel Distance Measure of Intuitionistic Fuzzy Sets and Its Application to Pattern Recognition Problems”
- Parameters
-
- predicted_class_distancefloat
-
Distance calculated between the sample and the predicted class.
- other_classes_distance: Iterable[float]
-
An Iterable consisted of the distances calculated between the sample and every other class.
- Returns
-
- numpy.float64
-
Degree of Confidence.