All Classes Interface Summary Class Summary Enum Summary Exception Summary
Class |
Description |
AbstractStableMarriage<T> |
|
AbstractStableMarriage.AbstractMatcher |
|
AbstractStableMarriage.Matcher |
|
AggregatingSimilarityMeasure<T> |
Aggregates similarities with a given aggregator.
|
AggregatingSimilarityMeasure.AggregatingSimilarityMeasureBuilder<T> |
|
AnnotatedValue<T> |
A value with lineage information; in particular, the Source and the timestamp.
|
Assigner<T> |
Implements an algorithm that solves an assignment problem.
|
BipartiteMatcher<T> |
Implements an algorithm that finds a matching in a bipartite graph.
|
CachingSimilarity<T,I> |
A light-weight caching layer over SimilarityMeasure to allow implementations to repeatedly calculate
expensive similarities without implementing a cache on their own.
|
Candidate<T> |
|
CandidateSelection<T> |
Selects candidates from a static or dynamic dataset accessible through Iterables.
|
Classification |
Contains the possible classification classes.
|
ClassificationException |
An exception thrown by a Classifier whenever an exception during classification occurred.
|
ClassificationResult |
The classification of a Candidate with additional information.
|
ClassificationResult.ClassificationResultBuilder |
|
ClassifiedCandidate<T> |
|
ClassifiedCandidate.ClassifiedCandidateBuilder<T> |
|
Classifier<T> |
|
Cluster<C extends java.lang.Comparable<C>,T> |
A cluster is a coherent collection of duplicate records.
|
Cluster.ClusterBuilder<C extends java.lang.Comparable<C>,T> |
|
ClusterIdGenerators |
A collection of typical cluster id generators.
|
Clustering<C extends java.lang.Comparable<C>,T> |
A clustering algorithm takes a list of ClassifiedCandidate s and creates a coherent Cluster , such that
all pairs of records inside the cluster are duplicate and no record outside the cluster is a duplicate with any
record inside the cluster.
|
Clusters |
|
ClusterSplitHandler<C extends java.lang.Comparable<C>,T> |
A callback that is invoked when an already existing cluster is split up during the Clustering of (online)
deduplication.
|
CollectionSimilarityMeasure<C extends java.util.Collection<? extends E>,E> |
|
CommonConflictResolutions |
Provides factory methods for common conflict resolutions.
|
CommonSimilarityMeasures |
A utility class that offers factory methods for common similarity measures.
|
CommonTransformations |
A utility class that offers factory methods for common similarity transformations.
|
CompositeValue<T1 extends java.lang.Comparable<T1>,T2 extends java.lang.Comparable<T2>> |
Helper for composed (sorting) keys, which will perform position-wise comparison of the key elements.
|
ConflictResolution<I,O> |
Solves a conflict during resolution - two or more different values that need to be merged into a single value for the
fused representation.
|
ConflictResolutionFusion<R> |
A fusion approach based on conflict resolution.
|
ConflictResolutionFusion.ConflictResolutionFusionBuilder<R> |
|
ConsistentClustering<C extends java.lang.Comparable<C>,T,I extends java.lang.Comparable<? super I>> |
Wraps another clustering and keeps clusters together, when the wrapped clustering would split it. Example:
consider a stable marriage-based clustering where A1-B have been previously matched and subsequently clustered.
|
ConsistentClustering.ConsistentClusteringBuilder<C extends java.lang.Comparable<C>,T,I extends java.lang.Comparable<? super I>> |
|
CosineSimilarityMeasure<C extends java.util.Collection<? extends E>,E> |
Calculates the cosine similarity measure over two bags of elements.
|
CutoffSimiliarityMeasure<T> |
Cuts the similarity returned by this similarity, such that all values <threshold result in a similarity of 0,
and all values [threshold, 1] are left untouched.
|
Deduplication<T> |
A full deduplication process, which ensures that no duplicate record is emitted.
|
DuplicateDetection<C extends java.lang.Comparable<C>,T> |
A duplicate detection algorithm processes a dataset of records and returns the distinct Cluster s.
|
ExceptionContext |
The exception context allows the safe execution of code that may throw an exception.
|
FunctionalClass<T> |
A wrapper around Class that can be used to extract callable lambdas to methods and fields.
|
FunctionalConstructor<T> |
A lambda wrapper around the no-arg constructor of a class.
|
FunctionalMethod<T> |
A lambda wrapper around the a method of a class.
|
FunctionalProperty<T,F> |
A lambda wrapper around the a property of a class.
|
FusedValue<T> |
A fused value with contextual information.
|
FusingOnlineDuplicateDetection<C extends java.lang.Comparable<C>,T> |
A full online deduplication process, which
Retrieves duplicate clusters through OnlineDeduplication .
Fuses the duplicate clusters into reconciled records.
|
FusingOnlineDuplicateDetection.FusingOnlineDuplicateDetectionBuilder<C extends java.lang.Comparable<C>,T> |
|
Fusion<T> |
Fuses a cluster of duplicates to one representation.
|
FusionContext |
A fusion context captures exceptions in an ExceptionContext during resolution and provides additional context
values.
|
FusionContext.FusionContextBuilder |
|
FusionException |
|
IncompleteFusionHandler<T> |
A callback that allows incomplete fusions to be treated either with direct repair algorithms or through side-channel
means (a.k.a ignore for now and repair asynchronously with the help of domain experts).
|
Levenshtein<T extends java.lang.CharSequence> |
Provides the Levenshtein similarity calculation, which calculates the number of insertions, deletions, and
replacements needed to transform one string into another.
|
MatchingSimilarity<C extends java.util.Collection<? extends E>,E> |
A similarity measure that finds the best matches between two bags of entities and calculates an overall
similarity by summing the similarity of these matches and normalize it over the (max) number of elements per
collection.
|
Merge<R> |
A nested conflict resolution for complex types.
|
Merge.AdditionalFieldMergeBuilder<F,R> |
|
Merge.FieldMergeBuilder<F,R> |
|
Merge.IllTypedFieldMergeBuilder<I,F,R> |
|
Merge.MergeBuilder<R> |
|
OfflineCandidate<T> |
|
OfflineCandidateSelection<T> |
Selects candidates from a static dataset accessible through Iterables.
|
OfflineDeduplication<T> |
A full offline deduplication process, which ensures that no duplicate record is in the results.
|
OfflineDuplicateDetection<C extends java.lang.Comparable<C>,T> |
The offline duplicate detection returns all duplicate records within a dataset.
|
OnlineCandidate<T> |
|
OnlineCandidateSelection<T> |
Selects candidates from a streaming dataset by processing the incoming elements one-at-a-time.
|
OnlineDeduplication<T> |
A full online deduplication process, which ensures that no duplicate record is emitted.
|
OnlineDuplicateDetection<C extends java.lang.Comparable<C>,T> |
An online duplicate detection algorithm processes a stream of records and returns all changed Cluster s for
each record.
|
OnlinePairBasedDuplicateDetection<C extends java.lang.Comparable<C>,T> |
|
OnlinePairBasedDuplicateDetection.OnlinePairBasedDuplicateDetectionBuilder<C extends java.lang.Comparable<C>,T> |
|
OnlineSortedNeighborhoodMethod<T> |
A sorted neighborhood method (SNM) for online deduplication.
|
OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder<T> |
|
OnlineSortedNeighborhoodMethod.Pass<T,K extends java.lang.Comparable<K>> |
Represents a pass over the dataset with a specific sorting key and window size.
|
OracleClassifier<T> |
A classifier that knows the results perfectly as it receives the gold standard during creation.
|
OracleClustering<C extends java.lang.Comparable<C>,T,I> |
A clustering that knows the results perfectly as it receives the gold standard during creation.
|
PossibleDuplicateHandler<T> |
|
RefineCluster<C extends java.lang.Comparable<C>,T> |
Splits large clusters into smaller clusters when the inter-cluster similarities are sub-optimal.
|
RefineCluster.RefineClusterBuilder<C extends java.lang.Comparable<C>,T> |
|
RefinedTransitiveClosure<C extends java.lang.Comparable<C>,T,I extends java.lang.Comparable<? super I>> |
|
RefinedTransitiveClosure.RefinedTransitiveClosureBuilder<C extends java.lang.Comparable<C>,T,I extends java.lang.Comparable<? super I>> |
|
ResolutionTag<T> |
A resolution tag is a way to refer to a previously, resolved value in the current FusionContext .
|
RuleBasedClassifier<T> |
|
RuleBasedClassifier.Rule<T> |
A rule has a name for lineage/debugging and the similarity measure.
|
RuleBasedClassifier.RuleBasedClassifierBuilder<T> |
A builder for RuleBasedClassifier with convenience methods to create positive and negative rules.
|
SetSimilarityMeasure<C extends java.util.Collection<? extends E>,E> |
|
SimilarityContext |
A similarity context captures exceptions in an ExceptionContext and provides additional configurations that
represent cross-cutting concerns, such as null handling.
|
SimilarityContext.SimilarityContextBuilder |
|
SimilarityMeasure<T> |
A SimilarityMeasure compares two values and calculates a similarity in [-1; 1], where 0 means no similarity and 1
equal values in a given context.
|
SortingKey<T,K extends java.lang.Comparable<K>> |
A sorting key allows a dataset to be indexed by a specific (calculated) value of a record, such that duplicates have
a higher probability of being in close proximity and thus a CandidateSelection may prune the search space
drastically.
|
Source |
The source of a value to be fused.
|
StreamUtil |
Adds some functions that are missing in Java Streams.
|
StronglyStableMarriage<T> |
Implements a strongly stable matching based on the stable marriage with indifference.
|
TerminalConflictResolution<I,O> |
A conflict resolution function that is guaranteed to produce a single value.
|
TransformingSimilarityMeasure<R,T> |
|
TransitiveClosure<C extends java.lang.Comparable<C>,T,I extends java.lang.Comparable<? super I>> |
An amortized linear transitive closure implementation over the number of pairs.
|
TransitiveClosure.TransitiveClosureBuilder<C extends java.lang.Comparable<C>,T,I extends java.lang.Comparable<? super I>> |
|
ValueTransformation<T,R> |
Performs a transformation on a value, especially on input values for SimilarityMeasure .
|
WeaklyStableMarriage<T> |
Implements a weakly stable matching based on the stable marriage with indifference (i.e., ties).
|
WeightedAggregatingSimilarityMeasure<R> |
|
WeightedAggregatingSimilarityMeasure.WeightedAggregatingSimilarityMeasureBuilder<R> |
|
WeightedAggregatingSimilarityMeasure.WeightedSimilarity<T> |
|
WeightedAggregatingSimilarityMeasure.WeightedValue |
|
WeightedEdge<T> |
A weighted (directed) edge between two records.
|