A B C D E F G H I J K L M N O P R S T U V W 
All Classes All Packages

A

AbstractMatcher(List<? extends Queue<List<Integer>>>, List<? extends Queue<List<Integer>>>) - Constructor for class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
AbstractStableMarriage<T> - Class in com.bakdata.dedupe.matching
 
AbstractStableMarriage() - Constructor for class com.bakdata.dedupe.matching.AbstractStableMarriage
 
AbstractStableMarriage.AbstractMatcher - Class in com.bakdata.dedupe.matching
 
AbstractStableMarriage.Matcher - Interface in com.bakdata.dedupe.matching
 
add(double, SimilarityMeasure<R>) - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedAggregatingSimilarityMeasureBuilder
 
add(double, Function<R, ? extends T>, SimilarityMeasure<? super T>) - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedAggregatingSimilarityMeasureBuilder
 
add(T) - Method in class com.bakdata.dedupe.clustering.Cluster
 
AdditionalFieldMergeBuilder(Merge.IllTypedFieldMergeBuilder<F, F, R>) - Constructor for class com.bakdata.dedupe.fusion.Merge.AdditionalFieldMergeBuilder
 
AggregatingSimilarityMeasure<T> - Class in com.bakdata.dedupe.similarity
Aggregates similarities with a given aggregator.
AggregatingSimilarityMeasure(ToDoubleFunction<? super DoubleStream>, SimilarityMeasure<? super T>...) - Constructor for class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure
Creates an AggregatingSimilarityMeasure with the given aggregator and the similarity measures.
AggregatingSimilarityMeasure(ToDoubleFunction<? super DoubleStream>, Iterable<? extends SimilarityMeasure<? super T>>) - Constructor for class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure
Creates an AggregatingSimilarityMeasure with the given aggregator and the similarity measures.
AggregatingSimilarityMeasure.AggregatingSimilarityMeasureBuilder<T> - Class in com.bakdata.dedupe.similarity
 
aggregator(ToDoubleFunction<? super DoubleStream>) - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure.AggregatingSimilarityMeasureBuilder
 
aggregator(ToDoubleFunction<Stream<WeightedAggregatingSimilarityMeasure.WeightedValue>>) - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedAggregatingSimilarityMeasureBuilder
 
and(T) - Method in class com.bakdata.dedupe.candidate_selection.CompositeValue
Adds another value to the composition.
andThen(ConflictResolution<O, O2>) - Method in interface com.bakdata.dedupe.fusion.ConflictResolution
Chains two conflict resolution functions, so that if some values remain unresolved after this conflict resolution, the successor will be applied on these remaining alternatives.
andThen(ValueTransformation<? super R, ? extends V>) - Method in interface com.bakdata.dedupe.similarity.ValueTransformation
Composes this and the other transformation, such that this transformation is first applied to values and the given transformation is applied afterwards.
AnnotatedValue<T> - Class in com.bakdata.dedupe.fusion
A value with lineage information; in particular, the Source and the timestamp.
AnnotatedValue(T, Source, LocalDateTime) - Constructor for class com.bakdata.dedupe.fusion.AnnotatedValue
 
apply(FusedValue<T>) - Method in interface com.bakdata.dedupe.fusion.IncompleteFusionHandler
 
assign(Collection<WeightedEdge<T>>) - Method in interface com.bakdata.dedupe.matching.Assigner
Finds the set of edges forming a matching that maximizes the sum of the edge weights.
assign(Collection<WeightedEdge<T>>) - Method in interface com.bakdata.dedupe.matching.BipartiteMatcher
 
Assigner<T> - Interface in com.bakdata.dedupe.matching
Implements an algorithm that solves an assignment problem.
assignMaterialized(Collection<WeightedEdge<T>>) - Method in interface com.bakdata.dedupe.matching.Assigner
Finds the set of edges forming a matching that maximizes the sum of the edge weights.
assumeEqualValue() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
A no-op conflict resolution that will eventually lead to a FusionException, when values are not resolved.

B

beiderMorse() - Static method in class com.bakdata.dedupe.similarity.CommonTransformations
Creates a normalizer that turns strings into the beider morse code.
bigram() - Static method in class com.bakdata.dedupe.similarity.CommonTransformations
Creates a tokenizer for a string into bigrams; that is, two succeeding characters.
binarize() - Method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Binarizes the similarity returned by this similarity, such that all values >0 result in a similarity of 1.
BipartiteMatcher<T> - Interface in com.bakdata.dedupe.matching
Implements an algorithm that finds a matching in a bipartite graph.
breakEngangement(Integer, Integer) - Method in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
build() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder
 
build() - Method in class com.bakdata.dedupe.classifier.ClassificationResult.ClassificationResultBuilder
 
build() - Method in class com.bakdata.dedupe.classifier.ClassifiedCandidate.ClassifiedCandidateBuilder
 
build() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
 
build() - Method in class com.bakdata.dedupe.clustering.Cluster.ClusterBuilder
 
build() - Method in class com.bakdata.dedupe.clustering.ConsistentClustering.ConsistentClusteringBuilder
 
build() - Method in class com.bakdata.dedupe.clustering.RefineCluster.RefineClusterBuilder
 
build() - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure.RefinedTransitiveClosureBuilder
 
build() - Method in class com.bakdata.dedupe.clustering.TransitiveClosure.TransitiveClosureBuilder
 
build() - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection.FusingOnlineDuplicateDetectionBuilder
 
build() - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection.OnlinePairBasedDuplicateDetectionBuilder
 
build() - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion.ConflictResolutionFusionBuilder
 
build() - Method in class com.bakdata.dedupe.fusion.FusionContext.FusionContextBuilder
 
build() - Method in class com.bakdata.dedupe.fusion.Merge.AdditionalFieldMergeBuilder
 
build() - Method in class com.bakdata.dedupe.fusion.Merge.MergeBuilder
 
build() - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure.AggregatingSimilarityMeasureBuilder
 
build() - Method in class com.bakdata.dedupe.similarity.SimilarityContext.SimilarityContextBuilder
 
build() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedAggregatingSimilarityMeasureBuilder
 
builder() - Static method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod
 
builder() - Static method in class com.bakdata.dedupe.classifier.ClassificationResult
 
builder() - Static method in class com.bakdata.dedupe.classifier.ClassifiedCandidate
 
builder() - Static method in class com.bakdata.dedupe.classifier.RuleBasedClassifier
 
builder() - Static method in class com.bakdata.dedupe.clustering.Cluster
 
builder() - Static method in class com.bakdata.dedupe.clustering.ConsistentClustering
 
builder() - Static method in class com.bakdata.dedupe.clustering.RefineCluster
 
builder() - Static method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure
 
builder() - Static method in class com.bakdata.dedupe.clustering.TransitiveClosure
 
builder() - Static method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection
 
builder() - Static method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection
 
builder() - Static method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion
 
builder() - Static method in class com.bakdata.dedupe.fusion.FusionContext
 
builder() - Static method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure
 
builder() - Static method in class com.bakdata.dedupe.similarity.SimilarityContext
 
builder() - Static method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure
 

C

CachingSimilarity<T,​I> - Class in com.bakdata.dedupe.similarity
A light-weight caching layer over SimilarityMeasure to allow implementations to repeatedly calculate expensive similarities without implementing a cache on their own.
CachingSimilarity(SimilarityMeasure<T>, Function<T, I>) - Constructor for class com.bakdata.dedupe.similarity.CachingSimilarity
 
calculated(T) - Static method in class com.bakdata.dedupe.fusion.AnnotatedValue
Wraps the given value with a calculated source and the current timestamp.
calculateNonEmptyCollectionSimilarity(C, C, SimilarityContext) - Method in interface com.bakdata.dedupe.similarity.CollectionSimilarityMeasure
Calculates the similarity ignoring the trivial cases of empty input collections.
calculateNonEmptyCollectionSimilarity(C, C, SimilarityContext) - Method in class com.bakdata.dedupe.similarity.CosineSimilarityMeasure
 
calculateNonEmptyCollectionSimilarity(C, C, SimilarityContext) - Method in class com.bakdata.dedupe.similarity.MatchingSimilarity
 
calculateNonEmptyCollectionSimilarity(C, C, SimilarityContext) - Method in interface com.bakdata.dedupe.similarity.SetSimilarityMeasure
 
calculateNonEmptySetSimilarity(Set<E>, Set<E>, SimilarityContext) - Method in interface com.bakdata.dedupe.similarity.SetSimilarityMeasure
Calculates the similarity ignoring the trivial cases of empty input collections.
candidate(Candidate<T>) - Method in class com.bakdata.dedupe.classifier.ClassifiedCandidate.ClassifiedCandidateBuilder
 
Candidate<T> - Interface in com.bakdata.dedupe.candidate_selection
Represents a candidate pair that was generated with an OfflineCandidateSelection.
candidateSelection(OnlineCandidateSelection<T>) - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection.OnlinePairBasedDuplicateDetectionBuilder
 
CandidateSelection<T> - Interface in com.bakdata.dedupe.candidate_selection
Selects candidates from a static or dynamic dataset accessible through Iterables.
canEqual(Object) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.Pass
 
canEqual(Object) - Method in class com.bakdata.dedupe.clustering.Cluster
 
canEqual(Object) - Method in class com.bakdata.dedupe.matching.StronglyStableMarriage
 
canEqual(Object) - Method in class com.bakdata.dedupe.matching.WeaklyStableMarriage
 
checkSplit(Collection<? extends Cluster<C, T>>, T) - Method in interface com.bakdata.dedupe.clustering.ClusterSplitHandler
Checks if the given set of clusters contains more than element and invokes ClusterSplitHandler.clusterSplit(Cluster, List).
classification(Classification) - Method in class com.bakdata.dedupe.classifier.ClassificationResult.ClassificationResultBuilder
 
Classification - Enum in com.bakdata.dedupe.classifier
Contains the possible classification classes.
ClassificationException - Exception in com.bakdata.dedupe.classifier
An exception thrown by a Classifier whenever an exception during classification occurred.
ClassificationException() - Constructor for exception com.bakdata.dedupe.classifier.ClassificationException
 
ClassificationException(String) - Constructor for exception com.bakdata.dedupe.classifier.ClassificationException
 
ClassificationException(String, Throwable) - Constructor for exception com.bakdata.dedupe.classifier.ClassificationException
 
ClassificationException(String, Throwable, boolean, boolean) - Constructor for exception com.bakdata.dedupe.classifier.ClassificationException
 
ClassificationException(Throwable) - Constructor for exception com.bakdata.dedupe.classifier.ClassificationException
 
classificationResult(ClassificationResult) - Method in class com.bakdata.dedupe.classifier.ClassifiedCandidate.ClassifiedCandidateBuilder
 
ClassificationResult - Class in com.bakdata.dedupe.classifier
The classification of a Candidate with additional information.
ClassificationResult.ClassificationResultBuilder - Class in com.bakdata.dedupe.classifier
 
ClassifiedCandidate<T> - Class in com.bakdata.dedupe.classifier
Contains a ClassificationResult with the corresponding Candidate.
ClassifiedCandidate(Candidate<T>, ClassificationResult) - Constructor for class com.bakdata.dedupe.classifier.ClassifiedCandidate
 
ClassifiedCandidate.ClassifiedCandidateBuilder<T> - Class in com.bakdata.dedupe.classifier
 
classifier(Classifier<T>) - Method in class com.bakdata.dedupe.clustering.RefineCluster.RefineClusterBuilder
 
classifier(Classifier<T>) - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection.OnlinePairBasedDuplicateDetectionBuilder
 
Classifier<T> - Interface in com.bakdata.dedupe.classifier
Classifies a OnlineCandidate as duplicate, non-duplicate, or other Classifications.
classify(Candidate<T>) - Method in interface com.bakdata.dedupe.classifier.Classifier
Classifies the OnlineCandidate as duplicate, non-duplicate, or other Classifications.
classify(Candidate<T>) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier
 
classify(Candidate<T>) - Method in class com.bakdata.dedupe.classifier.OracleClassifier
 
classifyCandidate(Candidate<T>) - Method in interface com.bakdata.dedupe.classifier.Classifier
Classifies the Candidate as duplicate, non-duplicate, or other Classifications and stores the ClassificationResult together with the candidate.
clearPasses() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder
 
clearRules() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
 
clearSources() - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion.ConflictResolutionFusionBuilder
 
clearWeightedSimilarities() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedAggregatingSimilarityMeasureBuilder
 
closure(TransitiveClosure<C, T, I>) - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure.RefinedTransitiveClosureBuilder
 
cluster(Stream<ClassifiedCandidate<T>>) - Method in interface com.bakdata.dedupe.clustering.Clustering
Creates a coherent Cluster from a list of ClassifiedCandidates.
cluster(Stream<ClassifiedCandidate<T>>) - Method in class com.bakdata.dedupe.clustering.ConsistentClustering
 
cluster(Stream<ClassifiedCandidate<T>>) - Method in class com.bakdata.dedupe.clustering.OracleClustering
 
cluster(Stream<ClassifiedCandidate<T>>) - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure
 
cluster(Stream<ClassifiedCandidate<T>>) - Method in class com.bakdata.dedupe.clustering.TransitiveClosure
 
Cluster<C extends java.lang.Comparable<C>,​T> - Class in com.bakdata.dedupe.clustering
A cluster is a coherent collection of duplicate records.
Cluster(C) - Constructor for class com.bakdata.dedupe.clustering.Cluster
 
Cluster(C, List<T>) - Constructor for class com.bakdata.dedupe.clustering.Cluster
 
Cluster.ClusterBuilder<C extends java.lang.Comparable<C>,​T> - Class in com.bakdata.dedupe.clustering
 
clusterDuplicates(Iterable<? extends Candidate<T>>) - Method in class com.bakdata.dedupe.clustering.TransitiveClosure
 
clusterIdGenerator(Function<? super Iterable<? extends T>, C>) - Method in class com.bakdata.dedupe.clustering.RefineCluster.RefineClusterBuilder
 
clusterIdGenerator(Function<? super Iterable<? extends T>, C>) - Method in class com.bakdata.dedupe.clustering.TransitiveClosure.TransitiveClosureBuilder
 
ClusterIdGenerators - Class in com.bakdata.dedupe.clustering
A collection of typical cluster id generators.
clusterIndex(Map<I, Cluster<C, T>>) - Method in class com.bakdata.dedupe.clustering.TransitiveClosure.TransitiveClosureBuilder
 
clustering(Clustering<C, T>) - Method in class com.bakdata.dedupe.clustering.ConsistentClustering.ConsistentClusteringBuilder
 
clustering(Clustering<C, T>) - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection.OnlinePairBasedDuplicateDetectionBuilder
 
Clustering<C extends java.lang.Comparable<C>,​T> - Interface in com.bakdata.dedupe.clustering
A clustering algorithm takes a list of ClassifiedCandidates and creates a coherent Cluster, such that all pairs of records inside the cluster are duplicate and no record outside the cluster is a duplicate with any record inside the cluster.
Clusters - Class in com.bakdata.dedupe.clustering
Utility methods for Cluster
clusterSplit(Cluster<C, T>, List<Cluster<C, T>>) - Method in interface com.bakdata.dedupe.clustering.ClusterSplitHandler
Invoked when an already existing cluster is split up during the Clustering of (online) deduplication.
ClusterSplitHandler<C extends java.lang.Comparable<C>,​T> - Interface in com.bakdata.dedupe.clustering
A callback that is invoked when an already existing cluster is split up during the Clustering of (online) deduplication.
codec(StringEncoder) - Static method in class com.bakdata.dedupe.similarity.CommonTransformations
Wraps a StringEncoder into a ValueTransformation.
CollectionSimilarityMeasure<C extends java.util.Collection<? extends E>,​E> - Interface in com.bakdata.dedupe.similarity
A SimilarityMeasure that is defined over Collections.
colognePhonetic() - Static method in class com.bakdata.dedupe.similarity.CommonTransformations
Creates a normalizer that turns strings into the cologne phonetics.
com.bakdata.dedupe.candidate_selection - package com.bakdata.dedupe.candidate_selection
Base data structured shared by online and offline candidate selections that choose promising pairs to limit search space for duplicates.
com.bakdata.dedupe.candidate_selection.offline - package com.bakdata.dedupe.candidate_selection.offline
Interfaces and implementations for offline candidate selections that choose promising pairs to limit search space for duplicates in a materialized dataset.
com.bakdata.dedupe.candidate_selection.online - package com.bakdata.dedupe.candidate_selection.online
Interfaces and implementations for online candidate selections that choose promising pairs to limit search space for duplicates in a streaming dataset.
com.bakdata.dedupe.classifier - package com.bakdata.dedupe.classifier
Interfaces, data structures, and implementations for the classification of Candidate pairs into duplicates and non-duplicates.
com.bakdata.dedupe.clustering - package com.bakdata.dedupe.clustering
Clusters ClassifiedCandidates into coherent clusters.
com.bakdata.dedupe.deduplication - package com.bakdata.dedupe.deduplication
Provides interfaces and implementations for a full deduplication process, which ensures that no duplicate record is emitted.
com.bakdata.dedupe.deduplication.offline - package com.bakdata.dedupe.deduplication.offline
Full offline deduplication for materialized data.
com.bakdata.dedupe.deduplication.online - package com.bakdata.dedupe.deduplication.online
Full online deduplication for streaming data.
com.bakdata.dedupe.duplicate_detection - package com.bakdata.dedupe.duplicate_detection
Provides base interfaces and implementations for finding duplicate clusters.
com.bakdata.dedupe.duplicate_detection.offline - package com.bakdata.dedupe.duplicate_detection.offline
Offline duplicate detection to find duplicate clusters in materialized data.
com.bakdata.dedupe.duplicate_detection.online - package com.bakdata.dedupe.duplicate_detection.online
Online duplicate detection to find duplicate clusters in streaming data.
com.bakdata.dedupe.fusion - package com.bakdata.dedupe.fusion
Provides means to reconcile a duplicate cluster into a consistent representation.
com.bakdata.dedupe.matching - package com.bakdata.dedupe.matching
Assigns and matches nodes of a bipartite graph.
com.bakdata.dedupe.similarity - package com.bakdata.dedupe.similarity
Data structures, interfaces, and implementations to define similarity measures that are ultimately used to detect duplicates.
com.bakdata.util - package com.bakdata.util
Utility classes that should not be deemed public API.
CommonConflictResolutions - Class in com.bakdata.dedupe.fusion
Provides factory methods for common conflict resolutions.
CommonSimilarityMeasures - Class in com.bakdata.dedupe.similarity
A utility class that offers factory methods for common similarity measures.
CommonTransformations - Class in com.bakdata.dedupe.similarity
A utility class that offers factory methods for common similarity transformations.
compareTo(CompositeValue<T1, T2>) - Method in class com.bakdata.dedupe.candidate_selection.CompositeValue
 
compose(T1, T2) - Static method in class com.bakdata.dedupe.candidate_selection.CompositeValue
Creates a composite value of the two values.
CompositeValue<T1 extends java.lang.Comparable<T1>,​T2 extends java.lang.Comparable<T2>> - Class in com.bakdata.dedupe.candidate_selection
Helper for composed (sorting) keys, which will perform position-wise comparison of the key elements.
CompositeValue(T1, T2) - Constructor for class com.bakdata.dedupe.candidate_selection.CompositeValue
 
confidence(double) - Method in class com.bakdata.dedupe.classifier.ClassificationResult.ClassificationResultBuilder
 
ConflictResolution<I,​O> - Interface in com.bakdata.dedupe.fusion
Solves a conflict during resolution - two or more different values that need to be merged into a single value for the fused representation.
ConflictResolutionFusion<R> - Class in com.bakdata.dedupe.fusion
A fusion approach based on conflict resolution.
ConflictResolutionFusion.ConflictResolutionFusionBuilder<R> - Class in com.bakdata.dedupe.fusion
 
ConsistentClustering<C extends java.lang.Comparable<C>,​T,​I extends java.lang.Comparable<? super I>> - Class in com.bakdata.dedupe.clustering
Wraps another clustering and keeps clusters together, when the wrapped clustering would split it.
Example: consider a stable marriage-based clustering where A1-B have been previously matched and subsequently clustered.
ConsistentClustering.ConsistentClusteringBuilder<C extends java.lang.Comparable<C>,​T,​I extends java.lang.Comparable<? super I>> - Class in com.bakdata.dedupe.clustering
 
contains(T) - Method in class com.bakdata.dedupe.clustering.Cluster
 
contextSupplier(Supplier<SimilarityContext>) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
 
convertingBack(ConflictResolution<I, F>) - Method in class com.bakdata.dedupe.fusion.Merge.IllTypedFieldMergeBuilder
 
convertingWith(ConflictResolution<F, I>) - Method in class com.bakdata.dedupe.fusion.Merge.AdditionalFieldMergeBuilder
 
convertingWith(ConflictResolution<F, I>) - Method in class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
corresponding(ResolutionTag<?>) - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Retains all values coming from the same source as the values resolved by the ResolutionTag.
corresponding(ResolutionTag<?>) - Method in class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
correspondingToPrevious() - Method in class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
cosine() - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Calculates the cosine similarity measure over two bags of elements.
CosineSimilarityMeasure<C extends java.util.Collection<? extends E>,​E> - Class in com.bakdata.dedupe.similarity
Calculates the cosine similarity measure over two bags of elements.
CosineSimilarityMeasure() - Constructor for class com.bakdata.dedupe.similarity.CosineSimilarityMeasure
 
createMatcher(List<? extends Queue<List<Integer>>>, List<? extends Queue<List<Integer>>>) - Method in class com.bakdata.dedupe.matching.AbstractStableMarriage
 
createMatcher(List<? extends Queue<List<Integer>>>, List<? extends Queue<List<Integer>>>) - Method in class com.bakdata.dedupe.matching.StronglyStableMarriage
 
createMatcher(List<? extends Queue<List<Integer>>>, List<? extends Queue<List<Integer>>>) - Method in class com.bakdata.dedupe.matching.WeaklyStableMarriage
 
cutoff(double) - Method in class com.bakdata.dedupe.similarity.CutoffSimiliarityMeasure
 
cutoff(double) - Method in class com.bakdata.dedupe.similarity.Levenshtein
 
cutoff(double) - Method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Cuts the similarity returned by this similarity, such that all values <threshold result in a similarity of 0, and all values [threshold, 1] are left untouched.
cutoff(double) - Method in class com.bakdata.dedupe.similarity.TransformingSimilarityMeasure
 
cutoff(double, double) - Static method in class com.bakdata.dedupe.similarity.CutoffSimiliarityMeasure
Cuts the similarity, such that all values <threshold result in a similarity of 0, and all values [threshold, 1] are left untouched.
CutoffSimiliarityMeasure<T> - Class in com.bakdata.dedupe.similarity
Cuts the similarity returned by this similarity, such that all values <threshold result in a similarity of 0, and all values [threshold, 1] are left untouched.
CutoffSimiliarityMeasure(SimilarityMeasure<T>, double) - Constructor for class com.bakdata.dedupe.similarity.CutoffSimiliarityMeasure
 

D

deduplicate(Stream<? extends T>) - Method in interface com.bakdata.dedupe.deduplication.Deduplication
Deduplicates the dataset.
deduplicate(Stream<? extends T>) - Method in interface com.bakdata.dedupe.deduplication.online.OnlineDeduplication
 
deduplicate(T) - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection
 
deduplicate(T) - Method in interface com.bakdata.dedupe.deduplication.online.OnlineDeduplication
Deduplicates the record with all previously seen records.
Deduplication<T> - Interface in com.bakdata.dedupe.deduplication
A full deduplication process, which ensures that no duplicate record is emitted.
defaultClassificationResult(ClassificationResult) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
 
defaultRule(SimilarityMeasure<? super T>, double) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
Creates a threshold rule named "default" that is always applied.
defaultWindowSize(int) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder
 
delete(Integer, Integer) - Method in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
demoteToNonDuplicate() - Static method in interface com.bakdata.dedupe.duplicate_detection.PossibleDuplicateHandler
Treats possible duplicates as non-duplicates.
detectDuplicates(Stream<? extends T>) - Method in interface com.bakdata.dedupe.duplicate_detection.DuplicateDetection
Finds all duplicates in the dataset.
detectDuplicates(Stream<? extends T>) - Method in interface com.bakdata.dedupe.duplicate_detection.online.OnlineDuplicateDetection
 
detectDuplicates(T) - Method in interface com.bakdata.dedupe.duplicate_detection.online.OnlineDuplicateDetection
Returns all clusters that have been affected by the new record.
detectDuplicates(T) - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection
 
distinct() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Returns the list of values that are unique.
doesNotApply() - Static method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.Rule
Indicates that this similarity measure can not be applied (e.g., precondition not satisfied).
dontFuse() - Static method in interface com.bakdata.dedupe.fusion.IncompleteFusionHandler
 
DUPLICATE - com.bakdata.dedupe.classifier.Classification
A sure duplicate.
duplicateDetection(OnlineDuplicateDetection<C, T>) - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection.FusingOnlineDuplicateDetectionBuilder
 
DuplicateDetection<C extends java.lang.Comparable<C>,​T> - Interface in com.bakdata.dedupe.duplicate_detection
A duplicate detection algorithm processes a dataset of records and returns the distinct Clusters.

E

earliest() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Selects all values with the same earliest AnnotatedValue.getDateTime().
elements(List<T>) - Method in class com.bakdata.dedupe.clustering.Cluster.ClusterBuilder
 
engagements - Variable in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
equality() - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
A similarity measure that is 1 if left.equals(right) or 0 otherwise.
equality(SimilarityMeasure<T>) - Static method in class com.bakdata.dedupe.similarity.CachingSimilarity
Creates a cache based on the object equality; that is, the cache triggers if two equal instances are compared.
equals(Object) - Method in class com.bakdata.dedupe.candidate_selection.CompositeValue
 
equals(Object) - Method in class com.bakdata.dedupe.candidate_selection.offline.OfflineCandidate
 
equals(Object) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineCandidate
 
equals(Object) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod
 
equals(Object) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.Pass
 
equals(Object) - Method in class com.bakdata.dedupe.candidate_selection.SortingKey
 
equals(Object) - Method in class com.bakdata.dedupe.classifier.ClassificationResult
 
equals(Object) - Method in class com.bakdata.dedupe.classifier.ClassifiedCandidate
 
equals(Object) - Method in class com.bakdata.dedupe.classifier.OracleClassifier
 
equals(Object) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier
 
equals(Object) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.Rule
 
equals(Object) - Method in class com.bakdata.dedupe.clustering.Cluster
 
equals(Object) - Method in class com.bakdata.dedupe.clustering.ConsistentClustering
 
equals(Object) - Method in class com.bakdata.dedupe.clustering.OracleClustering
 
equals(Object) - Method in class com.bakdata.dedupe.clustering.RefineCluster
 
equals(Object) - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure
 
equals(Object) - Method in class com.bakdata.dedupe.clustering.TransitiveClosure
 
equals(Object) - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection
 
equals(Object) - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection
 
equals(Object) - Method in class com.bakdata.dedupe.fusion.AnnotatedValue
 
equals(Object) - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion
 
equals(Object) - Method in class com.bakdata.dedupe.fusion.FusedValue
 
equals(Object) - Method in class com.bakdata.dedupe.fusion.FusionContext
 
equals(Object) - Method in class com.bakdata.dedupe.fusion.Merge.AdditionalFieldMergeBuilder
 
equals(Object) - Method in class com.bakdata.dedupe.fusion.Merge
 
equals(Object) - Method in class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
equals(Object) - Method in class com.bakdata.dedupe.fusion.Merge.IllTypedFieldMergeBuilder
 
equals(Object) - Method in class com.bakdata.dedupe.fusion.Merge.MergeBuilder
 
equals(Object) - Method in class com.bakdata.dedupe.fusion.ResolutionTag
 
equals(Object) - Method in class com.bakdata.dedupe.fusion.Source
 
equals(Object) - Method in class com.bakdata.dedupe.matching.StronglyStableMarriage
 
equals(Object) - Method in class com.bakdata.dedupe.matching.WeaklyStableMarriage
 
equals(Object) - Method in class com.bakdata.dedupe.matching.WeightedEdge
 
equals(Object) - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure
 
equals(Object) - Method in class com.bakdata.dedupe.similarity.CachingSimilarity
 
equals(Object) - Method in class com.bakdata.dedupe.similarity.CutoffSimiliarityMeasure
 
equals(Object) - Method in class com.bakdata.dedupe.similarity.Levenshtein
 
equals(Object) - Method in class com.bakdata.dedupe.similarity.MatchingSimilarity
 
equals(Object) - Method in class com.bakdata.dedupe.similarity.SimilarityContext
 
equals(Object) - Method in class com.bakdata.dedupe.similarity.TransformingSimilarityMeasure
 
equals(Object) - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure
 
equals(Object) - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedSimilarity
 
equals(Object) - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedValue
 
equals(Object) - Method in class com.bakdata.util.ExceptionContext
 
equals(Object) - Method in class com.bakdata.util.FunctionalClass
 
equals(Object) - Method in class com.bakdata.util.FunctionalConstructor
 
equals(Object) - Method in class com.bakdata.util.FunctionalMethod
 
equals(Object) - Method in class com.bakdata.util.FunctionalProperty
 
ExceptionContext - Class in com.bakdata.util
The exception context allows the safe execution of code that may throw an exception.
ExceptionContext() - Constructor for class com.bakdata.util.ExceptionContext
 
explanation(String) - Method in class com.bakdata.dedupe.classifier.ClassificationResult.ClassificationResultBuilder
 

F

field(FunctionalProperty<R, F>) - Method in class com.bakdata.dedupe.fusion.Merge.MergeBuilder
 
field(FunctionalProperty<R, F2>) - Method in class com.bakdata.dedupe.fusion.Merge.AdditionalFieldMergeBuilder
 
field(String) - Method in class com.bakdata.dedupe.fusion.Merge.AdditionalFieldMergeBuilder
 
field(String) - Method in class com.bakdata.dedupe.fusion.Merge.MergeBuilder
 
field(String) - Method in class com.bakdata.util.FunctionalClass
Returns the functional field with the given name.
field(Function<R, F>, BiConsumer<R, F>) - Method in class com.bakdata.dedupe.fusion.Merge.MergeBuilder
 
field(Function<R, F2>, BiConsumer<R, F2>) - Method in class com.bakdata.dedupe.fusion.Merge.AdditionalFieldMergeBuilder
 
FieldMergeBuilder(Merge.MergeBuilder<R>, Function<R, F>, BiConsumer<R, F>) - Constructor for class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
first() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Picks a the first value out of the bag of values.
first(SimilarityMeasure<? super T>...) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Returns the first similarity of the given similarity measures that is not SimilarityMeasure.unknown().
first(Iterable<? extends SimilarityMeasure<? super T>>) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Returns the first similarity of the given similarity measures that is not SimilarityMeasure.unknown().
FunctionalClass<T> - Class in com.bakdata.util
A wrapper around Class that can be used to extract callable lambdas to methods and fields.
FunctionalConstructor<T> - Class in com.bakdata.util
A lambda wrapper around the no-arg constructor of a class.
FunctionalConstructor(Constructor<T>) - Constructor for class com.bakdata.util.FunctionalConstructor
 
FunctionalMethod<T> - Class in com.bakdata.util
A lambda wrapper around the a method of a class.
FunctionalMethod(Method) - Constructor for class com.bakdata.util.FunctionalMethod
 
FunctionalProperty<T,​F> - Class in com.bakdata.util
A lambda wrapper around the a property of a class.
FunctionalProperty(PropertyDescriptor) - Constructor for class com.bakdata.util.FunctionalProperty
 
fuse(Cluster<?, T>) - Method in interface com.bakdata.dedupe.fusion.Fusion
Fuses a cluster of duplicates to one representation.
fuse(Cluster<?, R>) - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion
 
fusedValue(Cluster<?, T>, IncompleteFusionHandler<T>) - Method in interface com.bakdata.dedupe.fusion.Fusion
Returns the fused value for a cluster of duplicates.
FusedValue<T> - Class in com.bakdata.dedupe.fusion
A fused value with contextual information.
FusedValue(T, Cluster<?, T>, List<Exception>) - Constructor for class com.bakdata.dedupe.fusion.FusedValue
 
FusingOnlineDuplicateDetection<C extends java.lang.Comparable<C>,​T> - Class in com.bakdata.dedupe.deduplication.online
A full online deduplication process, which Retrieves duplicate clusters through OnlineDeduplication. Fuses the duplicate clusters into reconciled records.
FusingOnlineDuplicateDetection.FusingOnlineDuplicateDetectionBuilder<C extends java.lang.Comparable<C>,​T> - Class in com.bakdata.dedupe.deduplication.online
 
fusion(Fusion<T>) - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection.FusingOnlineDuplicateDetectionBuilder
 
Fusion<T> - Interface in com.bakdata.dedupe.fusion
Fuses a cluster of duplicates to one representation.
FusionContext - Class in com.bakdata.dedupe.fusion
A fusion context captures exceptions in an ExceptionContext during resolution and provides additional context values.
FusionContext.FusionContextBuilder - Class in com.bakdata.dedupe.fusion
 
FusionException - Exception in com.bakdata.dedupe.fusion
An exception thrown by a ConflictResolution whenever an exception during fusion occurs.
FusionException() - Constructor for exception com.bakdata.dedupe.fusion.FusionException
 
FusionException(String) - Constructor for exception com.bakdata.dedupe.fusion.FusionException
 
FusionException(String, Throwable) - Constructor for exception com.bakdata.dedupe.fusion.FusionException
 
FusionException(String, Throwable, boolean, boolean) - Constructor for exception com.bakdata.dedupe.fusion.FusionException
 
FusionException(Throwable) - Constructor for exception com.bakdata.dedupe.fusion.FusionException
 

G

get(int) - Method in class com.bakdata.dedupe.clustering.Cluster
 
getAggregator() - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure
The aggregator that will be applied on the similarity values.
getAggregator() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure
 
getBipartiteMatcher() - Method in class com.bakdata.dedupe.similarity.MatchingSimilarity
 
getCalculated() - Static method in class com.bakdata.dedupe.fusion.Source
A tag to indicate that the respective value has no real source as it has been created during conflict resolution.
getCandidate() - Method in class com.bakdata.dedupe.classifier.ClassifiedCandidate
The classified candidate.
getCandidateSelection() - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection
The candidate selection which returns a list of candidates for each new record.
getClassification() - Method in class com.bakdata.dedupe.classifier.ClassificationResult
The classification.
getClassificationResult() - Method in class com.bakdata.dedupe.classifier.ClassifiedCandidate
The resulting classification.
getClassifier() - Method in class com.bakdata.dedupe.clustering.RefineCluster
The classifier used to score the edges.
getClassifier() - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection
Classifier to label the candidates.
getClazz() - Method in class com.bakdata.dedupe.fusion.Merge.MergeBuilder
 
getClazz() - Method in class com.bakdata.util.FunctionalClass
The wrapped class.
getClosure() - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure
The underlying transitive closure implementation.
getClusterIdGenerator() - Method in interface com.bakdata.dedupe.clustering.Clustering
The cluster id generator that is used to create an id for a new cluster.
getClusterIdGenerator() - Method in class com.bakdata.dedupe.clustering.ConsistentClustering
 
getClusterIdGenerator() - Method in class com.bakdata.dedupe.clustering.OracleClustering
 
getClusterIdGenerator() - Method in class com.bakdata.dedupe.clustering.RefineCluster
A function to generate the id for newly split clusters.
getClusterIdGenerator() - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure
 
getClusterIdGenerator() - Method in class com.bakdata.dedupe.clustering.TransitiveClosure
A function to generate the id for newly formed clusters.
getClusterIndex() - Method in class com.bakdata.dedupe.clustering.TransitiveClosure
A backing map for old clusters.
getClustering() - Method in class com.bakdata.dedupe.clustering.ConsistentClustering
The wrapped clustering.
getClustering() - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection
Clustering algorithm to form coherent clusters of labeled duplicates.
getConfidence() - Method in class com.bakdata.dedupe.classifier.ClassificationResult
Some confidence value that depends on the Classifier implementation.
getConstructor() - Method in class com.bakdata.util.FunctionalClass
Returns the no-arg constructor as a Supplier.
getContainingCluster(Iterator<? extends Cluster<C, T>>, T) - Static method in class com.bakdata.dedupe.clustering.Clusters
Finds the cluster containing a given record and assures that there is exactly one.
getContextSupplier() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier
Factory that creates the SimilarityContext before classifying an incoming Candidate.
getCtor() - Method in class com.bakdata.dedupe.fusion.Merge
 
getCtor() - Method in class com.bakdata.dedupe.fusion.Merge.MergeBuilder
 
getCtor() - Method in class com.bakdata.util.FunctionalConstructor
The no-arg constructor
getDateTime() - Method in class com.bakdata.dedupe.fusion.AnnotatedValue
The time of creation/modification.
getDefaultClassificationResult() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier
Fallback value, when no rule applied.
getDefaultWindowSize() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod
The default window size, when not explicitly given.
getDescriptor() - Method in class com.bakdata.util.FunctionalProperty
The wrapped property.
getDuplicateDetection() - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection
The duplicate detection returning duplicate clusters.
getElements() - Method in class com.bakdata.dedupe.clustering.Cluster
The list of elements.
getExceptionContext() - Method in class com.bakdata.dedupe.fusion.FusionContext
 
getExceptionContext() - Method in class com.bakdata.dedupe.similarity.SimilarityContext
The exception context.
getExceptions() - Method in class com.bakdata.dedupe.fusion.FusedValue
All exceptions that occurred during fusion.
getExceptions() - Method in class com.bakdata.dedupe.fusion.FusionContext
 
getExceptions() - Method in class com.bakdata.dedupe.similarity.SimilarityContext
 
getExceptions() - Method in class com.bakdata.util.ExceptionContext
The captured exceptions.
getExplanation() - Method in class com.bakdata.dedupe.classifier.ClassificationResult
Additional explanation for humans, such as the similarity and threshold (0.933 >= 0.9) or the name of a rule that triggered.
getFieldMergeBuilder() - Method in class com.bakdata.dedupe.fusion.Merge.IllTypedFieldMergeBuilder
 
getFieldMerges() - Method in class com.bakdata.dedupe.fusion.Merge
 
getFieldMerges() - Method in class com.bakdata.dedupe.fusion.Merge.MergeBuilder
 
getFirst() - Method in class com.bakdata.dedupe.candidate_selection.CompositeValue
The first element.
getFirst() - Method in class com.bakdata.dedupe.matching.WeightedEdge
The first vertex.
getFusion() - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection
The fusion implementation that reconciles the clusters into new records.
getGetter() - Method in class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
getGetter() - Method in class com.bakdata.util.FunctionalProperty
Returns the getter as a Function that takes the instance as a parameter.
getGoldClusters() - Method in class com.bakdata.dedupe.clustering.OracleClustering
The gold clustering.
getGoldDuplicates() - Method in class com.bakdata.dedupe.classifier.OracleClassifier
The set of real duplicates.
getId() - Method in class com.bakdata.dedupe.clustering.Cluster
The identifier of the cluster.
getIdExtractor() - Method in class com.bakdata.dedupe.clustering.ConsistentClustering
A function to extract the id of a record for efficient, internal data structures.
getIdExtractor() - Method in class com.bakdata.dedupe.clustering.OracleClustering
A function to extract the id of a record for efficient, internal data structures.
getIdExtractor() - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure
Extracts the id of the record.
getIdExtractor() - Method in class com.bakdata.dedupe.clustering.TransitiveClosure
Extracts the id of the record.
getIdExtractor() - Method in class com.bakdata.dedupe.similarity.CachingSimilarity
 
getIncompleteFusionHandler() - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection
A callback for non-trivial clusters.
getInner() - Method in class com.bakdata.dedupe.fusion.Merge.AdditionalFieldMergeBuilder
 
getInner() - Method in class com.bakdata.dedupe.similarity.CutoffSimiliarityMeasure
The similarity measure that is applied first before applying the threshold.
getKeyExtractor() - Method in class com.bakdata.dedupe.candidate_selection.SortingKey
A calculation or simple value access to extract the key.
getLastModifiedExtractor() - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion
A function that extract the last modification timestamp of a record.
getMaxSmallClusterSize() - Method in class com.bakdata.dedupe.clustering.RefineCluster
The maximum size (inclusive) of a cluster.
getMeasure() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.Rule
 
getMeasure() - Method in class com.bakdata.dedupe.similarity.CachingSimilarity
 
getMeasure() - Method in class com.bakdata.dedupe.similarity.TransformingSimilarityMeasure
The similarity measure to apply on the transformed values.
getMeasure() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedSimilarity
 
getMergeBuilder() - Method in class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
getMethod() - Method in class com.bakdata.util.FunctionalMethod
The wrapped method.
getName() - Method in class com.bakdata.dedupe.candidate_selection.SortingKey
The name of the sorting key.
getName() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.Rule
 
getName() - Method in class com.bakdata.dedupe.fusion.ResolutionTag
The name of the tag for debugging.
getName() - Method in class com.bakdata.dedupe.fusion.Source
The name of the source (mostly for debugging).
getNewRecord() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineCandidate
The new record that triggered the OnlineCandidateSelection.
getNextFreeMen() - Method in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
getNonNullSimilarity(CharSequence, CharSequence, SimilarityContext) - Method in class com.bakdata.dedupe.similarity.Levenshtein
 
getNonNullSimilarity(C, C, SimilarityContext) - Method in interface com.bakdata.dedupe.similarity.CollectionSimilarityMeasure
 
getNonNullSimilarity(R, R, SimilarityContext) - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure
 
getNonNullSimilarity(T, T, SimilarityContext) - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure
 
getNonNullSimilarity(T, T, SimilarityContext) - Method in class com.bakdata.dedupe.similarity.CachingSimilarity
 
getNonNullSimilarity(T, T, SimilarityContext) - Method in class com.bakdata.dedupe.similarity.CutoffSimiliarityMeasure
 
getNonNullSimilarity(T, T, SimilarityContext) - Method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Calculates the similarity of the left and right value.
getNonNullSimilarity(T, T, SimilarityContext) - Method in class com.bakdata.dedupe.similarity.TransformingSimilarityMeasure
 
getOldClusterIndex() - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure
A backing map for old clusters.
getOldRecord() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineCandidate
The old record already known to the OnlineCandidateSelection.
getOriginalValues() - Method in class com.bakdata.dedupe.fusion.FusedValue
The original values.
getPairMeasure() - Method in class com.bakdata.dedupe.similarity.MatchingSimilarity
 
getPasses() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod
The different passes used to select the candidates.
getPossibleDuplicateHandler() - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection
getRecord1() - Method in interface com.bakdata.dedupe.candidate_selection.Candidate
Returns the first record.
getRecord1() - Method in class com.bakdata.dedupe.candidate_selection.offline.OfflineCandidate
The first record.
getRecord1() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineCandidate
 
getRecord2() - Method in interface com.bakdata.dedupe.candidate_selection.Candidate
Returns the second record.
getRecord2() - Method in class com.bakdata.dedupe.candidate_selection.offline.OfflineCandidate
The second record.
getRecord2() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineCandidate
 
getRefineCluster() - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure
The configured refineCluster.
getResolution() - Method in class com.bakdata.dedupe.fusion.Merge.IllTypedFieldMergeBuilder
 
getRootResolution() - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion
The root resolution function; usually, Merge.
getRules() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier
The set of rules that are applied in the order of addition.
getSecond() - Method in class com.bakdata.dedupe.candidate_selection.CompositeValue
The first second.
getSecond() - Method in class com.bakdata.dedupe.matching.WeightedEdge
The second vertex.
getSetter() - Method in class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
getSetter() - Method in class com.bakdata.util.FunctionalProperty
Returns the setter as a Function that takes the instance and the new value as parameters.
getSimilarity(T, T, SimilarityContext) - Method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Calculates the similarity of the left and right value.
getSimilarityForNull(T, T, SimilarityContext) - Method in class com.bakdata.dedupe.similarity.SimilarityContext
Calculates the similarity when any of the two values under comparison is null.
getSimilarityMeasureForNull() - Method in class com.bakdata.dedupe.similarity.SimilarityContext
The similarity measure to use when any of the two values under comparison is null.
getSimilarityMeasures() - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure
The similarity measures that will successively applied on the input values.
getSortingKey() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.Pass
The sorting key to use in this pass.
getSource() - Method in class com.bakdata.dedupe.fusion.AnnotatedValue
The source of the value.
getSourceExtractor() - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion
Finds the name of the source, which can then be used to retrieve the respective source from ConflictResolutionFusion.getSources().
getSources() - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion
The list of possible sources.
getSplitHandler() - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure
A callback that may veto cluster splits.
getStableMatches() - Method in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
getStableMatches() - Method in interface com.bakdata.dedupe.matching.AbstractStableMarriage.Matcher
 
getStoredValues() - Method in class com.bakdata.dedupe.fusion.FusionContext
 
getStrictSuccessors(Queue<List<Integer>>, Integer) - Static method in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
getSuccessors(Queue<List<Integer>>, Integer) - Static method in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
getTail(Queue<List<Integer>>, Integer) - Static method in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
getThreshold() - Method in class com.bakdata.dedupe.similarity.CutoffSimiliarityMeasure
The threshold that divides the dissimilar and the similar values.
getThreshold() - Method in class com.bakdata.dedupe.similarity.Levenshtein
The threshold [0; 1], below which the calculation should be aborted.
getTransformation() - Method in class com.bakdata.dedupe.similarity.TransformingSimilarityMeasure
The transformation that is applied to both inputs before calculating the similarity.
getUnknown() - Static method in class com.bakdata.dedupe.fusion.Source
The unknown source is used whenever source extraction in Fusion failed.
getValue() - Method in class com.bakdata.dedupe.fusion.AnnotatedValue
The wrapped value.
getValue() - Method in class com.bakdata.dedupe.fusion.FusedValue
The resulting value.
getValue() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedValue
 
getWeight() - Method in class com.bakdata.dedupe.fusion.Source
The weight of the source, mostly used for weighted majority voting.
getWeight() - Method in class com.bakdata.dedupe.matching.WeightedEdge
The weight.
getWeight() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedSimilarity
 
getWeight() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedValue
 
getWeightedSimilarities() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure
 
getWeightedValue() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedValue
 
getWindowSize() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.Pass
The window >= 2.

H

handlePartiallyFusedValue(FusedValue<T>) - Method in interface com.bakdata.dedupe.fusion.IncompleteFusionHandler
 
hashCode() - Method in class com.bakdata.dedupe.candidate_selection.CompositeValue
 
hashCode() - Method in class com.bakdata.dedupe.candidate_selection.offline.OfflineCandidate
 
hashCode() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineCandidate
 
hashCode() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod
 
hashCode() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.Pass
 
hashCode() - Method in class com.bakdata.dedupe.candidate_selection.SortingKey
 
hashCode() - Method in class com.bakdata.dedupe.classifier.ClassificationResult
 
hashCode() - Method in class com.bakdata.dedupe.classifier.ClassifiedCandidate
 
hashCode() - Method in class com.bakdata.dedupe.classifier.OracleClassifier
 
hashCode() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier
 
hashCode() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.Rule
 
hashCode() - Method in class com.bakdata.dedupe.clustering.Cluster
 
hashCode() - Method in class com.bakdata.dedupe.clustering.ConsistentClustering
 
hashCode() - Method in class com.bakdata.dedupe.clustering.OracleClustering
 
hashCode() - Method in class com.bakdata.dedupe.clustering.RefineCluster
 
hashCode() - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure
 
hashCode() - Method in class com.bakdata.dedupe.clustering.TransitiveClosure
 
hashCode() - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection
 
hashCode() - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection
 
hashCode() - Method in class com.bakdata.dedupe.fusion.AnnotatedValue
 
hashCode() - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion
 
hashCode() - Method in class com.bakdata.dedupe.fusion.FusedValue
 
hashCode() - Method in class com.bakdata.dedupe.fusion.FusionContext
 
hashCode() - Method in class com.bakdata.dedupe.fusion.Merge.AdditionalFieldMergeBuilder
 
hashCode() - Method in class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
hashCode() - Method in class com.bakdata.dedupe.fusion.Merge
 
hashCode() - Method in class com.bakdata.dedupe.fusion.Merge.IllTypedFieldMergeBuilder
 
hashCode() - Method in class com.bakdata.dedupe.fusion.Merge.MergeBuilder
 
hashCode() - Method in class com.bakdata.dedupe.fusion.ResolutionTag
 
hashCode() - Method in class com.bakdata.dedupe.fusion.Source
 
hashCode() - Method in class com.bakdata.dedupe.matching.StronglyStableMarriage
 
hashCode() - Method in class com.bakdata.dedupe.matching.WeaklyStableMarriage
 
hashCode() - Method in class com.bakdata.dedupe.matching.WeightedEdge
 
hashCode() - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure
 
hashCode() - Method in class com.bakdata.dedupe.similarity.CachingSimilarity
 
hashCode() - Method in class com.bakdata.dedupe.similarity.CutoffSimiliarityMeasure
 
hashCode() - Method in class com.bakdata.dedupe.similarity.Levenshtein
 
hashCode() - Method in class com.bakdata.dedupe.similarity.MatchingSimilarity
 
hashCode() - Method in class com.bakdata.dedupe.similarity.SimilarityContext
 
hashCode() - Method in class com.bakdata.dedupe.similarity.TransformingSimilarityMeasure
 
hashCode() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure
 
hashCode() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedSimilarity
 
hashCode() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedValue
 
hashCode() - Method in class com.bakdata.util.ExceptionContext
 
hashCode() - Method in class com.bakdata.util.FunctionalClass
 
hashCode() - Method in class com.bakdata.util.FunctionalConstructor
 
hashCode() - Method in class com.bakdata.util.FunctionalMethod
 
hashCode() - Method in class com.bakdata.util.FunctionalProperty
 

I

id(C) - Method in class com.bakdata.dedupe.clustering.Cluster.ClusterBuilder
 
identity(SimilarityMeasure<T>) - Static method in class com.bakdata.dedupe.similarity.CachingSimilarity
Creates a cache based on the object identity; that is, the cache only triggers if the exact same instances are compared.
idExtractor(Function<? super T, ? extends I>) - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure.RefinedTransitiveClosureBuilder
 
idExtractor(Function<? super T, ? extends I>) - Method in class com.bakdata.dedupe.clustering.TransitiveClosure.TransitiveClosureBuilder
 
idExtractor(Function<T, I>) - Method in class com.bakdata.dedupe.clustering.ConsistentClustering.ConsistentClusteringBuilder
 
ignore() - Static method in interface com.bakdata.dedupe.clustering.ClusterSplitHandler
Do nothing.
IllTypedFieldMergeBuilder(Merge.FieldMergeBuilder<F, R>, ConflictResolution<F, I>) - Constructor for class com.bakdata.dedupe.fusion.Merge.IllTypedFieldMergeBuilder
 
incompleteFusionHandler(IncompleteFusionHandler<T>) - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection.FusingOnlineDuplicateDetectionBuilder
 
IncompleteFusionHandler<T> - Interface in com.bakdata.dedupe.fusion
A callback that allows incomplete fusions to be treated either with direct repair algorithms or through side-channel means (a.k.a ignore for now and repair asynchronously with the help of domain experts).
inequality() - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
A similarity measure that is 0 if left.equals(right) or 1 otherwise.
intGenerator() - Static method in class com.bakdata.dedupe.clustering.ClusterIdGenerators
Returns an id generator that generates ints starting from 0.
invoke(T, Object...) - Method in class com.bakdata.util.FunctionalMethod
Invokes the method on an instance with the given parameters.
isAmbiguous() - Method in enum com.bakdata.dedupe.classifier.Classification
Returns if this class can directly be used
isNonEmpty(Object) - Method in class com.bakdata.dedupe.fusion.FusionContext
 
isSymmetric() - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure
 
isSymmetric() - Method in class com.bakdata.dedupe.similarity.Levenshtein
 
isSymmetric() - Method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Returns true if sim(a, b) = sim(b, a).
isSymmetric() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure
 
isUnknown() - Method in enum com.bakdata.dedupe.classifier.Classification
 
isUnknown(double) - Static method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Checks whether the supplied value is SimilarityMeasure.unknown().

J

jaccard() - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Treats the input collections as sets (removing duplicate elements) and calculates the size of the intersection over the size of the union.
jaroWinkler() - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Jaro-Winkler similarity counts the number of matched and transposed characters with a boost for initial characters.

K

keep() - Static method in interface com.bakdata.dedupe.duplicate_detection.PossibleDuplicateHandler
Ignores possible duplicates and returns them as is, which will most likely result in dropping them.

L

last() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Picks a the last value out of the bag of values.
last(Iterable<? extends SimilarityMeasure<? super T>>) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Returns the last similarity of the given similarity measures that is not SimilarityMeasure.unknown().
last(SimilarityMeasure<? super T>...) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Returns the last similarity of the given similarity measures that is not SimilarityMeasure.unknown().
lastModifiedExtractor(Function<R, LocalDateTime>) - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion.ConflictResolutionFusionBuilder
 
latest() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Selects all values with the same latest AnnotatedValue.getDateTime().
levenshtein() - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Returns the Levenshtein similarity measure.
Levenshtein<T extends java.lang.CharSequence> - Class in com.bakdata.dedupe.similarity
Provides the Levenshtein similarity calculation, which calculates the number of insertions, deletions, and replacements needed to transform one string into another.
Levenshtein(double) - Constructor for class com.bakdata.dedupe.similarity.Levenshtein
 
longest() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Selects all values with the same longest length.
longGenerator() - Static method in class com.bakdata.dedupe.clustering.ClusterIdGenerators
Returns an id generator that generates longs starting from 0.

M

match() - Method in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
match(Collection<WeightedEdge<T>>, Collection<WeightedEdge<T>>) - Method in class com.bakdata.dedupe.matching.AbstractStableMarriage
 
match(Collection<WeightedEdge<T>>, Collection<WeightedEdge<T>>) - Method in interface com.bakdata.dedupe.matching.BipartiteMatcher
Finds the set of edges forming a matching that maximizes the sum of the edge weights.
matching(BipartiteMatcher<E>, SimilarityMeasure<? super E>) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Uses the given pair SimilarityMeasure to calculate a preference matrix and uses the BipartiteMatcher to find the best matching entries.
MatchingSimilarity<C extends java.util.Collection<? extends E>,​E> - Class in com.bakdata.dedupe.similarity
A similarity measure that finds the best matches between two bags of entities and calculates an overall similarity by summing the similarity of these matches and normalize it over the (max) number of elements per collection.
MatchingSimilarity(BipartiteMatcher<E>, SimilarityMeasure<E>) - Constructor for class com.bakdata.dedupe.similarity.MatchingSimilarity
 
matchMaterialized(Collection<WeightedEdge<T>>, Collection<WeightedEdge<T>>) - Method in interface com.bakdata.dedupe.matching.BipartiteMatcher
Finds the set of edges forming a matching that maximizes the sum of the edge weights.
materializedDeduplicate(Iterable<? extends T>) - Method in interface com.bakdata.dedupe.deduplication.Deduplication
Deduplicates the dataset.
materializedDeduplicate(Iterable<? extends T>, Function<? super T, Object>) - Method in interface com.bakdata.dedupe.deduplication.Deduplication
Selects the candidates for the given records and materializes them.
materializeDuplicates(Iterable<? extends T>) - Method in interface com.bakdata.dedupe.duplicate_detection.DuplicateDetection
Finds all duplicates in the dataset.
max() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Selects all values that a maximal.
max(SimilarityMeasure<? super T>...) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Returns the largest similarity of the given similarity measures.
maxSmallClusterSize(int) - Method in class com.bakdata.dedupe.clustering.RefineCluster.RefineClusterBuilder
 
mean() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Calculates the mean value of a bag of conflicting numbers.
mean(SimilarityMeasure<? super T>...) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Returns the mean similarity of the given similarity measures.
median() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Calculates the median value of a bag of conflicting numbers.
mensFavoriteWomen - Variable in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
merge(Function<? super Iterable<? extends T>, ? extends C>, Cluster<C, ? extends T>) - Method in class com.bakdata.dedupe.clustering.Cluster
Merges this cluster with another cluster into one new cluster.
merge(Supplier<T>) - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Starts the creation of a Merge conflict resolution for record types.
merge(Class<T>) - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Starts the creation of a Merge conflict resolution for record types.
Merge<R> - Class in com.bakdata.dedupe.fusion
A nested conflict resolution for complex types.
Merge.AdditionalFieldMergeBuilder<F,​R> - Class in com.bakdata.dedupe.fusion
 
Merge.FieldMergeBuilder<F,​R> - Class in com.bakdata.dedupe.fusion
 
Merge.IllTypedFieldMergeBuilder<I,​F,​R> - Class in com.bakdata.dedupe.fusion
 
Merge.MergeBuilder<R> - Class in com.bakdata.dedupe.fusion
 
MergeBuilder(Supplier<R>, FunctionalClass<R>) - Constructor for class com.bakdata.dedupe.fusion.Merge.MergeBuilder
 
min() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Selects all values that a minimal.
min(SimilarityMeasure<? super T>...) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Returns the smallest similarity of the given similarity measures.
mongeElkan(SimilarityMeasure<E>) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Monge-Elkan is a simple list-based similarity measure, where elements from the left are matched with elements from the right with the highest similarity within a certain index range.
mongeElkan(SimilarityMeasure<E>, int) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Monge-Elkan is a simple list-based similarity measure, where elements from the left are matched with elements from the right with the highest similarity within a certain index range.
mostFrequent() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Selects the most frequent values.

N

negate() - Method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Swaps the lower and upper bound, such that equal pairs have a similarity of 0 and unequal pairs of 1.
negate(SimilarityMeasure<T>) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Swaps the lower and upper bound, such that equal pairs have a similarity of 0 and unequal pairs of 1.
negativeRule(String, SimilarityMeasure<? super T>) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
Creates a negative rule that is always applied.
negativeRule(String, BiPredicate<T, T>, SimilarityMeasure<? super T>) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
Creates a negative rule that is applied only when the precondition holds.
newInstance() - Method in class com.bakdata.util.FunctionalConstructor
Creates a new instance.
ngram(int) - Static method in class com.bakdata.dedupe.similarity.CommonTransformations
Creates a tokenizer for a string into ngrams; that is, two or more succeeding characters.
NON_DUPLICATE - com.bakdata.dedupe.classifier.Classification
A sure non-duplicate.

O

of(ValueTransformation<O, ? extends T>) - Method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Applies a ValueTransformation to the left and right value of a similarity comparison before applying this .
of(Class<T>) - Static method in class com.bakdata.util.FunctionalClass
 
of(Function<O, ? extends T>) - Method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Applies a ValueTransformation to the left and right value of a similarity comparison before applying this .
of(Class<T>) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
Fluent cast of the type parameter.
of(Class<T>) - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure.AggregatingSimilarityMeasureBuilder
Fluent cast of the type parameter.
of(Class<T>) - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedAggregatingSimilarityMeasureBuilder
Fluent cast of the type parameter.
of(T1, T2) - Static method in class com.bakdata.dedupe.candidate_selection.CompositeValue
Creates a composite value of the two values.
OfflineCandidate<T> - Class in com.bakdata.dedupe.candidate_selection.offline
Represents a candidate pair that was generated with an OfflineCandidateSelection.
OfflineCandidate(T, T) - Constructor for class com.bakdata.dedupe.candidate_selection.offline.OfflineCandidate
 
OfflineCandidateSelection<T> - Interface in com.bakdata.dedupe.candidate_selection.offline
Selects candidates from a static dataset accessible through Iterables.
OfflineDeduplication<T> - Interface in com.bakdata.dedupe.deduplication.offline
A full offline deduplication process, which ensures that no duplicate record is in the results.
OfflineDuplicateDetection<C extends java.lang.Comparable<C>,​T> - Interface in com.bakdata.dedupe.duplicate_detection.offline
The offline duplicate detection returns all duplicate records within a dataset.
oldClusterIndex(Map<I, Cluster<C, T>>) - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure.RefinedTransitiveClosureBuilder
 
OnlineCandidate<T> - Class in com.bakdata.dedupe.candidate_selection.online
Represents a candidate pair that was generated with an OnlineCandidateSelection.
OnlineCandidate(T, T) - Constructor for class com.bakdata.dedupe.candidate_selection.online.OnlineCandidate
 
OnlineCandidateSelection<T> - Interface in com.bakdata.dedupe.candidate_selection.online
Selects candidates from a streaming dataset by processing the incoming elements one-at-a-time.
OnlineDeduplication<T> - Interface in com.bakdata.dedupe.deduplication.online
A full online deduplication process, which ensures that no duplicate record is emitted.
OnlineDuplicateDetection<C extends java.lang.Comparable<C>,​T> - Interface in com.bakdata.dedupe.duplicate_detection.online
An online duplicate detection algorithm processes a stream of records and returns all changed Clusters for each record.
OnlinePairBasedDuplicateDetection<C extends java.lang.Comparable<C>,​T> - Class in com.bakdata.dedupe.duplicate_detection.online
A pair-based duplicate detection algorithm, which Performs OnlineCandidateSelection Applies a Classifier to the found Candidates Transforms the found duplicate pairs with a Clustering into Clusterss
OnlinePairBasedDuplicateDetection.OnlinePairBasedDuplicateDetectionBuilder<C extends java.lang.Comparable<C>,​T> - Class in com.bakdata.dedupe.duplicate_detection.online
 
OnlineSortedNeighborhoodMethod<T> - Class in com.bakdata.dedupe.candidate_selection.online
A sorted neighborhood method (SNM) for online deduplication.
OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder<T> - Class in com.bakdata.dedupe.candidate_selection.online
 
OnlineSortedNeighborhoodMethod.Pass<T,​K extends java.lang.Comparable<K>> - Class in com.bakdata.dedupe.candidate_selection.online
Represents a pass over the dataset with a specific sorting key and window size.
OracleClassifier<T> - Class in com.bakdata.dedupe.classifier
A classifier that knows the results perfectly as it receives the gold standard during creation.
OracleClassifier(Set<Candidate<T>>) - Constructor for class com.bakdata.dedupe.classifier.OracleClassifier
 
OracleClustering<C extends java.lang.Comparable<C>,​T,​I> - Class in com.bakdata.dedupe.clustering
A clustering that knows the results perfectly as it receives the gold standard during creation.
OracleClustering(Collection<Cluster<C, T>>, Function<T, I>) - Constructor for class com.bakdata.dedupe.clustering.OracleClustering
 

P

pass(OnlineSortedNeighborhoodMethod.Pass<T, ?>) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder
 
Pass(SortingKey<? super T, ? extends K>, int) - Constructor for class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.Pass
Creates a pass with the given sorting key and window size.
passes(Collection<? extends OnlineSortedNeighborhoodMethod.Pass<T, ?>>) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder
 
positionWise(SimilarityMeasure<E>) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Performs a simple one to one comparison of elements in two lists based on their index.
positiveRule(String, SimilarityMeasure<? super T>) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
Creates a positive rule that is always applied.
positiveRule(String, BiPredicate<T, T>, SimilarityMeasure<? super T>) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
Creates a positive rule that is applied only when the precondition holds.
POSSIBLE_DUPLICATE - com.bakdata.dedupe.classifier.Classification
Possible duplicate may come from a high uncertainty of the Classifier.
possibleDuplicateFound(ClassifiedCandidate<T>) - Method in interface com.bakdata.dedupe.duplicate_detection.PossibleDuplicateHandler
Invoked when a Classification.POSSIBLE_DUPLICATE has been founds during DuplicateDetection.
possibleDuplicateHandler(PossibleDuplicateHandler<T>) - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection.OnlinePairBasedDuplicateDetectionBuilder
 
PossibleDuplicateHandler<T> - Interface in com.bakdata.dedupe.duplicate_detection
A callback that is invoked when a Classification.POSSIBLE_DUPLICATE has been founds during DuplicateDetection.
preferSource(List<Source>) - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Chooses all values from a specific source.
preferSource(Source...) - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Chooses all values from a specific source.
promoteToDuplicate() - Static method in interface com.bakdata.dedupe.duplicate_detection.PossibleDuplicateHandler
Treats possible duplicates as full duplicates.
propose(Integer, Integer) - Method in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 

R

random() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Picks a random value out of the bag of values.
refine(Stream<? extends Cluster<C, T>>, Stream<ClassifiedCandidate<T>>) - Method in class com.bakdata.dedupe.clustering.RefineCluster
 
refineCluster(RefineCluster<C, T>) - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure.RefinedTransitiveClosureBuilder
 
RefineCluster<C extends java.lang.Comparable<C>,​T> - Class in com.bakdata.dedupe.clustering
Splits large clusters into smaller clusters when the inter-cluster similarities are sub-optimal.
RefineCluster.RefineClusterBuilder<C extends java.lang.Comparable<C>,​T> - Class in com.bakdata.dedupe.clustering
 
refinedSoundex(@lombok.NonNull char[]) - Static method in class com.bakdata.dedupe.similarity.CommonTransformations
Creates a normalizer that turns strings into the refined soundex representation.
RefinedTransitiveClosure<C extends java.lang.Comparable<C>,​T,​I extends java.lang.Comparable<? super I>> - Class in com.bakdata.dedupe.clustering
Executes TransitiveClosure and successively RefineCluster.
RefinedTransitiveClosure.RefinedTransitiveClosureBuilder<C extends java.lang.Comparable<C>,​T,​I extends java.lang.Comparable<? super I>> - Class in com.bakdata.dedupe.clustering
 
removeCluster(Cluster<C, ? extends T>) - Method in class com.bakdata.dedupe.clustering.TransitiveClosure
 
ResolutionTag<T> - Class in com.bakdata.dedupe.fusion
A resolution tag is a way to refer to a previously, resolved value in the current FusionContext.
ResolutionTag(String) - Constructor for class com.bakdata.dedupe.fusion.ResolutionTag
 
resolve(List<AnnotatedValue<I>>, FusionContext) - Method in interface com.bakdata.dedupe.fusion.ConflictResolution
Fully resolves the values if possible or throws a FusionException.
resolveFully(List<AnnotatedValue<I>>, FusionContext) - Method in interface com.bakdata.dedupe.fusion.TerminalConflictResolution
 
resolveNonEmptyPartially(List<AnnotatedValue<I>>, FusionContext) - Method in interface com.bakdata.dedupe.fusion.ConflictResolution
Tries the best to resolve the value but may end up with multiple concurrent values.
resolveNonEmptyPartially(List<AnnotatedValue<I>>, FusionContext) - Method in interface com.bakdata.dedupe.fusion.TerminalConflictResolution
 
resolveNonEmptyPartially(List<AnnotatedValue<R>>, FusionContext) - Method in class com.bakdata.dedupe.fusion.Merge
 
resolvePartially(List<AnnotatedValue<I>>, FusionContext) - Method in interface com.bakdata.dedupe.fusion.ConflictResolution
Tries the best to resolve the value but may end up with multiple concurrent values.
retrieveValues(ResolutionTag<T>) - Method in class com.bakdata.dedupe.fusion.FusionContext
 
reversed() - Method in class com.bakdata.dedupe.matching.WeightedEdge
Reverses the direction of the edge by swapping the vertexes.
rootResolution(ConflictResolution<R, R>) - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion.ConflictResolutionFusionBuilder
 
rule(RuleBasedClassifier.Rule<T>) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
 
Rule(String, SimilarityMeasure<? super T>) - Constructor for class com.bakdata.dedupe.classifier.RuleBasedClassifier.Rule
 
RuleBasedClassifier<T> - Class in com.bakdata.dedupe.classifier
Successively applies a list of rules to the record and returns the respective ClassificationResult with the following cases: If any rule classifies the pair unambiguously as Classification.DUPLICATE or Classification.NON_DUPLICATE, the classification is immediately returned. If no rule can be applied, the classification is RuleBasedClassifier.defaultClassificationResult. There are three types of rules: Negatives rule are used to exclude false positives.
RuleBasedClassifier.Rule<T> - Class in com.bakdata.dedupe.classifier
A rule has a name for lineage/debugging and the similarity measure.
RuleBasedClassifier.RuleBasedClassifierBuilder<T> - Class in com.bakdata.dedupe.classifier
A builder for RuleBasedClassifier with convenience methods to create positive and negative rules.
rules(Collection<? extends RuleBasedClassifier.Rule<T>>) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
 

S

safeExecute(Runnable) - Method in class com.bakdata.util.ExceptionContext
Safely executes the given runnable.
safeExecute(Callable<? extends T>) - Method in class com.bakdata.util.ExceptionContext
Safely executes the given function.
safeExecute(Runnable) - Method in class com.bakdata.dedupe.fusion.FusionContext
 
safeExecute(Runnable) - Method in class com.bakdata.dedupe.similarity.SimilarityContext
 
safeExecute(Callable<? extends T>) - Method in class com.bakdata.dedupe.fusion.FusionContext
 
safeExecute(Callable<? extends T>) - Method in class com.bakdata.dedupe.similarity.SimilarityContext
 
saveAs(ConflictResolution<T, T>, ResolutionTag<T>) - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Creates a ResolutionTag for the currently resolved value.
scaledDifference(int, TemporalUnit) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Calculates the difference between the left and right Temporal and compares the absolute difference to maxDiff.
scaledDifference(T) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Calculates the difference between the left and right number and compares the absolute difference to maxDiff.
scaleWithThreshold(double) - Method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Scales the similarity returned by this similarity, such that all values ≤minExclusive result in a similarity of 0, and all values (minExclusive, 1] are linearly rescaled to (0, 1].
selectCandidates(Iterable<? extends T>) - Method in interface com.bakdata.dedupe.candidate_selection.CandidateSelection
Selects the candidates for the given records and materializes them.
selectCandidates(Stream<? extends T>) - Method in interface com.bakdata.dedupe.candidate_selection.CandidateSelection
Selects the candidates for the given records.
selectCandidates(Stream<? extends T>) - Method in interface com.bakdata.dedupe.candidate_selection.online.OnlineCandidateSelection
 
selectCandidates(T) - Method in interface com.bakdata.dedupe.candidate_selection.online.OnlineCandidateSelection
Selects the candidates for the a new incoming record.
selectCandidates(T) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod
 
setElements(List<T>) - Method in class com.bakdata.dedupe.clustering.Cluster
The list of elements.
setId(C) - Method in class com.bakdata.dedupe.clustering.Cluster
The identifier of the cluster.
SetSimilarityMeasure<C extends java.util.Collection<? extends E>,​E> - Interface in com.bakdata.dedupe.similarity
A SimilarityMeasure that is defined over Collections that are treated as sets.
shortest() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Selects all values with the same shortest length.
SimilarityContext - Class in com.bakdata.dedupe.similarity
A similarity context captures exceptions in an ExceptionContext and provides additional configurations that represent cross-cutting concerns, such as null handling.
SimilarityContext.SimilarityContextBuilder - Class in com.bakdata.dedupe.similarity
 
SimilarityMeasure<T> - Interface in com.bakdata.dedupe.similarity
A SimilarityMeasure compares two values and calculates a similarity in [-1; 1], where 0 means no similarity and 1 equal values in a given context.
similarityMeasureForNull(SimilarityMeasure<Object>) - Method in class com.bakdata.dedupe.similarity.SimilarityContext.SimilarityContextBuilder
 
similarityMeasures(List<SimilarityMeasure<? super T>>) - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure.AggregatingSimilarityMeasureBuilder
 
similarityScore(SimilarityScore<R>) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Wraps a SimilarityScore of apache commons-text into a SimilarityMeasure.
size() - Method in class com.bakdata.dedupe.clustering.Cluster
 
sortingKey(SortingKey<T, ?>) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder
sortingKey(SortingKey<T, ?>, int) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder
Adds a new pass with the given sorting key.
SortingKey<T,​K extends java.lang.Comparable<K>> - Class in com.bakdata.dedupe.candidate_selection
A sorting key allows a dataset to be indexed by a specific (calculated) value of a record, such that duplicates have a higher probability of being in close proximity and thus a CandidateSelection may prune the search space drastically.
SortingKey(String, Function<T, K>) - Constructor for class com.bakdata.dedupe.candidate_selection.SortingKey
 
sortingKeys(Iterable<SortingKey<T, ?>>) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder
Adds new passes with the given list of sorting keys and the OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder.defaultWindowSize(int).
sortingKeys(Iterable<SortingKey<T, ?>>, int) - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder
Adds new passes with the given list of sorting keys and the given window size.
soundex() - Static method in class com.bakdata.dedupe.similarity.CommonTransformations
Creates a normalizer that turns strings into the soundex representation.
source(Source) - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion.ConflictResolutionFusionBuilder
 
Source - Class in com.bakdata.dedupe.fusion
The source of a value to be fused.
Source(String, double) - Constructor for class com.bakdata.dedupe.fusion.Source
 
sourceExtractor(Function<R, String>) - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion.ConflictResolutionFusionBuilder
 
sources(Collection<? extends Source>) - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion.ConflictResolutionFusionBuilder
 
splitHandler(ClusterSplitHandler) - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure.RefinedTransitiveClosureBuilder
 
stableMatching(SimilarityMeasure<E>) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Will find the (weakly) stable matches between the left and the right side with a given pair similarity measure and calculate the average similarity between the stable pairs.
storeValues(ResolutionTag<T>, List<AnnotatedValue<T>>) - Method in class com.bakdata.dedupe.fusion.FusionContext
 
stream(Iterable<T>) - Static method in class com.bakdata.util.StreamUtil
 
StreamUtil - Class in com.bakdata.util
Adds some functions that are missing in Java Streams.
stringGenerator(String) - Static method in class com.bakdata.dedupe.clustering.ClusterIdGenerators
Returns an id generator that generates strings with a given prefix starting from 0.
StronglyStableMarriage<T> - Class in com.bakdata.dedupe.matching
Implements a strongly stable matching based on the stable marriage with indifference.
StronglyStableMarriage() - Constructor for class com.bakdata.dedupe.matching.StronglyStableMarriage
 
sum() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Calculates the sum of a bag of conflicting numbers.

T

takeWhileInclusive(DoubleStream, DoublePredicate) - Static method in class com.bakdata.util.StreamUtil
 
takeWhileInclusive(IntStream, IntPredicate) - Static method in class com.bakdata.util.StreamUtil
 
takeWhileInclusive(LongStream, LongPredicate) - Static method in class com.bakdata.util.StreamUtil
 
takeWhileInclusive(Stream<T>, Predicate<? super T>) - Static method in class com.bakdata.util.StreamUtil
 
TerminalConflictResolution<I,​O> - Interface in com.bakdata.dedupe.fusion
A conflict resolution function that is guaranteed to produce a single value.
then(ConflictResolution<F, F>) - Method in class com.bakdata.dedupe.fusion.Merge.AdditionalFieldMergeBuilder
 
then(ConflictResolution<I, J>) - Method in class com.bakdata.dedupe.fusion.Merge.IllTypedFieldMergeBuilder
 
thresholdRule(String, SimilarityMeasure<? super T>, double) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
Creates a threshold rule that is always applied.
thresholdRule(String, BiPredicate<T, T>, SimilarityMeasure<? super T>, double) - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
Creates a threshold rule that is applied only when the precondition holds.
toBuilder() - Method in class com.bakdata.dedupe.classifier.ClassifiedCandidate
 
toSimilarity(EditDistance<R>) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Used to translate EditDistance of apache commons-text into a similarity score by using the formula 1 - dist / maxDist where maxDist is the maximum length of the input strings.
toString() - Method in class com.bakdata.dedupe.candidate_selection.CompositeValue
 
toString() - Method in class com.bakdata.dedupe.candidate_selection.offline.OfflineCandidate
 
toString() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineCandidate
 
toString() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod.OnlineSortedNeighborhoodMethodBuilder
 
toString() - Method in class com.bakdata.dedupe.candidate_selection.online.OnlineSortedNeighborhoodMethod
 
toString() - Method in class com.bakdata.dedupe.candidate_selection.SortingKey
 
toString() - Method in class com.bakdata.dedupe.classifier.ClassificationResult.ClassificationResultBuilder
 
toString() - Method in class com.bakdata.dedupe.classifier.ClassificationResult
 
toString() - Method in class com.bakdata.dedupe.classifier.ClassifiedCandidate.ClassifiedCandidateBuilder
 
toString() - Method in class com.bakdata.dedupe.classifier.ClassifiedCandidate
 
toString() - Method in class com.bakdata.dedupe.classifier.OracleClassifier
 
toString() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.Rule
 
toString() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier.RuleBasedClassifierBuilder
 
toString() - Method in class com.bakdata.dedupe.classifier.RuleBasedClassifier
 
toString() - Method in class com.bakdata.dedupe.clustering.Cluster.ClusterBuilder
 
toString() - Method in class com.bakdata.dedupe.clustering.Cluster
 
toString() - Method in class com.bakdata.dedupe.clustering.ConsistentClustering.ConsistentClusteringBuilder
 
toString() - Method in class com.bakdata.dedupe.clustering.ConsistentClustering
 
toString() - Method in class com.bakdata.dedupe.clustering.OracleClustering
 
toString() - Method in class com.bakdata.dedupe.clustering.RefineCluster.RefineClusterBuilder
 
toString() - Method in class com.bakdata.dedupe.clustering.RefineCluster
 
toString() - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure.RefinedTransitiveClosureBuilder
 
toString() - Method in class com.bakdata.dedupe.clustering.RefinedTransitiveClosure
 
toString() - Method in class com.bakdata.dedupe.clustering.TransitiveClosure
 
toString() - Method in class com.bakdata.dedupe.clustering.TransitiveClosure.TransitiveClosureBuilder
 
toString() - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection.FusingOnlineDuplicateDetectionBuilder
 
toString() - Method in class com.bakdata.dedupe.deduplication.online.FusingOnlineDuplicateDetection
 
toString() - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection.OnlinePairBasedDuplicateDetectionBuilder
 
toString() - Method in class com.bakdata.dedupe.duplicate_detection.online.OnlinePairBasedDuplicateDetection
 
toString() - Method in class com.bakdata.dedupe.fusion.AnnotatedValue
 
toString() - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion.ConflictResolutionFusionBuilder
 
toString() - Method in class com.bakdata.dedupe.fusion.ConflictResolutionFusion
 
toString() - Method in class com.bakdata.dedupe.fusion.FusedValue
 
toString() - Method in class com.bakdata.dedupe.fusion.FusionContext.FusionContextBuilder
 
toString() - Method in class com.bakdata.dedupe.fusion.FusionContext
 
toString() - Method in class com.bakdata.dedupe.fusion.Merge.AdditionalFieldMergeBuilder
 
toString() - Method in class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
toString() - Method in class com.bakdata.dedupe.fusion.Merge.IllTypedFieldMergeBuilder
 
toString() - Method in class com.bakdata.dedupe.fusion.Merge.MergeBuilder
 
toString() - Method in class com.bakdata.dedupe.fusion.Merge
 
toString() - Method in class com.bakdata.dedupe.fusion.ResolutionTag
 
toString() - Method in class com.bakdata.dedupe.fusion.Source
 
toString() - Method in class com.bakdata.dedupe.matching.StronglyStableMarriage
 
toString() - Method in class com.bakdata.dedupe.matching.WeaklyStableMarriage
 
toString() - Method in class com.bakdata.dedupe.matching.WeightedEdge
 
toString() - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure.AggregatingSimilarityMeasureBuilder
 
toString() - Method in class com.bakdata.dedupe.similarity.AggregatingSimilarityMeasure
 
toString() - Method in class com.bakdata.dedupe.similarity.CachingSimilarity
 
toString() - Method in class com.bakdata.dedupe.similarity.CutoffSimiliarityMeasure
 
toString() - Method in class com.bakdata.dedupe.similarity.Levenshtein
 
toString() - Method in class com.bakdata.dedupe.similarity.MatchingSimilarity
 
toString() - Method in class com.bakdata.dedupe.similarity.SimilarityContext.SimilarityContextBuilder
 
toString() - Method in class com.bakdata.dedupe.similarity.SimilarityContext
 
toString() - Method in class com.bakdata.dedupe.similarity.TransformingSimilarityMeasure
 
toString() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure
 
toString() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedAggregatingSimilarityMeasureBuilder
 
toString() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedSimilarity
 
toString() - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedValue
 
toString() - Method in class com.bakdata.util.ExceptionContext
 
toString() - Method in class com.bakdata.util.FunctionalClass
 
toString() - Method in class com.bakdata.util.FunctionalConstructor
 
toString() - Method in class com.bakdata.util.FunctionalMethod
 
toString() - Method in class com.bakdata.util.FunctionalProperty
 
transform(Function<? super T, ? extends R>) - Static method in class com.bakdata.dedupe.similarity.CommonTransformations
Wraps any function into a ValueTransformation.
transform(Function<? super T, R>) - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Transforms the values within an AnnotatedValue.
transform(T, SimilarityContext) - Method in interface com.bakdata.dedupe.similarity.ValueTransformation
Transforms the value to the expected output type.
TransformingSimilarityMeasure<R,​T> - Class in com.bakdata.dedupe.similarity
Applies a SimilarityMeasure to transformed input values.
TransformingSimilarityMeasure(ValueTransformation<T, ? extends R>, SimilarityMeasure<R>) - Constructor for class com.bakdata.dedupe.similarity.TransformingSimilarityMeasure
 
TransitiveClosure<C extends java.lang.Comparable<C>,​T,​I extends java.lang.Comparable<? super I>> - Class in com.bakdata.dedupe.clustering
An amortized linear transitive closure implementation over the number of pairs.
TransitiveClosure.TransitiveClosureBuilder<C extends java.lang.Comparable<C>,​T,​I extends java.lang.Comparable<? super I>> - Class in com.bakdata.dedupe.clustering
 
trigram() - Static method in class com.bakdata.dedupe.similarity.CommonTransformations
Creates a tokenizer for a string into trigrams; that is, three succeeding characters.

U

union() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Retains all distinct values.
unionAll() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Retains all values.
unionAll(Supplier<? extends R>) - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Adds all values to the given collection type.
unknown() - Static method in interface com.bakdata.dedupe.duplicate_detection.PossibleDuplicateHandler
Treats possible duplicates as unknown classification.
unknown() - Static method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Returns a value indicating that the similarity value is unknown.
UNKNOWN - com.bakdata.dedupe.classifier.Classification
Unknown classifications are primarily caused by a lack of information.
unknownIf(DoublePredicate) - Method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Replaces the similarity returned by this similarity, such that all values, for which the given predicate evaluates to true, result in an SimilarityMeasure.unknown() similarity.
unknownIfZero() - Method in interface com.bakdata.dedupe.similarity.SimilarityMeasure
Replaces the similarity returned by this similarity, such that all values =0 result in an SimilarityMeasure.unknown() similarity.

V

valueOf(String) - Static method in enum com.bakdata.dedupe.classifier.Classification
Returns the enum constant of this type with the specified name.
values() - Static method in enum com.bakdata.dedupe.classifier.Classification
Returns an array containing the constants of this enum type, in the order they are declared.
ValueTransformation<T,​R> - Interface in com.bakdata.dedupe.similarity
Performs a transformation on a value, especially on input values for SimilarityMeasure.
vote() - Static method in class com.bakdata.dedupe.fusion.CommonConflictResolutions
Selects the highest weighted values.

W

WeaklyStableMarriage<T> - Class in com.bakdata.dedupe.matching
Implements a weakly stable matching based on the stable marriage with indifference (i.e., ties).
WeaklyStableMarriage() - Constructor for class com.bakdata.dedupe.matching.WeaklyStableMarriage
 
WeightedAggregatingSimilarityMeasure<R> - Class in com.bakdata.dedupe.similarity
 
WeightedAggregatingSimilarityMeasure.WeightedAggregatingSimilarityMeasureBuilder<R> - Class in com.bakdata.dedupe.similarity
 
WeightedAggregatingSimilarityMeasure.WeightedSimilarity<T> - Class in com.bakdata.dedupe.similarity
 
WeightedAggregatingSimilarityMeasure.WeightedValue - Class in com.bakdata.dedupe.similarity
 
weightedAggregation(ToDoubleFunction<Stream<WeightedAggregatingSimilarityMeasure.WeightedValue>>) - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Starts the creation of a weighted aggregation of multiple SimilarityMeasures.
weightedAverage() - Static method in class com.bakdata.dedupe.similarity.CommonSimilarityMeasures
Starts the creation of a weighted average of multiple SimilarityMeasures.
WeightedEdge<T> - Class in com.bakdata.dedupe.matching
A weighted (directed) edge between two records.
WeightedEdge(T, T, double) - Constructor for class com.bakdata.dedupe.matching.WeightedEdge
 
weightedSimilarities(Collection<? extends WeightedAggregatingSimilarityMeasure.WeightedSimilarity<R>>) - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedAggregatingSimilarityMeasureBuilder
 
weightedSimilarity(WeightedAggregatingSimilarityMeasure.WeightedSimilarity<R>) - Method in class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedAggregatingSimilarityMeasureBuilder
 
WeightedSimilarity(double, SimilarityMeasure<T>) - Constructor for class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedSimilarity
 
WeightedValue(double, double) - Constructor for class com.bakdata.dedupe.similarity.WeightedAggregatingSimilarityMeasure.WeightedValue
 
with(ConflictResolution<F, F>) - Method in class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
with(ConflictResolution<F, F>, ConflictResolution<F, F>...) - Method in class com.bakdata.dedupe.fusion.Merge.FieldMergeBuilder
 
withValue(S) - Method in class com.bakdata.dedupe.fusion.AnnotatedValue
Creates a new instance with changed value and with potentially different type.
womensFavoriteMen - Variable in class com.bakdata.dedupe.matching.AbstractStableMarriage.AbstractMatcher
 
words() - Static method in class com.bakdata.dedupe.similarity.CommonTransformations
Splits a string on all whitespace characters into words.
A B C D E F G H I J K L M N O P R S T U V W 
All Classes All Packages