Packages 
Package Description
com.bakdata.dedupe.candidate_selection
Base data structured shared by online and offline candidate selections that choose promising pairs to limit search space for duplicates.
com.bakdata.dedupe.candidate_selection.offline
Interfaces and implementations for offline candidate selections that choose promising pairs to limit search space for duplicates in a materialized dataset.
com.bakdata.dedupe.candidate_selection.online
Interfaces and implementations for online candidate selections that choose promising pairs to limit search space for duplicates in a streaming dataset.
com.bakdata.dedupe.classifier
Interfaces, data structures, and implementations for the classification of Candidate pairs into duplicates and non-duplicates.
com.bakdata.dedupe.clustering
Clusters ClassifiedCandidates into coherent clusters.
com.bakdata.dedupe.deduplication
Provides interfaces and implementations for a full deduplication process, which ensures that no duplicate record is emitted.
com.bakdata.dedupe.deduplication.offline
Full offline deduplication for materialized data.
com.bakdata.dedupe.deduplication.online
Full online deduplication for streaming data.
com.bakdata.dedupe.duplicate_detection
Provides base interfaces and implementations for finding duplicate clusters.
com.bakdata.dedupe.duplicate_detection.offline
Offline duplicate detection to find duplicate clusters in materialized data.
com.bakdata.dedupe.duplicate_detection.online
Online duplicate detection to find duplicate clusters in streaming data.
com.bakdata.dedupe.fusion
Provides means to reconcile a duplicate cluster into a consistent representation.
com.bakdata.dedupe.matching
Assigns and matches nodes of a bipartite graph.
com.bakdata.dedupe.similarity
Data structures, interfaces, and implementations to define similarity measures that are ultimately used to detect duplicates.
com.bakdata.util
Utility classes that should not be deemed public API.