Package com.bakdata.dedupe.duplicate_detection.online
Online duplicate detection to find duplicate clusters in streaming data.
-
Interface Summary Interface Description OnlineDuplicateDetection<C extends java.lang.Comparable<C>,T> An online duplicate detection algorithm processes a stream of records and returns all changedClusters for each record. -
Class Summary Class Description OnlinePairBasedDuplicateDetection<C extends java.lang.Comparable<C>,T> A pair-based duplicate detection algorithm, which PerformsOnlineCandidateSelectionApplies aClassifierto the foundCandidates Transforms the found duplicate pairs with aClusteringintoClusterssOnlinePairBasedDuplicateDetection.OnlinePairBasedDuplicateDetectionBuilder<C extends java.lang.Comparable<C>,T>