Interface OfflineDuplicateDetection<C extends java.lang.Comparable<C>,​T>

  • All Superinterfaces:
    DuplicateDetection<C,​T>
    Functional Interface:
    This is a functional interface and can therefore be used as the assignment target for a lambda expression or method reference.

    @FunctionalInterface
    public interface OfflineDuplicateDetection<C extends java.lang.Comparable<C>,​T>
    extends DuplicateDetection<C,​T>
    The offline duplicate detection returns all duplicate records within a dataset.

    Consider a dataset of records A, B, A' and A, A' being duplicates. The final result will be [(A, A'), (B)].

    The actual implementation may use any means necessary to find duplicates and to ensure proper transitivity ((A,B) is duplicate and (B,C) is duplicate implies that (A,C) is duplicate).

    Implementation Requirements:
    For offline algorithms, the general assumption is that they work stateless. Derivations need to be documented.