3.6 How to Check Duplicates Between Tabular Datasets

    Note: this operation only supports tabular datasets, it doesn't work on image datasets.

    When having two similar datasets, it's important to know if there are duplicates between them. In Data Governance, we offer such an operation for users to check duplicates in the Dataset page.

    After clicking the icon under "Match" heading, you can see a jump-up window asking you which field and dataset to compare. Complete the menu and click the Match button to compare, or click the Cancel button to terminate the process.. In the following example, we choose to compare duplicate data using the ID fields in the two datasets:

    When the comparing process is done, you can find a Report button under the Match heading in the dataset page.

    Click the button and you can see the difference between the two datasets. The system also has a download function to allow users to download the duplicate field content in the dataset.

    Taiwan AI Labs (AILabs.tw) Copyright © 2023Powered by Bludit