![]() The Dynamic Cache can update the cache, as and when it is reading the data. When you change the property of the Lookup transformation to use the Dynamic Cache, a new port is added to the transformation. The entire mapping should look like this. If IS_DUP > 0, that means, those are duplicate entries. As from the previous expression transformation, we will have IS_DUP =0 attached to only records, which are unique. Use a filter transformation, only to pass IS_DUP = 0.We will use the variable ports to identify the duplicate entries, based on Employee_ID. Use one expression transformation to flag the duplicates.The Key for sorting would be Employee_ID. Bring the source into the Mapping designer. ![]() ![]() If your data is not sorted, then, you may first use a sorter to sort the data and then apply this logic: You can use, Expression and Filter transformation, to identify and remove duplicate if your data is sorted.You can use Sorter and use the Sort Distinct Property to get the distinct values. Configure the sorter in the following way to enable this.If you want to find the duplicates based on the entire columns, select all the ports as group by key. After you pass all the required ports to the Aggregator, select all those ports, those you need to select for de-duplication. You can use, Aggregator and select all the ports as key to get the distinct values.Or you can also use the SQL Override to perform the same. If the source is DBMS, you can use the property in Source Qualifier to select the distinct records.There are several ways to remove duplicates.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |