I created a data rule that is a concatenation of three columns to compare it to the fourth column and see if the data is accurate
District + boro + num = dbn. The data rule seems to work except I keep getting duplicate dbn. How do I remove the duplicates please
if district exists AND len(trim(district)) 0 AND boro exists AND len(trim(boro)) 0 AND num exists AND len(trim(num)) THEN dbn matches_format '99A999'
Pinned topic Removing Duplicate from the result of Data Rule?
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2013-03-14T13:44:32Z at 2013-03-14T13:44:32Z by SystemAdmin
smithha 110000PAKN161 Posts
Re: Removing Duplicate from the result of Data Rule?2013-03-11T13:51:43ZThis is the accepted answer. This is the accepted answer.I'm presuming that by 'duplicates' you are getting multiple records that either meet or fail your rule?
And also that you just want to see the invalid combinations or possibly get a count of those, but don't care about the specific records involved?
Assuming yes to both the above points, then removing duplicates depends on what version you are on.
In 9.1, there is a specific Output option to only produce distinct values.
In 8.5 and 8.7, you need to explicitly include an aggregation function in the output with a groupBy clause to trigger discrete records. For example, add a Custom Expression to your output such as: Count(recordID, dbn) where dbn is the grouping value. You may need to include the count in the rule definition as well.