and more in a single search tool across platforms. Read the announcement here. |
03/20/2024 05:46 AM
Hi Team,
I'm working on implementing a coarse matching filter for the DIM, where I need to compare users based on 3 attributes: firstname, lastname and phonenumber.
Comparison should follow the below rules:
1. If both users have values for all 3 attributes, compare all attributes. If all attributes match, identify them as duplicates.
2. If any attribute for either user is null, compare only the non-null attributes. If all attributes are null, ignore them.
I have developed a filter query for this purpose, but with that query duplicate identity detection job failed with an error "Maximum statement execution time exceed". Unfortunately, I'm unable to validate this in the data analyzer due to query execution timeouts exceeding 10 seconds. Even in analytics, I'm unable to retrieve any records with this query.
Coarse Matching Filter:
((t1.firstname = t2.firstname OR t1.firstname IS NULL OR t2.firstname IS NULL) AND
(t1.lastname = t2.lastname OR t1.lastname IS NULL OR t2.lastname IS NULL) AND
(t1.phonenumber = t2.phonenumber OR t1.phonenumber IS NULL OR t2.phonenumber IS NULL)) AND
((t1.firstname IS NOT NULL AND t2.firstname IS NOT NULL) OR
(t1.lastname IS NOT NULL AND t2.lastname IS NOT NULL) OR
(t1.phonenumber IS NOT NULL AND t2.phonenumber IS NOT NULL)) AND
(t1.USERKEY <> t2.USERKEY) AND (t1.STATUSKEY=1) AND (t2.STATUSKEY=1)
I would appreciate any assistance in achieving this use case.
03/21/2024 08:22 PM
Hello @gokul ,
Did you try with the only one rule and check if it is working?
Thanks
03/21/2024 08:29 PM
Hi @sudeshjaiswal
Yes, with one rule it is working fine but that doesn't satisfy our usecase.
Regards!
03/21/2024 09:48 PM
Hello @gokul,
So, till how many rules its is working fine ? and what do you see in the logs when its get timeout?.
Thanks
03/21/2024 09:55 PM
@sudeshjaiswal
I slightly modified the query with 2 rules it is working fine
((t1.firstname = t2.firstname AND t1.lastname = t2.lastname AND t1.phonenumber = t2.phonenumber) OR
(t1.phonenumber IS NULL AND t2.phonenumber IS NULL AND t1.firstname = t2.firstname AND t1.lastname = t2.lastname)) AND
(t1.USERKEY <> t2.USERKEY) AND (t1.STATUSKEY=1) AND (t2.STATUSKEY=1)
But when I add the 3rd rule that job is failed with the error stats 'maximum statement execution time exceeded'
((t1.firstname = t2.firstname AND t1.lastname = t2.lastname AND t1.phonenumber = t2.phonenumber) OR
(t1.phonenumber IS NULL AND t2.phonenumber IS NULL AND t1.firstname = t2.firstname AND t1.lastname = t2.lastname) OR
((t1.lastname IS NULL OR t2.lastname IS NULL) AND t1.firstname = t2.firstname AND t1.phonenumber = t2.phonenumber)) AND
(t1.USERKEY <> t2.USERKEY) AND (t1.STATUSKEY=1) AND (t2.STATUSKEY=1)
03/25/2024 10:19 PM
Hello @gokul,
What is the max time you have configured.
Thanks.
03/25/2024 11:21 PM
Hi @sudeshjaiswal
I removed my timeout configuration, so it will revert to the default setting of 2 hours.