function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
PseudodarwinistPseudodarwinist 

Suggest a soql query to find duplicates in Account objects .

Suggest a soql query to find duplicates in Account objects . R1_ACC_TXT_Id_Golden_record__c is a primay key on our Account object.I am using below query to find all instnaces of an Account in our org but the query is timing out.
SELECT count(Id) FROM Account GROUP BY R1_ACC_TXT_Id_Golden_record__c HAVING count(Id)>1 
PINKY REGHUPINKY REGHU
Hi, 
Try this code:
List<AggregateResult> acc=[SELECT Name, count(Id) FROM Account GROUP BY Name HAVING count(Id)>1];
for(AggregateResult result : acc)
{
System.debug('Finding duplicate names'+result);
}



Please let me know if that helps you. Thanks.
 
PseudodarwinistPseudodarwinist
Hi Pinky,
I had tried executing that from developer console but it always showed "unpacking result" and never completed.Just to update we have more than 10 million Accounts in our Org. Also I want to group them by R1_ACC_TXT_Id_Golden_record__c field so that i can delete all the duplicates.
Regards,
Chetan
Raj VakatiRaj Vakati
IF you have more than 10 million Accounts in our Org use any third party deduplicate apps or 
Write a bacth job that mark any records as a dulicate based on the match
Or Use salesforce deduplciate APEX 
Sibasish PanigrahiSibasish Panigrahi
Hi Chetan,

Try this below code. I created this to delete duplicate records.
Change it according to your needs once.
List<Logistic__c> epudList= [Select Id From Logistic__c];
Set<String> epudId = new Set<String>();

for(Logistic__c log : epudList)
{
    epudId.add(log.Id);
}
 
List<Zero_Quantity_Product__c> zeroProdDeleteList = new List<Zero_Quantity_Product__c>();
List<Zero_Quantity_Product__c> zeroProdUpdateList = new List<Zero_Quantity_Product__c>();
List<Zero_Quantity_Product__c> zeroProdList = [Select Id,EPUD__c, Product__c , Product_Name__c 
                                               From Zero_Quantity_Product__c 
                                               Where EPUD__c IN : epudId AND Epud_Temp_Check__c = false 
                                               Order By EPUD__c limit 800];
 
for(Zero_Quantity_Product__c zero1 : zeroProdList)
{
    Integer count = 0;
    for(Zero_Quantity_Product__c zero2 : zeroProdList )
    {
        if(zero1.EPUD__c != null && zero1.Product__c == zero2.Product__c && zero1.EPUD__c == zero2.EPUD__c)
        {
            count++;
            if(count > 1 && zero1.Id == zero2.Id )
            {
                zeroProdDeleteList.add(zero2);
            }
        }
    }
    zero1.Epud_Temp_Check__c = true;
    zeroProdUpdateList.add(zero1);
}

if(zeroProdUpdateList != null)
{
    update zeroProdUpdateList;
}

if(zeroProdDeleteList != null)
{
    //system.debug('zeroProdDeleteList '+zeroProdDeleteList);
    delete zeroProdDeleteList;
}

Thanks.
PseudodarwinistPseudodarwinist
Thanks Raj Vakati ,
Can you please elaborate a alittle on your suggestions. Do you have in mind any specific third party dedups app? or What is salesforce deduplciate APEX ?
PseudodarwinistPseudodarwinist
Thanks Sibasish Panigrahi
I am a newbie to Salesforce trying to put my toes into its waters before i dive. It would be highly helpful if you please explain it a bit and try to suggext on my Account object.ETL job inserts new Accounts in Salesforce in Bulk and more often than not they end up inserting duplicates.
Sibasish PanigrahiSibasish Panigrahi
Hi Chetan,

Let me try my best to explain with comments on each line of code.
I tried to create it according to your needs but may vary slightly as conditions varies.
//Query Account, i have used limit as 100
List<Account> accList = new List<Account>();
accList = [Select Id, Name From Account Where {{put condition if required}} LIMIT 100];

//Account instance to delete duplicate Accounts
List<Account> accDupList = new List<Account>();


//Iterate the account list

for(Account acc1 : accList)
{
    Integer count = 0; //counts or checks if similar account present
    
    //Iterate the same account list again to check duplicate
    for(Account acc2 : accList)
    {
        if(acc1.Name == acc2.Name) /*you give your condition which is more secure to 
                                                         proof duplicacy*/
        {
            count++;
            if(count > 1 && acc1.Id == acc2.Id)
            {
                accDupList.add(acc2);
            }
        }
    }
}

system.debug('duplicate accounts list: '+accDupList);

It's kind of messy, but i used it in many projects to delete dupplicate records and it proved successfull for me. As i'm also being a newbie in salesforce, i love to experiment more in coding :-)

Hope this method of mine helps you too :-)


Thanks.
PseudodarwinistPseudodarwinist
Thanks Sibasish 
I will try to implement it for my org But is it going to work on already existing Duplicates which is more than 7 Lacs right now or I will need to write a trigger for this which will kick off for future insertion of Accounts.
Sibasish PanigrahiSibasish Panigrahi
Hi Chetan,

Yes it works for existing duplicates record. Try for 1st few 100 records, but before that please do take backup of the records so that if anything happens your records will b safe.

Thanks.
Dhanasekar KDhanasekar K
Good morning All,

Can someone suggest me to find duplicate reocrds in custom object (which have more than 15 million reocods). I tried by using below query but it consumes more time and failing with CPU time limit exception.

List<AggregateResult> acc=[SELECT Name, count(Id) FROM Account GROUP BY Name HAVING count(Id)>1];

Now i am planning to create batch job find and update those records, but it has to be done by subset of records I guess. 

Can someone advise me "Is there any other way to achieve this?"

Thanks in advance !!!