You need to sign in to do that
Don't have an account?
Import Campaign Members
Hi all,
Having a big amount of campaign members which I`m trying to import in a timely manner, but unsuccessfully. The records are ~55M, initially i tried with my default bulk settings (parallel mode) and I got locking errors for ~30% of the records. This is key quote from one of the salesforce documentation which offers all possible workaounds for my scenario, but i can`t use any of these: "For optional lookup fields, you can avoid the locks by setting the "Clear the value of this field" option, which does more than just tell Salesforce what to do if your lookup record is deleted. When you set this option, whenever a record that has a lookup field to this lookup record is inserted or updated, Salesforce doesn't lock the lookup records; instead it only validates that the lookup values exist. To avoid unnecessary locks, it’s best to set this option for your lookup fields whenever possible. If it isn’t possible to configure your lookup fields with this option, use a workaround similar to the one you would use for master-detail relationships: Ensure that no lookup values span multiple batches or order your loads by their lookup values to minimize how often the records being inserted in multiple, concurrent batches reference a single lookup value."
For the lookups I use CampaignId and ContactId and both fields doesn`t have that option "Clear the value of this field", I assume because they`re standart fields. I can`t really play with my files to adjust, because having ~7000 campaigns, 10M contacts and 55M members is a big thing. Therefor the only working solution that I currently have is to use serial concurency mode, however the speed is really not acceptable because it will take me ~2 weeks to import all members.
Please do let me know if you have such issue before and/or if you know an alternative. Thank you!
Changing the sort order is not absurd because I found that someone solved a problem of parallel bulk inserts (lock contention) just by changing the sort order of the input file.
In parallel, if you insert thousands of contacts for the same campaign split in groups loaded at the same time, more than one threads will try to lock the same campaign record for inserting a lot of contacts during a long time (lock contention).
A CSV file sorted on the contact (external) id will perhaps have less lock contentions during the loading of the campaign members but there will be more individual locks on the campaign object (that could not be excellent for the performance either but that will be a bulk parallel treatment at least).
There is also the problem of lookup skew but there is no simple solution for that:
https://developer.salesforce.com/blogs/engineering/2013/04/managing-lookup-skew-to-avoid-record-lock-exceptions.html
All Answers
Field: Contacts in Campaign = NumberOfContacts (int) = rollup summary field so Master-Details.
https://developer.salesforce.com/docs/atlas.en-us.sfFieldRef.meta/sfFieldRef/salesforce_field_reference_Campaign.htm
Master Details: On insert or delete of a detail record, the master record is always locked. On update, the master is only locked if rollup summary fields exist on the master for that child type.
Lookups: On update of children, parent records are only locked when the Lookup value on the child changes. On insert or update, parent records are only locked "if [the] lookup relationship is not configured to clear the value of this field if the lookup record is deleted," which is the default option (clearing the value, that is).
@Alain,
thanks for the reply. On your questions:
1. I`m using Talend ETL Tool
2. Yes I do have problems with the parents - I believe the locking error issues I was getting was only from Campaign object. For Contacts the only errorI inittialy encountered was that I can`t update ContactId (I was using UPSERT). I found that ContactId is not actually editable, hence the error, so i changed the job to INSERT.
3. The file is not really sorted by anything and I haven`t tested, because it`a huge file, I`ll have to do that maybe and try to see what I can get out of it. And no - I have not tried with the CSV file for the campaign members sorted on the contact Id(or external contact id field). I will spend some time testing this and reply back.
Changing the sort order is not absurd because I found that someone solved a problem of parallel bulk inserts (lock contention) just by changing the sort order of the input file.
In parallel, if you insert thousands of contacts for the same campaign split in groups loaded at the same time, more than one threads will try to lock the same campaign record for inserting a lot of contacts during a long time (lock contention).
A CSV file sorted on the contact (external) id will perhaps have less lock contentions during the loading of the campaign members but there will be more individual locks on the campaign object (that could not be excellent for the performance either but that will be a bulk parallel treatment at least).
There is also the problem of lookup skew but there is no simple solution for that:
https://developer.salesforce.com/blogs/engineering/2013/04/managing-lookup-skew-to-avoid-record-lock-exceptions.html
thanks for the details. I spent some more time reading everything that you shared and some other stuff as well, I tested some possibilities, however I couldn`t make it run without locks. I was able to reduce the number of locks when i sorted the csv file by Customer external id, however i was still getting some errors, and therefor i don`t think this will work for me. I tried sorting by CampaignId as well, but i was getting pretty big numbers of locks. Soo, seems like I`ll have to live with the current solution with the serial mode and the expected 1 week for import. Thank you very much for your answers, I`ve learned something new here. Hope this would be of help to the others as well.
By definition, il is very difficult (quite impossible) to reproduce the problems on very large volumes of data.
I have used a lot some ETL tools (Informatica, Talend and Pentaho) but never with such volumes (dozens of million) that is why your question is very interesting (problem for architect or data expert).
There is a last alternative by coding a solution in native java or C# for a better control of the lock contentions with the dilemma of spending "some" days writing multi-threading code when it is for a "one shot initialization" and without the certainty of being better than Talend at the end (lost of time finally, failed POC).
Changing the sort order is not enough and there is this lookup skew in ambush (the "silent killer") without clear solution excepted by reducing the scope (a basic change always done at first like the sort order but without being sufficient).
The same question on other forums could be interesting for new feedbacks
(sfdxfox https://salesforce.stackexchange.com/users/2984/sfdcfox answers very pointed questions)
Best regards
Alain
thanks for you engagement with the topic, it was very interesting to me as well to get some ouptputs and I`m sure this will be of help to someone else in future, so it contains additional added value. The alternative that you`re talking about mught be worth if the number are really huge - let`s say hundreds of millions above, but in my case it is not really thaaat bad - I mean 54M records with serial mode will probably finish in 5 days, so it`s not a big deal and in my case it is just a historical data and there`s no rush to use it.