function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
knicholsknichols 

command line data loader

Has anyone ever loaded data, using the dataloader, from a table with 40 million rows?  I run out of memory each time.  I have

<entry key="dataAccess.readBatchSize" value="200"/>

 

defined but I'm guessing it's not using that to actually read in 200 rows from the table and then add it to the batch.  Also, we are trying to use the bulk API for this.  Any help or ideas please??

 

DodiDodi

You will run into problems if you load that many records into one object. The typically start to lose performance after about 6 million records.

 

For the bulk API, you can load 10k per call. However becareful b/c you cannot delete them via the bulk API. So you may get stuck deleting 200 per call. 

 

Regards

knicholsknichols

I"m aware of the problems once we get the data to SFDC but we can't select the data from our database to send.  Each time we get an out of memory error.

DodiDodi

What tool you using to query the database? 40 million records are alot. Another approach is to break the load into 40 csv files. I have loaded upto a million records per file. Using the bulk API, each file loads in an hour or less.

knicholsknichols

We are using the dataloader to query the database.  We thought about breaking it into separate jobs based on some filter in the SQL but we are still trying figure this one out.

DodiDodi

Does not sound like you are usinsg the Apex DL since it does not connect to native SQL databases. Are you guys using the informatica cloud based data loader? Not sure if that has the capacity to read 40 million rows. I would break the query size  into smaller chunks.

knicholsknichols

Read the docs on the command line interface for Data Loader, I assure you, you can connect to SQL databases.

DodiDodi

Yep, you are right, I assumed you were using the thick GUI tool and was not aware you were using the CLI.