function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
Chris Walters 9Chris Walters 9 

Bulk API 2.0 query returns truncated results when LIMIT not used

version 47.0
Bulk API 2.0

Built up a python script using requests and requests_oathlib modules against test.salesforce.com/services/oauth2/token to generate access tokens
and a function that concatenates chunks of results until last chunk is detected then creates a json string from the results.

we know we have 1479552 rows in Contact.
Problem: 
SELECT id,Name,Email,Phone FROM Contact LIMIT 1850000
returns all 1479552 rows but
SELECT id,Name,Email,Phone FROM Contact
returns only 492196 !!!
What is up with that??!!??

And I noticed that when using the LIMIT clause, only two chunks are returned, the first one contains all or close to all the data and the second only 110 bytes or so, while without the LIMIT  3 chunks of roughly equal size are returned.

Can anyone verify and/or explain this behaviour? Better still, tell me how to get around it?

First post, YAY!

TIA,

Chris
 
Best Answer chosen by Swetha (Salesforce Developers) 
Chris Walters 9Chris Walters 9
Hello Swetha,

I found the problem was not with SF but with the way my code was calling sequential chunks of output - it was -appending-  location values onto the submitted URL upon each loop iteration so of course only worked correctly on the first loop. Changing the appending to a simple reforming with 
         
url_full = "{0}/jobs/query/{1}/results?locator={2}".format( url_to_use, operation_id, header_dict['Sforce-Locator'])

did the trick.

Thx,

Chris

All Answers

SwethaSwetha (Salesforce Developers) 
HI Chris,

Have you checked the state of the processed Bulk API 2.0 job when limit is not used- Does it show JobComplete or open? 

related article: https://developer.salesforce.com/docs/atlas.en-us.216.0.api_bulk_v2.meta/api_bulk_v2/how_requests_are_processed.htm

Thanks
Chris Walters 9Chris Walters 9
Hello Swetha,

I found the problem was not with SF but with the way my code was calling sequential chunks of output - it was -appending-  location values onto the submitted URL upon each loop iteration so of course only worked correctly on the first loop. Changing the appending to a simple reforming with 
         
url_full = "{0}/jobs/query/{1}/results?locator={2}".format( url_to_use, operation_id, header_dict['Sforce-Locator'])

did the trick.

Thx,

Chris
This was selected as the best answer