function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
Ariel Berkman 5Ariel Berkman 5 

Error during BULK API Query using PKChunking and "ORDER BY" keyword

Hi All,

We believe we've hit a bug with the BULK API. When trying to perform a large query using PKChunking and an ORDER BY keyword the query fails.

Specifically, it appears that the BULK API is trying to submit multiple batches on different ranges (as it should), but since it's appending the range filter after the ORDER BY clause it's failing with the following error:

(msg=InvalidBatch : Failed to process query: MALFORMED_QUERY: Field_name FROM Account ORDER BY Id where Id >= '001150000yyyyyy' and ^ ERROR at Row:1:Column:xxxx unexpected token: 'where'), aborting

A sample SOQL query that will fail when Sforce-Enable-PKChunking header is set to True:

'SELECT Id,Name from Account ORDER BY Id'

Any suggestions/thoughts would be highly appreciated!

Thanks in advance,
Ariel.
Ariel Berkman 5Ariel Berkman 5
Hi,

Any thoughts/suggestions on this?

Thanks!
Ariel.
jhurstjhurst
Ariel,

So this is intended.  PK Chunking is designed to automatically order and offset the queries in a way which works wit hthe primary keys.  Adding an ORDER BY is not needed in this case, as the query is already ordered by the IDs.

I will log a bug to have the documentation reflect what filtering conditions are allowed for use with PK Chunking.

Thanks
Jay
Ariel Berkman 5Ariel Berkman 5
Thanks Jay, that's really helpful!

Ariel.
Christopher CurrieChristopher Currie
Jay,

This doesn't appear to always be true. We have been working with batches using PK Chunking where the results are not ordered. It appears to happen in cases where the results of a query all fit within a single batch.

Thanks,
Christopher
jhurstjhurst
Christopher,

PK Chunking works by performing offsets on the Primary Key and then constucting queries in the format:

select <fields> from <entity> where Id >= OffsetId1 and Id < OffsetId2
select <fields> from <entity> where Id >= OffsetId2 and Id < OffsetId3
select <fields> from <entity> where Id >= OffsetId3 and Id < OffsetId4
....

So it is possible that the individual batch results will not be ordered, the results from query 2 will have higher IDs than query 1.