You need to sign in to do that
Don't have an account?
New Bulk API from Winter '10 is not streaming
Hello,
I'm reviewing the new Bulk API being prepared for Winter '10 release and I see one major limitation. It expects the number of batches to be sent in advance. This implies you must have all data you want to load in advance, so you can determine how many batches you will need to upload the data. This requirement makes the Bulk API not usable when you want to stream data, without knowing in advance how much total data you have.
My question is to the developers of the Bulk API. Is there time left to extend the API to allow streaming of data? Is it possible to remove this requirement to submit the number of batches in advance?
You are correct. I guess this may actually work.
Still the question with the versatility of the New Bulk API is still around.
Hi,
What made you think it needs to know the number of batches in advance? It doesn't. You should definitely be able to implement a fully streaming client that doesn't know the number of records in advance.
This is great news! I guess I misunderstood the documentation. I will give it a try.
Thanks,
Ivan
Jesper,
This is the paragraph that confused me (api_bulk.pdf):
What You Can Do with the Bulk API
The REST Bulk API lets you insert, update, or upsert a large number of records asynchronously. That is, you first send a number
of batches to the server using an HTTP POST call and then the server processes the batches in the background.While batches
are being processed, you can track progress by checking the status of the job using an HTTP GET call. All operations use
HTTP GET or POST methods to send and receive XML or CSV data.
----
The text "That is, you first send a number of batches to the server" is not very clear.