function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
AlwaysConfusedAlwaysConfused 

Syncronisation

Hey guys,

 

I have a problem, pretty big one as it happens.

One of our customers has about 10,000 campaigns and about 78 MILLION campaign member items in the SF db.

 

They have asked if we can produce weekly complex reports on them that we can't build through SF and since we control the data we have built a copy of the SF data on a local SQL server.

 

I'm trying to build a syncronisation tool / process using SSIS and a bit of C# code that regularly looks up (using the SF api) whats changed and updates our copy of the data.

 

Of course this leads to a problem, due to the sheer number of records in the DB I often hit timeout issues and when I don't hit timeout issues I often need to sync millions of rows by downloading them.

 

Has anyone had any experience with dealing with huge row counts that they could lend some thoughts on this?

 

Does anyone know how I might "partition" the data so I can still somehow query the entire dataset but work mostly on subsets (eg split down per campaign)?

RichAintRichRichAintRich

A few questions, Paul. What is it about the weekly complex reports that is solved by migrating the records from Salesforce to a different database platform? In other words what can you do in SQL Server or some other database that you can't do in Salesforce?

 

Rich

AlwaysConfusedAlwaysConfused

As it happens I managed to figure out my syncronisation issues in the end, i now havea sync package that runs daily pulling any updated / new records.

 

The reports could not be produced on SF because of a few things :

1. query complexity (SOQL doesn't support Query chains like ones I'm using).

2. The reports rely on data outside of SF as well as data within it.

3. Performance, i'm not sure why (althoug i could hazzard a guess) but raw SQL and C# seems faster than APEX + SOQL

 

It's a clever solution actually, the report data is relying on roughly 100 million rows of data all in, I'm joining data from the campaigns ect described above to data across several complex objects with their own relationships to other things, I think I prefer it this way too because the customer often changes their mind.

 

Slesforce is great, but its not that great, and I would feel guilty about putting this much load on someone elses datacenter when mine is perfectly capable of handling this and because I have less demand for the resource pool I can do what I want without worrying about "API call limits".

 

It sounds daft but given time and enough of these complex reporting requirements I could probably hit the API limits of the unlimited edition (assuming the cloud was fast enough to handle that number of calls in the 24 hours of course).

 

also...

I do like Salesforce, I just don't seem to click well with Java so that accounted for some of my keen interest in bringing the data in house.

 

It's just a shame I can't open a record stream / similar using an API call instead of this "QueryMore" stuff they have going on or I could stream directly from the cloud in to SQL without any code at all.

 

RichAintRichRichAintRich

Makes sense to me. Good luck. Are they using Excel for any of their reporting requirements?

AlwaysConfusedAlwaysConfused

I might use some excel but it seems that I can handle pretty much everything with pivot tables as a SQL Reporting services data source.

 

Seem to be able to produce some pretty complex stuff and gain some real data mining insight with it too :)