function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
icemft1976icemft1976 

Outbound Messages not sending, '0 attempts' & queue backlog

Has anyone else experienced a situation where your Outbound Message queue is backing up and messages are not being attempted to be sent?

Our workflow rules are triggering the right outbound messages - and when they get sent they process fine - but we're seeing them sit in queue for hours (up to 24 ). It also happens to some outbound messages that get attempted, but fail the first time (i.e. we have messages sitting in queue that say "next delivery attempt" and list a time way in the past).

I've tried support but they have not been able to find a cause.

Today our queue was working fine in the AM, but after noon it stopped attempting all messages after it got this error (which looks like a server side issue to my ignorant eyes):  " Request timed out waiting for connection to ConnPool_2, Num waiting=10, Num Active RIs=45"

Are there any limits I might be overlooking? We have over 120 users on unlimited accounts.

Any anecdote or insight is appreciated! Especially considerting our new website we just launched relies on this integration.

Thanks,
AJD
prbprb
I'm also having this problem.  Did you ever figure out why your messages were not being sent?

icemft1976icemft1976
I never got an answer as to the cause - but I kicked up a fuss with our account rep (after submitting a ticket and working with tech support for a few days) and it went away.

I assumed it was something I was doing but we didn't make any adjustments on our side and the problem has never come back. I'd be interested to know how many other folks had the issue but I can only assuem they found a problem and resolved it (but didn't pass the explanation along)
HarmpieHarmpie
Same issue for me. Messages were nicely being sent by Salesforce and received by my service last friday. Come back from the weekend (didn't change a thing since succesful attempts) and now it seems SFDC refuses to send the messages out? Nothing is reaching my service anyway, I am sure of that. What is wrong here?
 
(edit: Screenshot was taken 8-9-2008 14:05)


Message Edited by Harmpie on 09-08-2008 05:08 AM
icemft1976icemft1976
Harmpie,

Did you ever get a resolution on this issue?

Ours just resolved one day while tier 3 was investigating the problem - could be a coincidence or not - they said they didn't do anything specific to resolve the problem....which left me uneasy about the solution.

I would never rule out a problem on the webservice end (I wrote that code afterall :) ), but at the time I couldn't identify any changes in the webservice endpoint environment where the messages were being sent to (no domain issues, the same apache server, same scripts, etc).

I'm just taking a shot in the dark: maybe there was a peering issue between salesforce and our hosting provider? But if that was the case I'd assume there would be more info on the salesforce queue side of things.

Now we have logging on the receiving end, so I can run a daily report and verify the # of records changed in salesforce matches with the # of records updated in our website (where the webservice lives) - but that only catches dropped messages, not severly delayed ones (unless they get delayed by a day +).

A.J. Dellicicchi

ps - We moved from NA1 to NA6, but I think thta was after the problem was resolved. For kicks, what servers are you on?


Message Edited by icemft1976 on 11-11-2008 07:39 AM
prbprb
I do remember that when I had this problem the 'Next Attempt' value was set in the past.
From this screenshot it looks like the 'Next Attempt' is set to the same value as 'Last Attempt'.
I think it's possible for some unchecked exception behind the scenes of Salesforce --either from APEX code execution or perhaps related to an upgrade-- to generate corrupt Timestamp values for outbound messages that never send. 
It's not reproducible though so this is a guess on my part based on what I've observed.



HarmpieHarmpie

Hi,

I guess my story is the same; Support was working on it, but in the meanwhile, the messages were already being sent again correctly. They told me they changed nothing and since my problem had solved itself, they closed the case. I tried getting more info as to the cause of the problem from support, which they promised to give me if they came up with anything, but I haven't heard anything since.

At the time of my message, I was already creating service side logging, so that's why I was 100% sure that nothing was reaching it....of course, looking at the monitoring screen in Salesforce (see my screenshot), it was quite probable that the problem was on SFDC side.
 
I have to admit that I am still not very happy with the situation, because as far as I know now, the messages can stop working at any time, without any particular reason, being purged from the queue after 24 hours, without a sign.... which of course can lead to loss of data, which is impossible to track, unless you post someone at the monitoring screen 24x7.
 
I sure hope SFDC will come up with some assurance that this can never happen again, but I doubt it... :smileymad:
SuperfellSuperfell
Messages are never expired from the queue without a delivery attempt, so even under adverse conditions, there's always an attempt to delivery the message before its expired. I think the other issues are just an expectations mismatch, the delivery service is asynchronous, in general you can expect a delivery attempt soon (e.g. < 30 seconds) after your message has been queued, but this is not guaranteed, under various adverse circumstances, the delivery attempt can be delayed from a few minutes to a few hours. Finally the delivery process is modulated based on your endpoints performance, to get the best performance out of the delivery process, your endpoint should acknowledge every message within a short period of time (this may entail having the listener manage a local transactional queue itself).

Networking problems, exceptions and other reasons for delivery failures will show in the log, if the attempts says 0, we haven't tried to delivery the message yet, regardless of what the schedule says.
icemft1976icemft1976
Thanks Simon, I want to be clear on the specific case we're all looking at:

The Next Attempted Delivery time was in the past while the # of attempts said 0  ???

I don't know how to reconcile that specific status we saw with the other (totally reasonable) explanations that have been posited for why there might be message delays up to 24 hours long.

It's been awhile since i had this problem, but I was seeing this exact status (0 attempted deliveries & Next Attempted Delivery time was in the past) which falls into none of the categories you described (as far as I can tell).

It might be a miscommunication problem (no pun intended) on my part: I just haven't heard tech support acknowledge this particular state was even possible...which worried me. Maybe it's simply an error in  the "Next Attempted Delivery Time"  that's causing this confusion (as I think prb suggested earlier)

I might be misinterpreting this state, does it actually mean: "We scheduled a delivery attempt but didn't have the resources, so it will be tried again later but we can't say when at this time" ?




Message Edited by icemft1976 on 11-11-2008 09:24 AM
SuperfellSuperfell
Right, a next delivery attempt in the past means the message is scheduled for delivery, but we haven't actually tried to delivery it yet (for any number of reasons).
icemft1976icemft1976
I guess it never hurts to be really explicit when you think you have a "problem."

My assumption was that a Next Attempt Time in the past was an error condition which was causing the delay - not the other way around.

I would expect a note in the docs saying "Note: Although Next Attempt ime is usually in the future, it can be set in the past if resources do not allow the initial attempt. A new delivery time will be attempted when resources allow it."  So I assumed there was something wrong with my particular instance.

thanks for the confirmation!
Fangzi WangFangzi Wang
Hi all,

I am seeing the exact same problem as icemft1976. I am surprised that this problem has not been solved for six years. Any ideas on how to avoid it?

Thanks,