• Robert Watson
  • NEWBIE
  • 0 Points
  • Member since 2016
  • Senior Software Developer
  • Full Circle Insights


  • Chatter
    Feed
  • 0
    Best Answers
  • 0
    Likes Received
  • 0
    Likes Given
  • 2
    Questions
  • 4
    Replies
There seems to be an issue with batch jobs failing silently when they exceed the CPU limit, except in certain circumstances. We have confirmed that this isn't just happening on my org but in one or two other orgs as well.

This is concerning because there is no way for us to know that our processes are actually working properly when installed in client orgs with many triggers, workflows, and processes on an object. In addition, any updates that happen prior to the failure do NOT get rolled back.

Here is an example class that illustrates the issue:
 
public class TestCpuTimeout implements Database.Batchable<sObject> {
        public Database.QueryLocator start(Database.BatchableContext bc) {
            return Database.getQueryLocator('SELECT Id FROM Account LIMIT 200');
        }
    
        public void execute(Database.BatchableContext bc, List<sObject> scope) {
            scope[0].put('NumberOfEmployees', 100);
            update scope[0];
            System.debug('Account updated: ' + scope[0].Id);
        
            List<String> largeData = new List<String>();
            for (Integer i = 0; i< 10000; i++) largeData.add('Test');
            for (Integer i = 0; i < 500; i++) {
                // Not calling Limits.getCpuTime() seems to cause this batch job to silently fail.
                // When there is a silent failure, any record updates do not get rolled back.
                //System.debug(Limits.getCpuTime()); // This causes a CPU timeout.
                //long l = Limits.getCpuTime(); // This causes a CPU timeout.
                //System.debug('test'); // This does not cause a CPU timeout. This silently aborts. Record updates are not rolled back.
                String data = JSON.serialize(largeData);
            }
        }
    
        public void finish(Database.BatchableContext bc) { }
    }



The class simply updates a single account and then serializes a lot of data asynchronously in order to simulate a CPU timeout.

Can anybody else replicate this issue, and/or explain to me what's going on? And yes, this may seem more suited for a case, but Salesforce closed my case (#15549753) because developer support is not offered to non-premier partners.

Test Setup:

Setup debugging for the current user - levels set to Debug for Apex Code and Fine for Apex Profiling.

Test #1:

- Run Database.executeBatch(new TestCpuTimeout());

After a minute, the Debug Logs and Apex Jobs will show nothing out of the ordinary. It looks like it is successful, but it actually doesn't finish it's entire processing. The account is updated with NumberOfEmployees = 100. (Can be confirmed by grabbing the ID from the debug log.)

Test #2:

- Update the TestCpuTimeout class and uncomment line 16 - System.debug(Limits.getCpuTime());

- Run Database.executeBatch(new TestCpuTimeout());

After a minute, the Debug Logs and Apex Jobs will show that there was a CPU timeout exception. The account is NOT updated with NumberOfEmployees = 100. (It may be the same account from Test #1 - confirm that the Last Modified Date does not change.)

Test #3:

- Update the TestCpuTimeout class and comment out line 16. Uncomment line 17.

- Run Database.executeBatch(new TestCpuTimeout());

After a minute, the Debug Logs and Apex Jobs will show that there was a CPU timeout exception. The account is NOT updated with NumberOfEmployees = 100. (It may be the same account from Test #1 - confirm that the Last Modified Date does not change.)

Test #4:

- Update the TestCpuTimeout class and comment out line 17. Uncomment line 18.

- Run Database.executeBatch(new TestCpuTimeout());

After a minute, the Debug Logs and Apex Jobs will show nothing out of the ordinary. It looks like it is successful, but it actually doesn't finish it's entire processing. The account is updated with NumberOfEmployees = 100.

Tests #1 and #4 do not work as expected. This leads me to believe that for some unknown reason, having the Limits.getCpuTime() statement in the code triggers the CPU timeout exception, but nothing else is. This is extremely concerning and confusing.

This may be related to https://success.salesforce.com/issues_view?id=a1p300000008YP2AAM, but if so, this issue has gone unfixed for over three months.

(Also posted on the stack exchange: http://salesforce.stackexchange.com/questions/156902/has-anybody-seen-this-behavior-batch-jobs-failing-silently-on-cpu-timeout-excep)
I'm pretty sure this is a Salesforce bug, but when I try to log a case it just directs me here. So let's see if I can start a discussion and get some traction on this.

I have a trigger built on the DuplicateRecordItem object. In order to meet the requirements of the product we're building, I had to do some funkiness to get around SOQL and DML limit issues. Why? Well, when the Automated Process user finds duplicate records, the DuplicateRecordItems don't seem to be inserted in bulk.

If I'm inserting 5 leads and 5 duplicates are found, I'd expect to see a debug log where 5 DuplicateRecordItems are being inserted by the Automated Process user. This is happening, but one-by-one in the same execution context.

My debug log looks something like this (truncated to remove business logic):
17:25:54.0 (672964)|CODE_UNIT_STARTED|[EXTERNAL]|TRIGGERS
17:25:54.0 (1855848)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (201951200)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (278330109)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001Atx8]
17:25:54.0 (595800979)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001Atx8]
17:25:54.0 (656795028)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (666779292)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (696109353)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001Atx9]
17:25:54.0 (720758546)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001Atx9]
17:25:54.0 (748951344)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (753501365)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (789856188)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxA]
17:25:54.0 (812664269)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxA]
17:25:54.0 (837110798)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (840939576)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (874777343)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxB]
17:25:54.0 (895501197)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxB]
17:25:54.0 (921262665)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (925065733)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (950455245)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxC]
17:25:54.0 (971136910)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxC]
17:25:54.0 (991983991)|CODE_UNIT_FINISHED|TRIGGERS

Salesforce always advocates for never doing DML and SOQL in for loops... but it seems as though we have a loop here that is beyond our control. If I need to do any queries or DML in this trigger, I had better be very careful because if 151 records are being inserted then I'm suddenly out of DML calls just by doing one in the trigger.

Has anybody else experienced this?
There seems to be an issue with batch jobs failing silently when they exceed the CPU limit, except in certain circumstances. We have confirmed that this isn't just happening on my org but in one or two other orgs as well.

This is concerning because there is no way for us to know that our processes are actually working properly when installed in client orgs with many triggers, workflows, and processes on an object. In addition, any updates that happen prior to the failure do NOT get rolled back.

Here is an example class that illustrates the issue:
 
public class TestCpuTimeout implements Database.Batchable<sObject> {
        public Database.QueryLocator start(Database.BatchableContext bc) {
            return Database.getQueryLocator('SELECT Id FROM Account LIMIT 200');
        }
    
        public void execute(Database.BatchableContext bc, List<sObject> scope) {
            scope[0].put('NumberOfEmployees', 100);
            update scope[0];
            System.debug('Account updated: ' + scope[0].Id);
        
            List<String> largeData = new List<String>();
            for (Integer i = 0; i< 10000; i++) largeData.add('Test');
            for (Integer i = 0; i < 500; i++) {
                // Not calling Limits.getCpuTime() seems to cause this batch job to silently fail.
                // When there is a silent failure, any record updates do not get rolled back.
                //System.debug(Limits.getCpuTime()); // This causes a CPU timeout.
                //long l = Limits.getCpuTime(); // This causes a CPU timeout.
                //System.debug('test'); // This does not cause a CPU timeout. This silently aborts. Record updates are not rolled back.
                String data = JSON.serialize(largeData);
            }
        }
    
        public void finish(Database.BatchableContext bc) { }
    }



The class simply updates a single account and then serializes a lot of data asynchronously in order to simulate a CPU timeout.

Can anybody else replicate this issue, and/or explain to me what's going on? And yes, this may seem more suited for a case, but Salesforce closed my case (#15549753) because developer support is not offered to non-premier partners.

Test Setup:

Setup debugging for the current user - levels set to Debug for Apex Code and Fine for Apex Profiling.

Test #1:

- Run Database.executeBatch(new TestCpuTimeout());

After a minute, the Debug Logs and Apex Jobs will show nothing out of the ordinary. It looks like it is successful, but it actually doesn't finish it's entire processing. The account is updated with NumberOfEmployees = 100. (Can be confirmed by grabbing the ID from the debug log.)

Test #2:

- Update the TestCpuTimeout class and uncomment line 16 - System.debug(Limits.getCpuTime());

- Run Database.executeBatch(new TestCpuTimeout());

After a minute, the Debug Logs and Apex Jobs will show that there was a CPU timeout exception. The account is NOT updated with NumberOfEmployees = 100. (It may be the same account from Test #1 - confirm that the Last Modified Date does not change.)

Test #3:

- Update the TestCpuTimeout class and comment out line 16. Uncomment line 17.

- Run Database.executeBatch(new TestCpuTimeout());

After a minute, the Debug Logs and Apex Jobs will show that there was a CPU timeout exception. The account is NOT updated with NumberOfEmployees = 100. (It may be the same account from Test #1 - confirm that the Last Modified Date does not change.)

Test #4:

- Update the TestCpuTimeout class and comment out line 17. Uncomment line 18.

- Run Database.executeBatch(new TestCpuTimeout());

After a minute, the Debug Logs and Apex Jobs will show nothing out of the ordinary. It looks like it is successful, but it actually doesn't finish it's entire processing. The account is updated with NumberOfEmployees = 100.

Tests #1 and #4 do not work as expected. This leads me to believe that for some unknown reason, having the Limits.getCpuTime() statement in the code triggers the CPU timeout exception, but nothing else is. This is extremely concerning and confusing.

This may be related to https://success.salesforce.com/issues_view?id=a1p300000008YP2AAM, but if so, this issue has gone unfixed for over three months.

(Also posted on the stack exchange: http://salesforce.stackexchange.com/questions/156902/has-anybody-seen-this-behavior-batch-jobs-failing-silently-on-cpu-timeout-excep)
I'm pretty sure this is a Salesforce bug, but when I try to log a case it just directs me here. So let's see if I can start a discussion and get some traction on this.

I have a trigger built on the DuplicateRecordItem object. In order to meet the requirements of the product we're building, I had to do some funkiness to get around SOQL and DML limit issues. Why? Well, when the Automated Process user finds duplicate records, the DuplicateRecordItems don't seem to be inserted in bulk.

If I'm inserting 5 leads and 5 duplicates are found, I'd expect to see a debug log where 5 DuplicateRecordItems are being inserted by the Automated Process user. This is happening, but one-by-one in the same execution context.

My debug log looks something like this (truncated to remove business logic):
17:25:54.0 (672964)|CODE_UNIT_STARTED|[EXTERNAL]|TRIGGERS
17:25:54.0 (1855848)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (201951200)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (278330109)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001Atx8]
17:25:54.0 (595800979)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001Atx8]
17:25:54.0 (656795028)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (666779292)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (696109353)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001Atx9]
17:25:54.0 (720758546)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001Atx9]
17:25:54.0 (748951344)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (753501365)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (789856188)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxA]
17:25:54.0 (812664269)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxA]
17:25:54.0 (837110798)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (840939576)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (874777343)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxB]
17:25:54.0 (895501197)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxB]
17:25:54.0 (921262665)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (925065733)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event BeforeInsert for [new]
17:25:54.0 (950455245)|CODE_UNIT_STARTED|[EXTERNAL]|01q36000001GnM4|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxC]
17:25:54.0 (971136910)|CODE_UNIT_FINISHED|LAM_DuplicateRecordItemTrigger on DuplicateRecordItem trigger event AfterInsert for [0GL36000001AtxC]
17:25:54.0 (991983991)|CODE_UNIT_FINISHED|TRIGGERS

Salesforce always advocates for never doing DML and SOQL in for loops... but it seems as though we have a loop here that is beyond our control. If I need to do any queries or DML in this trigger, I had better be very careful because if 151 records are being inserted then I'm suddenly out of DML calls just by doing one in the trigger.

Has anybody else experienced this?