Another DBMS_Scheduler Chain Rule Issue

Following on from a recent adventure with a problem validating DBMS_Scheduler chain rules, I hit another issue today.

A rule was defined with a step name that does not exist. This happened because there is an odd limit of chain step name lengths (27 bytes I think), and the name of the step in the define_step call was modified from STP_EXP_SEND_FILES_TO_PROBES to STP_EXP_SEND_FILES_TO_PR after the rule code was written to reference STP_EXP_SEND_FILES_TO_PROBES.

When the job was executed the scheduler made repeated efforts to start a job_subname of STP_EXP_SEND_FILES_TO_PROBES_0 (that’s a zero on the end) approximately every 0.005 second, with the job run details reporting a status of FAILED and additional_info of:

CHAIN_LOG_ID=”576839″,STEP_NAME=”STP_EXP_SEND_FILES_TO_PROBES”, ORA-01403: no data found.

The job had to be stopped using FORCE=TRUE, dropped and recreated.

When the rule was corrected the job ran correctly.

Of course this has left us with rather large SYS.SCHEDULER$_JOB_RUN_DETAILS, SYS.SCHEDULER$_EVENT_LOG and SYS.SCHEDULER$_INSTANCE_PK segments, which is not ideal. We’ll see if support will let us run a shrink space on them.

So, to the lessons …

  1. Do not rely on Oracle validating rule and step modifications.
  2. Watch the first execution of a job on which you have modified the rules like a hawk in case you forgot point 1.

Solving ORA-24172: rule set %s.%s has errors

DBMS_Scheduler chains are a joy to use, until they stop being a joy and start being a real pain.

I modified the logic of a process, dropping one stored procedure and replacing it with another (instead of writing out a list of files to a control file so that a cron job can scp them to a set of other servers, we’re now executing scp as a DBMS_Scheduler executable-type job).

Calling the new procedure instead of the old one required dropping the obsolete DBMS_Scheduler program, the chain step, and the associated rules, and creating a new program, step and rules.

The old rules, step and program were dropped, and the new program and step were created.

However when creating the first of the new rules the following error message was raised:

begin
 DBMS_SCHEDULER.DEFINE_CHAIN_RULE (
 chain_name              => 'EXPORT',
 condition               => 'STP_EXP_SEND_FILES_TO_PR SUCCEEDED',
 action                  => 'START STP_RM_OLD_PROBE_FILES',
 rule_name               => 'START_RM_OLD_PROBE_FILES'
 );
end;
/

ORA-24172: rule set NCBI_ETL.SCHED_RULESET$4 has errors
ORA-06512: at "SYS.DBMS_ISCHED", line 1286
ORA-06512: at "SYS.DBMS_SCHEDULER", line 1257
ORA-06512: at line 2
24172. 00000 -  "rule set %s.%s has errors"
*Cause:    The rule references variables not in the evaluation context.
*Action:   Modify the rule condition to remove illegal reference.

The error is another of those where the majority of Google hits are dumb listings of the error code and message, and the “cause” and “action”.

The syntax and names in the call to DBMS_Scheduler.Define_Chain_Rule were all validated.

I exported the rule set associated with the chain:

select dbms_metadata.get_ddl('PROCOBJ','SCHED_RULESET$4')
from dual;

This told me the name of the evaluation context, which I then exported with:

select dbms_metadata.get_ddl('PROCOBJ','SCHED_EV_CTX$6')
from dual;

On checking the evaluation context I found that the newly-added chain step was present:

vt(10) := sys.re$variable_type('"STP_EXP_SEND_FILES_TO_PR"','sys.scheduler$_job_step_type','dbms_scheduler.get_job_step_cf',NULL);

Mysterious stuff!

I flirted with the idea of dropping the chain entirely, after exporting the chain and steps with:

select dbms_metadata.get_ddl('PROCOBJ','EXPORT')
from dual;

… and the rules (why are they not included in the chain definition?) with:

select 'exec DBMS_SCHEDULER.DEFINE_CHAIN_RULE ('''||chain_name||''','''||condition||''','''||action||''','''||rule_name||''','''||comments||''');'
from user_scheduler_chain_rules
where chain_name = 'EXPORT';

However, when I exported the rule set I noticed that there was still a rule defined with the name of the step I had previously dropped — it was the “what to do if anything fails” rule, and listed all the steps of the chain except the one that sends the chain failure email.

I dropped and created that rule with reference to failure of the new rule instead of the old one, and the new replacement rule that had previously failed now created correctly.

What I Learned

It appears that a chain step can be dropped in 10.2.0.4 even if it is referenced by a rule, but this may leave an “orphan” rule that will cause an ORA-24172 to be raised when rules are defined later on. Well, there may be other or more complete explanations, but the error that is raised may not be complaining about the rule you are trying to create — there may be another at fault.