Friday, January 23, 2009

Data Guard Lag Time

We use the Broker to administer our Data Guard configurations and we also use OEM to monitor our databases, we have found that every time we change the state of the Physical Standby between ONLINE and READ-ONLY (and back to ONLINE again) we start getting OEM alarms for reaching our set limits for Lag Time (the time in seconds that the Physical Standby is behind the Primary database).This Lag Time can also be found in v$dataguard_stats (in fact that is where OEM gets it from).

What we have found was that we need to bounce the Physical Standby to force v$dataguard_stats to get updated with the correct values. The good news is that the Physical standby does not really have a lag time, it is just that v$dataguard_stats does not update itself when DG STATES change. The bad news is that OEM shows the faulty lag time, generates an alarm and since our alarms are visible to the whole organization we have to explain that it is not really a problem.

To confirm that the problem is in v$dataguard_stats and is not a real reflection of reality, do the following:


On Standby:

SQL>select TIME_COMPUTED from v$dataguard_stats;

TIME_COMPUTED
------------------------------
09-JAN-2009 09:45:13
09-JAN-2009 09:45:13
09-JAN-2009 09:45:13
09-JAN-2009 09:45:13
09-JAN-2009 09:45:13

SQL>select current_scn from v$database;

CURRENT_SCN
--------------------------
9661858803219

On Primary:

SQL>select scn_to_timestamp(9661857384219) from dual;

SCN_TO_TIMESTAMP(9661857384219)
---------------------------------------------------------
12-JAN-09 12.26.57.000000000 PM

SQL>select SCN_TO_TIMESTAMP(current_scn) from v$database;

SCN_TO_TIMESTAMP(CURRENT_SCN)
---------------------------------------------------------
12-JAN-09 12.30.00.000000000 PM

So we are only a few seconds behind, but v$dataguard_stats and OEM show otherwise. Bouncing the Physical Standby database fixed this.

1 comment:

Anonymous said...

Martin
Good find and this was supposed to be solved in 11.1.0.7 but as usual it has slipped through the net.
Plan to log an SR when I get time as this isn't a good enough situation esp if you are relying on DG Stats for monitoring - nothing worse than an event that gets ignored because the software crys wolf.
Thanks
Martin II