Monday, September 17, 2018

ORA-16047 - DGID mismatch between destination setting and target database

A few days after a DG planned switchover I got a call from the DBAs. OEM and v$dataguard_stats shows no lag, but archives have not shipped from the Primary DB to the Physical Standby DB since the switchover.


After confirming their suspicions by querying V$archived_log and v$dataguard_stats I went to the 3 primary sources for DG troubleshooting.

1.) DGMGRL

the command "show configuration" indicated that the Standby database was disabled.

2.) Primary and Standby Alert.log files

The Primary DB had this peculiar error in its Alert.log file:
Fri Sep 14 13:55:31 2018
ORA-16047: DGID mismatch between destination setting and target database

using the oerr utility I found the full descrition of the error:

Error:  ORA-16047 DGID mismatch between destination setting and target database 
---------------------------------------------------------------------------
Cause:  The DB_UNIQUE_NAME specified for the destination did not match the 
 DB_UNIQUE_NAME at the target database. 
Action: Make sure the DB_UNIQUE_NAME specified in the LOG_ARCHIVE_DEST_n 
 parameter matches the DB_UNIQUE_NAME parameter defined at the 
 destination. 

But the DB_UNIQUE_NAME was correct in the Primary's LOG_ARCHIVE_DEST_2 parameter.

Next I looked at the Standby DBs Alert.log file, especially around the timestamp of the ORA-16047 error in the Primary DBs Alert.log. It seems that the Broker was not happy and decided to deconfigure the LOG_ARCHIVE_CONFIG parameter!

Fri Sep 14 13:54:23 2018
ALTER SYSTEM SET log_archive_config='nodg_config' SCOPE=BOTH;
...
Completed: Data Guard Broker shutdown
Fri Sep 14 13:54:59 2018
ALTER SYSTEM SET dg_broker_start=FALSE SCOPE=BOTH;
ALTER SYSTEM SET dg_broker_start=TRUE SCOPE=BOTH;


3.) Primary and Standby DG Alert log files

Lets look at the DG Alert.log files around the same timestamp.
The Standby DG Alert.log had hundreds of repeating ORA-16501 error messages until the timestamp in question when we got:

...
Failed to send message to site pc3dfmw2tc. Error code is ORA-16501.
...
09/14/2018 13:54:06
REMOVE CONFIGURATION [PRESEVERE DESTINATIONS]

So I was correct, the Broker was not happy (as can be seen by the repeated ORA-16501 errors) and eventually deconfigured the LOG_ARCHIVE_CONFIG parameter!

Error:  ORA-16501 the Data Guard broker operation failed
---------------------------------------------------------------------------
Cause:  the Data Guard broker operation failed
Action: See accompanying messages for details.

Not a lot of info for the ORA-16501 error, it seems to be a generic error. Investigating further back in the Standby's DG Alert.log file I found that the switchover completed with errors. The DBA was satisfied that the switchover worked, but did not do due diligence in investigating the accompanied error, and a few hours later DG stopped shipping Archivelogs altogether.
The solution was to correct the log_archive_config setting to what it was before the switchover, then to re-enable the Standby database in the DG Broker config. That fixed the lag, archives are now transported and applied to the Standby DB. That also corrected the Apply Lag reported in v$dataguard_stats.The Standby DB is now in sync and is usable in case of a DR, but further troubleshooting is necessary to determine why the Broker configuration was not 100% happy in the first place. 



No comments: