Showing posts with label Data Guard. Show all posts
Showing posts with label Data Guard. Show all posts

Monday, February 4, 2019

UNLOCK user in ADG


In our production databases, all users have FAILED_LOGIN_ATTEMPTS set to 5 on their PROFILES. After 5 incorrect passwords their account will LOCK. When that happens in the Primary Database the value for DBA_USERS.ACCOUNT_STATUS will be set to LOCKED.

But in Active Data Guard (ADG) environment it works a bit differently. If the user LOCKED his account on the ADG instance he will still get the error “ORA-28000: the account is locked” when trying to connect, but DBA_USERS.ACCOUNT_STATUS will have a value of OPEN because the value gets replicated from the Primary and is not a true reflection of the account status on the Standby. Oracle created the V$RO_USER_ACCOUNT view to help with troubleshooting the Account Status in a ADG instance.

See the output of V$RO_USER_ACCOUNT below for a user that LOCKED his account on a ADG instance after 6 failed logins:

@>conn meyermar/x
ERROR:
ORA-28000: the account is locked

When logged in as SYS:
SQL>select DATABASE_ROLE, OPEN_MODE from v$database;

DATABASE_ROLE    OPEN_MODE
---------------- --------------------
PHYSICAL STANDBY READ ONLY WITH APPLY

SQL> select ro.* from v$ro_user_account ro, dba_users u where username='MEYERMAR' and USER_ID=USERID;

    USERID PASSW_EXPIRED PASSW_IN_GRACE PASSW_LOCKED PASSW_LOCK_UNLIM FAILED_LOGINS EXPIRATION PASSW_LOCK
---------- ------------- -------------- ------------ ---------------- ------------- ---------- ----------
       131             0              0            0                1             6


Users LOCKED in ADG can be unlocked again with a simple ALTER USER X ACCOUNT UNLOCK command. But this week we found that it did not work in one of our ADG databases.

STANDBY DB> alter user puleo account unlock;
alter user puleo account unlock
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access
ORA-06512: at line 1282
ORA-16000: database open for read-only access

Have not seen this error in a 11.2.0.4 ADG DB before, the UNLOCK USER normally works in ADG, so I traced my session doing the UNLOCK:

STANDBY DB>alter session set max_dump_file_size=UNLIMITED;
Session altered.
STANDBY DB>alter session set tracefile_identifier=SQLERROR_10046;
Session altered.
STANDBY DB>alter session set events '10046 trace name context forever, level 12';
STANDBY DB> alter user puleo account unlock;

The trace file showed “update to seg$” for the object GGS_DDL_SEQ, which is a Goldengate Sequence. The Primary DB is using Goldengate with DDL replication enabled. That is the difference between this ADG instance and all other ADG instances where the USER UNLOCK works.

I found 2 solutions to temporary disable the Goldengate DDL replication trigger in order to UNLOCK the USER in ADG.

either do
PRIMARY DB> alter trigger sys.GGS_DDL_TRIGGER_BEFORE disable ;
STANDBY DB> alter user puleo account unlock;
PRIMARY DB> alter trigger sys.GGS_DDL_TRIGGER_BEFORE enable ;
OR
STANDBY DB> alter system set "_system_trig_enabled"=FALSE;
STANDBY DB> alter user puleo account unlock;
STANDBY DB> alter system set "_system_trig_enabled"=TRUE;

I decided to log a SR with Oracle for the issue with test cases and the solution as it would be great to have ADG behave the same in all circumstances. Goldengate with DDL replication enabled should not influence the Standby in this way. Feedback before the engineer closed the call was that the following Goldengate Enhancement has been logged:
Enh 22198588 - DDL CHANGES IN DG BLOCKED BY BLOCKED BY GG GGS_DDL_TRIGGER_BEFORE

Wednesday, October 3, 2018

Data Guard Failover surprise

Ran into a interesting scenario after reinstating the previous primary DB after an emergency failover.

A Data Guard failover is done during emergencies (an actual DR situation) and the old standby gets activated as a Primary database with RESETLOGS after an incomplete recovery. 

Here's the alert.log from the new Primary database:

Tue Oct 02 16:07:34 2018
 Data Guard Broker: A failover will now be attempted
Tue Oct 02 16:07:34 2018
ALTER DATABASE FAILOVER TO TMSBY
...
Tue Oct 02 16:07:40 2018
ALTER DATABASE SWITCHOVER TO PRIMARY (TMSBY1)
RESETLOGS after incomplete recovery UNTIL CHANGE 14297961467966 time 10/02/2018 16:02:10
...
Tue Oct 02 16:07:40 2018
Setting recovery target incarnation to 3
Tue Oct 02 16:07:41 2018
Switchover: Complete - Database mounted as primary

A few hours later the old Primary DB (TMPRD) was fixed, started in MOUNT mode and reinstated via the Broker:

DGMGRL> show configuration
Configuration - tmdg
  Protection Mode: MaxPerformance
  Members:
  TMSBY  - Primary database
    TMPRD - Physical standby database (disabled)
      ORA-16661: the standby database needs to be reinstated
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS   (status updated 26 seconds ago)
DGMGRL> reinstate database TMPRD
Reinstating database "TMPRD", please wait...
Reinstatement of database "TMPRD" succeeded

Reinstatement completed successfully, but the new standby (TMPRD) was not applying logs, the alert.log confirmed that it was waiting for an Archivelog that was only created on the previous standby site. 

Media Recovery Waiting for thread 2 sequence 358 branch(resetlogs_id) 987409798

But that archivelog has already been backed up and deleted via RMAN. A straight forward restore on TMSBY should work but then it failed with: 

RMAN> restore archivelog sequence 358 thread 2;
Starting restore at 03-OCT-18 08:40
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=232 instance=TMSBY1 device type=DISK
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 10/03/2018 08:40:15
RMAN-20242: specification does not match any archived log in the repository

Remember the Failover forced a RESETLOGS so the incarnation of the DB changed, so the trick is:

RMAN> list incarnation;
List of Database Incarnations
DB Key  Inc Key DB Name  DB ID            STATUS  Reset SCN  Reset Time
------- ------- -------- ---------------- --- ---------- ----------
1       1       TMPRD   645377540        PARENT  1594143    31-AUG-18 10:37
2       2       TMPRD   645377540        PARENT  5368114    21-SEP-18 08:29
3       3       TMPRD   645377540        CURRENT 14297961467967 02-OCT-18 16:07

RMAN> restore archivelog sequence 358 thread 2 Incarnation 2;

Starting restore at 03-OCT-18 08:44
using channel ORA_DISK_1
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=2 sequence=358
channel ORA_DISK_1: reading from backup piece +DG_FRA/TMSBY/BACKUPSET/2018_10_02/annnf0_TMSBY1_arch_0.337.988488123
channel ORA_DISK_1: piece handle=+DG_FRA/TMSBY/BACKUPSET/2018_10_02/annnf0_TMSBY1_arch_0.337.988488123 tag=TMSBY1_ARCH
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:15
Finished restore at 03-OCT-18 08:45

And within a few minutes the restored Archivelog gets picked up by the FAL process, transfered and applied on the new standby (TMPRD). Alert.log contents:

RFS[3]: Opened log for thread 2 sequence 358 dbid 645377540 branch 987409798
...
Archived Log entry 1627 added for thread 2 sequence 358 rlc 987409798 ID 0x0 dest 2:
...
Media Recovery Log +DG_ARCHIVE/TMPRD/ARCHIVELOG/2018_10_03/thread_2_seq_358.49304.988533935

All in all I was impressed with the smooth failover and reinstatement via the broker, but even then, not everything can get automated and there's always some hicup where the DBA should roll up their sleeves and dig in in order to get things back to normal again.

Monday, September 17, 2018

ORA-16047 - DGID mismatch between destination setting and target database

A few days after a DG planned switchover I got a call from the DBAs. OEM and v$dataguard_stats shows no lag, but archives have not shipped from the Primary DB to the Physical Standby DB since the switchover.


After confirming their suspicions by querying V$archived_log and v$dataguard_stats I went to the 3 primary sources for DG troubleshooting.

1.) DGMGRL

the command "show configuration" indicated that the Standby database was disabled.

2.) Primary and Standby Alert.log files

The Primary DB had this peculiar error in its Alert.log file:
Fri Sep 14 13:55:31 2018
ORA-16047: DGID mismatch between destination setting and target database

using the oerr utility I found the full descrition of the error:

Error:  ORA-16047 DGID mismatch between destination setting and target database 
---------------------------------------------------------------------------
Cause:  The DB_UNIQUE_NAME specified for the destination did not match the 
 DB_UNIQUE_NAME at the target database. 
Action: Make sure the DB_UNIQUE_NAME specified in the LOG_ARCHIVE_DEST_n 
 parameter matches the DB_UNIQUE_NAME parameter defined at the 
 destination. 

But the DB_UNIQUE_NAME was correct in the Primary's LOG_ARCHIVE_DEST_2 parameter.

Next I looked at the Standby DBs Alert.log file, especially around the timestamp of the ORA-16047 error in the Primary DBs Alert.log. It seems that the Broker was not happy and decided to deconfigure the LOG_ARCHIVE_CONFIG parameter!

Fri Sep 14 13:54:23 2018
ALTER SYSTEM SET log_archive_config='nodg_config' SCOPE=BOTH;
...
Completed: Data Guard Broker shutdown
Fri Sep 14 13:54:59 2018
ALTER SYSTEM SET dg_broker_start=FALSE SCOPE=BOTH;
ALTER SYSTEM SET dg_broker_start=TRUE SCOPE=BOTH;


3.) Primary and Standby DG Alert log files

Lets look at the DG Alert.log files around the same timestamp.
The Standby DG Alert.log had hundreds of repeating ORA-16501 error messages until the timestamp in question when we got:

...
Failed to send message to site pc3dfmw2tc. Error code is ORA-16501.
...
09/14/2018 13:54:06
REMOVE CONFIGURATION [PRESEVERE DESTINATIONS]

So I was correct, the Broker was not happy (as can be seen by the repeated ORA-16501 errors) and eventually deconfigured the LOG_ARCHIVE_CONFIG parameter!

Error:  ORA-16501 the Data Guard broker operation failed
---------------------------------------------------------------------------
Cause:  the Data Guard broker operation failed
Action: See accompanying messages for details.

Not a lot of info for the ORA-16501 error, it seems to be a generic error. Investigating further back in the Standby's DG Alert.log file I found that the switchover completed with errors. The DBA was satisfied that the switchover worked, but did not do due diligence in investigating the accompanied error, and a few hours later DG stopped shipping Archivelogs altogether.
The solution was to correct the log_archive_config setting to what it was before the switchover, then to re-enable the Standby database in the DG Broker config. That fixed the lag, archives are now transported and applied to the Standby DB. That also corrected the Apply Lag reported in v$dataguard_stats.The Standby DB is now in sync and is usable in case of a DR, but further troubleshooting is necessary to determine why the Broker configuration was not 100% happy in the first place. 



Tuesday, November 26, 2013

Recover after a failed switchover


A failed swithcover attempt left both databases with a DATABASE_ROLE=PHYSICAL_STANDBY
Steps I followed to recover from the failure and to change the old PRIMARY back to PRIMARY was as follows:

SQL>shutdown immediate
ORA-01092: ORACLE instance terminated. Disconnection forced

Login to sqlplus again as sysdba and do a startup mount

SQL>startup mount
ORACLE instance started.

Total System Global Area 6.8947E+10 bytes
Fixed Size                  2189520 bytes
Variable Size            2.0670E+10 bytes
Database Buffers         4.8050E+10 bytes
Redo Buffers              225488896 bytes
Database mounted.

SQL>SELECT DATABASE_ROLE, status,OPEN_MODE,DB_UNIQUE_NAME, PROTECTION_MODE, PROTECTION_LEVEL, SWITCHOVER_STATUS
FROM V$DATABASE, V$INSTANCE
/


DATABASE_ROLE    STATUS       OPEN_MODE            DB_UNIQUE PROTECTION_MODE      PROTECTION_LEVEL     SWITCHOVER_STATUS
---------------- ------------ -------------------- --------- -------------------- -------------------- --------------------
PHYSICAL STANDBY MOUNTED      MOUNTED              casprd    MAXIMUM PERFORMANCE  MAXIMUM PERFORMANCE  RECOVERY NEEDED


Cancel the MRP process. You may receive “ORA-16136: Managed Standby Recovery not active” if the MRP process is not running.
                                              0
SQL>alter database recover managed standby database cancel;
alter database recover managed standby database cancel
*
ERROR at line 1:
ORA-16136: Managed Standby Recovery not active

Terminate the current switchover to Standby that never completed fully.  
SQL>alter database recover managed standby database finish;

Now switchover to Primary.
SQL>alter database commit to switchover to primary with session shutdown;

Open the database.
SQL>alter database open;

 Confirm DATABASE_ROLE
SQL>SELECT DATABASE_ROLE, status,OPEN_MODE,DB_UNIQUE_NAME, PROTECTION_MODE, PROTECTION_LEVEL, SWITCHOVER_STATUs
FROM V$DATABASE, V$INSTANCE
/


DATABASE_ROLE    STATUS       OPEN_MODE            DB_UNIQUE PROTECTION_MODE      PROTECTION_LEVEL     SWITCHOVER_STATUS
---------------- ------------ -------------------- --------- -------------------- -------------------- --------------------
PRIMARY          OPEN         READ WRITE           casprd    MAXIMUM PERFORMANCE  MAXIMUM PERFORMANCE  TO STANDBY

Wednesday, September 18, 2013

Drop Primary logfiles on Physical Standby Database

So we migrated a Physical Standby Database from RAW devices to ASM, after the move I still had to move the Online Redolog files:

SQL>select * from v$logfile;

    GROUP# STATUS  TYPE    MEMBER                             IS_
---------- ------- ------- ----------------------------------- ---
         1         ONLINE  /dev/vx/rdsk/redovmdg/casprd_REDO1  NO
         1         ONLINE  /dev/vx/rdsk/redovmdg/casprd_REDO1b NO
         2         ONLINE  /dev/vx/rdsk/redovmdg/casprd_REDO3  NO
         2         ONLINE  /dev/vx/rdsk/redovmdg/casprd_REDO3b NO
         3         ONLINE  /dev/vx/rdsk/redovmdg/casprd_REDO5  NO
         3         ONLINE  /dev/vx/rdsk/redovmdg/casprd_REDO5b NO

First step was to change STANDBY_FILE_MANAGEMENT from AUTO to MANUAL:

SQL>alter system set standby_file_management='MANUAL';

System altered.

But simply dropping a  logfile group was not so straight forward:

SQL>ALTER DATABASE DROP logfile group 1;
ALTER DATABASE DROP logfile group 1
*
ERROR at line 1:
ORA-01624: log 1 needed for crash recovery of instance castc (thread 1)
ORA-00312: online log 1 thread 1: '/dev/vx/rdsk/redovmdg/casprd_REDO1'
ORA-00312: online log 1 thread 1: '/dev/vx/rdsk/redovmdg/casprd_REDO1b'

Check the STATUS of the Online Redo log Group:

SQL>SELECT GROUP#, STATUS FROM V$LOG;

    GROUP# STATUS
---------- ----------------
         1 CLEARING
         2 CLEARING
         3 CURRENT

If Status is CLEARING or CURRENT then you cannot drop Online Redo log Group.The STATUS need to be CLEARED, UNUSED or INACTIVE

SQL>ALTER DATABASE CLEAR LOGFILE GROUP 1;

Database altered.

SQL>SELECT GROUP#, STATUS FROM V$LOG;

    GROUP# STATUS
---------- ----------------
         1 UNUSED
         2 CLEARING
         3 CURRENT

SQL>ALTER DATABASE DROP LOGFILE GROUP 1; 
.
SQL>alter database add logfile group 1 ('+PCASDG1','+PCASDG1');

Repeat for all groups and check:

SQL>select * from v$logfile;

    GROUP# STATUS  TYPE    MEMBER                                IS_
---------- ------- ------- -------------------------------------- ---
         1         ONLINE  +PCASDG1/castc/onlinelog/group_1.1148.826447293 NO
         1         ONLINE  +PCASDG1/castc/onlinelog/group_1.1149.826447293 NO
         2         ONLINE  +PCASDG1/castc/onlinelog/group_2.1153.826447427 NO
         2         ONLINE  +PCASDG1/castc/onlinelog/group_2.1152.826447431 NO
         3         ONLINE  +PCASDG1/castc/onlinelog/group_3.1180.826447635 NO
         3         ONLINE  +PCASDG1/castc/onlinelog/group_3.1181.826447641 NO



Wednesday, September 5, 2012

ORA-16714 in a RAC environment

I had to setup a physical standby in a RAC environment. The primary database is a two node RAC cluster but the standby environment is a single instance database.
During setup and configuration of the dataguard broker, we had the following error . ” ORA-16608: one or more sites have warnings”.

In DGMGRL, the command line interface of the broker you check the configuration status using “show configuration”, the broker would give you the current setup at the end of the show configuration output, the broker tries to query the current status from all servers involved, this includes all instances in the RAC configuration on the primary and also the physical standby.

DGMGRL> show configuration;
Configuration - ssodg
  Protection Mode: MaxPerformance
  Databases:
    ssoprd - Primary database
      Warning: ORA-16792: configurable property value is inconsistent with database setting
    ssodrp - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
WARNING


2. So the problem is for a init.ora parameter on one (or both) RAC nodes. We need to drill down, so “show database ssoprd” would tell us which database has this error and also show which property (or init.ora parameter) it is complaining about.


DGMGRL> show database ssoprd;

Database - ssoprd

  Role:            PRIMARY
  Intended State:  TRANSPORT-ON
  Instance(s):
    ssoprd1
    ssoprd2
      Warning: ORA-16714: the value of property StandbyFileManagement is inconsistent with the database setting
      Warning: ORA-16714: the value of property ArchiveLagTarget is inconsistent with the database setting
      Warning: ORA-16714: the value of property LogArchiveMaxProcesses is inconsistent with the database setting
      Warning: ORA-16714: the value of property LogArchiveMinSucceedDest is inconsistent with the database setting
      Warning: ORA-16714: the value of property LogArchiveTrace is inconsistent with the database setting
      Warning: ORA-16714: the value of property LogArchiveFormat is inconsistent with the database setting

Database Status:
WARNING


3. So just one node in the RAC cluster has a problem and that is ssoprd2. I logged into that node and did a “show parameter standby_file_management” and it showed a value of AUTO. Back in DGMGRL I did “show database verbose ssoprd”. The verbose option shows you DG configured properties for a db. The StandbyFileManagement property was also set to AUTO, so what is the problem. The command “SHOW DATABASE ssoprd 'InconsistentProperties';” gave me more info.
It showed me the 3 different places where a property/parameter can be set MEMORY_VALUE,   SPFILE_VALUE and BROKER_VALUE. As confirmed earlier the database MEMORY_VALUE had and standby_file_management =AUTO and the BROKER_VALUE had StandbyFileManagement=AUTO, the problem was that the SPFILE_VALUE parameter for node ssoprd2 had a value of NULL.

4. Now why would one node in a RAC cluster have a different SPFILE value when the SPFILE is kept in a shared location on ASM? I looked in the $ORACLE_HOME/dbs directory of node ssoprd2 and found a spfilessoprd2.ora file! Even though the contents was ‘*.SPFILE='+DG_DATA/ssoprd/spfilessoprd.ora' which is the shared location it is still a non-shared spfile in the local $ORACLE_HOME! Just to confirm I looked into $ORACLE_HOME/dbs on the other node did not find a SPFILE there. So the solution was simply to shutdown node 2, remove the spfile from $ORACLE_HOME/dbs and then to start the database on node 2. Note that each node had a initssoprd?.ora file in the $ORACLE_HOME with the contents *.SPFILE='+DG_DATA/ssoprd/spfilessoprd.ora'

5. This fixed the issue, it was the DGB not able to synchronize the value of these various parameters/properties between the various instances in the cluster. In a RAC environment its a best practice to maintain the SPFILE in a shared location such as ASM storage.
The SHOW CONFIGURATION command from the DGMGRL returned error free.

DGMGRL> show configuration;Configuration - ssodg  Protection Mode: MaxPerformance  Databases:    ssoprd - Primary database    ssodrp - Physical standby database Fast-Start Failover: DISABLED Configuration Status:SUCCESS DGMGRL> show database ssoprd;Database - ssoprd  Role:            PRIMARY  Intended State:  TRANSPORT-ON  Instance(s):    ssoprd1    ssoprd2 Database Status:SUCCESS

Thursday, January 26, 2012

Resolving Gaps in Data Guard Apply Using Incremental RMAN Backup

I found quite a few blog entries on the web and off course the Oracle Documentation also helped. The reason I want to create a post is that I encountered 2 problems that I first had to overcome before the Physical Standby was in sync again. A sort overview of the problem and the steps to overcome them are:

On the standby get the current scn which will be used in your RMAN backup script as a starting point. 

SQL>select current_scn FROM V$DATABASE;

     CURRENT_SCN
----------------
11133157117269


 On the primary run the RMAN script with the supplied current_scn number from the standby 

run {
allocate channel c1 type disk format '/dump/abcprd/%U.rmb';
backup as compressed backupset skip readonly incremental from scn 11133157117269 database;

And this is were I got my first unexpected problem. The RMAN backup gave the following warning message:

RMAN-06755: WARNING: datafile 408: incremental-start SCN is too recent; using checkpoint SCN 9733080640801 instead

File 408 is a Read Only Tablespace with a much older scn number. V$SESSION_LONGOPS showed that the backup will take longer than 24 hours to complete, so I immediately stopped it and reset the Read Only Tablespace's scn number with the aim that it will shorten the duration of the RMAN backup. 
SQL> alter tablespace SA_ORD_RO read write ;
SQL> alter tablespace SA_ORD_RO read only ;

I restarted the RMAN backup, it did not give the warning again, but still took 16 hours to complete. All files created by the backup were copied to the standby server. I also created a new standby controlfile:
RMAN> backup current controlfile for standby format '/usr/users/oracle/ForStandbyCTRL.bck';

On the standby server I cataloged and then restored the controlfile 
RMAN> catalog start with ' /dump/backup/';
RMAN> restore standby controlfile from '/dump/backup/ForStandbyCTRL.bck';


And the recover script:
RMAN> catalog start with ' /dump/abcprd/';
RMAN> recover database noredo;

But the Incremental backup failed almost immediately with the error

RMAN-06094: datafile 569 must be restored

It turns out that datafile 569 was created on the primary after the gap occured, but before the rman incremental was run. So the controlfile was aware of the datafile, but the file was not on the standby server. I also need to do a datafile backup for the newly created datafile. So back to Primary:
RMAN> backup datafile 569;
scp the file to the standby server and restore it on the standby server:
RMAN> restore datafile 569;

This time I could start the RMAN> recover database noredo; again and it completed successfully.


Friday, September 16, 2011

Logon to Active Data Guard fails with ORA-16000

We have a reporting user used by executives for real time reporting on one of our OLTP databases. Some of the queries were getting resource intensive so we decided to move those queries to our Active Data Guard instance.

But then the reports started failing with
ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access
on database logon. 

I found an old bug in MOS that pointed me in the right direction. After a failed login on Production, the same userid will get this error on the Active Data Guard instance, until the “logon counter” gets reset by a successful logon on production.

On Active Data Guard:
SQL >select * from v$version;

BANNER
------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
...

SQL> select OPEN_MODE, DATABASE_ROLE from v$database;

OPEN_MODE  DATABASE_ROLE
---------- ----------------
READ ONLY  PHYSICAL STANDBY

SQL> conn infoweb/correctpwd
Connected.

On Production:

SQL >select * from v$version;

BANNER
------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
...

SYS@casprd>select OPEN_MODE, DATABASE_ROLE from v$database;

OPEN_MODE  DATABASE_ROLE
---------- ----------------
READ WRITE PRIMARY

now login as infoweb with a wrong password on Production:

SQL>connect infoweb/wrongpwd
ERROR:
ORA-01017: invalid username/password; logon denied

Warning: You are no longer connected to ORACLE.

Back to Active Data Guard, logon as infoweb with the correct password:

SQL> conn infoweb/correctpwd
ERROR:
ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access

Warning: You are no longer connected to ORACLE.

Back to Production, logon as infoweb with the correct password to reset the logon counter:

SQL> conn infoweb/correctpwd
Connected.

Back to Active Data Guard, logon as infoweb with the correct password:

SQL> conn infoweb/correctpwd
Connected.

This is similar to a 10.2.0.4 bug, Bug 5847453  Failed logon counter with read only DB throws ORA-16000 on failed logon [ID 5847453.8]. The note states that the issue was fixed in 10.2.0.5 and 11.1.0.6.

Tuesday, June 22, 2010

Manually Add Datafile

I have a database that runs on RAW devices. Whenever I add a datafile to a tablespace, out UNIX Sys Admins must first create the RAW device on the Primary & Standby Server. The other day they created the RAW device on the standby server but forgot to change the ownership of the RAW device to oracle:dba.
The datafile was successfully added to the primary database, but I found the following error in the alert.log of the standby:

File #323 added to control file as 'UNNAMED00323'.
Originally created as:
'/dev/vx/rdsk/oradg1/casprd_SA_I_CMD10'
Recovery was unable to create the file as:
'/dev/vx/rdsk/oradg1_dg/casprd_SA_I_CMD10'
MRP0: Background Media Recovery terminated with error 1119
Mon Mar 9 23:41:44 2010
Errors in file /opt/apps/oracle/admin/caspp/bdump/caspp_mrp0_15261.trc:
ORA-01119: error in creating database file '/dev/vx/rdsk/oradg1_dg/casprd_SA_I_CMD10'
ORA-27040: file create error, unable to create file
HPUX-ia64 Error: 13: Permission denied
Managed Standby Recovery not using Real Time Apply
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
Mon Mar 9 23:41:45 2010
Errors in file /opt/apps/oracle/admin/caspp/bdump/caspp_mrp0_15261.trc:
ORA-01119: error in creating database file '/dev/vx/rdsk/oradg1_dg/casprd_SA_I_CMD10'
ORA-27040: file create error, unable to create file
HPUX-ia64 Error: 13: Permission denied

I can only qquery some v$ views when a standby database is mounted. So to fix the problem I did the following:
On production get the relevant details

SQL>select ts# from v$tablespace where name = 'SA_I_CMD';
TS#
----------
17
SQL> col name for a80;
SQL> select FILE#, NAME, bytes/1024/1024 MB from v$datafile where ts#=17;
FILE# NAME MB
---------- ----------------------------------------------------- ----------
34 /dev/vx/rdsk/oradg/casprd_SA_I_CMD03 32500
35 /dev/vx/rdsk/oradg/casprd_SA_I_CMD 31144
36 /dev/vx/rdsk/oradg/casprd_SA_I_CMD02 30712
37 /dev/vx/rdsk/oradg/casprd_SA_I_CMD01 32360
38 /dev/vx/rdsk/oradg/casprd_SA_I_CMD04 32500
39 /dev/vx/rdsk/oradg/casprd_SA_I_CMD05 30000
201 /dev/vx/rdsk/oradg/casprd_SA_I_CMD06 20000
266 /dev/vx/rdsk/oradg1/casprd_SA_I_CMD07 32500
307 /dev/vx/rdsk/oradg1/casprd_SA_I_CMD08 32500
311 /dev/vx/rdsk/oradg1/casprd_SA_I_CMD09 32500
323 /dev/vx/rdsk/oradg1/casprd_SA_I_CMD10 32500

Lets look at the same query on the standby side

SQL> col name for a80
SQL> select FILE#, NAME, bytes/1024/1024 MB from v$datafile where ts#=17;
FILE# NAME MB
---------- ----------------------------------------------- ----------
34 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD03 32500
35 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD 31144
36 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD02 30712
37 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD01 32360
38 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD04 32500
39 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD05 30000
201 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD06 20000
266 /dev/vx/rdsk/oradg1_dg/casprd_SA_I_CMD07 32500
307 /dev/vx/rdsk/oradg1_dg/casprd_SA_I_CMD08 32500
311 /dev/vx/rdsk/oradg1_dg/casprd_SA_I_CMD09 32500
323 /opt/apps/oracle/product/10.2.0.4.0/dbs/UNNAMED00323 0

After the UNIX admin corrected the file permissions of the RAW device I had to manually add it as a datafile to the standby database.

SQL>alter system set STANDBY_FILE_MANAGEMENT=MANUAL;
System altered.
SQL>ALTER DATABASE CREATE DATAFILE ‘/opt/apps/oracle/product/10.2.0.4.0/dbs/UNNAMED00323' as '/dev/vx/rdsk/oradg1_dg/casprd_SA_I_CMD10';
Database altered.
SQL>select FILE#, NAME, bytes/1024/1024 MB from v$datafile where ts#=17;
FILE# NAME MB
---------- ----------------------------------------------------- ----------
34 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD03 32500
35 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD 31144
36 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD02 30712
37 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD01 32360
38 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD04 32500
39 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD05 30000
201 /dev/vx/rdsk/oradg_dg/casprd_SA_I_CMD06 20000
266 /dev/vx/rdsk/oradg1_dg/casprd_SA_I_CMD07 32500
307 /dev/vx/rdsk/oradg1_dg/casprd_SA_I_CMD08 32500
311 /dev/vx/rdsk/oradg1_dg/casprd_SA_I_CMD09 32500
323 /dev/vx/rdsk/oradg1_dg/casprd_SA_I_CMD10 32500
11 rows selected.
SQL>alter system set STANDBY_FILE_MANAGEMENT=AUTO;
System altered.
SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE;
Database altered.