Thursday, January 26, 2012

Resolving Gaps in Data Guard Apply Using Incremental RMAN Backup

I found quite a few blog entries on the web and off course the Oracle Documentation also helped. The reason I want to create a post is that I encountered 2 problems that I first had to overcome before the Physical Standby was in sync again. A sort overview of the problem and the steps to overcome them are:

On the standby get the current scn which will be used in your RMAN backup script as a starting point. 

SQL>select current_scn FROM V$DATABASE;

     CURRENT_SCN
----------------
11133157117269


 On the primary run the RMAN script with the supplied current_scn number from the standby 

run {
allocate channel c1 type disk format '/dump/abcprd/%U.rmb';
backup as compressed backupset skip readonly incremental from scn 11133157117269 database;

And this is were I got my first unexpected problem. The RMAN backup gave the following warning message:

RMAN-06755: WARNING: datafile 408: incremental-start SCN is too recent; using checkpoint SCN 9733080640801 instead

File 408 is a Read Only Tablespace with a much older scn number. V$SESSION_LONGOPS showed that the backup will take longer than 24 hours to complete, so I immediately stopped it and reset the Read Only Tablespace's scn number with the aim that it will shorten the duration of the RMAN backup. 
SQL> alter tablespace SA_ORD_RO read write ;
SQL> alter tablespace SA_ORD_RO read only ;

I restarted the RMAN backup, it did not give the warning again, but still took 16 hours to complete. All files created by the backup were copied to the standby server. I also created a new standby controlfile:
RMAN> backup current controlfile for standby format '/usr/users/oracle/ForStandbyCTRL.bck';

On the standby server I cataloged and then restored the controlfile 
RMAN> catalog start with ' /dump/backup/';
RMAN> restore standby controlfile from '/dump/backup/ForStandbyCTRL.bck';


And the recover script:
RMAN> catalog start with ' /dump/abcprd/';
RMAN> recover database noredo;

But the Incremental backup failed almost immediately with the error

RMAN-06094: datafile 569 must be restored

It turns out that datafile 569 was created on the primary after the gap occured, but before the rman incremental was run. So the controlfile was aware of the datafile, but the file was not on the standby server. I also need to do a datafile backup for the newly created datafile. So back to Primary:
RMAN> backup datafile 569;
scp the file to the standby server and restore it on the standby server:
RMAN> restore datafile 569;

This time I could start the RMAN> recover database noredo; again and it completed successfully.


DB link from Oracle to SQL Server

We had a need for a DB link from an Oracle database to a SQL Server database. I initially feared that this might be a huge task, but it turned out to be quite a easy setup. Thanks Oracle for making it almost seamless!

There are basically 4 steps involved 

1.) Install & configure the Gateway

The Oracle Gateway can be found in the Oracle Database 11g Release 2 Enterprise Edition software. Installation is straight forward. You basically have to make 2 choices during installation. What Gateway you want to install (SQL Server in our case) and which ORACLE_HOME you want to use. You can create a separate Gateway home or just use the current ORACLE_HOME, which is what we did.

After installation you will find a new directory in ORACLE_HOME called dg4msql. Configure the Gateway by creating a agent init.ora file in $ORACLE_HOME/dg4msql/admin. The name of the file is important, as the name you use will be used in your listener as well. We kept it to the default initdg4msql.ora. The information you need to supply in the agent init.ora file are SQL Server Database Server Host Name, SQL Server Database Server Port number and SQL Server Database. The information all goes into one parameter called HS_FDS_CONNECT_INFO.

Ex:
# This is a customized agent init file that contains the HS parameters
# that are needed for the Database Gateway for Microsoft SQL Server

#
# HS init parameters
#
# HS_FDS_CONNECT_INFO=[server_name]:port//mssql_db
HS_FDS_CONNECT_INFO=[ABC02.company.co.za]:1433//MSCIM
# alternate connect format is hostname/serverinstance/databasename
HS_FDS_TRACE_LEVEL=OFF
HS_FDS_RECOVERY_ACCOUNT=RECOVER
HS_FDS_RECOVERY_PWD=RECOVER

2.) Listener & names lookup
We used port 1524 to setup the gateway

LISTENER =
  (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = tcp)(HOST = abc01.company.co.za)(PORT = 1521))
    (ADDRESS = (PROTOCOL = tcp)(HOST = localhost)(PORT = 1524))
  )

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = dg4msql)
      (ORACLE_HOME = /opt/apps/oracle/product/11.2.0/dbhome_1)
      (PROGRAM = dg4msql)
    )
  )

And Local Naming

MSCIM =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = tcp)(HOST = localhost)(PORT = 1524))
    (CONNECT_DATA =
      (SID = dg4msql)
    )
    (HS = OK)
  )

HS=OK - indicates this connect descriptor is using heterogeneous Service.

3.) Database Link
Once the SQL Server team gave us a username and password we could just create a DB link

CREATE PUBLIC DATABASE LINK MSCIM_DBLINK CONNECT TO 'mssql_username' IDENTIFIED BY 'mssql_password' USING 'MSCIM'

4.) Data dictionary translation support
Oracle has also supplied a script in $ORACLE_HOME/ dg4msql/admin/ called dg4msql_cvw.sql that must be run on the SQL Server side. The script creates views on SQL Server for usage by the Oracle Data Dictionary. 4 views called ALL_IND_COLUMNS, USER_IND_COLUMNS, ALL_CONS_COLUMNS, and USER_CONS_COLUMNS will be created on the SQL Server db making it easier for someone familiar with Oracle Data Dictionary views to query the SQL Server database.