Friday, December 12, 2008

Ora-17505 using RMAN with ASM

I had to create a Data Guard instance on test, so that we can evaluate Transparent Application Failover (TAF) for the app. I wanted to test the same procedure to build Data Guard that I will use for Production, and that meant that I had to do a RMAN backup to ASM. Our Storage team loaned me LUNs of different sizes for my temporary ASM on test.

But my backup command (Backup filesperset 10 database include current controlfile for standby) failed with:

RMAN-03009: failure of backup command on ORA_DISK_13 channel at 10/30/2008 20:21:27
ORA-19510: failed to set size of 5779210 blocks for file "+PCASDGF" (blocksize=8192)
ORA-17505: ksfdrsz:1 Failed to resize file to size 5779210 blocks

This was a surprise because the test db is 7TB and FRA is 9TB.


SQL> SELECT sum(space), sum(bytes) FROM v$asm_file;
and
SQL> SELECT * FROM V$FLASH_RECOVERY_AREA_USAGE;


confirmed that I had more than enough space available. In fact the backup failed after using only 47% fo the available space.

It turns out that the smallest disk in the diskgroup was the bottleneck.


SQL>select group_number, disk_number, total_mb, free_mb from v$asm_disk order by 4;

GROUP_NUMBER DISK_NUMBER TOTAL_MB FREE_MB
----------------------------------------
1 13 86315 70
1 16 17263 17080
1 14 17263 17080
1 129 34522 34168
1 130 34522 34168
1 131 34522 34168
1 19 34522 69052
...

As you can see disk 13 only had 70MB of space available. I removed all the disks of varying sizes and only kept the disks of 69052 MB Size. The total size of the FRA came down to 8493396 MB, but the RMAN backup completed successfully.

Lesson Learned:
ASM spreads file extents evenly accross all the disks disks on a diskgroup. An ORA-17505 error can still be encountered due to imbalanced free space between disks. The reason for this is that one disk lacking sufficient free space makes it impossible to do any allocation in a disk group because every file must be evenly allocated across all disks.


No comments: