Basha's Tech World: Block Corruption

In Oracle E-Business Suite 11i/R12, very rarely we get block corruption error after cloning with hot backup or in activated Standby. Re-backup or Re-cloning may solve this issue but in some case where we cannot afford to redo the activity then, we have to address the root cause and fix the issue in the cloned instance. Here i have come across the same issue and found the root cause for the block corruption. Will see how to resolve this cloned instance block corruption error in this blog.

After cloning we got this error while accessing login page,

500 Internal Server Error
java.lang.NoClassDefFoundError
at oracle.apps.fnd.sso.AppsLoginRedirect.doGet(AppsLoginRedirect.java:623)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:743)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)
at com.evermind[Oracle Containers for J2EE 10g (10.1.3.0.0) ].server.http.ResourceFilterChain.doFilter(ResourceFilterChain.java:64)
at oracle.apps.jtf.base.session.ReleaseResFilter.doFilter(ReleaseResFilter.java:26)

To our surprise, in database alert log we noticed below error,

Errors in file <oracle_home>/admin/<context_name>/diag/rdbms/<db_name>/<sid>/trace/<sid>_ora_30460.trc (incident=193):
ORA-01578: ORACLE data block corrupted (file # <file id>, block # <block id>)
ORA-01110: data file 401: '<data_top>/a_txn_data02.dbf'
ORA-26040: Data block was loaded using the NOLOGGING option
Incident details in: <oracle_home>/admin/<context_name>/diag/rdbms/<db_name>/<sid>/incident/incdir_193/<sid>_ora_30460_i193.trc

Above incident report file mentions that accessing a sql in package APPS.FND_SECURITY_PKG threw this block corruption.

When we tried dbv on the affected datafile noticed that few blocks are corrupted. Then we traced segments in affected block using below query and found that it is APPLSYS.WF_LOCAL_USER_ROLES object which has corrupted block after cloning with hot backup.

select owner, segment_name, segment_type
from   dba_extents
where  file_id = <file id>
and    <block id> between block_id and block_id + blocks - 1;

: file id from ORA-01578 error msg

: block id from ORA-01578 error msg

Nologging may be used in a concurrent program to increase the performance. In most cases, by end of the Concurrent Manager job, the objects affected by nologging have been dropped. For these cases, you have no further actions to take - dropping the objects cleans up the invalidated blocks. If the backup was taken before it drops the temporary then we hit the above error.

The concurrent job which caused a problem in our case was "Synchronize WF LOCAL tables". Performed a nologgin operations on APPLSYS.WF_LOCAL_USER_ROLES objects and invalidated the blocks in restored database. To fix this we have to re-run "Synchronize WF LOCAL tables" in restored database using pl/sql block,

sqlplus apps/<apps pwd>

set serveroutput on size 100000;
declare
begin
WF_LOCAL_SYNCH.BULKSYNCHRONIZATION(
P_ORIG_SYSTEM => 'ALL',
P_PARALLEL_PROCESSES => null,
P_LOGGING => null,
P_RAISEERRORS => TRUE);
exception
when others then
dbms_output.put_line('sqlerrm = ' || sqlerrm);
end;
/

Also it is found that even if we run "Synchronize WF LOCAL tables" with logging=YES, partitions of the following table are switched to logging=NO.

- WF_LOCAL_ROLES

- WF_LOCAL_USER_ROLES

- WF_USER_ROLE_ASSIGNMENTS

The cause of this issue is the stage table. For example if WF_LOCAL_USER_ROLES_STAGE table is set to logging=NO and concurrent request "Synchronize WF LOCAL tables" with loggin=YES then object WF_LOCAL_USER_ROLES will be set to logging=NO. To fix this we have to manually set logging to yes on all the stage tables by following these steps.

1) For any partition that is set to NOLOGGING, manually alter it to LOGGING:

1.1) SQL to run to retrieve partitions that are set to NOLOGGING:

select table_name, partition_name

from dba_tab_partitions

where table_name in

('WF_LOCAL_ROLES' ,

'WF_LOCAL_USER_ROLES' ,

'WF_USER_ROLE_ASSIGNMENTS')

and logging = 'NO' ;

1.2) switch all partitions retrieved in step 1.1 to LOGGING = 'YES' using

the following syntax:

alter table applsys. modify partition logging;

example:

alter table applsys.WF_LOCAL_USER_ROLES modify partition ENG_LIST logging;

2) For any stage table that is set to NOLOGGING, manually alter them to LOGGING.

2.1) SQL to run to retrieve stage tables set to NOLOGGING:

select table_name

from dba_tables

where table_name in

('WF_LOCAL_ROLES_STAGE' ,

'WF_LOCAL_USER_ROLES_STAGE' ,

'WF_UR_ASSIGNMENTS_STAGE')

and logging = 'NO' ;

2.2) Switch all the tables retrieved by statement above to LOGGING = 'YES' using the

following syntax:

alter table applsys. logging;

example:

alter table applsys.WF_LOCAL_ROLES_STAGE logging;

3) Then run the synchronize WF local tables concurrent program with logging mode = 'LOGGING'.

4) After completion, monitor tables and partitions: they should remain to LOGGING = 'YES'.

The best practice is to enable force logging on database level. First have to test the performance with forced logging before implementing it in production.

Reference

*********

1. Http 500 : Internal Server Error When Login To A New Cloned Environment [ID 781413.1]

2. ORA-1578 ORA-26040 in a LOB segment - Script to solve the errors [ID 293515.1]

3. Partitions in Workflow Local Tables are Automatically Switched to NOLOGGING [ID 433280.1]

4. Nologging in the E-Business Suite [ID 216211.1]

Fresh installed R12.1.1 with rapidwizard version (StartCD) 12.1.1.9 or below on Linux X86_64 contains corrupted SYSTEM09.dbf file. Thankfully those corruption are not data corruption (does not belong to any extents/segments) hence will not affect your installation and database startup. Might be this was the reason that Oracle could not capture this issue before it releases to the customer.

User will identify this corruption only when they starts to backup using RMAN or validates datafile using RMAN. RMAN will fail with below error message,

ORA-19566: exceeded limit of 0 corrupt blocks for file /DBF_LOC/system09.dbf

On verifying the datafile using dbv, there will be around 20000+ corrupted blocks in system09.dbf,

...
Page 192127 is marked corrupt
Corrupt block relative dba: 0x5802ee7f (file 352, block 192127)
Completely zero block found during dbv:

Page 192128 is marked corrupt
Corrupt block relative dba: 0x5802ee80 (file 352, block 192128)
Completely zero block found during dbv:

DBVERIFY - Verification complete

Total Pages Examined : 192128
Total Pages Processed (Data) : 54272
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 26750
Total Pages Failing (Index): 0
Total Pages Processed (Other): 2063
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 82570
Total Pages Marked Corrupt : 26473
Total Pages Influx : 0
Total Pages Encrypted : 0
Highest block SCN : 3755938521 (1388.3755938521)
$

Since the affected blocks are empty blocks we can resize the datafile to fix this corruption issue. Run this command as sysdba,

alter database datafile '/DBF_LOC/system09.dbf' resize 1000M;

Reference
=======
1. Oracle E-Business Suite Installation and Upgrade Notes Release 12 (12.1.1) for Linux x86-64 [ID 761566.1]
2. More than 20000 Blocks Corrupted in system09.dbf After Fresh Installation of E-Business Suite R12.1.1 on Linux x86_64 Platform [ID 840963.1]

Basha's Tech World

Friday, March 4, 2011

Block Corruption after Hot backup

Saturday, February 12, 2011

SYSTEM09.dbf file corrupted in R12.1.1 fresh install on Linux x86_64

Popular Posts