ORA-01148 followed with ASM communication failure

This morning one of SYSAUX tablespace datafile went offine all of a sudden. We saw the following messages in the DB alert log.


Thu Apr 05 15:06:04 2012
Errors in file /u02/diag/rdbms/orcl/orcl/trace/orcl_dbw0_13147.trc:
ORA-01148: cannot refresh file size for datafile 7
ORA-01110: data file 7: '+DG1/orcl/datafile/sysaux.273.690200715'
ORA-15032: not all alterations performed
ORA-29702: error occurred in Cluster Group Service operation
ORA-29702: error occurred in Cluster Group Service operation
Automatic datafile offline due to media error on
file 7: +DG1/orcl/datafile/sysaux.273.690200715

The above error happens because of “

ORA-15082: ASM failed to communicate with database instance

This is an ASM problem on On 11.1.0.7. The ASM instance failed to communicate with the DB instance in a timely fashion causing DBWR process to offline the datafile. The fix is the bug 7334426.

Database reporting a lot of ORA-12012 lately with no apparent changes

Database reporting a lot of ORA-12012 lately with no apparent changes to it in the last month or so.

Errors in file /home/oracle/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_j000_7094.trc:
ORA-12012: error on auto execute of job 63
ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)

From the above it’s pretty clear that my scheduled jobs are failing for something. (all traces had *j00*some number in the file name)
Just to confirm the hypothesis, I did this.

[oracle@localhost trace]$ grep ORA-08102 *trc
orcl_j000_3442.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)
orcl_j000_3442.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)
orcl_j000_3442.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)
orcl_j000_3442.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)
orcl_j000_3480.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)
orcl_j000_3480.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)
orcl_j000_3480.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)
orcl_j000_3480.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)
orcl_j000_3490.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)
orcl_j000_3490.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)
orcl_j000_3490.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)
orcl_j000_3490.trc:ORA-08102: index key not found, obj# 289, file 1, block 2025 (2)


select * from dba_objects where object_id=289;
SYS    I_JOB_NEXT        289    289    INDEX

Now the index I_JOB_NEXT seems to be corrupted at block# 2025. This belongs to the job$ table.
Let’s try to run an ANALYZE on the table.

SYS@orcl SQL>ANALYZE TABLE JOB$ VALIDATE STRUCTURE CASCADE;
ANALYZE TABLE JOB$ VALIDATE STRUCTURE CASCADE
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

By now, we know that index i_job_next is created on the next_date column of job$ table.
So, let’s try to do an explicit full table scan on job$.

SELECT /*+ FULL(j1) */ next_date
FROM job$ j1
MINUS
SELECT /*+ index(j i_job_next) */ next_date
FROM job$ j;

Surprisingly, doing the above FTS followed with an ANALYZE table command did the trick for me
and the index is reusable now.

SYSDBA login not working ORA-01031

We have been receiving ORA-01031 since last week with the dba claiming that no changes have been done whatsoever to the database account of SYS.

This turned out to be more strange, as when we tried logging to database with complete tns entry – it succeeded. Now what could be the reason ?

The cause turned out to be the new entry in the sqlnet.ora about the AUTHENTICATION_SERVICES.

We had inserted the following line into sqlnet.ora

sqlnet.AUTHENTICATION_SERVICES=(none)

We hashed this parameter out of the server side sqlnet.ora and it all started working normal again.

Now, why? Why this parameter caused the pain?

That happened because, this parameter is to be used only when you have any authentication service installed for your database. In other words this will bypass the OS authentication and the password file authentication. This parameter should be either pointing to a value of none or to the authentication service configured.

Resume your suspended jobs

Today morning, we were running a Statistics collection job under a RESUMABLE session.For those who don’t know what resumable session means – it helps us in resuming the execution of a large operation in the event of any failures because of storage allocation. We can turn it on by this command and follow it with our large operation.


SYS@orcl SQL>ALTER SESSION ENABLE RESUMABLE;

SYS@orcl SQL>ALTER SESSION ENABLE RESUMABLE TIMEOUT 3600;

A resumable statement is suspended only when one of the following conditions occur.

1. Out of space allocation :: ORA-1653 and ORA-1654

2. Maximum extents reached condition :: ORA-1631 and ORA-1654

3. Space quota exceeded condition :: ORA-1536

So, I started my stats collection job as resumable – forgot about it – only to be reminded by my subordinate that it we have received the following error in the alert.log:
</h5>
<pre>Mon Jan 30 13:57:01 2012
statement in resumable session 'User SYS(0), Session 155, Instance 1' was suspended due to
ORA-01652: unable to extend temp segment by 8 in tablespace TEST

Ok, time to start troubleshooting.

select * from dba_resumable;

select * from v$session_wait where sid = &sid_of_suspended_session;

So, two things are clear till now, SID 155 is currently in suspended state and is just waiting for the error to be cleared and requires a DBA intervention.

As we know the reason for the suspension from the alert.log, we added more space to the TEST tablespace and the status of the job in dba_resumable changed to NORMAL in some time.

For those, who don’t see the NORMAL status code in DBA_RESUMABLE and it still says SUSPENDED, don’t panic and check the event column of v$session_wait. If it still says statement suspended, wait error to be cleared then you need to dig in further but if it says anything else, you are good.

SQL*Net tracing

For configuring sqlnet tracing (server) side, we need to add the following lines to the server side sqlnet.ora


trace_level_client=16

trace_file_client=client

trace_directory_client=/tmp # Or a known directory

trace_unique_client=true # Add '_pid' to trace filename

trace_timestamp_client=ON # Only in Oracle8i onwards


For configuring listener tracing setup the following lines
trace_level_listener=16
trace_file_listener=listener
trace_directory_listener=/tmp # Or a known directory

The above will setup complete listener tracing to the file /tmp/listener.trc.
Following this we should bounce/reload the listener.

+Abhishek