What type of file system are you using for your redo log files ? Is there any special mount options in use ? While I agree with the solution as point-in-time recovery for online redo log corruption, but out-of-space error as a cause, does not fit for redo log files.
My guess at this point is *somehow* the file system / mount options you are using is not appropriate for redo log files.
Allen, Brandon wrote: >> -- --Original Message-- -- >> From: Binley Lim [mailto:Binley.Lim@(protected)] >> >> Care to explain why running out of disk space crashes a database, requiring recovery? >> > > > Good question. I don't really know the answer. According to Oracle Support, we were "unlucky". I've had filesystems fill up before and never had this problem, and this same filesystem even filled up again later in the day on Sunday after I finished recovery (yes, I'm an idiot for not taking preventative measures after I got the database back up, but I was exhausted by that point and not thinking clearly) but did not have corruption the 2nd time. Here are the errors from the logs: > > > When the file system first filled up: > > Sat Apr 15 06:32:13 2006 > ARC1: Beginning to archive log 1 thread 1 sequence 11021 > Creating archive destination LOG_ARCHIVE_DEST_1: '/baan4/oraarc/log_ -1917883320_11021_1.arc' > ARC1: I/O error 19502 archiving log 1 to '/baan4/oraarc/log_-1917883320_11021 _1.arc' > Sat Apr 15 06:32:18 2006 > Errors in file /baan4/admin/bdump/baan4_arc1_8405102.trc: > ORA-19502 (See ORA-19502.ora-code.com): write error on file "/baan4/oraarc/log_-1917883320_11021_1.arc", blockno 192513 (blocksize=512) > ORA-27063 (See ORA-27063.ora-code.com): skgfospo: number of bytes read/written is incorrect > IBM AIX RISC System/6000 Error: 28: No space left on device > > > Then the errors when the instance was terminated by LGWR a few minutes later: > > Sat Apr 15 06:35:03 2006 > Errors in file /baan4/admin/bdump/baan4_lgwr_1671218.trc: > ORA-00340 (See ORA-00340.ora-code.com): IO error processing online log 2 of thread 1 > ORA-00345 (See ORA-00345.ora-code.com): redo log write error block 158327 count 323 > ORA-00312 (See ORA-00312.ora-code.com): online log 2 thread 1: '/baan4/oralog/redo02/redo02b.log' > ORA-27063 (See ORA-27063.ora-code.com): skgfospo: number of bytes read/written is incorrect > IBM AIX RISC System/6000 Error: 28: No space left on device > Additional information: -1 > Additional information: 165376 > ORA-00345 (See ORA-00345.ora-code.com): redo log write error block 158327 count 323 > ORA-00312 (See ORA-00312.ora-code.com): online log 2 thread 1: '/baan4/oralog/redo02/redo02a.log' > ORA-27063 (See ORA-27063.ora-code.com): skgfospo: number of bytes read/written is incorrect > IBM AIX RISC System/6000 Error: 28: No space left on device > Additional information: -1 > Additional information: 165376 > Sat Apr 15 06:35:03 2006 > LGWR: terminating instance due to error 340 > Instance terminated by LGWR, pid = 1671218 > > And, then the errors that occurred after restarting the database: > > Sat Apr 15 07:51:11 2006 > Errors in file /baan4/admin/bdump/baan4_smon_7880942.trc: > ORA-00604 (See ORA-00604.ora-code.com): error occurred at recursive SQL level 1 > ORA-00607 (See ORA-00607.ora-code.com): Internal error occurred while making a change to a data block > ORA-00600 (See ORA-00600.ora-code.com): internal error code, arguments: [4193], [2015], [2205], [], [], [] , [], [] > > According to Metalink # 39282.1, the ORA-600 (See ORA-600.ora-code.com) [4193] error means "A mismatch has been detected between Redo records and Rollback (Undo) records.", and the solution is point-in-time recovery to before the error occurred. > > Why this happened? I don't know. Any ideas? > > Privileged/Confidential Information may be contained in this message or attachments hereto. Please advise immediately if you or your employer do not consent to Internet email for messages of this kind. Opinions, conclusions and other information in this message that do not relate to the official business of this company shall be understood as neither given nor endorsed by it. > > -- > http://www.freelists.org/webpage/oracle-l > > > >
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type"> </head> <body bgcolor="#ffffff" text="#000000"> Hi<br> <br> >>According to Oracle Support, we were "unlucky". <br> <br> Wow, that is very scientific ;-)<br> <br> What type of file system are you using for your redo log files ? Is there any special mount options in use ? While I agree with the solution as point-in-time recovery for online redo log corruption, but out-of-space error as a cause, does not fit for redo log files. <br> <br> My guess at this point is *somehow* the file system / mount options you are using is not appropriate for redo log files. <br> <br> <pre>-- </pre> Thanks<br> <br> Riyaj "Re-yas" Shamsudeen<br> Certified Oracle DBA (ver 7.0 - 9i)<br> Allocation & Assortment planning systems<br> JCPenney<br> <br> <br> Allen, Brandon wrote: <blockquote cite="mid04DDF147ED3A0D42B48A48A18D574C4503D4063F@(protected)" type="cite"> <blockquote type="cite"> <pre wrap="">-- --Original Message-- -- From: Binley Lim [<a class="moz-txt-link-freetext" href="mailto:Binley.Lim@(protected) .co.nz">mailto:Binley.Lim@(protected)</a>]
Care to explain why running out of disk space crashes a database, requiring recovery? </pre> </blockquote> <pre wrap=""><!---->
Good question. I don't really know the answer. According to Oracle Support, we were "unlucky". I've had filesystems fill up before and never had this problem, and this same filesystem even filled up again later in the day on Sunday after I finished recovery (yes, I'm an idiot for not taking preventative measures after I got the database back up, but I was exhausted by that point and not thinking clearly) but did not have corruption the 2nd time. Here are the errors from the logs:
When the file system first filled up:
Sat Apr 15 06:32:13 2006 ARC1: Beginning to archive log 1 thread 1 sequence 11021 Creating archive destination LOG_ARCHIVE_DEST_1: '/baan4/oraarc/log_-1917883320 _11021_1.arc' ARC1: I/O error 19502 archiving log 1 to '/baan4/oraarc/log_-1917883320_11021_1 .arc' Sat Apr 15 06:32:18 2006 Errors in file /baan4/admin/bdump/baan4_arc1_8405102.trc: ORA-19502 (See ORA-19502.ora-code.com): write error on file "/baan4/oraarc/log_-1917883320_11021_1.arc", blockno 192513 (blocksize=512) ORA-27063 (See ORA-27063.ora-code.com): skgfospo: number of bytes read/written is incorrect IBM AIX RISC System/6000 Error: 28: No space left on device
Then the errors when the instance was terminated by LGWR a few minutes later:
Sat Apr 15 06:35:03 2006 Errors in file /baan4/admin/bdump/baan4_lgwr_1671218.trc: ORA-00340 (See ORA-00340.ora-code.com): IO error processing online log 2 of thread 1 ORA-00345 (See ORA-00345.ora-code.com): redo log write error block 158327 count 323 ORA-00312 (See ORA-00312.ora-code.com): online log 2 thread 1: '/baan4/oralog/redo02/redo02b.log' ORA-27063 (See ORA-27063.ora-code.com): skgfospo: number of bytes read/written is incorrect IBM AIX RISC System/6000 Error: 28: No space left on device Additional information: -1 Additional information: 165376 ORA-00345 (See ORA-00345.ora-code.com): redo log write error block 158327 count 323 ORA-00312 (See ORA-00312.ora-code.com): online log 2 thread 1: '/baan4/oralog/redo02/redo02a.log' ORA-27063 (See ORA-27063.ora-code.com): skgfospo: number of bytes read/written is incorrect IBM AIX RISC System/6000 Error: 28: No space left on device Additional information: -1 Additional information: 165376 Sat Apr 15 06:35:03 2006 LGWR: terminating instance due to error 340 Instance terminated by LGWR, pid = 1671218
And, then the errors that occurred after restarting the database:
Sat Apr 15 07:51:11 2006 Errors in file /baan4/admin/bdump/baan4_smon_7880942.trc: ORA-00604 (See ORA-00604.ora-code.com): error occurred at recursive SQL level 1 ORA-00607 (See ORA-00607.ora-code.com): Internal error occurred while making a change to a data block ORA-00600 (See ORA-00600.ora-code.com): internal error code, arguments: [4193], [2015], [2205], [], [], [], [], []
According to Metalink # 39282.1, the ORA-600 (See ORA-600.ora-code.com) [4193] error means "A mismatch has been detected between Redo records and Rollback (Undo) records.", and the solution is point-in-time recovery to before the error occurred.
Why this happened? I don't know. Any ideas?
Privileged/Confidential Information may be contained in this message or attachments hereto. Please advise immediately if you or your employer do not consent to Internet email for messages of this kind. Opinions, conclusions and other information in this message that do not relate to the official business of this company shall be understood as neither given nor endorsed by it.
The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. If the reader of this message is not the intended recipient, you are hereby notified that your access is unauthorized, and any review, dissemination, distribution or copying of this message including any attachments is strictly prohibited. If you are not the intended recipient, please contact the sender and delete the material from any computer.