linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [Re: Possible Bug in 2.4.24???]
@ 2004-01-17  1:04 Brad Tilley
  2004-01-17  5:11 ` Oleg Drokin
  0 siblings, 1 reply; 3+ messages in thread
From: Brad Tilley @ 2004-01-17  1:04 UTC (permalink / raw)
  To: Marcelo Tosatti, Brad Tilley; +Cc: linux-kernel, Oleg Drokin



Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote:

> 
> 
> On Fri, 16 Jan 2004, Brad Tilley wrote:
> 
> > While running a script that recursively changes permissions on a ftp
> > directory, I received an error to the term window where the script was
> > running. I then checked out /var/log/messages and saw the below kernel
errors.
> > The machine was generally unresponsive and had to be physically rebooted
at
> > the power switch. It worked fine upon reboot an fsck ran w/o producing
any
> > error... the script ran fine too. This is a HP XW4100 with a P4, 1.5GB DDR
RAM
> > and two very fast (15,000 RPM), very large (140GB) SCSI HDDs. It had been
up
> > for 9 days (since compiling and installing 2.4.24) and has worked fine
until
> > this point. Could someone tell me if this is or isn't a kernel bug?
> >
> >
> > Jan 16 11:50:43 athop1 kernel: SCSI disk error : host 0 channel 0 id 1 lun
0
> > return code = 8000002
> > Jan 16 11:50:43 athop1 kernel: Info fld=0x2cd1bd9, Current sd08:15: sense
key
> > Hardware Error
> > Jan 16 11:50:43 athop1 kernel: Additional sense indicates Internal target
> > failure
> > Jan 16 11:50:43 athop1 kernel:  I/O error: dev 08:15, sector 54128
> > Jan 16 11:50:43 athop1 kernel: journal-601, buffer write failed
> > Jan 16 11:50:43 athop1 kernel:  (device sd(8,21))
> > Jan 16 11:50:43 athop1 kernel: kernel BUG at prints.c:341!
> > Jan 16 11:50:43 athop1 kernel: invalid operand: 0000
> > Jan 16 11:50:43 athop1 kernel: CPU:    0
> > Jan 16 11:50:43 athop1 kernel: EIP:    0010:[<c0189878>]    Tainted: P
> 
> Brad,
> 
> A device error happened (you see the "SCSI disk error : " message and
> "Additional sense indicates Internal target failure") which reiserfs
> could not handle.
> 
> kernel BUG at prints.c:341 == reiserfs_panic().

Thanks for the reply Marcelo,

Does this mean that there is a physical or mechanical problem with the drive
itself? I do use 
reiserfs as it's the best fs available for my purposes. Could the drive
attempt to write 
outside its physical bounds? Move the arm right when it was instructed to go
left? I don't 
understand how the drive could have an error w/o affecting the filesystem.




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Re: Possible Bug in 2.4.24???]
  2004-01-17  1:04 [Re: Possible Bug in 2.4.24???] Brad Tilley
@ 2004-01-17  5:11 ` Oleg Drokin
  2004-01-17  6:23   ` Mike Fedyk
  0 siblings, 1 reply; 3+ messages in thread
From: Oleg Drokin @ 2004-01-17  5:11 UTC (permalink / raw)
  To: Brad Tilley; +Cc: Marcelo Tosatti, linux-kernel

Hello!

On Fri, Jan 16, 2004 at 08:04:28PM -0500, Brad Tilley wrote:
> > > Jan 16 11:50:43 athop1 kernel:  I/O error: dev 08:15, sector 54128
> > > Jan 16 11:50:43 athop1 kernel: journal-601, buffer write failed
> > > Jan 16 11:50:43 athop1 kernel:  (device sd(8,21))
> > A device error happened (you see the "SCSI disk error : " message and
> > "Additional sense indicates Internal target failure") which reiserfs
> > could not handle.
> > kernel BUG at prints.c:341 == reiserfs_panic().
> Does this mean that there is a physical or mechanical problem with the drive
> itself? I do use 

Yes it does.

> reiserfs as it's the best fs available for my purposes. Could the drive
> attempt to write 
> outside its physical bounds? Move the arm right when it was instructed to go

The sector for I/O error is 54128, which is somewhere withing journal (at the
beginning of a disk). What was the problem inside of the drive is not very
clear, as modern drives are sort of black-boxes.

> left? I don't 
> understand how the drive could have an error w/o affecting the filesystem.

Well, there was affect on filesystem - the write have failed.
Also may be later that block was remapped, or that was internal drive's logic
failure or something else like that.
This journal block won't be used on subsequent mount (because transaction
was not closed), but will be just
overwritten. So even if its content was corrupted, reiserfs does not care.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Possible Bug in 2.4.24???]
  2004-01-17  5:11 ` Oleg Drokin
@ 2004-01-17  6:23   ` Mike Fedyk
  0 siblings, 0 replies; 3+ messages in thread
From: Mike Fedyk @ 2004-01-17  6:23 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Brad Tilley, Marcelo Tosatti, linux-kernel

On Sat, Jan 17, 2004 at 07:11:07AM +0200, Oleg Drokin wrote:
> Well, there was affect on filesystem - the write have failed.
> Also may be later that block was remapped, or that was internal drive's logic
> failure or something else like that.
> This journal block won't be used on subsequent mount (because transaction
> was not closed), but will be just
> overwritten. So even if its content was corrupted, reiserfs does not care.

I'd also suggest to brad that he replace the drive ASAP.

Mike

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-01-17  6:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-17  1:04 [Re: Possible Bug in 2.4.24???] Brad Tilley
2004-01-17  5:11 ` Oleg Drokin
2004-01-17  6:23   ` Mike Fedyk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).