All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [parisc-linux] w-a to what [Was: D380 and ext[23] fs grave pb]
       [not found] <45018CEC.4040104@scarlet.be>
@ 2006-09-11 13:29 ` Michael S. Zick
  2006-09-12 11:09 ` [parisc-linux] [RFC]: rename ncr53c8xx to 53c720? " Joel Soete
  1 sibling, 0 replies; 4+ messages in thread
From: Michael S. Zick @ 2006-09-11 13:29 UTC (permalink / raw)
  To: parisc-linux

On Fri September 8 2006 10:31, Joel Soete wrote:
> Hello James, Matthrew,
>=20
> I come back to you with this pb because as explained in a previous mail:=
=20
> <http://lists.parisc-linux.org/pipermail/parisc-linux/2006-September/0301=
52.html>
>=20
> , it seems to be related to ncr53c720 driver.
>=20

That's the second hit on that driver this week:
http://lists.parisc-linux.org/pipermail/parisc-linux/2006-August/030054.html

In the above mail, the driver was reporting:

ncr53c720-0: rev 0xf irq 66
ncr53c720-0: ID 7, Fast-10, Parity Checking, Differential
scsi0 : ncr53c8xx-3.4.3g

=A0 =A0Vendor: SEAGATE =A0 Model: ST15150W =A0 =A0 =A0 =A0 =A0Rev: HP07
=A0 =A0Type: =A0 Direct-Access =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0A=
NSI SCSI revision: 02

Where the model number was single-ended, not: ST15150WD (Differential)
and the problem was that the drive could not be written.

Joel, what does that part of your dmesg have to say?
Do the messages match what you have installed?

Dave pointed out that some drives firmware does not report
the correct model number - making this a non-problem.

Mike
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [parisc-linux] [RFC]: rename ncr53c8xx to 53c720? [Was: D380 and ext[23] fs grave pb]
       [not found] <45018CEC.4040104@scarlet.be>
  2006-09-11 13:29 ` [parisc-linux] w-a to what [Was: D380 and ext[23] fs grave pb] Michael S. Zick
@ 2006-09-12 11:09 ` Joel Soete
       [not found]   ` <200609120702.18615.mszick@morethan.org>
  1 sibling, 1 reply; 4+ messages in thread
From: Joel Soete @ 2006-09-12 11:09 UTC (permalink / raw)
  To: Parisc List; +Cc: Matthew Wilcox

Hello Matthew,

May be now that parisc-linux tree is managed with git and that driver is 
in fact only for ncr 53c720 hba, isn't it time to rename those stuff 
like ncr53c8xx to 53c720?

Awaitng a better fix, may I also suggest this small change:
--- drivers/scsi/ncr53c8xx.c.Orig       2006-09-11 11:26:36.000000000 +0200
+++ drivers/scsi/ncr53c8xx.c    2006-09-12 12:40:01.000000000 +0200
@@ -8362,7 +8362,12 @@
         tpnt->this_id           = 7;
         tpnt->sg_tablesize      = SCSI_NCR_SG_TABLESIZE;
         tpnt->cmd_per_lun       = SCSI_NCR_CMD_PER_LUN;
+#if defined(__hppa__)
+       /* FIXME: ENABLE_CLUSTERING cause ext fs corruptions */
+       tpnt->use_clustering    = DISABLE_CLUSTERING;
+#else
         tpnt->use_clustering    = ENABLE_CLUSTERING;
+#endif

         if (device->differential)
                 driver_setup.diff_support = device->differential;
=== <> ===

TIA for help,
	Joel


Joel Soete wrote:
> Hello James, Matthrew,
> 
> I come back to you with this pb because as explained in a previous mail: 
> <http://lists.parisc-linux.org/pipermail/parisc-linux/2006-September/030152.html> 
> 
> 
> , it seems to be related to ncr53c720 driver.
> 
> As I encountered same kind of pb on my c110:
> a few time after the reboot, I just did a tar/untar and already get:
> EXT3-fs error (device md2): ext3_readdir: bad entry in 
> directory4Remounting filesystem read-only
> 
> That said, it seems that I find a work around: I just reverted this
> <http://cvs.parisc-linux.org/linux-2.6/drivers/scsi/ncr53c8xx.c?r1=1.18&r2=1.19&makepatch=1&diff_format=h> 
> 
> ===================================================================
> RCS file: /var/lib/cvs/linux-2.6/drivers/scsi/ncr53c8xx.c,v
> retrieving revision 1.18
> retrieving revision 1.19
> diff -u -r1.18 -r1.19
> --- linux-2.6/drivers/scsi/ncr53c8xx.c    2004/07/20 22:06:43    1.18
> +++ linux-2.6/drivers/scsi/ncr53c8xx.c    2004/07/27 21:19:51    1.19
> @@ -8633,7 +8633,7 @@
>       tpnt->this_id        = 7;
>       tpnt->sg_tablesize    = SCSI_NCR_SG_TABLESIZE;
>       tpnt->cmd_per_lun    = SCSI_NCR_CMD_PER_LUN;
> -    tpnt->use_clustering    = DISABLE_CLUSTERING;
> +    tpnt->use_clustering    = ENABLE_CLUSTERING;
> 
>       if (device->differential)
>           driver_setup.diff_support = device->differential;
> === <> ===
> 
> And a loop like:
> # while true; do nice -n -5 tar -xslpf linux-2.6.12-rc1-pa1.tar ; nice 
> -n -5 rm -rf linux-2.6.12-rc1-pa 1-050319; date; done
> 
> was already running at least 85 time without any pb on this same c110 
> (same kernel 2.6.18-rc6-pa1 + jejb's patch + your
> timer_interrupt patch).
> 
> As you're scsi driver expert and also symbios/ncr one, may be have you 
> an idea where ncr stuff could be broken by enabling 'clustering'?
> 
> TIA,
>     Joel
> 
> Joel Soete wrote:
>  > Hello *pa,
>  >
>  > Making some stress test to atempt to identify some smp pb on my d380, 
> I figure
>  > out this ext[23] relatively grave pb. Effectively during this test loop:
>  > # while true ; do nice -n -3 tar -xspf linux-2.6.11-rc3-pa3.tar ; 
> nice -n -3 rm
>  > -rf linux-2.6.11-rc3-pa3 ; date ; done
>  >
>  > after about 30 iteration, I got a fs' corruption:
>  > EXT3-fs error (device sda10): ext3_readdir: bad entry in directory 
> #84801:
>  > rec_len % 4 != 0 - offset=0, inode=175234
>  > Aborting journal on device sda10.
>  >
>  > ext3_abort called.
>  >
>  > EXT3-fs error (device sda10): ext3_journal_start_sb: Detected aborted 
> journal
>  >
>  > Remounting filesystem read-only
>  >
>  > EXT3-fs error (device sda10): ext3_readdir: bad entry in directory 
> #84801:
>  > rec_len % 4 != 0 - offset=0, inode=1752397164, rec_len=24
>  > EXT3-fs error (device sda10): ext3_readdir: bad entry in directory 
> #84801:
>  > rec_len % 4 != 0 - offset=0, inode=1752397164, rec_len=24
>  > [snip]
>  >
>  > and during fsck I noticed:
>  > __journal_remove_journal_head: freeing b_committed_data
>  >
>  > Pass 2: Checking directory structure
>  > Directory inode 84801, block 0, offset 0: directory corrupted
>  > Salvage? yes
>  >
>  > Missing '.' in directory inode 84801.
>  > Fix? yes
>  >
>  > Setting filetype for entry '.' in ??? (84801) to 2.
>  > Missing '..' in directory inode 84801.
>  > Fix? yes
>  >
>  > [snip]
>  >
>  > As the first time it was with a smp kernel, I re-test with up kernels 
> only.
>  >
>  > This pb occured with different (up) kernels (2.6.17-rc6, 2.6.14, 
> 2.6.17 and
>  > even with 2.6.8.1) on different fs located on different disk and 
> different
>  > hba, and finaly on the 2 different cpu?
>  >
>  > That said I also get a fs corruption during a simple apt-get 
> dist-upgrade:
>  > (sdb6 being /var)
>  > EXT3-fs error (device sdb6): ext3_free_blocks: Freeing blocks not in 
> datazone -
>  > block = 745764206, count = 1
>  > Aborting journal on device sdb6.
>  > ext3_abort called.
>  > EXT3-fs error (device sdb6): ext3_journal_start_sb: Detected aborted 
> journal
>  > Remounting filesystem read-only
>  > EXT3-fs error (device sdb6): ext3_free_blocks: Freeing blocks not in 
> datazone -
>  > block = 1869488138, count = 1
>  > EXT3-fs error (device sdb6): ext3_free_blocks: Freeing blocks not in 
> datazone -
>  > block = 1953459744, count = 1
>  > [snip]
>  > EXT3-fs error (device sdb6): ext3_free_blocks: Freeing blocks not in 
> datazone -
>  > block = 2003788910, count = 1
>  > EXT3-fs error (device sdb6): ext3_free_blocks: Freeing blocks not in 
> datazone -
>  > block = 1851878701, count = 1
>  > EXT3-fs error (device sdb6): ext3_free_blocks: Freeing blocks not in 
> datazone -
>  > block = 1919248225, count = 1<3>BUG: soft lockup det
>  > ected on CPU#0!
>  > Backtrace:
>  >  [<1013222c>] update_process_times+0x34/0x80
>  >  [<1010748c>] timer_interrupt+0x9c/0x134
>  >  [<10145f44>] handle_IRQ_event+0x5c/0xa4
>  >  [<10146004>] __do_IRQ+0x78/0x18c
>  >  [<10107bc4>] do_cpu_irq_mask+0x6c/0xc8
>  >  [<1010a068>] intr_return+0x0/0xc
>  >
>  >
>  > EXT3-fs error (device sdb6): ext3_free_blocks: Freeing blocks not in 
> datazone -
>  > block = 1632635402, count = 1
>  > EXT3-fs error (device sdb6): ext3_free_blocks: Freeing blocks not in 
> datazone -
>  > block = 1734437731, count = 1
>  > [snip]
>  > EXT3-fs error (device sdb6): ext3_free_blocks: Freeing blocks not in 
> datazone -
>  > block = 1313423904, count = 1
>  > EXT3-fs error (device sdb6): ext3_free_blocks: Freeing blocks not in 
> datazone -
>  > block = 171515972, count = 1
>  > EXT3-fs error (device sdb6) in ext3_reserve_inode_write: Journal has 
> aborted
>  > EXT3-fs error (device sdb6) in ext3_truncate: Journal has aborted
>  > EXT3-fs error (device sdb6) in ext3_reserve_inode_write: Journal has 
> aborted
>  > EXT3-fs error (device sdb6) in ext3_orphan_del: Journal has aborted
>  > EXT3-fs error (device sdb6) in ext3_reserve_inode_write: Journal has 
> aborted
>  > __journal_remove_journal_head: freeing b_committed_data
>  >
>  > It's the only system of this model on which I can reproduce but also 
> I can
>  > test (and as I am not hp, I don't have passwd diagnostic cd to test 
> hw ;-( ),
>  > so it's very hard to me if it's a driver pb (e.g. ncr hba) or a hw pb?
>  >
>  > Any idea, advises are welcome.
>  >
>  > Thanks in advance,
>  >     Joel
>  >
> _______________________________________________
> parisc-linux mailing list
> parisc-linux@lists.parisc-linux.org
> http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
> 
> 
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [parisc-linux] [RFC]: rename ncr53c8xx to 53c720? [Was: D380 and ext[23] fs grave pb]
       [not found]   ` <200609120702.18615.mszick@morethan.org>
@ 2006-09-12 12:26     ` Michael S. Zick
  0 siblings, 0 replies; 4+ messages in thread
From: Michael S. Zick @ 2006-09-12 12:26 UTC (permalink / raw)
  To: parisc-linux

On Tue September 12 2006 07:02, Michael S. Zick wrote:
> On Tue September 12 2006 06:09, Joel Soete wrote:
> > Hello Matthew,
> > 
> > May be now that parisc-linux tree is managed with git and that driver is 
> > in fact only for ncr 53c720 hba, isn't it time to rename those stuff 
> > like ncr53c8xx to 53c720?
> >
> 
> Joel and group,
> 
> I was browsing that code and noticed that it tries to support a half
> dozen different chips.
> 
> That leads into some really strange looking code for the single-ended/
> differential setup.
> 

Oops, sorry, getting old...

Noticed one more thing - the driver allocates kernel memory -

Which raised the question in my mind of:
*) Driver allocates memory
*) Memory request generates swap activity
*) Swap file is on same driver

Can the above sequence happen and deadlock the driver?
The "softlockup detected" in your message listing?

That driver also uses both spinlocks and timers, so
it may be affected by the recent changes.

Mike
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [parisc-linux] w-a to what [Was: D380 and ext[23] fs grave pb]
       [not found] <4506B61C.2000204@scarlet.be>
@ 2006-09-12 14:14 ` Michael S. Zick
  0 siblings, 0 replies; 4+ messages in thread
From: Michael S. Zick @ 2006-09-12 14:14 UTC (permalink / raw)
  To: parisc-linux

On Tue September 12 2006 08:29, Joel wrote:
> Hello Mike,
> 
> Apologies for so late answer but it seems that I never recieved this 
> mail so I am trying to compose a new one from m-l (sorry to breack the 
> thread so sadely):
> 
>  >On Fri September 8 2006 10:31, Joel Soete wrote:
>  >> Hello James, Matthrew,
>  >>
>  >> I come back to you with this pb because as explained in a previous 
> mail:
>  >> 
>  ><http://lists.parisc-linux.org/pipermail/parisc-linux/2006-September/030152.html>
>  >>
>  >> , it seems to be related to ncr53c720 driver.
>  >>
>  >
>  >That's the second hit on that driver this week:
>  >http://lists.parisc-linux.org/pipermail/parisc-linux/2006-August/030054.html
>  >
>  >In the above mail, the driver was reporting:
>  >
>  >ncr53c720-0: rev 0xf irq 66
>  >ncr53c720-0: ID 7, Fast-10, Parity Checking, Differential
>  >scsi0 : ncr53c8xx-3.4.3g
>  >
>  >   Vendor: SEAGATE   Model: ST15150W          Rev: HP07
>  >   Type:   Direct-Access                      ANSI SCSI revision: 02
>  >
>  >Where the model number was single-ended, not: ST15150WD (Differential)
>  >and the problem was that the drive could not be written.
>  >
>  >Joel, what does that part of your dmesg have to say?
> That said on my C110 I got:
> --- snip ---
> ncr53c720-0: rev 0xf irq 66
>

I could not find any code that identifies the actual chip, the "ncr53c720"
seems to be hardcoded in the message.

> ncr53c720-0: ID 7, Fast-10, Parity Checking, Differential
>

This message is part of the strange SE/DIFF code.

> scsi0 : ncr53c8xx-3.4.3g
>    Vendor: SEAGATE   Model: ST34371W          Rev: HP03
>    Type:   Direct-Access                      ANSI SCSI revision: 02
>

And even though this is a different model drive than Joe's problem,
it is still claiming to be SE:
<http://www.seagate.com/support/disc/scsi/st34371w.html>

Hmm... I forgot to check if the string is long enough for 9 characters,
perhaps just the "D" is getting clipped in the message.

>   target0:0:5: Beginning Domain Validation
>   target0:0:5: asynchronous
>   target0:0:5: wide asynchronous
>   target0:0:5: FAST-10 WIDE SCSI 20.0 MB/s ST (100 ns, offset 8)
>

And that is just wrong - the drive is a SCSI-3 (Ultra SCSI) (Fast-20)
but this drive can fall back to FAST-10, which may be what is happening.

Hmm... Maybe Joe's drives can't fall back to match the controller.

See page 15 of:
<http://www.seagate.com/support/disc/manuals/scsi/67491d.pdf>

Just more reasons to /dev/null this driver.

Mike
 
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-09-12 14:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <45018CEC.4040104@scarlet.be>
2006-09-11 13:29 ` [parisc-linux] w-a to what [Was: D380 and ext[23] fs grave pb] Michael S. Zick
2006-09-12 11:09 ` [parisc-linux] [RFC]: rename ncr53c8xx to 53c720? " Joel Soete
     [not found]   ` <200609120702.18615.mszick@morethan.org>
2006-09-12 12:26     ` Michael S. Zick
     [not found] <4506B61C.2000204@scarlet.be>
2006-09-12 14:14 ` [parisc-linux] w-a to what " Michael S. Zick

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.