linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Probably 2.4 kernel or AIC7xxx module trouble
       [not found]             ` <5lnu.2l7.13@gated-at.bofh.it>
@ 2003-07-04 17:36               ` Roberto Slepetys Ferreira
  2003-07-05 20:27                 ` Justin T. Gibbs
  0 siblings, 1 reply; 14+ messages in thread
From: Roberto Slepetys Ferreira @ 2003-07-04 17:36 UTC (permalink / raw)
  To: linux-kernel

Hi again,

I passed the parameer: nmi_watchdog=1 to the kernel at the boot.

And after about 2 hours it frozen again, but in the console I found a lot of
messages like this:

smb_proc_readdir_log: name=\....(some directory)....\*, result=-2, rcls=1,
err=2

In the log, I found the same message:

Jul  4 12:35:21 filitico kernel: smb_proc_readdir_long:
name=\Renato(19)\Data Base Nao Utilizados\*, result=-2, rcls=1, err=2

and

Jul  4 12:30:07 filitico kernel: smb_proc_readdir_long:
name=\Renato(19)\Data Base Nao Utilizados\*, result=-2, rcls=1, err=2
Jul  4 12:30:54 filitico kernel: smb_proc_readdir_long:
name=\Renato(19)\Data Base Nao Utilizados\*, result=-2, rcls=1, err=2
Jul  4 12:33:59 filitico last message repeated 3 times
Jul  4 12:34:24 filitico last message repeated 2 times

Do you know what possible it can be ?

Thanks
Slepetys


----- Original Message ----- 
From: "Jim Gifford" <maillist@jg555.com>
Newsgroups: linux.kernel
Sent: Thursday, July 03, 2003 8:10 PM
Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble


> Justin, I just tried to enable the nmi watch dog. It doesn't seem to work
on
> my system I tried both
>
> append="nmi_watchdog=1"
> and
> append="nmi_watchdog=2"
>
> ----- Original Message ----- 
> From: "Justin T. Gibbs" <gibbs@scsiguy.com>
> To: "Roberto Slepetys Ferreira" <slepetys@homeworks.com.br>; "Jim Gifford"
> <jim@jg555.com>; <linux-kernel@vger.kernel.org>
> Sent: Thursday, July 03, 2003 2:20 PM
> Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble
>
>
> > > I have no clue for what kind of tests I can do to generate the
trouble,
> or
> > > for what logs, or files to look for.
> >
> > Have you tried running with the NMI watchdog enabled?
> >
> > --
> > Justin
> >
> >
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
       [not found]       ` <5ldB.2cK.1@gated-at.bofh.it>
@ 2003-07-04 17:38         ` Roberto Slepetys Ferreira
  0 siblings, 0 replies; 14+ messages in thread
From: Roberto Slepetys Ferreira @ 2003-07-04 17:38 UTC (permalink / raw)
  To: Matthias Andree; +Cc: linux-kernel

Hi again,

I did a ps ax|grep -w D, and I got:

>ps ax|grep -w D
 2205 ?        S      0:00 smbd -D
 2209 ?        S      0:00 nmbd -D
 2337 pts/0    S      0:00 grep -w D

And the Load Average still is incompatible with the use of the CPUs:

14:34:53  up 11 min,  1 user,  load average: 1.35, 1.38, 0.85
86 processes: 85 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states:   0.1% user   1.0% system    0.0% nice   0.0% iowait  97.0%
idle
CPU1 states:   0.0% user   1.0% system    0.0% nice   0.0% iowait  98.0%
idle
Mem:   513172k av,  340916k used,  172256k free,       0k shrd,   11904k
buff
                    242200k actv,   24552k in_d,   15120k in_c
Swap: 1060088k av,      40k used, 1060048k free                  254744k
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
   29 root      15   0     0    0     0 SW    0.9  0.0   0:03   1 raid1syncd

[]s
Slepetys


----- Original Message ----- 
From: "Matthias Andree" <matthias.andree@gmx.de>
Newsgroups: linux.kernel
Sent: Thursday, July 03, 2003 8:00 PM
Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble


> On Thu, 03 Jul 2003, Roberto Slepetys Ferreira wrote:
>
> > Meanning that the Load Average is incompatible with the use of the CPUs.
>
> To find the stuck process that pushes your LA up, try: ps ax | grep -w D
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
  2003-07-04 17:36               ` Probably 2.4 kernel or AIC7xxx module trouble Roberto Slepetys Ferreira
@ 2003-07-05 20:27                 ` Justin T. Gibbs
  0 siblings, 0 replies; 14+ messages in thread
From: Justin T. Gibbs @ 2003-07-05 20:27 UTC (permalink / raw)
  To: Roberto Slepetys Ferreira, linux-kernel

> Hi again,
> 
> I passed the parameer: nmi_watchdog=1 to the kernel at the boot.
> 
> And after about 2 hours it frozen again, but in the console I found a lot of
> messages like this:
> 
> smb_proc_readdir_log: name=\....(some directory)....\*, result=-2, rcls=1,
> err=2

Looks like your samba server is upset about some requests its getting.
These probably have nothing to do with your hang.

Did you verify that the NMI watchdog was functioning properly on your
system as outline by the NMI watchdog FAQ in the kernel source Documenation
directory?

--
Justin


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
  2003-07-05 20:15               ` Justin T. Gibbs
@ 2003-07-05 20:28                 ` Jim Gifford
  0 siblings, 0 replies; 14+ messages in thread
From: Jim Gifford @ 2003-07-05 20:28 UTC (permalink / raw)
  To: Justin T. Gibbs, Roberto Slepetys Ferreira, linux-kernel

I think the problem is elsewhere, please take a look at this message I sent
earlier.

http://marc.theaimsgroup.com/?l=linux-kernel&m=105742280413809&w=2

----- Original Message ----- 
From: "Justin T. Gibbs" <gibbs@scsiguy.com>
To: "Jim Gifford" <maillist@jg555.com>; "Roberto Slepetys Ferreira"
<slepetys@homeworks.com.br>; <linux-kernel@vger.kernel.org>
Sent: Saturday, July 05, 2003 1:15 PM
Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble


> > Justin, I just tried to enable the nmi watch dog. It doesn't seem to
work on
> > my system I tried both
> >
> > append="nmi_watchdog=1"
> > and
> > append="nmi_watchdog=2"
>
> Is the watchdog enabled in your kernel?  The command line only works
> if you have compiled in support for the watchdog.
>
> --
> Justin
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
  2003-07-03 23:04             ` Jim Gifford
@ 2003-07-05 20:15               ` Justin T. Gibbs
  2003-07-05 20:28                 ` Jim Gifford
  0 siblings, 1 reply; 14+ messages in thread
From: Justin T. Gibbs @ 2003-07-05 20:15 UTC (permalink / raw)
  To: Jim Gifford, Roberto Slepetys Ferreira, linux-kernel

> Justin, I just tried to enable the nmi watch dog. It doesn't seem to work on
> my system I tried both
> 
> append="nmi_watchdog=1"
> and
> append="nmi_watchdog=2"

Is the watchdog enabled in your kernel?  The command line only works
if you have compiled in support for the watchdog.

--
Justin


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
  2003-07-03 22:49       ` Matthias Andree
@ 2003-07-03 23:09         ` Jim Gifford
  0 siblings, 0 replies; 14+ messages in thread
From: Jim Gifford @ 2003-07-03 23:09 UTC (permalink / raw)
  To: Matthias Andree, linux-kernel

Tried that before. Before I thought it was the kswapd problem (see list).
But a few hours after I thought it was fixed, bamm it did it again.

I have monitored ps via this script, but I never see anything out of the
ordinary. I will try again and send a copy to the other guy who is having
the problem to see what results we get.


----- Original Message ----- 
From: "Matthias Andree" <matthias.andree@gmx.de>
To: <linux-kernel@vger.kernel.org>
Sent: Thursday, July 03, 2003 3:49 PM
Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble


> On Thu, 03 Jul 2003, Roberto Slepetys Ferreira wrote:
>
> > Meanning that the Load Average is incompatible with the use of the CPUs.
>
> To find the stuck process that pushes your LA up, try: ps ax | grep -w D
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
  2003-07-03 21:20           ` Justin T. Gibbs
@ 2003-07-03 23:04             ` Jim Gifford
  2003-07-05 20:15               ` Justin T. Gibbs
  0 siblings, 1 reply; 14+ messages in thread
From: Jim Gifford @ 2003-07-03 23:04 UTC (permalink / raw)
  To: Justin T. Gibbs, Roberto Slepetys Ferreira, linux-kernel

Justin, I just tried to enable the nmi watch dog. It doesn't seem to work on
my system I tried both

append="nmi_watchdog=1"
and
append="nmi_watchdog=2"

----- Original Message ----- 
From: "Justin T. Gibbs" <gibbs@scsiguy.com>
To: "Roberto Slepetys Ferreira" <slepetys@homeworks.com.br>; "Jim Gifford"
<jim@jg555.com>; <linux-kernel@vger.kernel.org>
Sent: Thursday, July 03, 2003 2:20 PM
Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble


> > I have no clue for what kind of tests I can do to generate the trouble,
or
> > for what logs, or files to look for.
>
> Have you tried running with the NMI watchdog enabled?
>
> --
> Justin
>
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
  2003-07-03 18:29     ` Roberto Slepetys Ferreira
       [not found]       ` <13e101c3419d$f62f9410$3400a8c0@W2RZ8L4S02>
@ 2003-07-03 22:49       ` Matthias Andree
  2003-07-03 23:09         ` Jim Gifford
  1 sibling, 1 reply; 14+ messages in thread
From: Matthias Andree @ 2003-07-03 22:49 UTC (permalink / raw)
  To: linux-kernel

On Thu, 03 Jul 2003, Roberto Slepetys Ferreira wrote:

> Meanning that the Load Average is incompatible with the use of the CPUs.

To find the stuck process that pushes your LA up, try: ps ax | grep -w D

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
  2003-07-03 20:15         ` Roberto Slepetys Ferreira
@ 2003-07-03 21:20           ` Justin T. Gibbs
  2003-07-03 23:04             ` Jim Gifford
  0 siblings, 1 reply; 14+ messages in thread
From: Justin T. Gibbs @ 2003-07-03 21:20 UTC (permalink / raw)
  To: Roberto Slepetys Ferreira, Jim Gifford, linux-kernel

> I have no clue for what kind of tests I can do to generate the trouble, or
> for what logs, or files to look for.

Have you tried running with the NMI watchdog enabled?

--
Justin


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
       [not found]       ` <13e101c3419d$f62f9410$3400a8c0@W2RZ8L4S02>
@ 2003-07-03 20:15         ` Roberto Slepetys Ferreira
  2003-07-03 21:20           ` Justin T. Gibbs
  0 siblings, 1 reply; 14+ messages in thread
From: Roberto Slepetys Ferreira @ 2003-07-03 20:15 UTC (permalink / raw)
  To: Jim Gifford, linux-kernel

Hi Jim,
It's exacly the same problem, after about 12-24 hour everything locks up,
but I can still ping (sometimes).

But the syslog stops, and I only reboot by hardware, because the CTL+ALT+DEL
doesn't works, and the terminal either.

It's a SMP (dual pentium III) too, but after some tests with single CPU and
NOAPIC parameter to the kernel, the trouble continues.

I have no clue for what kind of tests I can do to generate the trouble, or
for what logs, or files to look for.

[]s
Slepetys



> Roberto,
>     How does this problem manifest itself. I think it's the same problem
> that I'm having. Let me know what you think. I'm using the megaraid driver
> and aic7xxx driver. After about a 12-20 hour period, everything locks up,
> but there is not error message. The kernel sysreq information does work
and
> I'm able to reboot.
>
> top - 12:59:05 up  3:02,  2 users,  load average: 4.17, 4.35, 4.38
> Tasks: 122 total,   5 running, 117 sleeping,   0 stopped,   0 zombie
>  Cpu0 :  25.0% user,  50.0% system,  25.0% nice,   0.0% idle
>  Cpu1 :  62.5% user,  31.2% system,   0.0% nice,   6.2% idle
> Mem:   1033896k total,   823636k used,   210260k free,   160412k buffers
> Swap:  1060280k total,        0k used,  1060280k free,   352848k cached
>
>
> ----- Original Message ----- 
> From: "Roberto Slepetys Ferreira" <slepetys@homeworks.com.br>
> To: "Roberto Slepetys Ferreira" <slepetys@homeworks.com.br>; "Justin T.
> Gibbs" <gibbs@scsiguy.com>; <linux-kernel@vger.kernel.org>
> Sent: Thursday, July 03, 2003 11:29 AM
> Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble
>
>
> > Ops....
> > The linux box halted again, after 12 hours operating normaly.
> >
> > The more strange is that there isn't any message in the
/var/log/message,
> > the system simples stop to respond, and some strange behavior is that
the
> > TOP comand gaves me :
> >
> > 15:10:15     up   33 min, 1 user, load average: 1.06, 1.17, 1.14
> > 91 processes: 90 sleeping, 1 running, 0 zombie, 0 stopped
> > CPU0 states:   0.0% user   3.1% system    0.0% nice   0.0% iowait  96.2%
> > idle
> > CPU1 states:   0.0% user   0.1% system    0.0% nice   0.0% iowait  99.2%
> > idle
> > Mem:   513172k av,  437740k used,   75432k free,       0k shrd,   17500k
> > buff
> >                     258436k actv,   41792k in_d,   76644k in_c
> > Swap: 1060088k av,      44k used, 1060044k free                  339860k
> > cached
> >
> >   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU
COMMAND
> >     9 root      16   0     0    0     0 SW    0.8  0.0   0:02   0
> > kscand/Normal
> >    31 root      15   0     0    0     0 DW    0.5  0.0   0:05   0
> raid1syncd
> >  2425 root      15   0  1088 1088   864 R     0.2  0.2   0:00   1 top
> >     1 root      15   0   396  396   344 S     0.0  0.0   0:03   1 init
> >     2 root      RT   0     0    0     0 SW    0.0  0.0   0:00   0
> > migration/0
> > ... others...
> >
> > Meanning that the Load Average is incompatible with the use of the CPUs.
> >
> > I really have no idea where to find some clue about what is going on.
> >
> > Thanks
> > Roberto Slepetys
> >
> >
> >
> > ----- Original Message ----- 
> > From: "Roberto Slepetys Ferreira" <slepetys@homeworks.com.br>
> > To: "Justin T. Gibbs" <gibbs@scsiguy.com>;
<linux-kernel@vger.kernel.org>
> > Sent: Wednesday, July 02, 2003 6:00 PM
> > Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble
> >
> >
> > > Hi,
> > >
> > > I upgraded it for the 6.2.36, using RPM and I am making some heavy
> tests.
> > >
> > > Until now, it's ok, and for this kind of tests, the old configuration
> gave
> > > some trouble.
> > >
> > > Thanks
> > > Slepetys
> > >
> > > ----- Original Message ----- 
> > > From: "Justin T. Gibbs" <gibbs@scsiguy.com>
> > > To: "Roberto Slepetys Ferreira" <slepetys@homeworks.com.br>;
> > > <linux-kernel@vger.kernel.org>
> > > Sent: Wednesday, July 02, 2003 12:14 PM
> > > Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble
> > >
> > >
> > > > > The system halts easily if I do a large I/O, like reindexing a
> > database,
> > > > > giving me some messages like: (scsi0:A:1:0): Locking max tag count
> at
> > > 128...
> > > >
> > > > The "Locking max tag count" messages are normal.  It means the SCSI
> > > > driver was able to determine the maximum queue depth of your drive.
> > > >
> > > > 6.2.8 is rather old.  I don't know that upgrading the aic7xxx driver
> > > > will solve your problem, but it might be worth a shot.  The latest
> > > > is available here:
> > > >
> > > > http://people.FreeBSD.org/~gibbs/linux/SRC/
> > > >
> > > > After upgrading, you should be at 6.2.36.
> > > >
> > > > --
> > > > Justin
> >
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> >
>
>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
  2003-07-02 21:00   ` Roberto Slepetys Ferreira
@ 2003-07-03 18:29     ` Roberto Slepetys Ferreira
       [not found]       ` <13e101c3419d$f62f9410$3400a8c0@W2RZ8L4S02>
  2003-07-03 22:49       ` Matthias Andree
  0 siblings, 2 replies; 14+ messages in thread
From: Roberto Slepetys Ferreira @ 2003-07-03 18:29 UTC (permalink / raw)
  To: Roberto Slepetys Ferreira, Justin T. Gibbs, linux-kernel

Ops....
The linux box halted again, after 12 hours operating normaly.

The more strange is that there isn't any message in the /var/log/message,
the system simples stop to respond, and some strange behavior is that the
TOP comand gaves me :

15:10:15     up   33 min, 1 user, load average: 1.06, 1.17, 1.14
91 processes: 90 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states:   0.0% user   3.1% system    0.0% nice   0.0% iowait  96.2%
idle
CPU1 states:   0.0% user   0.1% system    0.0% nice   0.0% iowait  99.2%
idle
Mem:   513172k av,  437740k used,   75432k free,       0k shrd,   17500k
buff
                    258436k actv,   41792k in_d,   76644k in_c
Swap: 1060088k av,      44k used, 1060044k free                  339860k
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
    9 root      16   0     0    0     0 SW    0.8  0.0   0:02   0
kscand/Normal
   31 root      15   0     0    0     0 DW    0.5  0.0   0:05   0 raid1syncd
 2425 root      15   0  1088 1088   864 R     0.2  0.2   0:00   1 top
    1 root      15   0   396  396   344 S     0.0  0.0   0:03   1 init
    2 root      RT   0     0    0     0 SW    0.0  0.0   0:00   0
migration/0
... others...

Meanning that the Load Average is incompatible with the use of the CPUs.

I really have no idea where to find some clue about what is going on.

Thanks
Roberto Slepetys



----- Original Message ----- 
From: "Roberto Slepetys Ferreira" <slepetys@homeworks.com.br>
To: "Justin T. Gibbs" <gibbs@scsiguy.com>; <linux-kernel@vger.kernel.org>
Sent: Wednesday, July 02, 2003 6:00 PM
Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble


> Hi,
>
> I upgraded it for the 6.2.36, using RPM and I am making some heavy tests.
>
> Until now, it's ok, and for this kind of tests, the old configuration gave
> some trouble.
>
> Thanks
> Slepetys
>
> ----- Original Message ----- 
> From: "Justin T. Gibbs" <gibbs@scsiguy.com>
> To: "Roberto Slepetys Ferreira" <slepetys@homeworks.com.br>;
> <linux-kernel@vger.kernel.org>
> Sent: Wednesday, July 02, 2003 12:14 PM
> Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble
>
>
> > > The system halts easily if I do a large I/O, like reindexing a
database,
> > > giving me some messages like: (scsi0:A:1:0): Locking max tag count at
> 128...
> >
> > The "Locking max tag count" messages are normal.  It means the SCSI
> > driver was able to determine the maximum queue depth of your drive.
> >
> > 6.2.8 is rather old.  I don't know that upgrading the aic7xxx driver
> > will solve your problem, but it might be worth a shot.  The latest
> > is available here:
> >
> > http://people.FreeBSD.org/~gibbs/linux/SRC/
> >
> > After upgrading, you should be at 6.2.36.
> >
> > --
> > Justin



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
  2003-07-02 15:14 ` Justin T. Gibbs
@ 2003-07-02 21:00   ` Roberto Slepetys Ferreira
  2003-07-03 18:29     ` Roberto Slepetys Ferreira
  0 siblings, 1 reply; 14+ messages in thread
From: Roberto Slepetys Ferreira @ 2003-07-02 21:00 UTC (permalink / raw)
  To: Justin T. Gibbs, linux-kernel

Hi,

I upgraded it for the 6.2.36, using RPM and I am making some heavy tests.

Until now, it's ok, and for this kind of tests, the old configuration gave
some trouble.

Thanks
Slepetys

----- Original Message ----- 
From: "Justin T. Gibbs" <gibbs@scsiguy.com>
To: "Roberto Slepetys Ferreira" <slepetys@homeworks.com.br>;
<linux-kernel@vger.kernel.org>
Sent: Wednesday, July 02, 2003 12:14 PM
Subject: Re: Probably 2.4 kernel or AIC7xxx module trouble


> > The system halts easily if I do a large I/O, like reindexing a database,
> > giving me some messages like: (scsi0:A:1:0): Locking max tag count at
128...
>
> The "Locking max tag count" messages are normal.  It means the SCSI
> driver was able to determine the maximum queue depth of your drive.
>
> 6.2.8 is rather old.  I don't know that upgrading the aic7xxx driver
> will solve your problem, but it might be worth a shot.  The latest
> is available here:
>
> http://people.FreeBSD.org/~gibbs/linux/SRC/
>
> After upgrading, you should be at 6.2.36.
>
> --
> Justin
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Probably 2.4 kernel or AIC7xxx module trouble
  2003-07-02 14:44 Roberto Slepetys Ferreira
@ 2003-07-02 15:14 ` Justin T. Gibbs
  2003-07-02 21:00   ` Roberto Slepetys Ferreira
  0 siblings, 1 reply; 14+ messages in thread
From: Justin T. Gibbs @ 2003-07-02 15:14 UTC (permalink / raw)
  To: Roberto Slepetys Ferreira, linux-kernel

> The system halts easily if I do a large I/O, like reindexing a database,
> giving me some messages like: (scsi0:A:1:0): Locking max tag count at 128...

The "Locking max tag count" messages are normal.  It means the SCSI
driver was able to determine the maximum queue depth of your drive.

6.2.8 is rather old.  I don't know that upgrading the aic7xxx driver
will solve your problem, but it might be worth a shot.  The latest
is available here:

http://people.FreeBSD.org/~gibbs/linux/SRC/

After upgrading, you should be at 6.2.36.

--
Justin


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Probably 2.4 kernel or AIC7xxx module trouble
@ 2003-07-02 14:44 Roberto Slepetys Ferreira
  2003-07-02 15:14 ` Justin T. Gibbs
  0 siblings, 1 reply; 14+ messages in thread
From: Roberto Slepetys Ferreira @ 2003-07-02 14:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: Slepetys

Hi linuxers,

I am having a strange halt troubles in my linux box after I installed the
2.4.20 kernel that didn't happen when I was running the 2.2 kernel.

The system works fine for hours or days, and then it halts. The keybord
halts, the remote acess halts, but I still can ping it !

I found some messages in this newsgroup with the same caracteristics, but it
was unsolved.

The system halts easily if I do a large I/O, like reindexing a database,
giving me some messages like: (scsi0:A:1:0): Locking max tag count at 128...

I have a Red Hat 9 distribution instaled in a Intel STL2 server board, with
2 Pentium III 933 Mhz and 512 Mb RAM, 2 scsi disks (18 Mb) and 2 IDE disks
(40 Mb), running in multiple RAID 1 and RAID 0 configurations.

The dmesg gives me the following data about the SCSI:

SCSI subsystem driver Revision: 1.00
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.8
        <Adaptec aic7899 Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.8
        <Adaptec aic7899 Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs

blk: queue c24fd618, I/O limit 4095Mb (mask 0xffffffff)
  Vendor: FUJITSU   Model: MAJ3182MC         Rev: 0114
  Type:   Direct-Access                      ANSI SCSI revision: 04
blk: queue c24fd418, I/O limit 4095Mb (mask 0xffffffff)
  Vendor: IBM       Model: DDYS-T18350M      Rev: SA2A
  Type:   Direct-Access                      ANSI SCSI revision: 03
blk: queue c24fd218, I/O limit 4095Mb (mask 0xffffffff)
  Vendor: ESG-SHV   Model: SCA HSBP M14      Rev: 0.03
  Type:   Processor                          ANSI SCSI revision: 02
blk: queue dfc58e18, I/O limit 4095Mb (mask 0xffffffff)
scsi0:A:0:0: Tagged Queuing enabled.  Depth 253
scsi0:A:1:0: Tagged Queuing enabled.  Depth 253
  Vendor: NEC       Model: CD-ROM DRIVE:466  Rev: 1.26
  Type:   CD-ROM                             ANSI SCSI revision: 02
blk: queue dfc58a18, I/O limit 4095Mb (mask 0xffffffff)
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi disk sdb at scsi0, channel 0, id 1, lun 0
(scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, offset 127, 16bit)
SCSI device sda: 35694904 512-byte hdwr sectors (18276 MB)
 sda: sda1 sda2 sda3
(scsi0:A:1): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
SCSI device sdb: 35843670 512-byte hdwr sectors (18352 MB)
 sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 >
md: raid0 personality registered as nr 2
md: raid1 personality registered as nr 3
Journalled Block Device driver loaded
md: Autodetecting RAID arrays.

The /proc/scsi/aic7xxx/0 gives me:

Adaptec AIC7xxx driver version: 6.2.8
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

Serial EEPROM:
0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a
0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a
0x58e4 0x5c5e 0x2807 0x0010 0xffff 0xffff 0xffff 0xffff
0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0x0250 0x133f

Channel A Target 0 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
        Goal: 160.000MB/s transfers (80.000MHz DT, offset 127, 16bit)
        Curr: 160.000MB/s transfers (80.000MHz DT, offset 127, 16bit)
        Channel A Target 0 Lun 0 Settings
                Commands Queued 488876
                Commands Active 0
                Command Openings 125
                Max Tagged Openings 253
                Device Queue Frozen Count 0
Channel A Target 1 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
        Goal: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
        Curr: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
        Channel A Target 1 Lun 0 Settings
                Commands Queued 465642
                Commands Active 0
                Command Openings 128
                Max Tagged Openings 128
                Device Queue Frozen Count 0
Channel A Target 2 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 3 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 4 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 5 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 6 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
        Goal: 3.300MB/s transfers
        Curr: 3.300MB/s transfers
        Channel A Target 6 Lun 0 Settings
                Commands Queued 1
                Commands Active 0
                Command Openings 1

                Max Tagged Openings 0
                Device Queue Frozen Count 0
Channel A Target 7 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 8 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 9 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 10 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 11 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 12 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 13 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 14 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 15 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)

And /proc/scsi/aic7xxx/1 is:

daptec AIC7xxx driver version: 6.2.8
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs

Serial EEPROM:
0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a
0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a
0x58e4 0x5c5e 0x2807 0x0010 0xffff 0xffff 0xffff 0xffff
0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0x0250 0x133f

Channel A Target 0 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 1 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 2 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 3 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 4 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 5 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
        Goal: 20.000MB/s transfers (20.000MHz, offset 16)
        Curr: 20.000MB/s transfers (20.000MHz, offset 16)
        Channel A Target 5 Lun 0 Settings
                Commands Queued 193
                Commands Active 0
                Command Openings 1
                Max Tagged Openings 0
                Device Queue Frozen Count 0
Channel A Target 6 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 7 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 8 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 9 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 10 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 11 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 12 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 13 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 14 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)
Channel A Target 15 Negotiation Settings
        User: 160.000MB/s transfers (80.000MHz DT, offset 255, 16bit)

How I could know if this is a hardware problem or a kernel + module problem
?

Thanks
Slepetys



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2003-07-05 20:14 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4R5X.8bo.19@gated-at.bofh.it>
     [not found] ` <4Rzh.h8.25@gated-at.bofh.it>
     [not found]   ` <4WSg.5H7.21@gated-at.bofh.it>
     [not found]     ` <5h0y.5ht.25@gated-at.bofh.it>
     [not found]       ` <5iIR.7Cp.11@gated-at.bofh.it>
     [not found]         ` <5iIQ.7Cp.9@gated-at.bofh.it>
     [not found]           ` <5jOA.14o.9@gated-at.bofh.it>
     [not found]             ` <5lnu.2l7.13@gated-at.bofh.it>
2003-07-04 17:36               ` Probably 2.4 kernel or AIC7xxx module trouble Roberto Slepetys Ferreira
2003-07-05 20:27                 ` Justin T. Gibbs
     [not found]       ` <5ldB.2cK.1@gated-at.bofh.it>
2003-07-04 17:38         ` Roberto Slepetys Ferreira
2003-07-02 14:44 Roberto Slepetys Ferreira
2003-07-02 15:14 ` Justin T. Gibbs
2003-07-02 21:00   ` Roberto Slepetys Ferreira
2003-07-03 18:29     ` Roberto Slepetys Ferreira
     [not found]       ` <13e101c3419d$f62f9410$3400a8c0@W2RZ8L4S02>
2003-07-03 20:15         ` Roberto Slepetys Ferreira
2003-07-03 21:20           ` Justin T. Gibbs
2003-07-03 23:04             ` Jim Gifford
2003-07-05 20:15               ` Justin T. Gibbs
2003-07-05 20:28                 ` Jim Gifford
2003-07-03 22:49       ` Matthias Andree
2003-07-03 23:09         ` Jim Gifford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).