linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie TSM
@ 2002-03-04  7:35 Oliver.Schersand
  2002-03-04 15:07 ` Hans Reiser
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Oliver.Schersand @ 2002-03-04  7:35 UTC (permalink / raw)
  To: Alessandro Suardi, use-oracle, suse-linux-e, mason, linux-kernel

Hi,

on saturday a had a nice day with 16 houre to find a workaround to bring
linux stable.
I had moved the server from reiserfs to ext2 for all datafile areas. The
move with tar
runs without any crash. I had an about 60 to 75 MB/second transfer ( read +
write) on the
move of the oracle datafiles.

After startup of oracle and backup the open datafiles ( i know this is
nonsens but its a good stress test)
i get a crash. On a reiserfs this would crash immediately. On ext2 crash
happend after about 2.5houres of backup ( about 80GB datafiles).
After this i switched backup to kernel version 2.2.19. ---> The system runs
now without crash.
On other server without oracle but which are have tsm backup we had no
problems with 2.4.16 ( at the moment  only about 15 Servers)

Its seems that you are right an we have a serious vm bug. This bug is only
viewable if you user oracle and tsm (tivoli storage manager) .... Strange.

Kinds regards

Oliver Schersand


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie TSM
  2002-03-04  7:35 Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie TSM Oliver.Schersand
@ 2002-03-04 15:07 ` Hans Reiser
  2002-03-05 17:06 ` Chris Mason
  2002-03-07  7:13 ` Petro
  2 siblings, 0 replies; 7+ messages in thread
From: Hans Reiser @ 2002-03-04 15:07 UTC (permalink / raw)
  To: Oliver.Schersand
  Cc: Alessandro Suardi, use-oracle, suse-linux-e, mason, linux-kernel

Oliver.Schersand@BASF-IT-Services.com wrote:

>Hi,
>
>on saturday a had a nice day with 16 houre to find a workaround to bring
>linux stable.
>I had moved the server from reiserfs to ext2 for all datafile areas. The
>move with tar
>runs without any crash. I had an about 60 to 75 MB/second transfer ( read +
>write) on the
>move of the oracle datafiles.
>
>After startup of oracle and backup the open datafiles ( i know this is
>nonsens but its a good stress test)
>i get a crash. On a reiserfs this would crash immediately. On ext2 crash
>happend after about 2.5houres of backup ( about 80GB datafiles).
>After this i switched backup to kernel version 2.2.19. ---> The system runs
>now without crash.
>On other server without oracle but which are have tsm backup we had no
>problems with 2.4.16 ( at the moment  only about 15 Servers)
>
>Its seems that you are right an we have a serious vm bug. This bug is only
>viewable if you user oracle and tsm (tivoli storage manager) .... Strange.
>
>Kinds regards
>
>Oliver Schersand
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>
Wasn't 2.4.16 the known unstable vm release of 2.4?  Why do you go to 
such effort to stick with a bad kernel?  Go to 2.4.18.

Hans



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie TSM
  2002-03-04  7:35 Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie TSM Oliver.Schersand
  2002-03-04 15:07 ` Hans Reiser
@ 2002-03-05 17:06 ` Chris Mason
  2002-03-07  7:14   ` Petro
  2002-03-07  7:13 ` Petro
  2 siblings, 1 reply; 7+ messages in thread
From: Chris Mason @ 2002-03-05 17:06 UTC (permalink / raw)
  To: Hans Reiser, Oliver.Schersand
  Cc: Alessandro Suardi, use-oracle, suse-linux-e, linux-kernel



On Monday, March 04, 2002 06:07:19 PM +0300 Hans Reiser <reiser@namesys.com> wrote:


> Wasn't 2.4.16 the known unstable vm release of 2.4?  Why do you go to 
> such effort to stick with a bad kernel?  Go to 2.4.18.

I'm not sure exactly which vm problems you mean, but He's running the 
suse 2.4.16, which is heavily patched. When your running big production
databases, upgrading to the kernel of the week isn't an option.

I think we've found the bug, it looks like a race in the proc code.

Oliver, someone will contact you a little later with instructions on
getting a kernel with the fix.  If you only see this oops during backups,
make sure you aren't trying to backup /proc.

-chris


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie TSM
  2002-03-04  7:35 Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie TSM Oliver.Schersand
  2002-03-04 15:07 ` Hans Reiser
  2002-03-05 17:06 ` Chris Mason
@ 2002-03-07  7:13 ` Petro
  2 siblings, 0 replies; 7+ messages in thread
From: Petro @ 2002-03-07  7:13 UTC (permalink / raw)
  To: Oliver.Schersand
  Cc: Alessandro Suardi, use-oracle, suse-linux-e, mason, linux-kernel

On Mon, Mar 04, 2002 at 08:35:36AM +0100, Oliver.Schersand@BASF-IT-Services.com wrote:
> happend after about 2.5houres of backup ( about 80GB datafiles).
> After this i switched backup to kernel version 2.2.19. ---> The system runs
> now without crash.
> On other server without oracle but which are have tsm backup we had no
> problems with 2.4.16 ( at the moment  only about 15 Servers)
> 
> Its seems that you are right an we have a serious vm bug. This bug is only
> viewable if you user oracle and tsm (tivoli storage manager) .... Strange.

    Are you getting a complete OS crash, or just Oracle going bang?

-- 
Share and Enjoy. 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie TSM
  2002-03-05 17:06 ` Chris Mason
@ 2002-03-07  7:14   ` Petro
  0 siblings, 0 replies; 7+ messages in thread
From: Petro @ 2002-03-07  7:14 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-kernel

On Tue, Mar 05, 2002 at 12:06:43PM -0500, Chris Mason wrote:
> On Monday, March 04, 2002 06:07:19 PM +0300 Hans Reiser <reiser@namesys.com> wrote:
> > Wasn't 2.4.16 the known unstable vm release of 2.4?  Why do you go to 
> > such effort to stick with a bad kernel?  Go to 2.4.18.
> I'm not sure exactly which vm problems you mean, but He's running the 
> suse 2.4.16, which is heavily patched. When your running big production
> databases, upgrading to the kernel of the week isn't an option.
> I think we've found the bug, it looks like a race in the proc code.
> Oliver, someone will contact you a little later with instructions on
> getting a kernel with the fix.  If you only see this oops during backups,
> make sure you aren't trying to backup /proc.

    Is this in the generic kernel, or the patches? 

-- 
Share and Enjoy. 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie TSM
@ 2002-03-11  8:16 Oliver.Schersand
  0 siblings, 0 replies; 7+ messages in thread
From: Oliver.Schersand @ 2002-03-11  8:16 UTC (permalink / raw)
  To: mason
  Cc: reiser, alessandro.suardi, suse-oracle, linux-kernel, gehringg,
	Bjoern.Macdonald

Hi,

i have switch to kernel 2.2.19. But after about 5 day's ( Friday )  i had a
the same hang on the system. which leads me to the
diagnostic that we have a possible hardware problem or a problem in the
compaq smart array or compaq 5300 Raid Array controller
driver. On 2.2.19 i have cpqarray 1.0.12 and cciss 1.0.4. On the 2.4.16
kernel i have the cpqarray 2.4.5 and the cciss 2.4.6.

Kinds Regards

Oliver Schersand

---------------------- Weitergeleitet von Oliver Schersand/BCS/BASF am
11.03.2002 08:43 ---------------------------


"James Washer" <washer@us.ibm.com> am 09.03.2002 00:07:07

An:    Chris Mason <mason@suse.com>
Kopie: Hans Reiser <reiser@namesys.com>, Oliver Schersand/BCS/BASF@EUROPE,
       Alessandro Suardi <alessandro.suardi@oracle.com>,
       use-oracle@suse.com, suse-linux-e@suse.com,
       linux-kernel@vger.kernel.org
Thema: Re: Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie
       TSM




Chris,

I just took a look at what little information I have available on this
situation.. Namely the 'block-o-oops' from many ps processes..

I'm not sure I agree with you that it is a race in proc code. There are
several ps processes that oops'd over a period of 58 seconds. My guess is
that there is (was)  a process out there that has a corrupt  p->sig (==
0x00003296).  Hence, each time the user runs ps, the new ps trips over the
same corrupt task.

What really confuses me is what any of this has to do with the original
complaint about the system hanging.. Has that behaviour gone away?

 - jim

Chris Mason <mason@suse.com>@vger.kernel.org on 03/05/2002 09:06:43 AM

Sent by:    linux-kernel-owner@vger.kernel.org


To:    Hans Reiser <reiser@namesys.com>,
       Oliver.Schersand@BASF-IT-Services.com
cc:    Alessandro Suardi <alessandro.suardi@oracle.com>,
       use-oracle@suse.com, suse-linux-e@suse.com,
       linux-kernel@vger.kernel.org
Subject:    Re: Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and
       Tivolie TSM





On Monday, March 04, 2002 06:07:19 PM +0300 Hans Reiser
<reiser@namesys.com> wrote:


> Wasn't 2.4.16 the known unstable vm release of 2.4?  Why do you go to
> such effort to stick with a bad kernel?  Go to 2.4.18.

I'm not sure exactly which vm problems you mean, but He's running the
suse 2.4.16, which is heavily patched. When your running big production
databases, upgrading to the kernel of the week isn't an option.

I think we've found the bug, it looks like a race in the proc code.

Oliver, someone will contact you a little later with instructions on
getting a kernel with the fix.  If you only see this oops during backups,
make sure you aren't trying to backup /proc.

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/







^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie TSM
@ 2002-03-08 23:07 James Washer
  0 siblings, 0 replies; 7+ messages in thread
From: James Washer @ 2002-03-08 23:07 UTC (permalink / raw)
  To: Chris Mason
  Cc: Hans Reiser, Oliver.Schersand, Alessandro Suardi, use-oracle,
	suse-linux-e, linux-kernel


Chris,

I just took a look at what little information I have available on this
situation.. Namely the 'block-o-oops' from many ps processes..

I'm not sure I agree with you that it is a race in proc code. There are
several ps processes that oops'd over a period of 58 seconds. My guess is
that there is (was)  a process out there that has a corrupt  p->sig (==
0x00003296).  Hence, each time the user runs ps, the new ps trips over the
same corrupt task.

What really confuses me is what any of this has to do with the original
complaint about the system hanging.. Has that behaviour gone away?

 - jim

Chris Mason <mason@suse.com>@vger.kernel.org on 03/05/2002 09:06:43 AM

Sent by:    linux-kernel-owner@vger.kernel.org


To:    Hans Reiser <reiser@namesys.com>,
       Oliver.Schersand@BASF-IT-Services.com
cc:    Alessandro Suardi <alessandro.suardi@oracle.com>,
       use-oracle@suse.com, suse-linux-e@suse.com,
       linux-kernel@vger.kernel.org
Subject:    Re: Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and
       Tivolie TSM





On Monday, March 04, 2002 06:07:19 PM +0300 Hans Reiser
<reiser@namesys.com> wrote:


> Wasn't 2.4.16 the known unstable vm release of 2.4?  Why do you go to
> such effort to stick with a bad kernel?  Go to 2.4.18.

I'm not sure exactly which vm problems you mean, but He's running the
suse 2.4.16, which is heavily patched. When your running big production
databases, upgrading to the kernel of the week isn't an option.

I think we've found the bug, it looks like a race in the proc code.

Oliver, someone will contact you a little later with instructions on
getting a kernel with the fix.  If you only see this oops during backups,
make sure you aren't trying to backup /proc.

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-03-11  8:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-03-04  7:35 Antwort: Re: Kernel Hangs 2.4.16 on heay io Oracle and Tivolie TSM Oliver.Schersand
2002-03-04 15:07 ` Hans Reiser
2002-03-05 17:06 ` Chris Mason
2002-03-07  7:14   ` Petro
2002-03-07  7:13 ` Petro
2002-03-08 23:07 James Washer
2002-03-11  8:16 Oliver.Schersand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).