linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.4.9-ac1 RAID-5 resync causes PPP connection to be unusable
@ 2001-08-30 17:21 Kevin P. Fleming
  2001-08-30 17:59 ` Doug Ledford
  0 siblings, 1 reply; 7+ messages in thread
From: Kevin P. Fleming @ 2001-08-30 17:21 UTC (permalink / raw)
  To: linux-kernel

I ran into a very strange problem yesterday... my server here, which is a
700 MHz Celeron, 256MiB RAM, four ~40G disks has two RAID-5 arrays (using
the standard kernel MD driver) configured across those four drives. For some
reason definitely related to operator error, the machine crashed and needed
to resync the arrays after being rebooted.

Eveything was working fine, interactive response was just fine even though
the drives were just cranking away doing their resync. I then brought up my
PPP Internet connection, which came up just fine. However, I was _not_ able
to actually communicate with any 'Net hosts. Watching the modem lights, it
appeared that my packets were going out, and responses were coming back, but
the responses never made it up to the userspace applications.

I even dropped and reestablished the PPP connection twice, thinking I got a
bad connection to the ISP, but there was no improvement. When the RAID-5
resync was complete, suddenly things began working just fine. While the
resync was happening, top showed "raid5syncd" with PRI 19 and NI 19 using
about 25-30% CPU, and "raid5" with PRI -1 and NI -20 using about 60-65% CPU.

I can probably reproduce this pretty easily, if anyone is interested and can
give me some idea where to look for the cause...


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.9-ac1 RAID-5 resync causes PPP connection to be unusable
  2001-08-30 17:21 2.4.9-ac1 RAID-5 resync causes PPP connection to be unusable Kevin P. Fleming
@ 2001-08-30 17:59 ` Doug Ledford
  2001-08-31  4:23   ` Kevin P. Fleming
  0 siblings, 1 reply; 7+ messages in thread
From: Doug Ledford @ 2001-08-30 17:59 UTC (permalink / raw)
  To: Kevin P. Fleming; +Cc: linux-kernel

Kevin P. Fleming wrote:

> I ran into a very strange problem yesterday... my server here, which is a
> 700 MHz Celeron, 256MiB RAM, four ~40G disks has two RAID-5 arrays (using
> the standard kernel MD driver) configured across those four drives. For some
> reason definitely related to operator error, the machine crashed and needed
> to resync the arrays after being rebooted.
> 
> Eveything was working fine, interactive response was just fine even though
> the drives were just cranking away doing their resync. I then brought up my
> PPP Internet connection, which came up just fine. However, I was _not_ able
> to actually communicate with any 'Net hosts.


[ snip ]


> I can probably reproduce this pretty easily, if anyone is interested and can
> give me some idea where to look for the cause...


Don't bother.  The problem is that your disks are IDE disks and you 
don't have IRQ unmasking enabled on some/all of them.  As long as that's 
the case, heavy disk activity (whether it's a RAID5 resync or a bonnie 
run or untar'ing a kernel archive) will always cause your PPP connection 
to quit working due to dropped serial data and therefore corrupted PPP 
packets.


-- 

  Doug Ledford <dledford@redhat.com>  http://people.redhat.com/dledford
       Please check my web site for aic7xxx updates/answers before
                       e-mailing me about problems


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.9-ac1 RAID-5 resync causes PPP connection to be unusable
  2001-08-30 17:59 ` Doug Ledford
@ 2001-08-31  4:23   ` Kevin P. Fleming
  2001-08-31 15:31     ` Andreas Dilger
  0 siblings, 1 reply; 7+ messages in thread
From: Kevin P. Fleming @ 2001-08-31  4:23 UTC (permalink / raw)
  To: Doug Ledford; +Cc: linux-kernel

OK, I see that now... and it looks like the risks associated with setting
the unmaskirq flags on my drives (none of the four drives have it set now)
are too great to be worth playing with it. I'll just not use my PPP
connection during these particularly heavy disk activity moments. Thanks for
the quick response.

----- Original Message -----
From: "Doug Ledford" <dledford@redhat.com>
To: "Kevin P. Fleming" <kevin@labsysgrp.com>
Cc: <linux-kernel@vger.kernel.org>
Sent: Thursday, August 30, 2001 10:59 AM
Subject: Re: 2.4.9-ac1 RAID-5 resync causes PPP connection to be unusable


> Kevin P. Fleming wrote:
>
> > I ran into a very strange problem yesterday... my server here, which is
a
> > 700 MHz Celeron, 256MiB RAM, four ~40G disks has two RAID-5 arrays
(using
> > the standard kernel MD driver) configured across those four drives. For
some
> > reason definitely related to operator error, the machine crashed and
needed
> > to resync the arrays after being rebooted.
> >
> > Eveything was working fine, interactive response was just fine even
though
> > the drives were just cranking away doing their resync. I then brought up
my
> > PPP Internet connection, which came up just fine. However, I was _not_
able
> > to actually communicate with any 'Net hosts.
>
>
> [ snip ]
>
>
> > I can probably reproduce this pretty easily, if anyone is interested and
can
> > give me some idea where to look for the cause...
>
>
> Don't bother.  The problem is that your disks are IDE disks and you
> don't have IRQ unmasking enabled on some/all of them.  As long as that's
> the case, heavy disk activity (whether it's a RAID5 resync or a bonnie
> run or untar'ing a kernel archive) will always cause your PPP connection
> to quit working due to dropped serial data and therefore corrupted PPP
> packets.
>
>
> --
>
>   Doug Ledford <dledford@redhat.com>  http://people.redhat.com/dledford
>        Please check my web site for aic7xxx updates/answers before
>                        e-mailing me about problems
>
>
>
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.9-ac1 RAID-5 resync causes PPP connection to be unusable
  2001-08-31  4:23   ` Kevin P. Fleming
@ 2001-08-31 15:31     ` Andreas Dilger
  2001-08-31 15:43       ` Martin Josefsson
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Dilger @ 2001-08-31 15:31 UTC (permalink / raw)
  To: Kevin P. Fleming; +Cc: Doug Ledford, linux-kernel

On Aug 30, 2001  21:23 -0700, Kevin P. Fleming wrote:
> OK, I see that now... and it looks like the risks associated with setting
> the unmaskirq flags on my drives (none of the four drives have it set now)
> are too great to be worth playing with it. I'll just not use my PPP
> connection during these particularly heavy disk activity moments. Thanks for
> the quick response.

There was a kernel patch (or possibly a user-space tool) which allowed
one to change the "priority" of IRQs and their handlers.  This was back
in the 1.2 or 2.0 days, when _any_ disk or other interrupt activity might
be enough to cause problems for serial connections (especially if you
only had a 16450 UART (1 byte buffer) instead of a 16550 (16 byte buffer).
You could make your serial interrupt (handler) take priority over disk
interrupts.

Maybe Ted Ts'o or other long-time Linux folks will know what was actually
called, and whether it is still applicable to modern hardware/kernel.

Cheers, Andreas
-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.9-ac1 RAID-5 resync causes PPP connection to be unusable
  2001-08-31 15:31     ` Andreas Dilger
@ 2001-08-31 15:43       ` Martin Josefsson
  0 siblings, 0 replies; 7+ messages in thread
From: Martin Josefsson @ 2001-08-31 15:43 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Kevin P. Fleming, Doug Ledford, linux-kernel

On Fri, 31 Aug 2001, Andreas Dilger wrote:

> On Aug 30, 2001  21:23 -0700, Kevin P. Fleming wrote:
> > OK, I see that now... and it looks like the risks associated with setting
> > the unmaskirq flags on my drives (none of the four drives have it set now)
> > are too great to be worth playing with it. I'll just not use my PPP
> > connection during these particularly heavy disk activity moments. Thanks for
> > the quick response.
> 
> There was a kernel patch (or possibly a user-space tool) which allowed
> one to change the "priority" of IRQs and their handlers.  This was back
> in the 1.2 or 2.0 days, when _any_ disk or other interrupt activity might
> be enough to cause problems for serial connections (especially if you
> only had a 16450 UART (1 byte buffer) instead of a 16550 (16 byte buffer).
> You could make your serial interrupt (handler) take priority over disk
> interrupts.
> 
> Maybe Ted Ts'o or other long-time Linux folks will know what was actually
> called, and whether it is still applicable to modern hardware/kernel.

It was called irqtune, http://www.best.com/~cae/irqtune/
But I don't know if it still works with newer hardware/kernel

/Martin


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.9-ac1 RAID-5 resync causes PPP connection to be unusable
@ 2001-08-31 16:02 david
  0 siblings, 0 replies; 7+ messages in thread
From: david @ 2001-08-31 16:02 UTC (permalink / raw)
  To: Kevin P. Fleming; +Cc: linux-kernel


"Kevin P. Fleming" <kevin@labsysgrp.com>  wrote:
> OK, I see that now... and it looks like the risks associated with
> setting the unmaskirq flags on my drives (none of the four drives have
> it set now) are too great to be worth playing with it. I'll just not
> use my PPP connection during these particularly heavy disk activity
> moments. Thanks for the quick response.

I don't think that the unmask irq thing is really a problem for any modern
system.  Since the days of 1.2 I've run every system with -u 1.  It's not
a case of: '-u 1' gives a .01% chance of corruption on any system, instead
it's a case of '-u 1' gives a 100% chance of corruption on certain
systems, see the difference?

In short, try the -u 1 cautiously (maybe on a r/o fs, or have backups) if
you're paranoid, but if your system is modern, have no fears.

DISCLAIMER: *if* your system does eat itself, it wasn't me that told you
it wouldn't.

David



-- 
David Mansfield                                           (718) 963-2020
david@ultramaster.com
Ultramaster Group, LLC                               www.ultramaster.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* 2.4.9-ac1 RAID-5 resync causes PPP connection to be unusable
@ 2001-08-31  2:12 Samium Gromoff
  0 siblings, 0 replies; 7+ messages in thread
From: Samium Gromoff @ 2001-08-31  2:12 UTC (permalink / raw)
  To: linux-kernel; +Cc: kevin

> I can probably reproduce this pretty easily, if anyone is interested and can
> give me some idea where to look for the cause...
      So did "hdparm -u1 /dev/yourdrive(s)" fixed the problem?
   i have seen something quite like that, though that was an Am5x86 with
   IBM Deskstar 75GXP...
      In my case unmaskirq didn`t helped.
   And ppp interface errcount surely was being increased,,

cheers,
 Sam

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-08-31 15:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-30 17:21 2.4.9-ac1 RAID-5 resync causes PPP connection to be unusable Kevin P. Fleming
2001-08-30 17:59 ` Doug Ledford
2001-08-31  4:23   ` Kevin P. Fleming
2001-08-31 15:31     ` Andreas Dilger
2001-08-31 15:43       ` Martin Josefsson
2001-08-31  2:12 Samium Gromoff
2001-08-31 16:02 david

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).