linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Whitehouse <steve@gw.chygwyn.com>
To: ptb@it.uc3m.es
Cc: alan@lxorguk.ukuu.org.uk, chen_xiangping@emc.com,
	kumbera@yahoo.com, linux-kernel@vger.kernel.org (linux kernel)
Subject: Re: Kernel deadlock using nbd over acenic driver
Date: Thu, 16 May 2002 09:33:42 +0100 (BST)	[thread overview]
Message-ID: <200205160833.JAA24899@gw.chygwyn.com> (raw)
In-Reply-To: <200205152143.g4FLhLs17344@oboe.it.uc3m.es> from "Peter T. Breuer" at May 15, 2002 11:43:21 PM

Hi,

> 
> (Addresses got munged locally, but as I'm postmaster, I get the mail
> after 26 bounces, so no hassle ...)
> 
Ok. I was wodering after the bounce message that I got :-)

> Let's see if I follow ...
> 
> > thanks for the info. I'm starting to form some ideas of what the problem
> > with nbd might be. Here is my initial idea of what might be going on:
> > 
> >  1. Something happens which starts lots of I/O (e.g. the ext3/xfs journal
> >     flush that Xiangping says usually triggers the problem)
> 
> Is this any kind of i/o? Such as swapping? You mean something which
> takes the i/o lock, or which generally exercises the i/o system .. And
> are there any particular characteristics to the "a lot" that you have
> in mind, such as maybe running us out of requests on that device (no), or 
> running us low on available free buffers (yes?).
> 
Probably swapping would trigger it too, though thats the "difficult" case
so I've been ignoring that one up till now :-)

> >  2. One of the threads doing the writes blocks running the device I/O
> 
> If a thread blocks running its own device i/o queue, that would be 
> a fatal error for most of the kernel. The i/o lock is held. And -
> correct me on this - interrupts are disabled?
> 
> So I assume you are talking about "a bug in something, somewhere".
> 
No. The kernel nbd drops the io request lock before it does anything
which might block, so its ok from that point of view. I suspect that
we'll find that its not a bug in one particular bit of code but two
subsystems which are making assumptions about how the other works, which
whilst being perfectly reasonable on their own conflict in a way which
causes the deadlock we see.

[snip]
> 
> >     only need to have each memory zones free pages just below pages_min
> >     at the right time to trigger this.
> 
> I don't understand the specific allusion, but I gather you are talking
> about low free pages. Yes, being run out of memory matches the reports.
> Particularly the people who are swapping over nbd under memory pressure
> are in that situation.
> 
> So - is that situation handled differently in the old VM?
> 
I'm not enough of an expert on the changes that have gone on to answer
that one, the VM isn't really my area of the kernel.

I think I've answered the other points that you make in my other reply
which I sent a few moments ago, let me know if I missed something.

It would be nice to have a per device "max dirty pages" limit. Also useful 
would be a per queue priority so that if the max dirty pages limit is reached 
for that device, then the driver gets higher priority on memory allocations 
until the number of dirty pages has dropped below an acceptable level. I
don't know how easy or desireable it would be to implement such a scheme
generally though.

Steve.

  reply	other threads:[~2002-05-16  8:53 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-05-15 21:43 Kernel deadlock using nbd over acenic driver Peter T. Breuer
2002-05-16  8:33 ` Steven Whitehouse [this message]
  -- strict thread matches above, loose matches on Subject: below --
2002-05-16 22:54 Peter T. Breuer
2002-05-17  8:44 ` Steven Whitehouse
2002-05-23 13:21   ` Peter T. Breuer
2002-05-24 10:11     ` Steven Whitehouse
2002-05-24 11:43       ` Peter T. Breuer
2002-05-24 13:28         ` Steven Whitehouse
2002-05-24 15:54           ` Peter T. Breuer
2002-05-27 13:04             ` Steven Whitehouse
2002-05-27 19:51               ` Peter T. Breuer
2002-05-27 13:44         ` Pavel Machek
2002-05-29 10:51           ` Peter T. Breuer
2002-05-29 11:21             ` Pavel Machek
2002-05-29 12:10               ` Peter T. Breuer
2002-05-29 13:24                 ` Jens Axboe
2002-06-01 21:13       ` Peter T. Breuer
2002-06-05  8:48         ` Steven Whitehouse
2002-06-02  6:39           ` Pavel Machek
     [not found] <3CE40A77.22C74DC1@zip.com.au>
2002-05-16 20:28 ` Peter T. Breuer
2002-05-16 13:18 chen, xiangping
2002-05-15 17:43 Peter T. Breuer
2002-05-15 19:43 ` Steven Whitehouse
2002-05-16  5:15   ` Peter T. Breuer
2002-05-16  8:04     ` Steven Whitehouse
2002-05-16  8:49       ` Peter T. Breuer
2002-05-15 16:01 Peter T. Breuer
2002-05-14 17:42 chen, xiangping
2002-05-14 17:36 chen, xiangping
2002-05-14 18:02 ` Alan Cox
2002-05-14 16:07 chen, xiangping
2002-05-14 16:32 ` Steven Whitehouse
2002-05-14 16:48 ` Alan Cox
2002-05-15 22:31 ` Oliver Xymoron
2002-05-16  5:10   ` Peter T. Breuer
2002-05-16  5:19     ` Peter T. Breuer
2002-05-16 14:29       ` Oliver Xymoron
2002-05-16 15:35         ` Peter T. Breuer
2002-05-16 16:22           ` Oliver Xymoron
2002-05-16 16:45             ` Peter T. Breuer
2002-05-16 16:35               ` Steven Whitehouse
2002-05-17  7:01                 ` Peter T. Breuer
2002-05-17  9:26                   ` Steven Whitehouse
2002-05-14 15:05 chen, xiangping
2002-05-14 15:11 ` Jes Sorensen
2002-05-10 15:39 chen, xiangping
2002-05-10 15:02 chen, xiangping
2002-05-10 15:11 ` Steven Whitehouse
2002-05-14 14:58 ` Jes Sorensen
2002-05-06 15:05 chen, xiangping
2002-05-07  8:15 ` Steven Whitehouse
2002-05-06  2:26 chen, xiangping
2002-05-06  8:45 ` Steven Whitehouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200205160833.JAA24899@gw.chygwyn.com \
    --to=steve@gw.chygwyn.com \
    --cc=Steve@ChyGwyn.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=chen_xiangping@emc.com \
    --cc=kumbera@yahoo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ptb@it.uc3m.es \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).