All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: Jeff Moyer <jmoyer@redhat.com>, Avi Kivity <avi@redhat.com>,
	linux-aio <linux-aio@kvack.org>,
	zach.brown@oracle.com, bcrl@kvack.org,
	linux-kernel@vger.kernel.org,
	Davide Libenzi <davidel@xmailserver.org>,
	Christoph Lameter <cl@linux-foundation.org>
Subject: Re: [PATCH] fs: fput() can be called from interrupt context
Date: Wed, 11 Mar 2009 22:47:12 -0700	[thread overview]
Message-ID: <20090311224712.fb8db075.akpm@linux-foundation.org> (raw)
In-Reply-To: <49B89B22.7080303@cosmosbay.com>

On Thu, 12 Mar 2009 06:18:26 +0100 Eric Dumazet <dada1@cosmosbay.com> wrote:

> Eric Dumazet a __crit :
> > Eric Dumazet a __crit :
> >> Eric Dumazet a __crit :
> >>> Jeff Moyer a __crit :
> >>>> Avi Kivity <avi@redhat.com> writes:
> >>>>
> >>>>> Jeff Moyer wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> Believe it or not, I get numerous questions from customers about the
> >>>>>> suggested tuning value of aio-max-nr.  aio-max-nr limits the total
> >>>>>> number of io events that can be reserved, system wide, for aio
> >>>>>> completions.  Each time io_setup is called, a ring buffer is allocated
> >>>>>> that can hold nr_events I/O completions.  That ring buffer is then
> >>>>>> mapped into the process' address space, and the pages are pinned in
> >>>>>> memory.  So, the reason for this upper limit (I believe) is to keep a
> >>>>>> malicious user from pinning all of kernel memory.  Now, this sounds like
> >>>>>> a much better job for the memlock rlimit to me, hence the following
> >>>>>> patch.
> >>>>>>   
> >>>>> Is it not possible to get rid of the pinning entirely?  Pinning
> >>>>> interferes with page migration which is important for NUMA, among
> >>>>> other issues.
> >>>> aio_complete is called from interrupt handlers, so can't block faulting
> >>>> in a page.  Zach mentions there is a possibility of handing completions
> >>>> off to a kernel thread, with all of the performance worries and extra
> >>>> bookkeeping that go along with such a scheme (to help frame my concerns,
> >>>> I often get lambasted over .5% performance regressions).
> >>> This aio_completion from interrupt handlers keep us from using SLAB_DESTROY_BY_RCU
> >>> instead of call_rcu() for "struct file" freeing.
> >>>
> >>> http://lkml.org/lkml/2008/12/17/364
> >>>
> >>> I would love if we could get rid of this mess...
> >> Speaking of that, I tried to take a look at this aio stuff and have one question.
> >>
> >> Assuming that __fput() cannot be called from interrupt context.
> >> -> fput() should not be called from interrupt context as well.
> >>
> >> How comes we call fput(req->ki_eventfd) from really_put_req()
> >> from interrupt context ?
> >>
> >> If user program closes eventfd, then inflight AIO requests can trigger
> >> a bug.
> >>
> > 
> > Path could be :
> > 
> > 1) fput() changes so that calling it from interrupt context is possible
> >    (Using a working queue to make sure __fput() is called from process context)
> > 
> > 2) Changes aio to use fput() as is (and zap its internal work_queue and aio_fput_routine() stuff)
> > 
> > 3) Once atomic_long_dec_and_test(&filp->f_count) only performed in fput(),
> >    SLAB_DESTROY_BY_RCU for "struct file" get back :)
> > 
> 
> Please find first patch against linux-2.6
> 
> Next patch (2) can cleanup aio code, but it probably can wait linux-2.6.30
> 
> Thank you
> 
> [PATCH] fs: fput() can be called from interrupt context
> 
> Current aio/eventfd code can call fput() from interrupt context, which is
> not allowed.

The changelog forgot to tell us where this happens, and under what
circumstances.

See, there might be other ways of fixing the bug,

> In order to fix the problem and prepare SLAB_DESTROY_BY_RCU use for "struct file"
> allocation/freeing in 2.6.30, we might extend existing workqueue infrastructure and
> allow fput() to be called from interrupt context.
> 
> This unfortunalty adds a pointer to 'struct file'.
> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> ---
>  fs/file.c               |   55 ++++++++++++++++++++++++++------------
>  fs/file_table.c         |   10 +++++-
>  include/linux/fdtable.h |    1
>  include/linux/fs.h      |    1
>  4 files changed, 49 insertions(+), 18 deletions(-)

which might not have some or all of the above problems.


I assume you're referring to really_put_req(), and commit
9c3060bedd84144653a2ad7bea32389f65598d40.

>From the above email straggle I extract "If user program closes
eventfd, then inflight AIO requests can trigger a bug" and I don't
immediately see anything in there which would prevent this.

Did you reproduce the bug, and confirm that the patch fixes it?

Are there simpler ways of fixing it?  Maybe sneak a call to
wait_for_all_aios() into the right place?  I doubt if it's performance
critical, as nobody seems to have ever hit the bug.

Bear in mind that if the bug _is_ real then it's now out there, and
we would like a fix which is usable by 2.6.<two-years-worth>.

etcetera..

  parent reply	other threads:[~2009-03-12  5:51 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-09 15:49 [patch] aio: remove aio-max-nr and instead use the memlock rlimit to limit the number of pages pinned for the aio completion ring Jeff Moyer
2009-03-09 15:54 ` [patch] factor out checks against the memlock rlimit Jeff Moyer
2009-03-09 15:59 ` [patch] man-pages: add documentation about the memlock implications of io_setup Jeff Moyer
2009-03-09 16:45   ` Michael Kerrisk
2009-03-09 16:48   ` Michael Kerrisk
2009-03-09 20:44     ` Jeff Moyer
2009-03-09 16:18 ` [patch] aio: remove aio-max-nr and instead use the memlock rlimit to limit the number of pages pinned for the aio completion ring Avi Kivity
2009-03-09 17:57   ` Jeff Moyer
2009-03-09 19:45     ` Avi Kivity
2009-03-09 20:36       ` Jamie Lokier
2009-03-10  8:36         ` Avi Kivity
2009-03-09 20:31     ` Eric Dumazet
2009-03-12  2:39       ` Eric Dumazet
2009-03-12  2:44         ` Benjamin LaHaise
2009-03-12  3:24           ` Eric Dumazet
2009-03-12  3:29             ` Benjamin LaHaise
2009-03-12  3:33               ` Eric Dumazet
2009-03-12  3:36                 ` Benjamin LaHaise
2009-03-12  3:40                   ` Eric Dumazet
2009-03-12  3:09         ` Eric Dumazet
2009-03-12  5:18           ` [PATCH] fs: fput() can be called from interrupt context Eric Dumazet
2009-03-12  5:42             ` [PATCH] aio: " Eric Dumazet
2009-03-12  5:47             ` Andrew Morton [this message]
2009-03-12  6:10               ` [PATCH] fs: " Eric Dumazet
2009-03-12  6:39                 ` Andrew Morton
2009-03-12 13:39                   ` Davide Libenzi
2009-03-13 22:34                     ` Davide Libenzi
2009-03-13 22:43                       ` Eric Dumazet
2009-03-13 23:28                     ` Trond Myklebust
2009-03-14  1:40                       ` Davide Libenzi
2009-03-14  4:02                         ` Trond Myklebust
2009-03-14 14:32                           ` Davide Libenzi
2009-03-15  1:36                             ` [patch] eventfd - remove fput() call from possible IRQ context Davide Libenzi
2009-03-15 17:44                               ` Benjamin LaHaise
2009-03-15 20:08                                 ` [patch] eventfd - remove fput() call from possible IRQ context (2nd rev) Davide Libenzi
2009-03-16 17:25                                   ` Jamie Lokier
2009-03-16 18:36                                     ` Davide Libenzi
2009-03-18 14:22                                   ` Jeff Moyer
2009-03-18 14:46                                     ` Davide Libenzi
2009-03-18 14:55                                     ` Eric Dumazet
2009-03-18 15:25                                       ` Jeff Moyer
2009-03-18 15:43                                         ` Eric Dumazet
2009-03-18 16:13                                           ` Jeff Moyer
2009-03-18 17:25                                     ` [patch] eventfd - remove fput() call from possible IRQ context (3rd rev) Davide Libenzi
2009-03-18 17:34                                       ` Jeff Moyer
2009-03-12 19:22                   ` [PATCH] fs: fput() can be called from interrupt context Eric Dumazet
2009-03-12 20:21                     ` Andrew Morton
2009-03-09 22:36 ` [patch] aio: remove aio-max-nr and instead use the memlock rlimit to limit the number of pages pinned for the aio completion ring Andrew Morton
2009-03-10 13:43   ` Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090311224712.fb8db075.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=bcrl@kvack.org \
    --cc=cl@linux-foundation.org \
    --cc=dada1@cosmosbay.com \
    --cc=davidel@xmailserver.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=zach.brown@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.