Re: [PATCH v3 0/2] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Al Viro <viro@ZenIV.linux.org.uk>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Linux NFS list <linux-nfs@vger.kernel.org>,
	ceph-devel@vger.kernel.org, lustre-devel@lists.lustre.org,
	v9fs-developer@lists.sourceforge.net,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jan Kara <jack@suse.cz>, Chris Wilson <chris@chris-wilson.co.uk>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Jeff Layton <jlayton@redhat.com>
Subject: Re: [PATCH v3 0/2] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call
Date: Sun, 5 Feb 2017 01:51:49 +0000	[thread overview]
Message-ID: <20170205015145.GB13195@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAJfpegtVb8PKNnKe5wGMd0u0WzgLpjpVtVpqDScbrBJShLAfGw@mail.gmail.com>

On Sat, Feb 04, 2017 at 11:11:27PM +0100, Miklos Szeredi wrote:

> Well, it's not historical; at least not yet.  The deadlock is there
> alright: mmap fuse file to addr; read byte from mapped page -> page
> locked; this triggeres read request served in same process but
> separate thread; write addr-headerlen to fuse dev; trying to lock same
> page -> deadlock.

Let me see if I got it straight - you have the same fuse file mmapped
in two processes, one of them being fuse server (either sharing the
entire address space, or the same area mapped in both).  Another process
faults the sucker in; filemap_fault() locks the page and goes
fuse_readpage() -> fuse_do_readpage() -> fuse_send_read() ->
-> fuse_request_send() -> __fuse_request_send() which puts request into
queue and goes to sleep in request_wait_answer().  Eventually, read()
on /dev/fuse (or splice(), whatever) by server picks that request and reply
is formed and fed back into /dev/fuse.  There we (in fuse_do_dev_write())
call copy_out_args(), which tries to copy into our (still locked) page
a piece of data coming from server-supplied iovec.  As it is, you
are calling get_user_pages_fast(), triggering handle_mm_fault().  Since that
malicous FPOS of a server tried to feed you the _same_ mmapped file, you
hit a deadlock.  In server's context.  Correct?

Convoluted, but possible.  But.  Why the hell do we care whether that deadlock
hits in get_user_pages_fast() or in copy_from_user()?  Put it another way,
what difference does it make whether we take that fault with or without
FR_LOCKED in req->flags?

> The deadlock can be broken by aborting or force unmounting: return
> error for original read request; page unlocked; device write can get
> page lock and return.
> 
> The reason we need to prohibit pagefault while copying is that when
> request is aborted and the caller returns the memory in the request
> may become invalid (e.g. data from stack).

???

IDGI.  Your request is marked aborted and should presumably fail, so
that when request_wait_answer() wakes up and finds it screwed, fuse_readpage()
would just return an error and filemap_fault() will return VM_FAULT_SIGBUS,
with page left not uptodate and _not_ inserted into page tables.  What's
leaking where?

WARNING: multiple messages have this Message-ID (diff)

From: Al Viro <viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
To: Miklos Szeredi <miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org>
Cc: Linux NFS list
	<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>,
	Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Chris Wilson
	<chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org>,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	v9fs-developer-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org,
	ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	"Kirill A. Shutemov"
	<kirill.shutemov-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	lustre-devel-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org
Subject: Re: [PATCH v3 0/2] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call
Date: Sun, 5 Feb 2017 01:51:49 +0000	[thread overview]
Message-ID: <20170205015145.GB13195@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAJfpegtVb8PKNnKe5wGMd0u0WzgLpjpVtVpqDScbrBJShLAfGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Sat, Feb 04, 2017 at 11:11:27PM +0100, Miklos Szeredi wrote:

> Well, it's not historical; at least not yet.  The deadlock is there
> alright: mmap fuse file to addr; read byte from mapped page -> page
> locked; this triggeres read request served in same process but
> separate thread; write addr-headerlen to fuse dev; trying to lock same
> page -> deadlock.

Let me see if I got it straight - you have the same fuse file mmapped
in two processes, one of them being fuse server (either sharing the
entire address space, or the same area mapped in both).  Another process
faults the sucker in; filemap_fault() locks the page and goes
fuse_readpage() -> fuse_do_readpage() -> fuse_send_read() ->
-> fuse_request_send() -> __fuse_request_send() which puts request into
queue and goes to sleep in request_wait_answer().  Eventually, read()
on /dev/fuse (or splice(), whatever) by server picks that request and reply
is formed and fed back into /dev/fuse.  There we (in fuse_do_dev_write())
call copy_out_args(), which tries to copy into our (still locked) page
a piece of data coming from server-supplied iovec.  As it is, you
are calling get_user_pages_fast(), triggering handle_mm_fault().  Since that
malicous FPOS of a server tried to feed you the _same_ mmapped file, you
hit a deadlock.  In server's context.  Correct?

Convoluted, but possible.  But.  Why the hell do we care whether that deadlock
hits in get_user_pages_fast() or in copy_from_user()?  Put it another way,
what difference does it make whether we take that fault with or without
FR_LOCKED in req->flags?

> The deadlock can be broken by aborting or force unmounting: return
> error for original read request; page unlocked; device write can get
> page lock and return.
> 
> The reason we need to prohibit pagefault while copying is that when
> request is aborted and the caller returns the memory in the request
> may become invalid (e.g. data from stack).

???

IDGI.  Your request is marked aborted and should presumably fail, so
that when request_wait_answer() wakes up and finds it screwed, fuse_readpage()
would just return an error and filemap_fault() will return VM_FAULT_SIGBUS,
with page left not uptodate and _not_ inserted into page tables.  What's
leaking where?

WARNING: multiple messages have this Message-ID (diff)

From: Al Viro <viro@ZenIV.linux.org.uk>
To: Miklos Szeredi <miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org>
Cc: Linux NFS list
	<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>,
	Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Chris Wilson
	<chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org>,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	v9fs-developer-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org,
	ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	"Kirill A. Shutemov"
	<kirill.shutemov-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	lustre-devel-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org
Subject: [lustre-devel] [PATCH v3 0/2] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call
Date: Sun, 5 Feb 2017 01:51:49 +0000	[thread overview]
Message-ID: <20170205015145.GB13195@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAJfpegtVb8PKNnKe5wGMd0u0WzgLpjpVtVpqDScbrBJShLAfGw@mail.gmail.com>

On Sat, Feb 04, 2017 at 11:11:27PM +0100, Miklos Szeredi wrote:

> Well, it's not historical; at least not yet.  The deadlock is there
> alright: mmap fuse file to addr; read byte from mapped page -> page
> locked; this triggeres read request served in same process but
> separate thread; write addr-headerlen to fuse dev; trying to lock same
> page -> deadlock.

Let me see if I got it straight - you have the same fuse file mmapped
in two processes, one of them being fuse server (either sharing the
entire address space, or the same area mapped in both).  Another process
faults the sucker in; filemap_fault() locks the page and goes
fuse_readpage() -> fuse_do_readpage() -> fuse_send_read() ->
-> fuse_request_send() -> __fuse_request_send() which puts request into
queue and goes to sleep in request_wait_answer().  Eventually, read()
on /dev/fuse (or splice(), whatever) by server picks that request and reply
is formed and fed back into /dev/fuse.  There we (in fuse_do_dev_write())
call copy_out_args(), which tries to copy into our (still locked) page
a piece of data coming from server-supplied iovec.  As it is, you
are calling get_user_pages_fast(), triggering handle_mm_fault().  Since that
malicous FPOS of a server tried to feed you the _same_ mmapped file, you
hit a deadlock.  In server's context.  Correct?

Convoluted, but possible.  But.  Why the hell do we care whether that deadlock
hits in get_user_pages_fast() or in copy_from_user()?  Put it another way,
what difference does it make whether we take that fault with or without
FR_LOCKED in req->flags?

> The deadlock can be broken by aborting or force unmounting: return
> error for original read request; page unlocked; device write can get
> page lock and return.
> 
> The reason we need to prohibit pagefault while copying is that when
> request is aborted and the caller returns the memory in the request
> may become invalid (e.g. data from stack).

???

IDGI.  Your request is marked aborted and should presumably fail, so
that when request_wait_answer() wakes up and finds it screwed, fuse_readpage()
would just return an error and filemap_fault() will return VM_FAULT_SIGBUS,
with page left not uptodate and _not_ inserted into page tables.  What's
leaking where?

next prev parent reply	other threads:[~2017-02-05  2:09 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-24 21:23 [PATCH] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call Jeff Layton
2017-01-25 13:32 ` [PATCH v3 0/2] " Jeff Layton
2017-01-25 13:32   ` [PATCH v3 1/2] " Jeff Layton
2017-01-26 12:35     ` Jeff Layton
2017-01-27 13:24       ` [PATCH v4 0/2] " Jeff Layton
2017-01-27 13:24         ` [PATCH v4 1/2] " Jeff Layton
2017-01-27 13:24         ` [PATCH v4 2/2] ceph: switch DIO code to use iov_iter_get_pages_alloc Jeff Layton
2017-01-30 15:40           ` Jeff Layton
2017-01-30 15:40             ` Jeff Layton
2017-01-25 13:32   ` [PATCH v3 " Jeff Layton
2017-02-02  9:51   ` [PATCH v3 0/2] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call Al Viro
2017-02-02  9:51     ` [lustre-devel] " Al Viro
2017-02-02  9:51     ` Al Viro
2017-02-02 10:56     ` Christoph Hellwig
2017-02-02 10:56       ` [lustre-devel] " Christoph Hellwig
2017-02-02 10:56       ` Christoph Hellwig
2017-02-02 11:16       ` Al Viro
2017-02-02 11:16         ` [lustre-devel] " Al Viro
2017-02-02 11:16         ` Al Viro
2017-02-02 13:00         ` Jeff Layton
2017-02-02 13:00           ` Jeff Layton
2017-02-03  7:29           ` Al Viro
2017-02-03  7:29             ` [lustre-devel] " Al Viro
2017-02-03  7:29             ` Al Viro
2017-02-03 18:29             ` Linus Torvalds
2017-02-03 18:29               ` [lustre-devel] " Linus Torvalds
2017-02-03 18:29               ` Linus Torvalds
2017-02-03 19:08               ` Al Viro
2017-02-03 19:08                 ` [lustre-devel] " Al Viro
2017-02-03 19:08                 ` Al Viro
2017-02-03 19:28                 ` Linus Torvalds
2017-02-03 19:28                   ` [lustre-devel] " Linus Torvalds
2017-02-03 19:28                   ` Linus Torvalds
2017-02-13  9:56                   ` Steve Capper
2017-02-13 21:40                     ` Linus Torvalds
2017-02-13 21:40                       ` [lustre-devel] " Linus Torvalds
2017-02-13 21:40                       ` Linus Torvalds
2017-02-03  7:49           ` Christoph Hellwig
2017-02-03  7:49             ` [lustre-devel] " Christoph Hellwig
2017-02-03  7:49             ` Christoph Hellwig
2017-02-03  8:54             ` Al Viro
2017-02-03  8:54               ` [lustre-devel] " Al Viro
2017-02-03  8:54               ` Al Viro
2017-02-03 11:09               ` Christoph Hellwig
2017-02-03 11:09                 ` [lustre-devel] " Christoph Hellwig
2017-02-03 11:09                 ` Christoph Hellwig
2017-02-02 14:48     ` Jan Kara
2017-02-02 14:48       ` [lustre-devel] " Jan Kara
2017-02-02 14:48       ` Jan Kara
2017-02-02 18:28       ` Al Viro
2017-02-02 18:28         ` [lustre-devel] " Al Viro
2017-02-02 18:28         ` Al Viro
2017-02-03 14:47         ` Jan Kara
2017-02-03 14:47           ` [lustre-devel] " Jan Kara
2017-02-03 14:47           ` Jan Kara
2017-02-04  3:08     ` Al Viro
2017-02-04  3:08       ` [lustre-devel] " Al Viro
2017-02-04  3:08       ` Al Viro
2017-02-04 19:26       ` Al Viro
2017-02-04 19:26         ` [lustre-devel] " Al Viro
2017-02-04 19:26         ` Al Viro
2017-02-04 22:12         ` Miklos Szeredi
2017-02-04 22:12           ` Miklos Szeredi
2017-02-04 22:11       ` Miklos Szeredi
2017-02-04 22:11         ` Miklos Szeredi
2017-02-05  1:51         ` Al Viro [this message]
2017-02-05  1:51           ` [lustre-devel] " Al Viro
2017-02-05  1:51           ` Al Viro
2017-02-05 20:15           ` Miklos Szeredi
2017-02-05 20:15             ` Miklos Szeredi
2017-02-05 21:01             ` Al Viro
2017-02-05 21:01               ` [lustre-devel] " Al Viro
2017-02-05 21:01               ` Al Viro
2017-02-05 21:19               ` Miklos Szeredi
2017-02-05 21:19                 ` Miklos Szeredi
2017-02-05 22:04                 ` Al Viro
2017-02-05 22:04                   ` [lustre-devel] " Al Viro
2017-02-05 22:04                   ` Al Viro
2017-02-05 22:04                   ` Al Viro
2017-02-06  3:05                   ` Al Viro
2017-02-06  3:05                     ` [lustre-devel] " Al Viro
2017-02-06  3:05                     ` Al Viro
2017-02-06  9:08                     ` Miklos Szeredi
2017-02-06  9:57                       ` Al Viro
2017-02-06  9:57                         ` [lustre-devel] " Al Viro
2017-02-06  9:57                         ` Al Viro
2017-02-06 14:18                         ` Miklos Szeredi
2017-02-07  7:19                           ` Al Viro
2017-02-07  7:19                             ` [lustre-devel] " Al Viro
2017-02-07  7:19                             ` Al Viro
2017-02-07 11:35                             ` Miklos Szeredi
2017-02-07 11:35                               ` Miklos Szeredi
2017-02-08  5:54                               ` Al Viro
2017-02-08  5:54                                 ` [lustre-devel] " Al Viro
2017-02-08  5:54                                 ` Al Viro
2017-02-08  9:53                                 ` Miklos Szeredi
2017-02-06  8:37                   ` Miklos Szeredi
2017-02-05 20:56           ` Al Viro
2017-02-05 20:56             ` [lustre-devel] " Al Viro
2017-02-05 20:56             ` Al Viro
2017-02-16 13:10     ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170205015145.GB13195@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=ceph-devel@vger.kernel.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=jack@suse.cz \
    --cc=jlayton@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=lustre-devel@lists.lustre.org \
    --cc=miklos@szeredi.hu \
    --cc=torvalds@linux-foundation.org \
    --cc=v9fs-developer@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.