All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Darrick J . Wong" <djwong@kernel.org>,
	John Hubbard <jhubbard@nvidia.com>,
	Jason Gunthorpe <jgg@nvidia.com>, Hugh Dickins <hughd@google.com>
Subject: [PATCH v1 0/2] mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly
Date: Thu, 14 Mar 2024 17:12:58 +0100	[thread overview]
Message-ID: <20240314161300.382526-1-david@redhat.com> (raw)

Derrick reports that in some cases where pread() would fail with -EIO and
mmap()+access would generate a SIGBUS signal, MADV_POPULATE_READ /
MADV_POPULATE_WRITE will keep retrying forever and not fail with -EFAULT.

It all boils down to missing VM_FAULT_RETRY handling. Let's try to handle
that in a better way, similar to how ordinary GUP handles it.

Details in patch #1. In short, move special MADV_POPULATE_(READ|WRITE)
VMA handling into __get_user_pages(), and make faultin_page_range()
call __get_user_pages_locked(), which handles VM_FAULT_RETRY. Further,
avoid the now-useless madvise VMA walk, because __get_user_pages() will
perform the VMA lookup either way.

I briefly played with handling the FOLL_MADV_POPULATE checks in
__get_user_pages() a bit differently, integrating them with existing
handling, but it ended up looking worse. So I decided to keep it simple.

Likely, we need better selftests, but the reproducer from Darrick might
be a bit hard to convert into a simple selftest.

Note that using mlock() in Darricks reproducer results in a similar
endless retry. Likely, that is not what we want, and we should handle
VM_FAULT_RETRY in populate_vma_page_range() / __mm_populate() as well.
However, similarly using __get_user_pages_locked() might be more
complicated, because of the advanced VMA handling in
populate_vma_page_range().

Further, most populate_vma_page_range() callers simply ignore the return
values, so it's unclear in which cases we expect to just silently fail, or
where we'd want to retry+fail or endlessly retry instead.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Hugh Dickins <hughd@google.com>

David Hildenbrand (2):
  mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY
    properly
  mm/madvise: don't perform madvise VMA walk for
    MADV_POPULATE_(READ|WRITE)

 mm/gup.c      | 54 ++++++++++++++++++++++++++++++---------------------
 mm/internal.h | 10 ++++++----
 mm/madvise.c  | 43 +++++++++++++---------------------------
 3 files changed, 52 insertions(+), 55 deletions(-)


base-commit: f48159f866f422371bb1aad10eb4d05b29ca4d8c
-- 
2.43.2


             reply	other threads:[~2024-03-14 16:13 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-14 16:12 David Hildenbrand [this message]
2024-03-14 16:12 ` [PATCH v1 1/2] mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly David Hildenbrand
2024-03-14 16:13 ` [PATCH v1 2/2] mm/madvise: don't perform madvise VMA walk for MADV_POPULATE_(READ|WRITE) David Hildenbrand
2024-03-15  2:25 ` [PATCH v1 0/2] mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly Darrick J. Wong
2024-03-17 16:50 ` Darrick J. Wong
2024-03-17 16:51 ` [RFC PATCH] xfs_io: add linux madvise advice codes Darrick J. Wong
2024-03-17 16:53   ` [RFC PATCH] fstests: test MADV_POPULATE_READ with IO errors Darrick J. Wong
2024-03-17 21:14     ` Christoph Hellwig
2024-03-19  8:59     ` David Hildenbrand
2024-03-17 21:14   ` [RFC PATCH] xfs_io: add linux madvise advice codes Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240314161300.382526-1-david@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=djwong@kernel.org \
    --cc=hughd@google.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.