linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Dave Chinner" <david@fromorbit.com>,
	"Ira Weiny" <ira.weiny@intel.com>, "Jan Kara" <jack@suse.cz>,
	"Jason Gunthorpe" <jgg@ziepe.ca>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	"Michal Hocko" <mhocko@suse.com>,
	"Mike Kravetz" <mike.kravetz@oracle.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Matthew Wilcox" <willy@infradead.org>,
	linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kselftest@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-mm@kvack.org,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [regression] Re: [PATCH v6 06/12] mm/gup: track FOLL_PIN pages
Date: Fri, 24 Apr 2020 15:58:29 -0700	[thread overview]
Message-ID: <665ffb48-d498-90f4-f945-997a922fc370@nvidia.com> (raw)
In-Reply-To: <20200424141548.5afdd2bb@w520.home>

On 2020-04-24 13:15, Alex Williamson wrote:
> On Fri, 24 Apr 2020 12:20:03 -0700
> John Hubbard <jhubbard@nvidia.com> wrote:
> 
>> On 2020-04-24 11:18, Alex Williamson wrote:
>> ...
>>> Hi John,
>>>
>>> I'm seeing a regression bisected back to this commit (3faa52c03f44
>>> mm/gup: track FOLL_PIN pages).  I've attached some vfio-pci test code
>>> that reproduces this by mmap'ing a page of MMIO space of a device and
>>> then tries to map that through the IOMMU, so this should be attempting
>>> a gup/pin of a PFNMAP page.  Previously this failed gracefully (-EFAULT),
>>> but now results in:
>>
>>
>> Hi Alex,
>>
>> Thanks for this report, and especially for source code to test it,
>> seeing as how I can't immediately spot the problem just from the crash
>> data so far.  I'll get set up and attempt a repro.
>>
>> Actually this looks like it should be relatively easier than the usual
>> sort of "oops, we leaked a pin_user_pages() or unpin_user_pages() call,
>> good luck finding which one" report that I fear the most. :) This one
>> looks more like a crash that happens directly, when calling into the
>> pin_user_pages_remote() code. Which should be a lot easier to solve...
>>
>> btw, if you are set up for it, it would be nice to know what source file
>> and line number corresponds to the RIP (get_pfnblock_flags_mask+0x22)
>> below. But if not, no problem, because I've likely got to do the repro
>> in any case.
> 
> Hey John,
> 
> TBH I'm feeling a lot less confident about this bisect.  This was
> readily reproducible to me on a clean tree a bit ago, but now it
> eludes me.  Let me go back and figure out what's going on before you
> spend any more time on it.  Thanks,
> 

OK. But I'm keeping the repro program! :)  It made it quick and easy to
set up a vfio test, so it was worth doing in any case.

Anyway, I wanted to double check this just out of paranoia, and so
now I have a data point for you: your test program runs and passes for
me using today's linux.git kernel, with an NVIDIA GPU as the vfio
device, and the kernel log is clean. No hint of any problems.

I traced it a little bit:

# sudo bpftrace -e kprobe:__get_user_pages { @[kstack()] = count(); }
Attaching 1 probe...
^C
...
@[
     __get_user_pages+1
     __gup_longterm_locked+176
     vaddr_get_pfn+104
     vfio_pin_pages_remote+113
     vfio_dma_do_map+760
     vfio_iommu_type1_ioctl+761
     ksys_ioctl+135
     __x64_sys_ioctl+22
     do_syscall_64+90
     entry_SYSCALL_64_after_hwframe+73
]: 1

...and also verified that it's not actually pinning any pages with that
path:

$ grep foll_pin /proc/vmstat
nr_foll_pin_acquired 0
nr_foll_pin_released 0


Good luck and let me know if it starts pointing to FOLL_PIN or gup, etc.

thanks,
-- 
John Hubbard
NVIDIA

  reply	other threads:[~2020-04-24 22:58 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-11  0:15 [PATCH v6 00/12] mm/gup: track FOLL_PIN pages John Hubbard
2020-02-11  0:15 ` [PATCH v6 01/12] mm/gup: split get_user_pages_remote() into two routines John Hubbard
2020-02-11  0:15 ` [PATCH v6 02/12] mm/gup: pass a flags arg to __gup_device_* functions John Hubbard
2020-02-11  0:15 ` [PATCH v6 03/12] mm: introduce page_ref_sub_return() John Hubbard
2020-02-11  0:15 ` [PATCH v6 04/12] mm/gup: pass gup flags to two more routines John Hubbard
2020-02-11  0:15 ` [PATCH v6 05/12] mm/gup: require FOLL_GET for get_user_pages_fast() John Hubbard
2020-02-11  0:15 ` [PATCH v6 06/12] mm/gup: track FOLL_PIN pages John Hubbard
2020-04-24 18:18   ` [regression] " Alex Williamson
2020-04-24 19:20     ` John Hubbard
2020-04-24 20:15       ` Alex Williamson
2020-04-24 22:58         ` John Hubbard [this message]
2020-04-28 16:54           ` [regression?] " Alex Williamson
2020-04-28 17:49             ` Jason Gunthorpe
2020-04-28 19:07               ` Alex Williamson
2020-04-28 19:22                 ` Jason Gunthorpe
2020-04-28 20:12                   ` Alex Williamson
2020-04-29  0:29                     ` Jason Gunthorpe
2020-04-29 19:56                       ` Alex Williamson
2020-04-29 23:03                         ` Jason Gunthorpe
2020-02-11  0:15 ` [PATCH v6 07/12] mm/gup: page->hpage_pinned_refcount: exact pin counts for huge pages John Hubbard
2020-02-11  0:15 ` [PATCH v6 08/12] mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting John Hubbard
2020-02-12  9:17   ` Jan Kara
2020-02-11  0:15 ` [PATCH v6 09/12] mm/gup_benchmark: support pin_user_pages() and related calls John Hubbard
2020-02-11  0:15 ` [PATCH v6 10/12] selftests/vm: run_vmtests: invoke gup_benchmark with basic FOLL_PIN coverage John Hubbard
2020-02-11  0:15 ` [PATCH v6 11/12] mm: Improve dump_page() for compound pages John Hubbard
2020-02-11  0:15 ` [PATCH v6 12/12] mm: dump_page(): additional diagnostics for huge pinned pages John Hubbard
2020-02-11 13:21   ` Kirill A. Shutemov
2020-02-12  2:10     ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=665ffb48-d498-90f4-f945-997a922fc370@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=shuah@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).