All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: "David Hildenbrand" <david@redhat.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	kvm-ppc@vger.kernel.org,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	"KVM list" <kvm@vger.kernel.org>,
	linux-hyperv@vger.kernel.org, devel@driverdev.osuosl.org,
	xen-devel <xen-devel@lists.xenproject.org>,
	"X86 ML" <x86@kernel.org>,
	"Alexander Duyck" <alexander.duyck@gmail.com>,
	"Alexander Duyck" <alexander.h.duyck@linux.intel.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Allison Randal" <allison@lohutok.net>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	"Anshuman Khandual" <anshuman.khandual@arm.com>,
	"Anthony Yznaga" <anthony.yznaga@oracle.com>,
	"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
	"Borislav Petkov" <bp@alien8.de>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Christophe Leroy" <christophe.leroy@c-s.fr>,
	"Cornelia Huck" <cohuck@redhat.com>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Haiyang Zhang" <haiyangz@microsoft.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Isaac J. Manjarres" <isaacm@codeaurora.org>,
	"Jim Mattson" <jmattson@google.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Juergen Gross" <jgross@suse.com>,
	"KarimAllah Ahmed" <karahmed@amazon.de>,
	"Kees Cook" <keescook@chromium.org>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Matt Sickler" <Matt.Sickler@daktronics.com>,
	"Mel Gorman" <mgorman@techsingularity.net>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Michal Hocko" <mhocko@suse.com>,
	"Mike Rapoport" <rppt@linux.ibm.com>,
	"Mike Rapoport" <rppt@linux.vnet.ibm.com>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Oscar Salvador" <osalvador@suse.de>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Paul Mackerras" <paulus@ozlabs.org>,
	"Paul Mackerras" <paulus@samba.org>,
	"Pavel Tatashin" <pasha.tatashin@soleen.com>,
	"Pavel Tatashin" <pavel.tatashin@microsoft.com>,
	"Peter Zijlstra" <peterz@infradead.org>, "Qian Cai" <cai@lca.pw>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Sasha Levin" <sashal@kernel.org>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Stephen Hemminger" <sthemmin@microsoft.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Vitaly Kuznetsov" <vkuznets@redhat.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Wanpeng Li" <wanpengli@tencent.com>,
	YueHaibing <yuehaibing@huawei.com>,
	"Adam Borowski" <kilobyte@angband.pl>
Subject: Re: [PATCH v1 03/10] KVM: Prepare kvm_is_reserved_pfn() for PG_reserved changes
Date: Tue, 5 Nov 2019 15:43:29 -0800	[thread overview]
Message-ID: <CAPcyv4i7tnjyghYhSjK8fxUu8Qkdc2RuD9kUwJcKEMDzOf51ng@mail.gmail.com> (raw)
In-Reply-To: <CAPcyv4iRP0Sz=mcT+iuoVaD4-o2q1nCH2Hixc5OkfWu+SBQmkg@mail.gmail.com>

On Tue, Nov 5, 2019 at 3:30 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Tue, Nov 5, 2019 at 3:13 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Tue, Nov 05, 2019 at 03:02:40PM -0800, Dan Williams wrote:
> > > On Tue, Nov 5, 2019 at 12:31 PM David Hildenbrand <david@redhat.com> wrote:
> > > > > The scarier code (for me) is transparent_hugepage_adjust() and
> > > > > kvm_mmu_zap_collapsible_spte(), as I don't at all understand the
> > > > > interaction between THP and _PAGE_DEVMAP.
> > > >
> > > > The x86 KVM MMU code is one of the ugliest code I know (sorry, but it
> > > > had to be said :/ ). Luckily, this should be independent of the
> > > > PG_reserved thingy AFAIKs.
> > >
> > > Both transparent_hugepage_adjust() and kvm_mmu_zap_collapsible_spte()
> > > are honoring kvm_is_reserved_pfn(), so again I'm missing where the
> > > page count gets mismanaged and leads to the reported hang.
> >
> > When mapping pages into the guest, KVM gets the page via gup(), which
> > increments the page count for ZONE_DEVICE pages.  But KVM puts the page
> > using kvm_release_pfn_clean(), which skips put_page() if PageReserved()
> > and so never puts its reference to ZONE_DEVICE pages.
>
> Oh, yeah, that's busted.

Ugh, it's extra busted because every other gup user in the kernel
tracks the pages resulting from gup and puts them (put_page()) when
they are done. KVM wants to forget about whether it did a gup to get
the page and optionally trigger put_page() based purely on the pfn.
Outside of VFIO device assignment that needs pages pinned for DMA, why
does KVM itself need to pin pages? If pages are pinned over a return
to userspace that needs to be a FOLL_LONGTERM gup.


WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: linux-hyperv@vger.kernel.org, "Michal Hocko" <mhocko@suse.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"KVM list" <kvm@vger.kernel.org>,
	"David Hildenbrand" <david@redhat.com>,
	"KarimAllah Ahmed" <karahmed@amazon.de>,
	"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Alexander Duyck" <alexander.duyck@gmail.com>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Paul Mackerras" <paulus@ozlabs.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"Pavel Tatashin" <pavel.tatashin@microsoft.com>,
	"Paul Mackerras" <paulus@samba.org>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Wanpeng Li" <wanpengli@tencent.com>,
	"Alexander Duyck" <alexander.h.duyck@linux.intel.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Kees Cook" <keescook@chromium.org>,
	devel@driverdev.osuosl.org,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Stephen Hemminger" <sthemmin@microsoft.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	"Joerg Roedel" <joro@8bytes.org>, "X86 ML" <x86@kernel.org>,
	YueHaibing <yuehaibing@huawei.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Mike Rapoport" <rppt@linux.ibm.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Anthony Yznaga" <anthony.yznaga@oracle.com>,
	"Oscar Salvador" <osalvador@suse.de>,
	"Isaac J. Manjarres" <isaacm@codeaurora.org>,
	"Juergen Gross" <jgross@suse.com>,
	"Anshuman Khandual" <anshuman.khandual@arm.com>,
	"Haiyang Zhang" <haiyangz@microsoft.com>,
	"Sasha Levin" <sashal@kernel.org>,
	kvm-ppc@vger.kernel.org, "Qian Cai" <cai@lca.pw>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Mike Rapoport" <rppt@linux.vnet.ibm.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	xen-devel <xen-devel@lists.xenproject.org>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Vitaly Kuznetsov" <vkuznets@redhat.com>,
	"Allison Randal" <allison@lohutok.net>,
	"Jim Mattson" <jmattson@google.com>,
	"Christophe Leroy" <christophe.leroy@c-s.fr>,
	"Mel Gorman" <mgorman@techsingularity.net>,
	"Adam Borowski" <kilobyte@angband.pl>,
	"Cornelia Huck" <cohuck@redhat.com>,
	"Pavel Tatashin" <pasha.tatashin@soleen.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v1 03/10] KVM: Prepare kvm_is_reserved_pfn() for PG_reserved changes
Date: Tue, 5 Nov 2019 15:43:29 -0800	[thread overview]
Message-ID: <CAPcyv4i7tnjyghYhSjK8fxUu8Qkdc2RuD9kUwJcKEMDzOf51ng@mail.gmail.com> (raw)
In-Reply-To: <CAPcyv4iRP0Sz=mcT+iuoVaD4-o2q1nCH2Hixc5OkfWu+SBQmkg@mail.gmail.com>

On Tue, Nov 5, 2019 at 3:30 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Tue, Nov 5, 2019 at 3:13 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Tue, Nov 05, 2019 at 03:02:40PM -0800, Dan Williams wrote:
> > > On Tue, Nov 5, 2019 at 12:31 PM David Hildenbrand <david@redhat.com> wrote:
> > > > > The scarier code (for me) is transparent_hugepage_adjust() and
> > > > > kvm_mmu_zap_collapsible_spte(), as I don't at all understand the
> > > > > interaction between THP and _PAGE_DEVMAP.
> > > >
> > > > The x86 KVM MMU code is one of the ugliest code I know (sorry, but it
> > > > had to be said :/ ). Luckily, this should be independent of the
> > > > PG_reserved thingy AFAIKs.
> > >
> > > Both transparent_hugepage_adjust() and kvm_mmu_zap_collapsible_spte()
> > > are honoring kvm_is_reserved_pfn(), so again I'm missing where the
> > > page count gets mismanaged and leads to the reported hang.
> >
> > When mapping pages into the guest, KVM gets the page via gup(), which
> > increments the page count for ZONE_DEVICE pages.  But KVM puts the page
> > using kvm_release_pfn_clean(), which skips put_page() if PageReserved()
> > and so never puts its reference to ZONE_DEVICE pages.
>
> Oh, yeah, that's busted.

Ugh, it's extra busted because every other gup user in the kernel
tracks the pages resulting from gup and puts them (put_page()) when
they are done. KVM wants to forget about whether it did a gup to get
the page and optionally trigger put_page() based purely on the pfn.
Outside of VFIO device assignment that needs pages pinned for DMA, why
does KVM itself need to pin pages? If pages are pinned over a return
to userspace that needs to be a FOLL_LONGTERM gup.
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: linux-hyperv@vger.kernel.org, "Michal Hocko" <mhocko@suse.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"KVM list" <kvm@vger.kernel.org>,
	"David Hildenbrand" <david@redhat.com>,
	"KarimAllah Ahmed" <karahmed@amazon.de>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Alexander Duyck" <alexander.duyck@gmail.com>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"Pavel Tatashin" <pavel.tatashin@microsoft.com>,
	"Paul Mackerras" <paulus@samba.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Wanpeng Li" <wanpengli@tencent.com>,
	"Alexander Duyck" <alexander.h.duyck@linux.intel.com>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Kees Cook" <keescook@chromium.org>,
	devel@driverdev.osuosl.org,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Stephen Hemminger" <sthemmin@microsoft.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	"Joerg Roedel" <joro@8bytes.org>, "X86 ML" <x86@kernel.org>,
	YueHaibing <yuehaibing@huawei.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Mike Rapoport" <rppt@linux.ibm.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Anthony Yznaga" <anthony.yznaga@oracle.com>,
	"Oscar Salvador" <osalvador@suse.de>,
	"Isaac J. Manjarres" <isaacm@codeaurora.org>,
	"Matt Sickler" <Matt.Sickler@daktronics.com>,
	"Juergen Gross" <jgross@suse.com>,
	"Anshuman Khandual" <anshuman.khandual@arm.com>,
	"Haiyang Zhang" <haiyangz@microsoft.com>,
	"Sasha Levin" <sashal@kernel.org>,
	kvm-ppc@vger.kernel.org, "Qian Cai" <cai@lca.pw>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Mike Rapoport" <rppt@linux.vnet.ibm.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	xen-devel <xen-devel@lists.xenproject.org>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Vitaly Kuznetsov" <vkuznets@redhat.com>,
	"Allison Randal" <allison@lohutok.net>,
	"Jim Mattson" <jmattson@google.com>,
	"Mel Gorman" <mgorman@techsingularity.net>,
	"Adam Borowski" <kilobyte@angband.pl>,
	"Cornelia Huck" <cohuck@redhat.com>,
	"Pavel Tatashin" <pasha.tatashin@soleen.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v1 03/10] KVM: Prepare kvm_is_reserved_pfn() for PG_reserved changes
Date: Tue, 5 Nov 2019 15:43:29 -0800	[thread overview]
Message-ID: <CAPcyv4i7tnjyghYhSjK8fxUu8Qkdc2RuD9kUwJcKEMDzOf51ng@mail.gmail.com> (raw)
In-Reply-To: <CAPcyv4iRP0Sz=mcT+iuoVaD4-o2q1nCH2Hixc5OkfWu+SBQmkg@mail.gmail.com>

On Tue, Nov 5, 2019 at 3:30 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Tue, Nov 5, 2019 at 3:13 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Tue, Nov 05, 2019 at 03:02:40PM -0800, Dan Williams wrote:
> > > On Tue, Nov 5, 2019 at 12:31 PM David Hildenbrand <david@redhat.com> wrote:
> > > > > The scarier code (for me) is transparent_hugepage_adjust() and
> > > > > kvm_mmu_zap_collapsible_spte(), as I don't at all understand the
> > > > > interaction between THP and _PAGE_DEVMAP.
> > > >
> > > > The x86 KVM MMU code is one of the ugliest code I know (sorry, but it
> > > > had to be said :/ ). Luckily, this should be independent of the
> > > > PG_reserved thingy AFAIKs.
> > >
> > > Both transparent_hugepage_adjust() and kvm_mmu_zap_collapsible_spte()
> > > are honoring kvm_is_reserved_pfn(), so again I'm missing where the
> > > page count gets mismanaged and leads to the reported hang.
> >
> > When mapping pages into the guest, KVM gets the page via gup(), which
> > increments the page count for ZONE_DEVICE pages.  But KVM puts the page
> > using kvm_release_pfn_clean(), which skips put_page() if PageReserved()
> > and so never puts its reference to ZONE_DEVICE pages.
>
> Oh, yeah, that's busted.

Ugh, it's extra busted because every other gup user in the kernel
tracks the pages resulting from gup and puts them (put_page()) when
they are done. KVM wants to forget about whether it did a gup to get
the page and optionally trigger put_page() based purely on the pfn.
Outside of VFIO device assignment that needs pages pinned for DMA, why
does KVM itself need to pin pages? If pages are pinned over a return
to userspace that needs to be a FOLL_LONGTERM gup.

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: linux-hyperv@vger.kernel.org, "Michal Hocko" <mhocko@suse.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"KVM list" <kvm@vger.kernel.org>,
	"David Hildenbrand" <david@redhat.com>,
	"KarimAllah Ahmed" <karahmed@amazon.de>,
	"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Alexander Duyck" <alexander.duyck@gmail.com>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Paul Mackerras" <paulus@ozlabs.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"Pavel Tatashin" <pavel.tatashin@microsoft.com>,
	"Paul Mackerras" <paulus@samba.org>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Wanpeng Li" <wanpengli@tencent.com>,
	"Alexander Duyck" <alexander.h.duyck@linux.intel.com>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Kees Cook" <keescook@chromium.org>,
	devel@driverdev.osuosl.org,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Stephen Hemminger" <sthemmin@microsoft.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	"Joerg Roedel" <joro@8bytes.org>, "X86 ML" <x86@kernel.org>,
	YueHaibing <yuehaibing@huawei.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Mike Rapoport" <rppt@linux.ibm.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Anthony Yznaga" <anthony.yznaga@oracle.com>,
	"Oscar Salvador" <osalvador@suse.de>,
	"Isaac J. Manjarres" <isaacm@codeaurora.org>,
	"Matt Sickler" <Matt.Sickler@daktronics.com>,
	"Juergen Gross" <jgross@suse.com>,
	"Anshuman Khandual" <anshuman.khandual@arm.com>,
	"Haiyang Zhang" <haiyangz@microsoft.com>,
	"Sasha Levin" <sashal@kernel.org>,
	kvm-ppc@vger.kernel.org, "Qian Cai" <cai@lca.pw>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Mike Rapoport" <rppt@linux.vnet.ibm.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	xen-devel <xen-devel@lists.xenproject.org>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Vitaly Kuznetsov" <vkuznets@redhat.com>,
	"Allison Randal" <allison@lohutok.net>,
	"Jim Mattson" <jmattson@google.com>,
	"Christophe Leroy" <christophe.leroy@c-s.fr>,
	"Mel Gorman" <mgorman@techsingularity.net>,
	"Adam Borowski" <kilobyte@angband.pl>,
	"Cornelia Huck" <cohuck@redhat.com>,
	"Pavel Tatashin" <pasha.tatashin@soleen.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [Xen-devel] [PATCH v1 03/10] KVM: Prepare kvm_is_reserved_pfn() for PG_reserved changes
Date: Tue, 5 Nov 2019 15:43:29 -0800	[thread overview]
Message-ID: <CAPcyv4i7tnjyghYhSjK8fxUu8Qkdc2RuD9kUwJcKEMDzOf51ng@mail.gmail.com> (raw)
In-Reply-To: <CAPcyv4iRP0Sz=mcT+iuoVaD4-o2q1nCH2Hixc5OkfWu+SBQmkg@mail.gmail.com>

On Tue, Nov 5, 2019 at 3:30 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Tue, Nov 5, 2019 at 3:13 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Tue, Nov 05, 2019 at 03:02:40PM -0800, Dan Williams wrote:
> > > On Tue, Nov 5, 2019 at 12:31 PM David Hildenbrand <david@redhat.com> wrote:
> > > > > The scarier code (for me) is transparent_hugepage_adjust() and
> > > > > kvm_mmu_zap_collapsible_spte(), as I don't at all understand the
> > > > > interaction between THP and _PAGE_DEVMAP.
> > > >
> > > > The x86 KVM MMU code is one of the ugliest code I know (sorry, but it
> > > > had to be said :/ ). Luckily, this should be independent of the
> > > > PG_reserved thingy AFAIKs.
> > >
> > > Both transparent_hugepage_adjust() and kvm_mmu_zap_collapsible_spte()
> > > are honoring kvm_is_reserved_pfn(), so again I'm missing where the
> > > page count gets mismanaged and leads to the reported hang.
> >
> > When mapping pages into the guest, KVM gets the page via gup(), which
> > increments the page count for ZONE_DEVICE pages.  But KVM puts the page
> > using kvm_release_pfn_clean(), which skips put_page() if PageReserved()
> > and so never puts its reference to ZONE_DEVICE pages.
>
> Oh, yeah, that's busted.

Ugh, it's extra busted because every other gup user in the kernel
tracks the pages resulting from gup and puts them (put_page()) when
they are done. KVM wants to forget about whether it did a gup to get
the page and optionally trigger put_page() based purely on the pfn.
Outside of VFIO device assignment that needs pages pinned for DMA, why
does KVM itself need to pin pages? If pages are pinned over a return
to userspace that needs to be a FOLL_LONGTERM gup.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  parent reply	other threads:[~2019-11-05 23:43 UTC|newest]

Thread overview: 172+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-24 12:09 [PATCH v1 00/10] mm: Don't mark hotplugged pages PG_reserved (including ZONE_DEVICE) David Hildenbrand
2019-10-24 12:09 ` [Xen-devel] " David Hildenbrand
2019-10-24 12:09 ` David Hildenbrand
2019-10-24 12:09 ` David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 01/10] mm/memory_hotplug: Don't allow to online/offline memory blocks with holes David Hildenbrand
2019-10-24 12:09   ` [Xen-devel] " David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-11-05  1:30   ` Dan Williams
2019-11-05  1:30     ` [Xen-devel] " Dan Williams
2019-11-05  1:30     ` Dan Williams
2019-11-05  1:30     ` Dan Williams
2019-11-05  9:31     ` David Hildenbrand
2019-11-05  9:31       ` [Xen-devel] " David Hildenbrand
2019-11-05  9:31       ` David Hildenbrand
2019-11-05  9:31       ` David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 02/10] KVM: x86/mmu: Prepare kvm_is_mmio_pfn() for PG_reserved changes David Hildenbrand
2019-10-24 12:09   ` [Xen-devel] " David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-11-05  1:37   ` Dan Williams
2019-11-05  1:37     ` [Xen-devel] " Dan Williams
2019-11-05  1:37     ` Dan Williams
2019-11-05  1:37     ` Dan Williams
2019-11-05 11:09     ` David Hildenbrand
2019-11-05 11:09       ` [Xen-devel] " David Hildenbrand
2019-11-05 11:09       ` David Hildenbrand
2019-11-05 11:09       ` David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 03/10] KVM: Prepare kvm_is_reserved_pfn() " David Hildenbrand
2019-10-24 12:09   ` [Xen-devel] " David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-11-05  4:38   ` Dan Williams
2019-11-05  4:38     ` [Xen-devel] " Dan Williams
2019-11-05  4:38     ` Dan Williams
2019-11-05  4:38     ` Dan Williams
2019-11-05  9:17     ` David Hildenbrand
2019-11-05  9:17       ` [Xen-devel] " David Hildenbrand
2019-11-05  9:17       ` David Hildenbrand
2019-11-05  9:17       ` David Hildenbrand
2019-11-05  9:49       ` David Hildenbrand
2019-11-05  9:49         ` [Xen-devel] " David Hildenbrand
2019-11-05  9:49         ` David Hildenbrand
2019-11-05  9:49         ` David Hildenbrand
2019-11-05 10:02         ` David Hildenbrand
2019-11-05 10:02           ` [Xen-devel] " David Hildenbrand
2019-11-05 10:02           ` David Hildenbrand
2019-11-05 10:02           ` David Hildenbrand
2019-11-05 16:00           ` Sean Christopherson
2019-11-05 16:00             ` [Xen-devel] " Sean Christopherson
2019-11-05 16:00             ` Sean Christopherson
2019-11-05 16:00             ` Sean Christopherson
2019-11-05 20:30             ` David Hildenbrand
2019-11-05 20:30               ` [Xen-devel] " David Hildenbrand
2019-11-05 20:30               ` David Hildenbrand
2019-11-05 20:30               ` David Hildenbrand
2019-11-05 22:22               ` Sean Christopherson
2019-11-05 22:22                 ` [Xen-devel] " Sean Christopherson
2019-11-05 22:22                 ` Sean Christopherson
2019-11-05 22:22                 ` Sean Christopherson
2019-11-05 23:02               ` Dan Williams
2019-11-05 23:02                 ` [Xen-devel] " Dan Williams
2019-11-05 23:02                 ` Dan Williams
2019-11-05 23:02                 ` Dan Williams
2019-11-05 23:13                 ` Sean Christopherson
2019-11-05 23:13                   ` [Xen-devel] " Sean Christopherson
2019-11-05 23:13                   ` Sean Christopherson
2019-11-05 23:13                   ` Sean Christopherson
2019-11-05 23:30                   ` Dan Williams
2019-11-05 23:30                     ` [Xen-devel] " Dan Williams
2019-11-05 23:30                     ` Dan Williams
2019-11-05 23:30                     ` Dan Williams
2019-11-05 23:42                     ` Sean Christopherson
2019-11-05 23:42                       ` [Xen-devel] " Sean Christopherson
2019-11-05 23:42                       ` Sean Christopherson
2019-11-05 23:42                       ` Sean Christopherson
2019-11-05 23:43                     ` Dan Williams [this message]
2019-11-05 23:43                       ` [Xen-devel] " Dan Williams
2019-11-05 23:43                       ` Dan Williams
2019-11-05 23:43                       ` Dan Williams
2019-11-06  0:03                       ` Sean Christopherson
2019-11-06  0:03                         ` [Xen-devel] " Sean Christopherson
2019-11-06  0:03                         ` Sean Christopherson
2019-11-06  0:03                         ` Sean Christopherson
2019-11-06  0:08                         ` Dan Williams
2019-11-06  0:08                           ` [Xen-devel] " Dan Williams
2019-11-06  0:08                           ` Dan Williams
2019-11-06  0:08                           ` Dan Williams
2019-11-06  6:56                           ` David Hildenbrand
2019-11-06  6:56                             ` [Xen-devel] " David Hildenbrand
2019-11-06  6:56                             ` David Hildenbrand
2019-11-06  6:56                             ` David Hildenbrand
2019-11-06 16:09                             ` Sean Christopherson
2019-11-06 16:09                               ` [Xen-devel] " Sean Christopherson
2019-11-06 16:09                               ` Sean Christopherson
2019-11-06 16:09                               ` Sean Christopherson
2019-10-24 12:09 ` [PATCH v1 04/10] vfio/type1: Prepare is_invalid_reserved_pfn() " David Hildenbrand
2019-10-24 12:09   ` [Xen-devel] " David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-11-07 15:40   ` Dan Williams
2019-11-07 15:40     ` [Xen-devel] " Dan Williams
2019-11-07 15:40     ` Dan Williams
2019-11-07 15:40     ` Dan Williams
2019-11-07 18:22     ` David Hildenbrand
2019-11-07 18:22       ` [Xen-devel] " David Hildenbrand
2019-11-07 18:22       ` David Hildenbrand
2019-11-07 18:22       ` David Hildenbrand
2019-11-07 22:07       ` David Hildenbrand
2019-11-07 22:07         ` [Xen-devel] " David Hildenbrand
2019-11-07 22:07         ` David Hildenbrand
2019-11-07 22:07         ` David Hildenbrand
2019-11-08  5:09         ` Dan Williams
2019-11-08  5:09           ` [Xen-devel] " Dan Williams
2019-11-08  5:09           ` Dan Williams
2019-11-08  5:09           ` Dan Williams
2019-11-08  7:14           ` David Hildenbrand
2019-11-08  7:14             ` [Xen-devel] " David Hildenbrand
2019-11-08  7:14             ` David Hildenbrand
2019-11-08  7:14             ` David Hildenbrand
2019-11-08 10:21             ` David Hildenbrand
2019-11-08 10:21               ` [Xen-devel] " David Hildenbrand
2019-11-08 10:21               ` David Hildenbrand
2019-11-08 10:21               ` David Hildenbrand
2019-11-08 18:29               ` Dan Williams
2019-11-08 18:29                 ` [Xen-devel] " Dan Williams
2019-11-08 18:29                 ` Dan Williams
2019-11-08 18:29                 ` Dan Williams
2019-11-08 23:01                 ` David Hildenbrand
2019-11-08 23:01                   ` [Xen-devel] " David Hildenbrand
2019-11-08 23:01                   ` David Hildenbrand
2019-11-08 23:01                   ` David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 05/10] powerpc/book3s: Prepare kvmppc_book3s_instantiate_page() " David Hildenbrand
2019-10-24 12:09   ` [Xen-devel] " David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 06/10] powerpc/64s: Prepare hash_page_do_lazy_icache() " David Hildenbrand
2019-10-24 12:09   ` [Xen-devel] " David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 07/10] powerpc/mm: Prepare maybe_pte_to_page() " David Hildenbrand
2019-10-24 12:09   ` [Xen-devel] " David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 08/10] x86/mm: Prepare __ioremap_check_ram() " David Hildenbrand
2019-10-24 12:09   ` [Xen-devel] " David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 09/10] mm/memory_hotplug: Don't mark pages PG_reserved when initializing the memmap David Hildenbrand
2019-10-24 12:09   ` [Xen-devel] " David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-11-04 22:44   ` Boris Ostrovsky
2019-11-04 22:44     ` [Xen-devel] " Boris Ostrovsky
2019-11-04 22:44     ` Boris Ostrovsky
2019-11-04 22:44     ` Boris Ostrovsky
2019-11-05 10:18     ` David Hildenbrand
2019-11-05 10:18       ` [Xen-devel] " David Hildenbrand
2019-11-05 10:18       ` David Hildenbrand
2019-11-05 10:18       ` David Hildenbrand
2019-11-05 16:06       ` Boris Ostrovsky
2019-11-05 16:06         ` [Xen-devel] " Boris Ostrovsky
2019-11-05 16:06         ` Boris Ostrovsky
2019-11-05 16:06         ` Boris Ostrovsky
2019-10-24 12:09 ` [PATCH v1 10/10] mm/usercopy.c: Update comment in check_page_span() regarding ZONE_DEVICE David Hildenbrand
2019-10-24 12:09   ` [Xen-devel] " David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-10-24 12:09   ` David Hildenbrand
2019-11-01 19:24 ` [PATCH v1 00/10] mm: Don't mark hotplugged pages PG_reserved (including ZONE_DEVICE) David Hildenbrand
2019-11-01 19:24   ` [Xen-devel] " David Hildenbrand
2019-11-01 19:24   ` David Hildenbrand
2019-11-01 19:24   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4i7tnjyghYhSjK8fxUu8Qkdc2RuD9kUwJcKEMDzOf51ng@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=Matt.Sickler@daktronics.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=allison@lohutok.net \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=anshuman.khandual@arm.com \
    --cc=anthony.yznaga@oracle.com \
    --cc=benh@kernel.crashing.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=cai@lca.pw \
    --cc=christophe.leroy@c-s.fr \
    --cc=cohuck@redhat.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=haiyangz@microsoft.com \
    --cc=hannes@cmpxchg.org \
    --cc=hpa@zytor.com \
    --cc=isaacm@codeaurora.org \
    --cc=jgross@suse.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=karahmed@amazon.de \
    --cc=keescook@chromium.org \
    --cc=kilobyte@angband.pl \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luto@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulus@ozlabs.org \
    --cc=paulus@samba.org \
    --cc=pavel.tatashin@microsoft.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rkrcmar@redhat.com \
    --cc=rppt@linux.ibm.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=sashal@kernel.org \
    --cc=sean.j.christopherson@intel.com \
    --cc=sstabellini@kernel.org \
    --cc=sthemmin@microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    --cc=yuehaibing@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.