All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: "Justin He (Arm Technology China)" <Justin.He@arm.com>,
	"Will Deacon" <will@kernel.org>,
	"Mark Rutland" <Mark.Rutland@arm.com>,
	"James Morse" <James.Morse@arm.com>,
	"Marc Zyngier" <maz@kernel.org>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Suzuki Poulose" <Suzuki.Poulose@arm.com>,
	"Punit Agrawal" <punitagrawal@gmail.com>,
	"Anshuman Khandual" <Anshuman.Khandual@arm.com>,
	"Alex Van Brunt" <avanbrunt@nvidia.com>,
	"Robin Murphy" <Robin.Murphy@arm.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"hejianet@gmail.com" <hejianet@gmail.com>,
	"Kaly Xin (Arm Technology China)" <Kaly.Xin@arm.com>,
	nd <nd@arm.com>
Subject: Re: [PATCH v8 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared
Date: Tue, 24 Sep 2019 14:59:13 +0300	[thread overview]
Message-ID: <20190924115913.ju67nr4gcdbzbeva@box> (raw)
In-Reply-To: <20190924103324.GB41214@arrakis.emea.arm.com>

On Tue, Sep 24, 2019 at 11:33:25AM +0100, Catalin Marinas wrote:
> On Tue, Sep 24, 2019 at 06:43:06AM +0000, Justin He (Arm Technology China) wrote:
> > Catalin Marinas wrote:
> > > On Sat, Sep 21, 2019 at 09:50:54PM +0800, Jia He wrote:
> > > > @@ -2151,21 +2163,53 @@ static inline void cow_user_page(struct page *dst, struct page *src, unsigned lo
> > > >  	 * fails, we just zero-fill it. Live with it.
> > > >  	 */
> > > >  	if (unlikely(!src)) {
> > > > -		void *kaddr = kmap_atomic(dst);
> > > > -		void __user *uaddr = (void __user *)(va & PAGE_MASK);
> > > > +		void *kaddr;
> > > > +		pte_t entry;
> > > > +		void __user *uaddr = (void __user *)(addr & PAGE_MASK);
> > > >
> > > > +		/* On architectures with software "accessed" bits, we would
> > > > +		 * take a double page fault, so mark it accessed here.
> > > > +		 */
> [...]
> > > > +		if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) {
> > > > +			vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr,
> > > > +						       &vmf->ptl);
> > > > +			if (likely(pte_same(*vmf->pte, vmf->orig_pte))) {
> > > > +				entry = pte_mkyoung(vmf->orig_pte);
> > > > +				if (ptep_set_access_flags(vma, addr,
> > > > +							  vmf->pte, entry, 0))
> > > > +					update_mmu_cache(vma, addr, vmf->pte);
> > > > +			} else {
> > > > +				/* Other thread has already handled the fault
> > > > +				 * and we don't need to do anything. If it's
> > > > +				 * not the case, the fault will be triggered
> > > > +				 * again on the same address.
> > > > +				 */
> > > > +				pte_unmap_unlock(vmf->pte, vmf->ptl);
> > > > +				return false;
> > > > +			}
> > > > +			pte_unmap_unlock(vmf->pte, vmf->ptl);
> > > > +		}
> [...]
> > > > +
> > > > +		kaddr = kmap_atomic(dst);
> > > 
> > > Since you moved the kmap_atomic() here, could the above
> > > arch_faults_on_old_pte() run in a preemptible context? I suggested to
> > > add a WARN_ON in patch 2 to be sure.
> > 
> > Should I move kmap_atomic back to the original line? Thus, we can make sure
> > that arch_faults_on_old_pte() is in the context of preempt_disabled?
> > Otherwise, arch_faults_on_old_pte() may cause plenty of warning if I add
> > a WARN_ON in arch_faults_on_old_pte.  I tested it when I enable the PREEMPT=y
> > on a ThunderX2 qemu guest.
> 
> So we have two options here:
> 
> 1. Change arch_faults_on_old_pte() scope to the whole system rather than
>    just the current CPU. You'd have to wire up a new arm64 capability
>    for the access flag but this way we don't care whether it's
>    preemptible or not.
> 
> 2. Keep the arch_faults_on_old_pte() per-CPU but make sure we are not
>    preempted here. The kmap_atomic() move would do but you'd have to
>    kunmap_atomic() before the return.
> 
> I think the answer to my question below also has some implication on
> which option to pick:
> 
> > > >  		/*
> > > >  		 * This really shouldn't fail, because the page is there
> > > >  		 * in the page tables. But it might just be unreadable,
> > > >  		 * in which case we just give up and fill the result with
> > > >  		 * zeroes.
> > > >  		 */
> > > > -		if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE))
> > > > +		if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
> > > > +			/* Give a warn in case there can be some obscure
> > > > +			 * use-case
> > > > +			 */
> > > > +			WARN_ON_ONCE(1);
> > > 
> > > That's more of a question for the mm guys: at this point we do the
> > > copying with the ptl released; is there anything else that could have
> > > made the pte old in the meantime? I think unuse_pte() is only called on
> > > anonymous vmas, so it shouldn't be the case here.
> 
> If we need to hold the ptl here, you could as well have an enclosing
> kmap/kunmap_atomic (option 2) with some goto instead of "return false".

Yeah, look like we need to hold ptl for longer. There is nothing I see
that would prevent clearing young bit under us otherwise.

-- 
 Kirill A. Shutemov

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: "Mark Rutland" <Mark.Rutland@arm.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Punit Agrawal" <punitagrawal@gmail.com>,
	"Will Deacon" <will@kernel.org>,
	"Alex Van Brunt" <avanbrunt@nvidia.com>,
	"Justin He (Arm Technology China)" <Justin.He@arm.com>,
	"Marc Zyngier" <maz@kernel.org>,
	"Anshuman Khandual" <Anshuman.Khandual@arm.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Kaly Xin (Arm Technology China)" <Kaly.Xin@arm.com>,
	"hejianet@gmail.com" <hejianet@gmail.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"Suzuki Poulose" <Suzuki.Poulose@arm.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Thomas Gleixner" <tglx@linutronix.de>, nd <nd@arm.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"James Morse" <James.Morse@arm.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Robin Murphy" <Robin.Murphy@arm.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [PATCH v8 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared
Date: Tue, 24 Sep 2019 14:59:13 +0300	[thread overview]
Message-ID: <20190924115913.ju67nr4gcdbzbeva@box> (raw)
In-Reply-To: <20190924103324.GB41214@arrakis.emea.arm.com>

On Tue, Sep 24, 2019 at 11:33:25AM +0100, Catalin Marinas wrote:
> On Tue, Sep 24, 2019 at 06:43:06AM +0000, Justin He (Arm Technology China) wrote:
> > Catalin Marinas wrote:
> > > On Sat, Sep 21, 2019 at 09:50:54PM +0800, Jia He wrote:
> > > > @@ -2151,21 +2163,53 @@ static inline void cow_user_page(struct page *dst, struct page *src, unsigned lo
> > > >  	 * fails, we just zero-fill it. Live with it.
> > > >  	 */
> > > >  	if (unlikely(!src)) {
> > > > -		void *kaddr = kmap_atomic(dst);
> > > > -		void __user *uaddr = (void __user *)(va & PAGE_MASK);
> > > > +		void *kaddr;
> > > > +		pte_t entry;
> > > > +		void __user *uaddr = (void __user *)(addr & PAGE_MASK);
> > > >
> > > > +		/* On architectures with software "accessed" bits, we would
> > > > +		 * take a double page fault, so mark it accessed here.
> > > > +		 */
> [...]
> > > > +		if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) {
> > > > +			vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr,
> > > > +						       &vmf->ptl);
> > > > +			if (likely(pte_same(*vmf->pte, vmf->orig_pte))) {
> > > > +				entry = pte_mkyoung(vmf->orig_pte);
> > > > +				if (ptep_set_access_flags(vma, addr,
> > > > +							  vmf->pte, entry, 0))
> > > > +					update_mmu_cache(vma, addr, vmf->pte);
> > > > +			} else {
> > > > +				/* Other thread has already handled the fault
> > > > +				 * and we don't need to do anything. If it's
> > > > +				 * not the case, the fault will be triggered
> > > > +				 * again on the same address.
> > > > +				 */
> > > > +				pte_unmap_unlock(vmf->pte, vmf->ptl);
> > > > +				return false;
> > > > +			}
> > > > +			pte_unmap_unlock(vmf->pte, vmf->ptl);
> > > > +		}
> [...]
> > > > +
> > > > +		kaddr = kmap_atomic(dst);
> > > 
> > > Since you moved the kmap_atomic() here, could the above
> > > arch_faults_on_old_pte() run in a preemptible context? I suggested to
> > > add a WARN_ON in patch 2 to be sure.
> > 
> > Should I move kmap_atomic back to the original line? Thus, we can make sure
> > that arch_faults_on_old_pte() is in the context of preempt_disabled?
> > Otherwise, arch_faults_on_old_pte() may cause plenty of warning if I add
> > a WARN_ON in arch_faults_on_old_pte.  I tested it when I enable the PREEMPT=y
> > on a ThunderX2 qemu guest.
> 
> So we have two options here:
> 
> 1. Change arch_faults_on_old_pte() scope to the whole system rather than
>    just the current CPU. You'd have to wire up a new arm64 capability
>    for the access flag but this way we don't care whether it's
>    preemptible or not.
> 
> 2. Keep the arch_faults_on_old_pte() per-CPU but make sure we are not
>    preempted here. The kmap_atomic() move would do but you'd have to
>    kunmap_atomic() before the return.
> 
> I think the answer to my question below also has some implication on
> which option to pick:
> 
> > > >  		/*
> > > >  		 * This really shouldn't fail, because the page is there
> > > >  		 * in the page tables. But it might just be unreadable,
> > > >  		 * in which case we just give up and fill the result with
> > > >  		 * zeroes.
> > > >  		 */
> > > > -		if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE))
> > > > +		if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
> > > > +			/* Give a warn in case there can be some obscure
> > > > +			 * use-case
> > > > +			 */
> > > > +			WARN_ON_ONCE(1);
> > > 
> > > That's more of a question for the mm guys: at this point we do the
> > > copying with the ptl released; is there anything else that could have
> > > made the pte old in the meantime? I think unuse_pte() is only called on
> > > anonymous vmas, so it shouldn't be the case here.
> 
> If we need to hold the ptl here, you could as well have an enclosing
> kmap/kunmap_atomic (option 2) with some goto instead of "return false".

Yeah, look like we need to hold ptl for longer. There is nothing I see
that would prevent clearing young bit under us otherwise.

-- 
 Kirill A. Shutemov

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-09-24 11:59 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-21 13:50 [PATCH v8 0/3] fix double page fault on arm64 Jia He
2019-09-21 13:50 ` Jia He
2019-09-21 13:50 ` [PATCH v8 1/3] arm64: cpufeature: introduce helper cpu_has_hw_af() Jia He
2019-09-21 13:50   ` Jia He
2019-09-23 16:07   ` Catalin Marinas
2019-09-23 16:07     ` Catalin Marinas
2019-09-24  1:50     ` Justin He (Arm Technology China)
2019-09-24  1:50       ` Justin He (Arm Technology China)
2019-09-21 13:50 ` [PATCH v8 2/3] arm64: mm: implement arch_faults_on_old_pte() on arm64 Jia He
2019-09-21 13:50   ` Jia He
2019-09-23 16:18   ` Catalin Marinas
2019-09-23 16:18     ` Catalin Marinas
2019-09-24  2:17     ` Justin He (Arm Technology China)
2019-09-24  2:17       ` Justin He (Arm Technology China)
2019-09-21 13:50 ` [PATCH v8 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared Jia He
2019-09-21 13:50   ` Jia He
2019-09-21 15:31   ` Matthew Wilcox
2019-09-21 15:31     ` Matthew Wilcox
2019-09-23  8:28   ` Kirill A. Shutemov
2019-09-23  8:28     ` Kirill A. Shutemov
2019-09-23 17:04   ` Catalin Marinas
2019-09-23 17:04     ` Catalin Marinas
2019-09-24  6:43     ` Justin He (Arm Technology China)
2019-09-24  6:43       ` Justin He (Arm Technology China)
2019-09-24 10:33       ` Catalin Marinas
2019-09-24 10:33         ` Catalin Marinas
2019-09-24 11:59         ` Kirill A. Shutemov [this message]
2019-09-24 11:59           ` Kirill A. Shutemov
2019-09-24 15:29         ` Jia He
2019-09-24 15:29           ` Jia He
2019-09-24 16:35           ` Catalin Marinas
2019-09-24 16:35             ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190924115913.ju67nr4gcdbzbeva@box \
    --to=kirill@shutemov.name \
    --cc=Anshuman.Khandual@arm.com \
    --cc=James.Morse@arm.com \
    --cc=Justin.He@arm.com \
    --cc=Kaly.Xin@arm.com \
    --cc=Mark.Rutland@arm.com \
    --cc=Robin.Murphy@arm.com \
    --cc=Suzuki.Poulose@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=avanbrunt@nvidia.com \
    --cc=catalin.marinas@arm.com \
    --cc=hejianet@gmail.com \
    --cc=jglisse@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maz@kernel.org \
    --cc=nd@arm.com \
    --cc=punitagrawal@gmail.com \
    --cc=rcampbell@nvidia.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.