All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Jia He <hejianet@gmail.com>
Cc: "Justin He (Arm Technology China)" <Justin.He@arm.com>,
	"Will Deacon" <will@kernel.org>,
	"Mark Rutland" <Mark.Rutland@arm.com>,
	"James Morse" <James.Morse@arm.com>,
	"Marc Zyngier" <maz@kernel.org>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Suzuki Poulose" <Suzuki.Poulose@arm.com>,
	"Punit Agrawal" <punitagrawal@gmail.com>,
	"Anshuman Khandual" <Anshuman.Khandual@arm.com>,
	"Alex Van Brunt" <avanbrunt@nvidia.com>,
	"Robin Murphy" <Robin.Murphy@arm.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"Kaly Xin (Arm Technology China)" <Kaly.Xin@arm.com>,
	nd <nd@arm.com>
Subject: Re: [PATCH v8 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared
Date: Tue, 24 Sep 2019 17:35:42 +0100	[thread overview]
Message-ID: <20190924163542.GI41214@arrakis.emea.arm.com> (raw)
In-Reply-To: <6267b685-5162-85ac-087f-112303bb7035@gmail.com>

On Tue, Sep 24, 2019 at 11:29:07PM +0800, Jia He wrote:
> On 2019/9/24 18:33, Catalin Marinas wrote:
> > On Tue, Sep 24, 2019 at 06:43:06AM +0000, Justin He (Arm Technology China) wrote:
> > > Catalin Marinas wrote:
> > > > On Sat, Sep 21, 2019 at 09:50:54PM +0800, Jia He wrote:
> > > > >   		/*
> > > > >   		 * This really shouldn't fail, because the page is there
> > > > >   		 * in the page tables. But it might just be unreadable,
> > > > >   		 * in which case we just give up and fill the result with
> > > > >   		 * zeroes.
> > > > >   		 */
> > > > > -		if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE))
> > > > > +		if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
> > > > > +			/* Give a warn in case there can be some obscure
> > > > > +			 * use-case
> > > > > +			 */
> > > > > +			WARN_ON_ONCE(1);
> > > > That's more of a question for the mm guys: at this point we do the
> > > > copying with the ptl released; is there anything else that could have
> > > > made the pte old in the meantime? I think unuse_pte() is only called on
> > > > anonymous vmas, so it shouldn't be the case here.
> >
> > If we need to hold the ptl here, you could as well have an enclosing
> > kmap/kunmap_atomic (option 2) with some goto instead of "return false".
> 
> I am not 100% sure that I understand your suggestion well, so I
> drafted the patch

Well, however you think the code is cleaner really.

The copy/paste didn't work well, tabs disappeared (or rather the
Exchange server corrupting outgoing emails) but I'll try to comment
below:

> -static inline void cow_user_page(struct page *dst, struct page *src,
>   unsigned long va, struct vm_area_struct *vma)
> +static inline bool cow_user_page(struct page *dst, struct page *src,
> +                 struct vm_fault *vmf)
>  {
> +    struct vm_area_struct *vma = vmf->vma;
> +    struct mm_struct *mm = vma->vm_mm;
> +    unsigned long addr = vmf->address;
> +    bool ret;
> +    pte_t entry;
> +    void *kaddr;
> +    void __user *uaddr;
> +
>      debug_dma_assert_idle(src);
> 
> +    if (likely(src)) {
> +        copy_user_highpage(dst, src, addr, vma);
> +        return true;
> +    }
> +
>      /*
>       * If the source page was a PFN mapping, we don't have
>       * a "struct page" for it. We do a best-effort copy by
>       * just copying from the original user address. If that
>       * fails, we just zero-fill it. Live with it.
>       */
> -    if (unlikely(!src)) {
> -        void *kaddr = kmap_atomic(dst);
> -        void __user *uaddr = (void __user *)(va & PAGE_MASK);
> +    kaddr = kmap_atomic(dst);
> +    uaddr = (void __user *)(addr & PAGE_MASK);
> +
> +    /*
> +     * On architectures with software "accessed" bits, we would
> +     * take a double page fault, so mark it accessed here.
> +     */
> +    vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
> +    if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) {

I'd move the pte_offset_map_lock() inside the 'if' block as we don't
want to affect architectures that handle old ptes automatically.

> +        if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
> +            /*
> +             * Other thread has already handled the fault
> +             * and we don't need to do anything. If it's
> +             * not the case, the fault will be triggered
> +             * again on the same address.
> +             */
> +            ret = false;
> +            goto pte_unlock;
> +        }
> +
> +        entry = pte_mkyoung(vmf->orig_pte);
> +        if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0))
> +            update_mmu_cache(vma, addr, vmf->pte);
> +    }
> 
> +    /*
> +     * This really shouldn't fail, because the page is there
> +     * in the page tables. But it might just be unreadable,
> +     * in which case we just give up and fill the result with
> +     * zeroes.
> +     */
> +    if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
>          /*
> -         * This really shouldn't fail, because the page is there
> -         * in the page tables. But it might just be unreadable,
> -         * in which case we just give up and fill the result with
> -         * zeroes.
> +         * Give a warn in case there can be some obscure
> +         * use-case
>           */
> -        if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE))
> -            clear_page(kaddr);
> -        kunmap_atomic(kaddr);
> -        flush_dcache_page(dst);
> -    } else
> -        copy_user_highpage(dst, src, va, vma);
> +        WARN_ON_ONCE(1);
> +        clear_page(kaddr);
> +    }
> +
> +    ret = true;
> +
> +pte_unlock:
> +    pte_unmap_unlock(vmf->pte, vmf->ptl);

Since the locking would be moved in the 'if' block above, we need
another check here before unlocking:

	if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte))
		pte_unmap_unlock(vmf->pte, vmf->ptl);

You could probably replace the two calls to arch_faults_on_old_pte()
with a single bool variable initialisation, something like:

	force_mkyoung = arch_faults_on_old_pte() &&
		!pte_young(vmf->orig_pte)

and only check for "if (force_mkyoung)" in both cases.

> +    kunmap_atomic(kaddr);
> +    flush_dcache_page(dst);
> +
> +    return ret;
>  }

-- 
Catalin

WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Jia He <hejianet@gmail.com>
Cc: "Mark Rutland" <Mark.Rutland@arm.com>,
	"Kaly Xin (Arm Technology China)" <Kaly.Xin@arm.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"Justin He (Arm Technology China)" <Justin.He@arm.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Suzuki Poulose" <Suzuki.Poulose@arm.com>,
	"Marc Zyngier" <maz@kernel.org>,
	"Anshuman Khandual" <Anshuman.Khandual@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Matthew Wilcox" <willy@infradead.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"James Morse" <James.Morse@arm.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"Punit Agrawal" <punitagrawal@gmail.com>,
	"Thomas Gleixner" <tglx@linutronix.de>, nd <nd@arm.com>,
	"Will Deacon" <will@kernel.org>,
	"Alex Van Brunt" <avanbrunt@nvidia.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"Robin Murphy" <Robin.Murphy@arm.com>
Subject: Re: [PATCH v8 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared
Date: Tue, 24 Sep 2019 17:35:42 +0100	[thread overview]
Message-ID: <20190924163542.GI41214@arrakis.emea.arm.com> (raw)
In-Reply-To: <6267b685-5162-85ac-087f-112303bb7035@gmail.com>

On Tue, Sep 24, 2019 at 11:29:07PM +0800, Jia He wrote:
> On 2019/9/24 18:33, Catalin Marinas wrote:
> > On Tue, Sep 24, 2019 at 06:43:06AM +0000, Justin He (Arm Technology China) wrote:
> > > Catalin Marinas wrote:
> > > > On Sat, Sep 21, 2019 at 09:50:54PM +0800, Jia He wrote:
> > > > >   		/*
> > > > >   		 * This really shouldn't fail, because the page is there
> > > > >   		 * in the page tables. But it might just be unreadable,
> > > > >   		 * in which case we just give up and fill the result with
> > > > >   		 * zeroes.
> > > > >   		 */
> > > > > -		if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE))
> > > > > +		if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
> > > > > +			/* Give a warn in case there can be some obscure
> > > > > +			 * use-case
> > > > > +			 */
> > > > > +			WARN_ON_ONCE(1);
> > > > That's more of a question for the mm guys: at this point we do the
> > > > copying with the ptl released; is there anything else that could have
> > > > made the pte old in the meantime? I think unuse_pte() is only called on
> > > > anonymous vmas, so it shouldn't be the case here.
> >
> > If we need to hold the ptl here, you could as well have an enclosing
> > kmap/kunmap_atomic (option 2) with some goto instead of "return false".
> 
> I am not 100% sure that I understand your suggestion well, so I
> drafted the patch

Well, however you think the code is cleaner really.

The copy/paste didn't work well, tabs disappeared (or rather the
Exchange server corrupting outgoing emails) but I'll try to comment
below:

> -static inline void cow_user_page(struct page *dst, struct page *src,
>   unsigned long va, struct vm_area_struct *vma)
> +static inline bool cow_user_page(struct page *dst, struct page *src,
> +                 struct vm_fault *vmf)
>  {
> +    struct vm_area_struct *vma = vmf->vma;
> +    struct mm_struct *mm = vma->vm_mm;
> +    unsigned long addr = vmf->address;
> +    bool ret;
> +    pte_t entry;
> +    void *kaddr;
> +    void __user *uaddr;
> +
>      debug_dma_assert_idle(src);
> 
> +    if (likely(src)) {
> +        copy_user_highpage(dst, src, addr, vma);
> +        return true;
> +    }
> +
>      /*
>       * If the source page was a PFN mapping, we don't have
>       * a "struct page" for it. We do a best-effort copy by
>       * just copying from the original user address. If that
>       * fails, we just zero-fill it. Live with it.
>       */
> -    if (unlikely(!src)) {
> -        void *kaddr = kmap_atomic(dst);
> -        void __user *uaddr = (void __user *)(va & PAGE_MASK);
> +    kaddr = kmap_atomic(dst);
> +    uaddr = (void __user *)(addr & PAGE_MASK);
> +
> +    /*
> +     * On architectures with software "accessed" bits, we would
> +     * take a double page fault, so mark it accessed here.
> +     */
> +    vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
> +    if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) {

I'd move the pte_offset_map_lock() inside the 'if' block as we don't
want to affect architectures that handle old ptes automatically.

> +        if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
> +            /*
> +             * Other thread has already handled the fault
> +             * and we don't need to do anything. If it's
> +             * not the case, the fault will be triggered
> +             * again on the same address.
> +             */
> +            ret = false;
> +            goto pte_unlock;
> +        }
> +
> +        entry = pte_mkyoung(vmf->orig_pte);
> +        if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0))
> +            update_mmu_cache(vma, addr, vmf->pte);
> +    }
> 
> +    /*
> +     * This really shouldn't fail, because the page is there
> +     * in the page tables. But it might just be unreadable,
> +     * in which case we just give up and fill the result with
> +     * zeroes.
> +     */
> +    if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
>          /*
> -         * This really shouldn't fail, because the page is there
> -         * in the page tables. But it might just be unreadable,
> -         * in which case we just give up and fill the result with
> -         * zeroes.
> +         * Give a warn in case there can be some obscure
> +         * use-case
>           */
> -        if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE))
> -            clear_page(kaddr);
> -        kunmap_atomic(kaddr);
> -        flush_dcache_page(dst);
> -    } else
> -        copy_user_highpage(dst, src, va, vma);
> +        WARN_ON_ONCE(1);
> +        clear_page(kaddr);
> +    }
> +
> +    ret = true;
> +
> +pte_unlock:
> +    pte_unmap_unlock(vmf->pte, vmf->ptl);

Since the locking would be moved in the 'if' block above, we need
another check here before unlocking:

	if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte))
		pte_unmap_unlock(vmf->pte, vmf->ptl);

You could probably replace the two calls to arch_faults_on_old_pte()
with a single bool variable initialisation, something like:

	force_mkyoung = arch_faults_on_old_pte() &&
		!pte_young(vmf->orig_pte)

and only check for "if (force_mkyoung)" in both cases.

> +    kunmap_atomic(kaddr);
> +    flush_dcache_page(dst);
> +
> +    return ret;
>  }

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-09-24 16:35 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-21 13:50 [PATCH v8 0/3] fix double page fault on arm64 Jia He
2019-09-21 13:50 ` Jia He
2019-09-21 13:50 ` [PATCH v8 1/3] arm64: cpufeature: introduce helper cpu_has_hw_af() Jia He
2019-09-21 13:50   ` Jia He
2019-09-23 16:07   ` Catalin Marinas
2019-09-23 16:07     ` Catalin Marinas
2019-09-24  1:50     ` Justin He (Arm Technology China)
2019-09-24  1:50       ` Justin He (Arm Technology China)
2019-09-21 13:50 ` [PATCH v8 2/3] arm64: mm: implement arch_faults_on_old_pte() on arm64 Jia He
2019-09-21 13:50   ` Jia He
2019-09-23 16:18   ` Catalin Marinas
2019-09-23 16:18     ` Catalin Marinas
2019-09-24  2:17     ` Justin He (Arm Technology China)
2019-09-24  2:17       ` Justin He (Arm Technology China)
2019-09-21 13:50 ` [PATCH v8 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared Jia He
2019-09-21 13:50   ` Jia He
2019-09-21 15:31   ` Matthew Wilcox
2019-09-21 15:31     ` Matthew Wilcox
2019-09-23  8:28   ` Kirill A. Shutemov
2019-09-23  8:28     ` Kirill A. Shutemov
2019-09-23 17:04   ` Catalin Marinas
2019-09-23 17:04     ` Catalin Marinas
2019-09-24  6:43     ` Justin He (Arm Technology China)
2019-09-24  6:43       ` Justin He (Arm Technology China)
2019-09-24 10:33       ` Catalin Marinas
2019-09-24 10:33         ` Catalin Marinas
2019-09-24 11:59         ` Kirill A. Shutemov
2019-09-24 11:59           ` Kirill A. Shutemov
2019-09-24 15:29         ` Jia He
2019-09-24 15:29           ` Jia He
2019-09-24 16:35           ` Catalin Marinas [this message]
2019-09-24 16:35             ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190924163542.GI41214@arrakis.emea.arm.com \
    --to=catalin.marinas@arm.com \
    --cc=Anshuman.Khandual@arm.com \
    --cc=James.Morse@arm.com \
    --cc=Justin.He@arm.com \
    --cc=Kaly.Xin@arm.com \
    --cc=Mark.Rutland@arm.com \
    --cc=Robin.Murphy@arm.com \
    --cc=Suzuki.Poulose@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=avanbrunt@nvidia.com \
    --cc=hejianet@gmail.com \
    --cc=jglisse@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maz@kernel.org \
    --cc=nd@arm.com \
    --cc=punitagrawal@gmail.com \
    --cc=rcampbell@nvidia.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.