All of lore.kernel.org
 help / color / mirror / Atom feed
From: Naresh Kamboju <naresh.kamboju@linaro.org>
To: Peter Xu <peterx@redhat.com>, linux-mm <linux-mm@kvack.org>,
	 open list <linux-kernel@vger.kernel.org>
Cc: Jan Kara <jack@suse.cz>, John Hubbard <jhubbard@nvidia.com>,
	 Linus Torvalds <torvalds@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>,
	 Kirill Tkhai <ktkhai@virtuozzo.com>,
	Kirill Shutemov <kirill@shutemov.name>,
	 Oleg Nesterov <oleg@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Jann Horn <jannh@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	 Matthew Wilcox <willy@infradead.org>,
	Hugh Dickins <hughd@google.com>,
	 Linux-Next Mailing List <linux-next@vger.kernel.org>,
	lkft-triage@lists.linaro.org,  regressions@lists.linux.dev
Subject: Re: [PATCH v2 3/3] mm: gup: pack has_pinned in MMF_HAS_PINNED
Date: Wed, 12 May 2021 15:19:55 +0530	[thread overview]
Message-ID: <CA+G9fYswoAtwh=m1pakV2fB7w3Aq-emvLSCU+b_OhxyqV0Ybsg@mail.gmail.com> (raw)
In-Reply-To: <20210507150553.208763-4-peterx@redhat.com>

On Fri, 7 May 2021 at 20:36, Peter Xu <peterx@redhat.com> wrote:
>
> From: Andrea Arcangeli <aarcange@redhat.com>
>
> has_pinned 32bit can be packed in the MMF_HAS_PINNED bit as a noop
> cleanup.
>
> Any atomic_inc/dec to the mm cacheline shared by all threads in
> pin-fast would reintroduce a loss of SMP scalability to pin-fast, so
> there's no future potential usefulness to keep an atomic in the mm for
> this.
>
> set_bit(MMF_HAS_PINNED) will be theoretically a bit slower than
> WRITE_ONCE (atomic_set is equivalent to WRITE_ONCE), but the set_bit
> (just like atomic_set after this commit) has to be still issued only
> once per "mm", so the difference between the two will be lost in the
> noise.
>
> will-it-scale "mmap2" shows no change in performance with enterprise
> config as expected.
>
> will-it-scale "pin_fast" retains the > 4000% SMP scalability
> performance improvement against upstream as expected.
>
> This is a noop as far as overall performance and SMP scalability are
> concerned.
>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> [peterx: Fix build for task_mmu.c, introduce mm_set_has_pinned_flag, fix
>  comment here and there]
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  fs/proc/task_mmu.c             |  2 +-
>  include/linux/mm.h             |  2 +-
>  include/linux/mm_types.h       | 10 ----------
>  include/linux/sched/coredump.h |  8 ++++++++
>  kernel/fork.c                  |  1 -
>  mm/gup.c                       | 19 +++++++++++++++----
>  6 files changed, 25 insertions(+), 17 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 4c95cc57a66a8..6144571942db9 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -1049,7 +1049,7 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr,
>                 return false;
>         if (!is_cow_mapping(vma->vm_flags))
>                 return false;
> -       if (likely(!atomic_read(&vma->vm_mm->has_pinned)))
> +       if (likely(!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags)))
>                 return false;
>         page = vm_normal_page(vma, addr, pte);
>         if (!page)
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index d6790ab0cf575..94dc84f6d8658 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1331,7 +1331,7 @@ static inline bool page_needs_cow_for_dma(struct vm_area_struct *vma,
>         if (!is_cow_mapping(vma->vm_flags))
>                 return false;
>
> -       if (!atomic_read(&vma->vm_mm->has_pinned))
> +       if (!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags))
>                 return false;
>
>         return page_maybe_dma_pinned(page);
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 6613b26a88946..15d79858fadbd 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -435,16 +435,6 @@ struct mm_struct {
>                  */
>                 atomic_t mm_count;
>
> -               /**
> -                * @has_pinned: Whether this mm has pinned any pages.  This can
> -                * be either replaced in the future by @pinned_vm when it
> -                * becomes stable, or grow into a counter on its own. We're
> -                * aggresive on this bit now - even if the pinned pages were
> -                * unpinned later on, we'll still keep this bit set for the
> -                * lifecycle of this mm just for simplicity.
> -                */
> -               atomic_t has_pinned;
> -
>                 /**
>                  * @write_protect_seq: Locked when any thread is write
>                  * protecting pages mapped by this mm to enforce a later COW,
> diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
> index dfd82eab29025..4d9e3a6568758 100644
> --- a/include/linux/sched/coredump.h
> +++ b/include/linux/sched/coredump.h
> @@ -73,6 +73,14 @@ static inline int get_dumpable(struct mm_struct *mm)
>  #define MMF_OOM_VICTIM         25      /* mm is the oom victim */
>  #define MMF_OOM_REAP_QUEUED    26      /* mm was queued for oom_reaper */
>  #define MMF_MULTIPROCESS       27      /* mm is shared between processes */
> +/*
> + * MMF_HAS_PINNED: Whether this mm has pinned any pages.  This can be either
> + * replaced in the future by mm.pinned_vm when it becomes stable, or grow into
> + * a counter on its own. We're aggresive on this bit for now: even if the
> + * pinned pages were unpinned later on, we'll still keep this bit set for the
> + * lifecycle of this mm, just for simplicity.
> + */
> +#define MMF_HAS_PINNED         28      /* FOLL_PIN has run, never cleared */
>  #define MMF_DISABLE_THP_MASK   (1 << MMF_DISABLE_THP)
>
>  #define MMF_INIT_MASK          (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 502dc046fbc62..a71e73707ef59 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1026,7 +1026,6 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
>         mm_pgtables_bytes_init(mm);
>         mm->map_count = 0;
>         mm->locked_vm = 0;
> -       atomic_set(&mm->has_pinned, 0);
>         atomic64_set(&mm->pinned_vm, 0);
>         memset(&mm->rss_stat, 0, sizeof(mm->rss_stat));
>         spin_lock_init(&mm->page_table_lock);
> diff --git a/mm/gup.c b/mm/gup.c
> index 9933bc5c2eff2..bb130723a6717 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1270,6 +1270,17 @@ int fixup_user_fault(struct mm_struct *mm,
>  }
>  EXPORT_SYMBOL_GPL(fixup_user_fault);
>
> +/*
> + * Set the MMF_HAS_PINNED if not set yet; after set it'll be there for the mm's
> + * lifecycle.  Avoid setting the bit unless necessary, or it might cause write
> + * cache bouncing on large SMP machines for concurrent pinned gups.
> + */
> +static inline void mm_set_has_pinned_flag(unsigned long *mm_flags)
> +{
> +       if (!test_bit(MMF_HAS_PINNED, mm_flags))
> +               set_bit(MMF_HAS_PINNED, mm_flags);
> +}
> +
>  /*
>   * Please note that this function, unlike __get_user_pages will not
>   * return 0 for nr_pages > 0 without FOLL_NOWAIT
> @@ -1292,8 +1303,8 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
>                 BUG_ON(*locked != 1);
>         }
>
> -       if ((flags & FOLL_PIN) && !atomic_read(&mm->has_pinned))
> -               atomic_set(&mm->has_pinned, 1);
> +       if (flags & FOLL_PIN)
> +               mm_set_has_pinned_flag(&mm->flags);
>
>         /*
>          * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavior
> @@ -2617,8 +2628,8 @@ static int internal_get_user_pages_fast(unsigned long start,
>                                        FOLL_FAST_ONLY)))
>                 return -EINVAL;
>
> -       if ((gup_flags & FOLL_PIN) && !atomic_read(&current->mm->has_pinned))
> -               atomic_set(&current->mm->has_pinned, 1);
> +       if (gup_flags & FOLL_PIN)
> +               mm_set_has_pinned_flag(&current->mm->flags);

Linux next tag next-20210512 builds failed on arm, riscv, mips and sh
for the tinyconfig and allnoconfig due this patch.

 arm, mips, riscv and sh (tinyconfig) with gcc-8
 arm, mips, riscv and sh (allnoconfig) with gcc-8
 arm, mips, riscv and sh (tinyconfig) with gcc-9
 arm, mips, riscv and sh (allnoconfig) with gcc-9
 arm, mips, riscv and sh (tinyconfig) with gcc-10
 arm, mips, riscv and sh (allnoconfig) with gcc-10

mm/gup.c: In function 'internal_get_user_pages_fast':
mm/gup.c:2698:3: error: implicit declaration of function
'mm_set_has_pinned_flag' [-Werror=implicit-function-declaration]
 2698 |   mm_set_has_pinned_flag(&current->mm->flags);
      |   ^~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
make[2]: *** [/builds/linux/scripts/Makefile.build:273: mm/gup.o] Error 1

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>

#regzb introduced: 354a2e3604e2 ("mm: gup: pack has_pinned in MMF_HAS_PINNED")

Build url:
https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/jobs/1255567072#L315

--
Linaro LKFT
https://lkft.linaro.org

  parent reply	other threads:[~2021-05-12  9:50 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-07 15:05 [PATCH v2 0/3] mm/gup: Fix pin page write cache bouncing on has_pinned Peter Xu
2021-05-07 15:05 ` [PATCH v2 1/3] mm/gup_benchmark: Support threading Peter Xu
2021-05-07 15:05 ` [PATCH v2 2/3] mm: gup: allow FOLL_PIN to scale in SMP Peter Xu
2021-05-07 15:05 ` [PATCH v2 3/3] mm: gup: pack has_pinned in MMF_HAS_PINNED Peter Xu
2021-05-08  1:12   ` John Hubbard
2021-05-12  9:49   ` Geert Uytterhoeven
2021-05-12  9:49     ` Geert Uytterhoeven
2021-05-12 12:34     ` Peter Xu
2021-05-12  9:49   ` Naresh Kamboju [this message]
2021-05-12  9:49     ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+G9fYswoAtwh=m1pakV2fB7w3Aq-emvLSCU+b_OhxyqV0Ybsg@mail.gmail.com' \
    --to=naresh.kamboju@linaro.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill@shutemov.name \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-next@vger.kernel.org \
    --cc=lkft-triage@lists.linaro.org \
    --cc=mhocko@suse.com \
    --cc=oleg@redhat.com \
    --cc=peterx@redhat.com \
    --cc=regressions@lists.linux.dev \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.