linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Dave Hansen <dave@sr71.net>, linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, x86@kernel.org, dave.hansen@linux.intel.com,
	akpm@linux-foundation.org, kirill.shutemov@linux.intel.com,
	aarcange@redhat.com, n-horiguchi@ah.jp.nec.com, jack@suse.cz
Subject: Re: [PATCH] mm, gup: introduce concept of "foreign" get_user_pages()
Date: Mon, 18 Jan 2016 16:20:15 +0100	[thread overview]
Message-ID: <569D02AF.8050903@suse.cz> (raw)
In-Reply-To: <20160115181114.A50C25D1@viggo.jf.intel.com>

On 01/15/2016 07:11 PM, Dave Hansen wrote:
> Just sending an update to this one patch instead of resending
> the entire series.
>
> Jan Kara suggested that we just change get_user_pages()'s
> prototype to remove tsk/mm instead of introducing a separate
> get_user_pages_current().  Also, we moved the "_foreign" in
> get_user_pages_foreign() to the end to be more consistent with
> the "_unlocked" version.

Thanks, and you could have blamed me too, not just Jan ;)

> This approach will break any new users of get_user_pages()
> which try to pass a tsk/mm, but Jan doesn't think these are
> frequent enough to be a concern.  This passes an allyesconfig
> on 4.4, at least.
>
> As always, any acks on this approach would be much appreciated.
> This is the largest swath of non-x86 code that protection keys
> touches, and I'm sure the x86 maintainers would appreciate
> seeing some acks from folks on it.

This is finally a thorough review attempt, sorry I didn't catch some of 
the stuff below earlier.

> ---
>
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> For protection keys, we need to understand whether protections
> should be enforced in software or not.  In general, we enforce
> protections when working on our own task, but not when on others.
> We call these "current" and "foreign" operations.
>
> This patch introduces a new get_user_pages() variant:
>
> 	get_user_pages_foreign()
>
> The plain get_user_pages() can no longer be used on mm/tasks
> other than 'current/current->mm', which is by far the most common
> way it is called.  Using it makes a few of the call sites look a
> bit nicer.
>
> get_user_pages_foreign() is a replacement for when
> get_user_pages() is called on non-current tsk/mm.
>
> This also switches get_user_pages_unlocked() over to be like
> get_user_pages() and not take a tsk/mm.  If someone wants the
> get_user_pages_unlocked() behavior with a non-current tsk/mm,
> they just have to use __get_user_pages_unlocked() directly.

Hmm, but your patch actually changes __get_user_pages_unlocked() to also 
not include the task and mm params and assume current and current->mm? 
Wouldn't it be more consistent if __get_user_unlocked() stayed as it is?
What you say above is true for {__}get_user_pages_locked.

> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: vbabka@suse.cz
> Cc: jack@suse.cz
> ---
>
>   b/arch/cris/arch-v32/drivers/cryptocop.c        |    8 ---
>   b/arch/ia64/kernel/err_inject.c                 |    3 -
>   b/arch/mips/mm/gup.c                            |    3 -
>   b/arch/s390/mm/gup.c                            |    4 -
>   b/arch/sh/mm/gup.c                              |    2
>   b/arch/sparc/mm/gup.c                           |    2
>   b/arch/x86/mm/gup.c                             |    2
>   b/arch/x86/mm/mpx.c                             |    4 -
>   b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |    3 -
>   b/drivers/gpu/drm/i915/i915_gem_userptr.c       |    2
>   b/drivers/gpu/drm/radeon/radeon_ttm.c           |    3 -
>   b/drivers/gpu/drm/via/via_dmablit.c             |    3 -
>   b/drivers/infiniband/core/umem.c                |    2
>   b/drivers/infiniband/core/umem_odp.c            |    8 +--
>   b/drivers/infiniband/hw/mthca/mthca_memfree.c   |    3 -
>   b/drivers/infiniband/hw/qib/qib_user_pages.c    |    3 -
>   b/drivers/infiniband/hw/usnic/usnic_uiom.c      |    2
>   b/drivers/media/pci/ivtv/ivtv-udma.c            |    4 -
>   b/drivers/media/pci/ivtv/ivtv-yuv.c             |   10 +---
>   b/drivers/media/v4l2-core/videobuf-dma-sg.c     |    3 -
>   b/drivers/misc/mic/scif/scif_rma.c              |    2
>   b/drivers/misc/sgi-gru/grufault.c               |    3 -
>   b/drivers/scsi/st.c                             |    2
>   b/drivers/staging/rdma/hfi1/user_pages.c        |    3 -
>   b/drivers/staging/rdma/ipath/ipath_user_pages.c |    3 -
>   b/drivers/video/fbdev/pvr2fb.c                  |    4 -
>   b/drivers/virt/fsl_hypervisor.c                 |    5 --
>   b/fs/exec.c                                     |    8 ++-
>   b/include/linux/mm.h                            |   23 +++++-----
>   b/kernel/events/uprobes.c                       |    4 -
>   b/mm/frame_vector.c                             |    2
>   b/mm/gup.c                                      |   51 +++++++++++++++---------
>   b/mm/ksm.c                                      |    2
>   b/mm/memory.c                                   |    2
>   b/mm/mempolicy.c                                |    6 +-
>   b/mm/nommu.c                                    |   35 +++++++++-------
>   b/mm/process_vm_access.c                        |    6 +-
>   b/mm/util.c                                     |    4 -
>   b/net/ceph/pagevec.c                            |    2
>   b/security/tomoyo/domain.c                      |    9 +++-
>   b/virt/kvm/async_pf.c                           |    2
>   b/virt/kvm/kvm_main.c                           |   13 ++----
>   42 files changed, 135 insertions(+), 130 deletions(-)

[...]

> --- a/kernel/events/uprobes.c~get_current_user_pages	2016-01-15 09:45:42.110046066 -0800
> +++ b/kernel/events/uprobes.c	2016-01-15 09:45:42.152047953 -0800
> @@ -298,7 +298,7 @@ int uprobe_write_opcode(struct mm_struct
>
>   retry:
>   	/* Read the page with vaddr into memory */
> -	ret = get_user_pages(NULL, mm, vaddr, 1, 0, 1, &old_page, &vma);
> +	ret = get_user_pages_foreign(NULL, mm, vaddr, 1, 0, 1, &old_page, &vma);
>   	if (ret <= 0)
>   		return ret;
>
> @@ -1699,7 +1699,7 @@ static int is_trap_at_addr(struct mm_str
>   	if (likely(result == 0))
>   		goto out;
>
> -	result = get_user_pages(NULL, mm, vaddr, 1, 0, 1, &page, NULL);
> +	result = get_user_pages(vaddr, 1, 0, 1, &page, NULL);

Yeah it seems that mm here is current->mm, and using current task 
instead of NULL affects AFAICS just the min/maj fault counting, but 
isn't it still a subtle and unintended functional change?

[...]

> -__always_inline long __get_user_pages_unlocked(struct task_struct *tsk, struct mm_struct *mm,
> -					       unsigned long start, unsigned long nr_pages,
> +__always_inline long __get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
>   					       int write, int force, struct page **pages,
>   					       unsigned int gup_flags)

This is the IMHO unneeded inconsistency what I mentioned above...

> diff -puN mm/process_vm_access.c~get_current_user_pages mm/process_vm_access.c
> --- a/mm/process_vm_access.c~get_current_user_pages	2016-01-15 09:45:42.120046515 -0800
> +++ b/mm/process_vm_access.c	2016-01-15 09:45:42.157048177 -0800
> @@ -99,8 +99,10 @@ static int process_vm_rw_single_vec(unsi
>   		size_t bytes;
>
>   		/* Get the pages we're interested in */
> -		pages = get_user_pages_unlocked(task, mm, pa, pages,
> -						vm_write, 0, process_pages);
> +		down_read(&mm->mmap_sem);
> +		pages = get_user_pages_foreign(task, mm, pa, pages, vm_write,
> +						0, process_pages, NULL);
> +		up_read(&mm->mmap_sem);

You could have simply used __get_user_pages_unlocked() if it wasn't changed.

> @@ -80,7 +80,7 @@ static void async_pf_execute(struct work
>
>   	might_sleep();
>
> -	get_user_pages_unlocked(NULL, mm, addr, 1, 1, 0, NULL);
> +	get_user_pages_unlocked(addr, 1, 1, 0, NULL);

This seems to get mm from some structure where struct work is embedded, 
are you sure it's current->mm? I think it's another place for 
__get_user_pages_unlocked().

>   static inline int check_user_page_hwpoison(unsigned long addr)
> @@ -1344,12 +1345,10 @@ static int hva_to_pfn_slow(unsigned long
>
>   	if (async) {
>   		down_read(&current->mm->mmap_sem);
> -		npages = get_user_page_nowait(current, current->mm,
> -					      addr, write_fault, page);
> +		npages = get_user_page_nowait(addr, write_fault, page);
>   		up_read(&current->mm->mmap_sem);
>   	} else
> -		npages = __get_user_pages_unlocked(current, current->mm, addr, 1,
> -						   write_fault, 0, page,
> +		npages = __get_user_pages_unlocked(addr, 1, write_fault, 0, page,
>   						   FOLL_TOUCH|FOLL_HWPOISON);
>   	if (npages != 1)
>   		return npages;

If you change __get_user_pages_unlocked() as I suggested, you could use 
get_user_pages_unlocked() here?

  reply	other threads:[~2016-01-18 15:20 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-15 18:11 [PATCH] mm, gup: introduce concept of "foreign" get_user_pages() Dave Hansen
2016-01-18 15:20 ` Vlastimil Babka [this message]
2016-01-20 17:35 Dave Hansen
2016-01-20 17:56 ` Vlastimil Babka
2016-01-20 18:48   ` Dave Hansen
2016-01-20 19:30 ` Vlastimil Babka
2016-01-22 18:02 Dave Hansen
2016-01-22 18:16 ` kbuild test robot
2016-01-22 21:31   ` Dave Hansen
2016-01-25 13:17 ` Srikar Dronamraju
2016-01-25 18:18   ` Oleg Nesterov
2016-01-27 11:30 ` Vlastimil Babka
2016-01-27 22:59   ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=569D02AF.8050903@suse.cz \
    --to=vbabka@suse.cz \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave@sr71.net \
    --cc=jack@suse.cz \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).