From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 844DEC43219 for ; Tue, 30 Apr 2019 13:25:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 14E8621707 for ; Tue, 30 Apr 2019 13:25:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="EMsH7fAm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 14E8621707 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B29D06B000D; Tue, 30 Apr 2019 09:25:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AD9316B000E; Tue, 30 Apr 2019 09:25:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A2846B0010; Tue, 30 Apr 2019 09:25:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 6E1D76B000D for ; Tue, 30 Apr 2019 09:25:32 -0400 (EDT) Received: by mail-qk1-f200.google.com with SMTP id i124so11781364qkf.14 for ; Tue, 30 Apr 2019 06:25:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=UuoCrbzv5b8zt4OTCxiSxJOR/ubA99fvcSSBJXRKnzE=; b=c2tVSqphl9hxF3MX4vKIHjKATPdLp2WBx5WrLVFXWJcIHtmzmSRilEaZsu28SapOwH NQDmGgSME2auZLuJNQzzMmc6t8gncvbUli/lXrbVygNsnY6bHHIAdhqFcNIMlvil/KLg GH4i7C75wipJfVxDMSS03++a2fEkB2Hc13BLOrcZJ9xOWvPGMCy5mf7QuiMcMj9mYTQw 0zKBUk5IA5ijlgZEgs0b/m4Don8p2EJupL6OIfFGdMgY9KtetFw0jtSqS3ZJ47h7zdNs C6CNczo9rY+p+WbC7pdp0xhmY9W6yAud17DHY7nJcJBn3E1XIeXy/y4gom92TgrIBezr AGNw== X-Gm-Message-State: APjAAAU2PDrQ5Ge0PFfi/Keect9U3JffyyeIwJJUkS6v0L1tgMEaTgOq CmMNs/UqaCUxOICLEnNV/97FMDbaw+pNVFtqQq6Bocj5zbgIpftmxmRnk5WBhgWIqqCVAJemeDB V6A8FQckBHcpbUJaGtzabOn9gJjUfLqdKmJNw11mS4dqLN4gdI1ec3YizwvA81bYwRg== X-Received: by 2002:ac8:37cf:: with SMTP id e15mr45227987qtc.161.1556630732029; Tue, 30 Apr 2019 06:25:32 -0700 (PDT) X-Received: by 2002:ac8:37cf:: with SMTP id e15mr45227860qtc.161.1556630730308; Tue, 30 Apr 2019 06:25:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556630730; cv=none; d=google.com; s=arc-20160816; b=c49P/VnsOc5+TZnFKPL/wJC9flaIuXxrzZlbnvPy90x/syBNpLHPHFVn8NkClhH4Gy eVGlT3hTTu72aPY7D9w0U1SLALRmfNYbds8WfClyNplGFDDjZWFbCZb73QALtLsyP8av 2dL9xBE5VXPMyQzLXtuc4ZmeWgNl/JwblVLc2qwUlYhLiVtGpXns4dohejpI83Sz2Mus iQS+UYudA6Jc7zx2oYh3NulKOky3AOQJs7X1GWbzTzmpmkkFW/6dvM5vFHQuzy1ls+6S o46TkjazXYkzCK0W6CNN4CxOIZtrRW6jdJp+9FnoyA/59ClQlhSyP2T8fOIfDFskH8xr wl5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:dkim-signature; bh=UuoCrbzv5b8zt4OTCxiSxJOR/ubA99fvcSSBJXRKnzE=; b=iffaFKr57b8mNnUIYyXDPcfSJeyuzynZ+nBbmn2z2IyhiEKjGlXTGhZ4yuRYkW7hQG 7R7FumMhgG56Q2mZy3Oc/3A4fOwpbaxqHCpDZchUXI5mDqRBmM/XEWncJaVpTLAN4hyC bjMR9Wrv5g/AWsbH9R4zThnU1newTyxhHenz+k091T8R6m16uDo+6b80mf+w/jZ96E95 jnTRBjA51k/95J0yftfV10GJfYNIhP7M1aYaf67/oBv4DpiNHMXRzaypSIrb5zDyv91k mQWW27Em4AdI6KS6pG/4yXbTTHwf31CUGplOWRuAlphh6p4w6C8Zz0mPdplax/cFOJcd TBEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=EMsH7fAm; spf=pass (google.com: domain of 3yuzixaokchaobrfsmybjzuccuzs.qcazwbil-aayjoqy.cfu@flex--andreyknvl.bounces.google.com designates 209.85.220.73 as permitted sender) smtp.mailfrom=3yUzIXAoKCHAObRfSmYbjZUccUZS.QcaZWbil-aaYjOQY.cfU@flex--andreyknvl.bounces.google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from mail-sor-f73.google.com (mail-sor-f73.google.com. [209.85.220.73]) by mx.google.com with SMTPS id x30sor25928580qtx.32.2019.04.30.06.25.30 for (Google Transport Security); Tue, 30 Apr 2019 06:25:30 -0700 (PDT) Received-SPF: pass (google.com: domain of 3yuzixaokchaobrfsmybjzuccuzs.qcazwbil-aayjoqy.cfu@flex--andreyknvl.bounces.google.com designates 209.85.220.73 as permitted sender) client-ip=209.85.220.73; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=EMsH7fAm; spf=pass (google.com: domain of 3yuzixaokchaobrfsmybjzuccuzs.qcazwbil-aayjoqy.cfu@flex--andreyknvl.bounces.google.com designates 209.85.220.73 as permitted sender) smtp.mailfrom=3yUzIXAoKCHAObRfSmYbjZUccUZS.QcaZWbil-aaYjOQY.cfU@flex--andreyknvl.bounces.google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=UuoCrbzv5b8zt4OTCxiSxJOR/ubA99fvcSSBJXRKnzE=; b=EMsH7fAmK7kssVH435xmw0fz8das3WJgqzcDhNDYcyNxT5R0lqyHJSCTVHjvHDoeX+ Cza3A/NVV6ocMFLWoamQ+mfGRm49tfzKF3MhBWvOqAgPt6yw9J0OfI4XW2LYU061p6xE lsYpAPjHTbjIvJQMGITeBpzWOzD+bLOP4JeuNLZ4Y59nTzhZYrryO2y1QHRZe5PsF4Kq 1k7nYocm+I8OJrfpYBCzNhUlHWXyxOA7qA54JGzMMPes5NhT/VZxDAcGpY1YtXIQ5fGB 49LUw4hoVyjUL2HjXkaJRjmlKQjWZ4aEw3gNEcGNBKR7XbkrglHE0mKoQOjQ1gwI2nTz ddeQ== X-Google-Smtp-Source: APXvYqyzLKXo6t2omzvNodpOthN7YasDWFmvmmzG+JTX/M7Y9x0ihUYQpCMV5H3WwRxhB44c+mxJSKrcqNn1q5HN X-Received: by 2002:ac8:21c7:: with SMTP id 7mr53156593qtz.66.1556630729885; Tue, 30 Apr 2019 06:25:29 -0700 (PDT) Date: Tue, 30 Apr 2019 15:25:00 +0200 In-Reply-To: Message-Id: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.21.0.593.g511ec345e18-goog Subject: [PATCH v14 04/17] mm: add ksys_ wrappers to memory syscalls From: Andrey Konovalov To: linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-rdma@vger.kernel.org, linux-media@vger.kernel.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org Cc: Catalin Marinas , Vincenzo Frascino , Will Deacon , Mark Rutland , Andrew Morton , Greg Kroah-Hartman , Kees Cook , Yishai Hadas , Kuehling@google.com, Felix , Deucher@google.com, Alexander , Koenig@google.com, Christian , Mauro Carvalho Chehab , Jens Wiklander , Alex Williamson , Leon Romanovsky , Dmitry Vyukov , Kostya Serebryany , Evgeniy Stepanov , Lee Smith , Ramana Radhakrishnan , Jacob Bramley , Ruben Ayrapetyan , Robin Murphy , Chintan Pandya , Luc Van Oostenryck , Dave Martin , Kevin Brodsky , Szabolcs Nagy , Andrey Konovalov Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments. This patch adds ksys_ wrappers to the following memory syscalls: brk, get_mempolicy (renamed kernel_get_mempolicy -> ksys_get_mempolicy), madvise, mbind (renamed kernel_mbind -> ksys_mbind), mincore, mlock (renamed do_mlock -> ksys_mlock), mlock2, mmap_pgoff, mprotect (renamed do_mprotect_pkey -> ksys_mprotect_pkey), mremap, msync, munlock, munmap, remap_file_pages, shmat, shmdt. The next patch in this series will add a custom implementation for these syscalls that makes them accept tagged pointers on arm64. Signed-off-by: Andrey Konovalov --- include/linux/syscalls.h | 22 +++++++ ipc/shm.c | 7 ++- mm/madvise.c | 129 ++++++++++++++++++++------------------- mm/mempolicy.c | 21 +++---- mm/mincore.c | 57 +++++++++-------- mm/mlock.c | 20 ++++-- mm/mmap.c | 30 ++++++--- mm/mprotect.c | 6 +- mm/mremap.c | 27 +++++--- mm/msync.c | 35 ++++++----- 10 files changed, 213 insertions(+), 141 deletions(-) diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index e446806a561f..70008f5ed84f 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1260,6 +1260,28 @@ int ksys_ipc(unsigned int call, int first, unsigned long second, unsigned long third, void __user * ptr, long fifth); int compat_ksys_ipc(u32 call, int first, int second, u32 third, u32 ptr, u32 fifth); +unsigned long ksys_mremap(unsigned long addr, unsigned long old_len, + unsigned long new_len, unsigned long flags, + unsigned long new_addr); +int ksys_munmap(unsigned long addr, size_t len); +unsigned long ksys_brk(unsigned long brk); +int ksys_get_mempolicy(int __user *policy, unsigned long __user *nmask, + unsigned long maxnode, unsigned long addr, unsigned long flags); +int ksys_madvise(unsigned long start, size_t len_in, int behavior); +long ksys_mbind(unsigned long start, unsigned long len, + unsigned long mode, const unsigned long __user *nmask, + unsigned long maxnode, unsigned int flags); +__must_check int ksys_mlock(unsigned long start, size_t len, vm_flags_t flags); +__must_check int ksys_mlock2(unsigned long start, size_t len, vm_flags_t flags); +int ksys_munlock(unsigned long start, size_t len); +int ksys_mprotect_pkey(unsigned long start, size_t len, + unsigned long prot, int pkey); +int ksys_msync(unsigned long start, size_t len, int flags); +long ksys_mincore(unsigned long start, size_t len, unsigned char __user *vec); +unsigned long ksys_remap_file_pages(unsigned long start, unsigned long size, + unsigned long prot, unsigned long pgoff, unsigned long flags); +long ksys_shmat(int shmid, char __user *shmaddr, int shmflg); +long ksys_shmdt(char __user *shmaddr); /* * The following kernel syscall equivalents are just wrappers to fs-internal diff --git a/ipc/shm.c b/ipc/shm.c index ce1ca9f7c6e9..557b43968c0e 100644 --- a/ipc/shm.c +++ b/ipc/shm.c @@ -1588,7 +1588,7 @@ long do_shmat(int shmid, char __user *shmaddr, int shmflg, return err; } -SYSCALL_DEFINE3(shmat, int, shmid, char __user *, shmaddr, int, shmflg) +long ksys_shmat(int shmid, char __user *shmaddr, int shmflg) { unsigned long ret; long err; @@ -1600,6 +1600,11 @@ SYSCALL_DEFINE3(shmat, int, shmid, char __user *, shmaddr, int, shmflg) return (long)ret; } +SYSCALL_DEFINE3(shmat, int, shmid, char __user *, shmaddr, int, shmflg) +{ + return ksys_shmat(shmid, shmaddr, shmflg); +} + #ifdef CONFIG_COMPAT #ifndef COMPAT_SHMLBA diff --git a/mm/madvise.c b/mm/madvise.c index 21a7881a2db4..c27f5f14e2ee 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -738,68 +738,7 @@ madvise_behavior_valid(int behavior) } } -/* - * The madvise(2) system call. - * - * Applications can use madvise() to advise the kernel how it should - * handle paging I/O in this VM area. The idea is to help the kernel - * use appropriate read-ahead and caching techniques. The information - * provided is advisory only, and can be safely disregarded by the - * kernel without affecting the correct operation of the application. - * - * behavior values: - * MADV_NORMAL - the default behavior is to read clusters. This - * results in some read-ahead and read-behind. - * MADV_RANDOM - the system should read the minimum amount of data - * on any access, since it is unlikely that the appli- - * cation will need more than what it asks for. - * MADV_SEQUENTIAL - pages in the given range will probably be accessed - * once, so they can be aggressively read ahead, and - * can be freed soon after they are accessed. - * MADV_WILLNEED - the application is notifying the system to read - * some pages ahead. - * MADV_DONTNEED - the application is finished with the given range, - * so the kernel can free resources associated with it. - * MADV_FREE - the application marks pages in the given range as lazy free, - * where actual purges are postponed until memory pressure happens. - * MADV_REMOVE - the application wants to free up the given range of - * pages and associated backing store. - * MADV_DONTFORK - omit this area from child's address space when forking: - * typically, to avoid COWing pages pinned by get_user_pages(). - * MADV_DOFORK - cancel MADV_DONTFORK: no longer omit this area when forking. - * MADV_WIPEONFORK - present the child process with zero-filled memory in this - * range after a fork. - * MADV_KEEPONFORK - undo the effect of MADV_WIPEONFORK - * MADV_HWPOISON - trigger memory error handler as if the given memory range - * were corrupted by unrecoverable hardware memory failure. - * MADV_SOFT_OFFLINE - try to soft-offline the given range of memory. - * MADV_MERGEABLE - the application recommends that KSM try to merge pages in - * this area with pages of identical content from other such areas. - * MADV_UNMERGEABLE- cancel MADV_MERGEABLE: no longer merge pages with others. - * MADV_HUGEPAGE - the application wants to back the given range by transparent - * huge pages in the future. Existing pages might be coalesced and - * new pages might be allocated as THP. - * MADV_NOHUGEPAGE - mark the given range as not worth being backed by - * transparent huge pages so the existing pages will not be - * coalesced into THP and new pages will not be allocated as THP. - * MADV_DONTDUMP - the application wants to prevent pages in the given range - * from being included in its core dump. - * MADV_DODUMP - cancel MADV_DONTDUMP: no longer exclude from core dump. - * - * return values: - * zero - success - * -EINVAL - start + len < 0, start is not page-aligned, - * "behavior" is not a valid value, or application - * is attempting to release locked or shared pages, - * or the specified address range includes file, Huge TLB, - * MAP_SHARED or VMPFNMAP range. - * -ENOMEM - addresses in the specified range are not currently - * mapped, or are outside the AS of the process. - * -EIO - an I/O error occurred while paging in data. - * -EBADF - map exists, but area maps something that isn't a file. - * -EAGAIN - a kernel resource was temporarily unavailable. - */ -SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) +int ksys_madvise(unsigned long start, size_t len_in, int behavior) { unsigned long end, tmp; struct vm_area_struct *vma, *prev; @@ -894,3 +833,69 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) return error; } + +/* + * The madvise(2) system call. + * + * Applications can use madvise() to advise the kernel how it should + * handle paging I/O in this VM area. The idea is to help the kernel + * use appropriate read-ahead and caching techniques. The information + * provided is advisory only, and can be safely disregarded by the + * kernel without affecting the correct operation of the application. + * + * behavior values: + * MADV_NORMAL - the default behavior is to read clusters. This + * results in some read-ahead and read-behind. + * MADV_RANDOM - the system should read the minimum amount of data + * on any access, since it is unlikely that the appli- + * cation will need more than what it asks for. + * MADV_SEQUENTIAL - pages in the given range will probably be accessed + * once, so they can be aggressively read ahead, and + * can be freed soon after they are accessed. + * MADV_WILLNEED - the application is notifying the system to read + * some pages ahead. + * MADV_DONTNEED - the application is finished with the given range, + * so the kernel can free resources associated with it. + * MADV_FREE - the application marks pages in the given range as lazy free, + * where actual purges are postponed until memory pressure happens. + * MADV_REMOVE - the application wants to free up the given range of + * pages and associated backing store. + * MADV_DONTFORK - omit this area from child's address space when forking: + * typically, to avoid COWing pages pinned by get_user_pages(). + * MADV_DOFORK - cancel MADV_DONTFORK: no longer omit this area when forking. + * MADV_WIPEONFORK - present the child process with zero-filled memory in this + * range after a fork. + * MADV_KEEPONFORK - undo the effect of MADV_WIPEONFORK + * MADV_HWPOISON - trigger memory error handler as if the given memory range + * were corrupted by unrecoverable hardware memory failure. + * MADV_SOFT_OFFLINE - try to soft-offline the given range of memory. + * MADV_MERGEABLE - the application recommends that KSM try to merge pages in + * this area with pages of identical content from other such areas. + * MADV_UNMERGEABLE- cancel MADV_MERGEABLE: no longer merge pages with others. + * MADV_HUGEPAGE - the application wants to back the given range by transparent + * huge pages in the future. Existing pages might be coalesced and + * new pages might be allocated as THP. + * MADV_NOHUGEPAGE - mark the given range as not worth being backed by + * transparent huge pages so the existing pages will not be + * coalesced into THP and new pages will not be allocated as THP. + * MADV_DONTDUMP - the application wants to prevent pages in the given range + * from being included in its core dump. + * MADV_DODUMP - cancel MADV_DONTDUMP: no longer exclude from core dump. + * + * return values: + * zero - success + * -EINVAL - start + len < 0, start is not page-aligned, + * "behavior" is not a valid value, or application + * is attempting to release locked or shared pages, + * or the specified address range includes file, Huge TLB, + * MAP_SHARED or VMPFNMAP range. + * -ENOMEM - addresses in the specified range are not currently + * mapped, or are outside the AS of the process. + * -EIO - an I/O error occurred while paging in data. + * -EBADF - map exists, but area maps something that isn't a file. + * -EAGAIN - a kernel resource was temporarily unavailable. + */ +SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) +{ + return ksys_madvise(start, len_in, behavior); +} diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 2219e747df49..c2f82a045ceb 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1352,9 +1352,9 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode, return copy_to_user(mask, nodes_addr(*nodes), copy) ? -EFAULT : 0; } -static long kernel_mbind(unsigned long start, unsigned long len, - unsigned long mode, const unsigned long __user *nmask, - unsigned long maxnode, unsigned int flags) +long ksys_mbind(unsigned long start, unsigned long len, + unsigned long mode, const unsigned long __user *nmask, + unsigned long maxnode, unsigned int flags) { nodemask_t nodes; int err; @@ -1377,7 +1377,7 @@ SYSCALL_DEFINE6(mbind, unsigned long, start, unsigned long, len, unsigned long, mode, const unsigned long __user *, nmask, unsigned long, maxnode, unsigned int, flags) { - return kernel_mbind(start, len, mode, nmask, maxnode, flags); + return ksys_mbind(start, len, mode, nmask, maxnode, flags); } /* Set the process memory policy */ @@ -1507,11 +1507,8 @@ SYSCALL_DEFINE4(migrate_pages, pid_t, pid, unsigned long, maxnode, /* Retrieve NUMA policy */ -static int kernel_get_mempolicy(int __user *policy, - unsigned long __user *nmask, - unsigned long maxnode, - unsigned long addr, - unsigned long flags) +int ksys_get_mempolicy(int __user *policy, unsigned long __user *nmask, + unsigned long maxnode, unsigned long addr, unsigned long flags) { int err; int uninitialized_var(pval); @@ -1538,7 +1535,7 @@ SYSCALL_DEFINE5(get_mempolicy, int __user *, policy, unsigned long __user *, nmask, unsigned long, maxnode, unsigned long, addr, unsigned long, flags) { - return kernel_get_mempolicy(policy, nmask, maxnode, addr, flags); + return ksys_get_mempolicy(policy, nmask, maxnode, addr, flags); } #ifdef CONFIG_COMPAT @@ -1559,7 +1556,7 @@ COMPAT_SYSCALL_DEFINE5(get_mempolicy, int __user *, policy, if (nmask) nm = compat_alloc_user_space(alloc_size); - err = kernel_get_mempolicy(policy, nm, nr_bits+1, addr, flags); + err = ksys_get_mempolicy(policy, nm, nr_bits+1, addr, flags); if (!err && nmask) { unsigned long copy_size; @@ -1613,7 +1610,7 @@ COMPAT_SYSCALL_DEFINE6(mbind, compat_ulong_t, start, compat_ulong_t, len, return -EFAULT; } - return kernel_mbind(start, len, mode, nm, nr_bits+1, flags); + return ksys_mbind(start, len, mode, nm, nr_bits+1, flags); } COMPAT_SYSCALL_DEFINE4(migrate_pages, compat_pid_t, pid, diff --git a/mm/mincore.c b/mm/mincore.c index 218099b5ed31..a609bd8128da 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -197,32 +197,7 @@ static long do_mincore(unsigned long addr, unsigned long pages, unsigned char *v return (end - addr) >> PAGE_SHIFT; } -/* - * The mincore(2) system call. - * - * mincore() returns the memory residency status of the pages in the - * current process's address space specified by [addr, addr + len). - * The status is returned in a vector of bytes. The least significant - * bit of each byte is 1 if the referenced page is in memory, otherwise - * it is zero. - * - * Because the status of a page can change after mincore() checks it - * but before it returns to the application, the returned vector may - * contain stale information. Only locked pages are guaranteed to - * remain in memory. - * - * return values: - * zero - success - * -EFAULT - vec points to an illegal address - * -EINVAL - addr is not a multiple of PAGE_SIZE - * -ENOMEM - Addresses in the range [addr, addr + len] are - * invalid for the address space of this process, or - * specify one or more pages which are not currently - * mapped - * -EAGAIN - A kernel resource was temporarily unavailable. - */ -SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, - unsigned char __user *, vec) +long ksys_mincore(unsigned long start, size_t len, unsigned char __user *vec) { long retval; unsigned long pages; @@ -271,3 +246,33 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, free_page((unsigned long) tmp); return retval; } + +/* + * The mincore(2) system call. + * + * mincore() returns the memory residency status of the pages in the + * current process's address space specified by [addr, addr + len). + * The status is returned in a vector of bytes. The least significant + * bit of each byte is 1 if the referenced page is in memory, otherwise + * it is zero. + * + * Because the status of a page can change after mincore() checks it + * but before it returns to the application, the returned vector may + * contain stale information. Only locked pages are guaranteed to + * remain in memory. + * + * return values: + * zero - success + * -EFAULT - vec points to an illegal address + * -EINVAL - addr is not a multiple of PAGE_SIZE + * -ENOMEM - Addresses in the range [addr, addr + len] are + * invalid for the address space of this process, or + * specify one or more pages which are not currently + * mapped + * -EAGAIN - A kernel resource was temporarily unavailable. + */ +SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, + unsigned char __user *, vec) +{ + return ksys_mincore(start, len, vec); +} diff --git a/mm/mlock.c b/mm/mlock.c index 080f3b36415b..09e449447539 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -668,7 +668,7 @@ static int count_mm_mlocked_page_nr(struct mm_struct *mm, return count >> PAGE_SHIFT; } -static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t flags) +__must_check int ksys_mlock(unsigned long start, size_t len, vm_flags_t flags) { unsigned long locked; unsigned long lock_limit; @@ -715,10 +715,10 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len) { - return do_mlock(start, len, VM_LOCKED); + return ksys_mlock(start, len, VM_LOCKED); } -SYSCALL_DEFINE3(mlock2, unsigned long, start, size_t, len, int, flags) +__must_check int ksys_mlock2(unsigned long start, size_t len, vm_flags_t flags) { vm_flags_t vm_flags = VM_LOCKED; @@ -728,10 +728,15 @@ SYSCALL_DEFINE3(mlock2, unsigned long, start, size_t, len, int, flags) if (flags & MLOCK_ONFAULT) vm_flags |= VM_LOCKONFAULT; - return do_mlock(start, len, vm_flags); + return ksys_mlock(start, len, vm_flags); } -SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) +SYSCALL_DEFINE3(mlock2, unsigned long, start, size_t, len, int, flags) +{ + return ksys_mlock2(start, len, flags); +} + +int ksys_munlock(unsigned long start, size_t len) { int ret; @@ -746,6 +751,11 @@ SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) return ret; } +SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) +{ + return ksys_munlock(start, len); +} + /* * Take the MCL_* flags passed into mlockall (or 0 if called from munlockall) * and translate into the appropriate modifications to mm->def_flags and/or the diff --git a/mm/mmap.c b/mm/mmap.c index bd7b9f293b39..09bfaf36b961 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -189,7 +189,8 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma) static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long flags, struct list_head *uf); -SYSCALL_DEFINE1(brk, unsigned long, brk) + +unsigned long ksys_brk(unsigned long brk) { unsigned long retval; unsigned long newbrk, oldbrk, origbrk; @@ -288,6 +289,11 @@ SYSCALL_DEFINE1(brk, unsigned long, brk) return retval; } +SYSCALL_DEFINE1(brk, unsigned long, brk) +{ + return ksys_brk(brk); +} + static long vma_compute_subtree_gap(struct vm_area_struct *vma) { unsigned long max, prev_end, subtree_gap; @@ -2870,18 +2876,19 @@ int vm_munmap(unsigned long start, size_t len) } EXPORT_SYMBOL(vm_munmap); -SYSCALL_DEFINE2(munmap, unsigned long, addr, size_t, len) +int ksys_munmap(unsigned long addr, size_t len) { profile_munmap(addr); return __vm_munmap(addr, len, true); } +SYSCALL_DEFINE2(munmap, unsigned long, addr, size_t, len) +{ + return ksys_munmap(addr, len); +} -/* - * Emulation of deprecated remap_file_pages() syscall. - */ -SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, - unsigned long, prot, unsigned long, pgoff, unsigned long, flags) +unsigned long ksys_remap_file_pages(unsigned long start, unsigned long size, + unsigned long prot, unsigned long pgoff, unsigned long flags) { struct mm_struct *mm = current->mm; @@ -2976,6 +2983,15 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, return ret; } +/* + * Emulation of deprecated remap_file_pages() syscall. + */ +SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, + unsigned long, prot, unsigned long, pgoff, unsigned long, flags) +{ + return ksys_remap_file_pages(start, size, prot, pgoff, flags); +} + /* * this is really a simplified "do_mmap". it only handles * anonymous maps. eventually we may be able to do some diff --git a/mm/mprotect.c b/mm/mprotect.c index 028c724dcb1a..07344bdd7a04 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -454,7 +454,7 @@ mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev, /* * pkey==-1 when doing a legacy mprotect() */ -static int do_mprotect_pkey(unsigned long start, size_t len, +int ksys_mprotect_pkey(unsigned long start, size_t len, unsigned long prot, int pkey) { unsigned long nstart, end, tmp, reqprot; @@ -578,7 +578,7 @@ static int do_mprotect_pkey(unsigned long start, size_t len, SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len, unsigned long, prot) { - return do_mprotect_pkey(start, len, prot, -1); + return ksys_mprotect_pkey(start, len, prot, -1); } #ifdef CONFIG_ARCH_HAS_PKEYS @@ -586,7 +586,7 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len, SYSCALL_DEFINE4(pkey_mprotect, unsigned long, start, size_t, len, unsigned long, prot, int, pkey) { - return do_mprotect_pkey(start, len, prot, pkey); + return ksys_mprotect_pkey(start, len, prot, pkey); } SYSCALL_DEFINE2(pkey_alloc, unsigned long, flags, unsigned long, init_val) diff --git a/mm/mremap.c b/mm/mremap.c index e3edef6b7a12..fec1f9911388 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -584,16 +584,9 @@ static int vma_expandable(struct vm_area_struct *vma, unsigned long delta) return 1; } -/* - * Expand (or shrink) an existing mapping, potentially moving it at the - * same time (controlled by the MREMAP_MAYMOVE flag and available VM space) - * - * MREMAP_FIXED option added 5-Dec-1999 by Benjamin LaHaise - * This option implies MREMAP_MAYMOVE. - */ -SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, - unsigned long, new_len, unsigned long, flags, - unsigned long, new_addr) +unsigned long ksys_mremap(unsigned long addr, unsigned long old_len, + unsigned long new_len, unsigned long flags, + unsigned long new_addr) { struct mm_struct *mm = current->mm; struct vm_area_struct *vma; @@ -726,3 +719,17 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, userfaultfd_unmap_complete(mm, &uf_unmap); return ret; } + +/* + * Expand (or shrink) an existing mapping, potentially moving it at the + * same time (controlled by the MREMAP_MAYMOVE flag and available VM space) + * + * MREMAP_FIXED option added 5-Dec-1999 by Benjamin LaHaise + * This option implies MREMAP_MAYMOVE. + */ +SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, + unsigned long, new_len, unsigned long, flags, + unsigned long, new_addr) +{ + return ksys_mremap(addr, old_len, new_len, flags, new_addr); +} diff --git a/mm/msync.c b/mm/msync.c index ef30a429623a..b5a013549626 100644 --- a/mm/msync.c +++ b/mm/msync.c @@ -15,21 +15,7 @@ #include #include -/* - * MS_SYNC syncs the entire file - including mappings. - * - * MS_ASYNC does not start I/O (it used to, up to 2.5.67). - * Nor does it marks the relevant pages dirty (it used to up to 2.6.17). - * Now it doesn't do anything, since dirty pages are properly tracked. - * - * The application may now run fsync() to - * write out the dirty pages and wait on the writeout and check the result. - * Or the application may run fadvise(FADV_DONTNEED) against the fd to start - * async writeout immediately. - * So by _not_ starting I/O in MS_ASYNC we provide complete flexibility to - * applications. - */ -SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags) +int ksys_msync(unsigned long start, size_t len, int flags) { unsigned long end; struct mm_struct *mm = current->mm; @@ -106,3 +92,22 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags) out: return error ? : unmapped_error; } + +/* + * MS_SYNC syncs the entire file - including mappings. + * + * MS_ASYNC does not start I/O (it used to, up to 2.5.67). + * Nor does it marks the relevant pages dirty (it used to up to 2.6.17). + * Now it doesn't do anything, since dirty pages are properly tracked. + * + * The application may now run fsync() to + * write out the dirty pages and wait on the writeout and check the result. + * Or the application may run fadvise(FADV_DONTNEED) against the fd to start + * async writeout immediately. + * So by _not_ starting I/O in MS_ASYNC we provide complete flexibility to + * applications. + */ +SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags) +{ + return ksys_msync(start, len, flags); +} -- 2.21.0.593.g511ec345e18-goog