From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A752FC433E1 for ; Sun, 26 Jul 2020 08:03:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2F2B220714 for ; Sun, 26 Jul 2020 08:03:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F2B220714 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.page Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8620E8D0001; Sun, 26 Jul 2020 04:03:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8132F6B0005; Sun, 26 Jul 2020 04:03:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 755FF8D0001; Sun, 26 Jul 2020 04:03:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0161.hostedemail.com [216.40.44.161]) by kanga.kvack.org (Postfix) with ESMTP id 60BE46B0003 for ; Sun, 26 Jul 2020 04:03:02 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 180D1180AD81F for ; Sun, 26 Jul 2020 08:03:02 +0000 (UTC) X-FDA: 77079486204.17.brush91_1707b1526f57 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id E20B7180D0184 for ; Sun, 26 Jul 2020 08:03:01 +0000 (UTC) X-HE-Tag: brush91_1707b1526f57 X-Filterd-Recvd-Size: 13642 Received: from relay11.mail.gandi.net (relay11.mail.gandi.net [217.70.178.231]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Sun, 26 Jul 2020 08:03:00 +0000 (UTC) Received: from localhost.localdomain (unknown [180.110.142.179]) (Authenticated sender: fly@kernel.page) by relay11.mail.gandi.net (Postfix) with ESMTPSA id 76B41100004; Sun, 26 Jul 2020 08:02:48 +0000 (UTC) From: Pengfei Li To: akpm@linux-foundation.org Cc: bmt@zurich.ibm.com, dledford@redhat.com, willy@infradead.org, vbabka@suse.cz, kirill.shutemov@linux.intel.com, jgg@ziepe.ca, alex.williamson@redhat.com, cohuck@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Pengfei Li Subject: [PATCH 1/2] mm: make mm->locked_vm an atomic64 counter Date: Sun, 26 Jul 2020 16:02:23 +0800 Message-Id: <20200726080224.205470-1-fly@kernel.page> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 X-Rspamd-Queue-Id: E20B7180D0184 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Like commit 70f8a3ca68d3 ("mm: make mm->pinned_vm an atomic64 counter"). By making mm->locked_vm an atomic64 counter, we can safely modify it without holding mmap_lock. The reason for using atomic64 instead of atomic_long is to keep the same as mm->pinned_vm, and there is no need to worry about overflow. Signed-off-by: Pengfei Li --- drivers/infiniband/sw/siw/siw_verbs.c | 12 +++++++----- drivers/vfio/vfio_iommu_type1.c | 6 ++++-- fs/io_uring.c | 4 ++-- fs/proc/task_mmu.c | 2 +- include/linux/mm_types.h | 4 ++-- kernel/fork.c | 2 +- mm/debug.c | 5 +++-- mm/mlock.c | 4 ++-- mm/mmap.c | 18 +++++++++--------- mm/mremap.c | 6 +++--- mm/util.c | 6 +++--- 11 files changed, 37 insertions(+), 32 deletions(-) diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/s= w/siw/siw_verbs.c index adafa1b8bebe..bf78d7988442 100644 --- a/drivers/infiniband/sw/siw/siw_verbs.c +++ b/drivers/infiniband/sw/siw/siw_verbs.c @@ -1293,14 +1293,16 @@ struct ib_mr *siw_reg_user_mr(struct ib_pd *pd, u= 64 start, u64 len, goto err_out; } if (mem_limit !=3D RLIM_INFINITY) { - unsigned long num_pages =3D - (PAGE_ALIGN(len + (start & ~PAGE_MASK))) >> PAGE_SHIFT; + unsigned long num_pages, locked_pages; + + num_pages =3D (PAGE_ALIGN(len + (start & ~PAGE_MASK))) + >> PAGE_SHIFT; + locked_pages =3D atomic64_read(¤t->mm->locked_vm); mem_limit >>=3D PAGE_SHIFT; =20 - if (num_pages > mem_limit - current->mm->locked_vm) { + if (num_pages > mem_limit - locked_pages) { siw_dbg_pd(pd, "pages req %lu, max %lu, lock %lu\n", - num_pages, mem_limit, - current->mm->locked_vm); + num_pages, mem_limit, locked_pages); rv =3D -ENOMEM; goto err_out; } diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_ty= pe1.c index 9d41105bfd01..78013be07fe7 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -509,7 +509,8 @@ static long vfio_pin_pages_remote(struct vfio_dma *dm= a, unsigned long vaddr, * pages are already counted against the user. */ if (!rsvd && !vfio_find_vpfn(dma, iova)) { - if (!dma->lock_cap && current->mm->locked_vm + 1 > limit) { + if (!dma->lock_cap && + atomic64_read(¤t->mm->locked_vm) + 1 > limit) { put_pfn(*pfn_base, dma->prot); pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n", __func__, limit << PAGE_SHIFT); @@ -536,7 +537,8 @@ static long vfio_pin_pages_remote(struct vfio_dma *dm= a, unsigned long vaddr, =20 if (!rsvd && !vfio_find_vpfn(dma, iova)) { if (!dma->lock_cap && - current->mm->locked_vm + lock_acct + 1 > limit) { + atomic64_read(¤t->mm->locked_vm) + + lock_acct + 1 > limit) { put_pfn(pfn, dma->prot); pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n", __func__, limit << PAGE_SHIFT); diff --git a/fs/io_uring.c b/fs/io_uring.c index 7cf2f295fba7..f1241c6314e6 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7371,7 +7371,7 @@ static void io_unaccount_mem(struct io_ring_ctx *ct= x, unsigned long nr_pages, =20 if (ctx->sqo_mm) { if (acct =3D=3D ACCT_LOCKED) - ctx->sqo_mm->locked_vm -=3D nr_pages; + atomic64_sub(nr_pages, &ctx->sqo_mm->locked_vm); else if (acct =3D=3D ACCT_PINNED) atomic64_sub(nr_pages, &ctx->sqo_mm->pinned_vm); } @@ -7390,7 +7390,7 @@ static int io_account_mem(struct io_ring_ctx *ctx, = unsigned long nr_pages, =20 if (ctx->sqo_mm) { if (acct =3D=3D ACCT_LOCKED) - ctx->sqo_mm->locked_vm +=3D nr_pages; + atomic64_add(nr_pages, &ctx->sqo_mm->locked_vm); else if (acct =3D=3D ACCT_PINNED) atomic64_add(nr_pages, &ctx->sqo_mm->pinned_vm); } diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index df2f0f05f5ba..2af56e68766e 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -58,7 +58,7 @@ void task_mem(struct seq_file *m, struct mm_struct *mm) swap =3D get_mm_counter(mm, MM_SWAPENTS); SEQ_PUT_DEC("VmPeak:\t", hiwater_vm); SEQ_PUT_DEC(" kB\nVmSize:\t", total_vm); - SEQ_PUT_DEC(" kB\nVmLck:\t", mm->locked_vm); + SEQ_PUT_DEC(" kB\nVmLck:\t", atomic64_read(&mm->locked_vm)); SEQ_PUT_DEC(" kB\nVmPin:\t", atomic64_read(&mm->pinned_vm)); SEQ_PUT_DEC(" kB\nVmHWM:\t", hiwater_rss); SEQ_PUT_DEC(" kB\nVmRSS:\t", total_rss); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 496c3ff97cce..3f0ad38c534d 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -457,8 +457,8 @@ struct mm_struct { unsigned long hiwater_vm; /* High-water virtual memory usage */ =20 unsigned long total_vm; /* Total pages mapped */ - unsigned long locked_vm; /* Pages that have PG_mlocked set */ - atomic64_t pinned_vm; /* Refcount permanently increased */ + atomic64_t locked_vm; /* Pages that have PG_mlocked set */ + atomic64_t pinned_vm; /* Refcount permanently increased */ unsigned long data_vm; /* VM_WRITE & ~VM_SHARED & ~VM_STACK */ unsigned long exec_vm; /* VM_EXEC & ~VM_WRITE & ~VM_STACK */ unsigned long stack_vm; /* VM_STACK */ diff --git a/kernel/fork.c b/kernel/fork.c index 45cdf724a2d4..8ed0d0574621 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1009,7 +1009,7 @@ static struct mm_struct *mm_init(struct mm_struct *= mm, struct task_struct *p, mm->core_state =3D NULL; mm_pgtables_bytes_init(mm); mm->map_count =3D 0; - mm->locked_vm =3D 0; + atomic64_set(&mm->locked_vm, 0); atomic64_set(&mm->pinned_vm, 0); memset(&mm->rss_stat, 0, sizeof(mm->rss_stat)); spin_lock_init(&mm->page_table_lock); diff --git a/mm/debug.c b/mm/debug.c index 8f569db9a514..c27fff1e3ca8 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -218,7 +218,7 @@ void dump_mm(const struct mm_struct *mm) #endif "mmap_base %lu mmap_legacy_base %lu highest_vm_end %lu\n" "pgd %px mm_users %d mm_count %d pgtables_bytes %lu map_count %d\n" - "hiwater_rss %lx hiwater_vm %lx total_vm %lx locked_vm %lx\n" + "hiwater_rss %lx hiwater_vm %lx total_vm %lx locked_vm %llx\n" "pinned_vm %llx data_vm %lx exec_vm %lx stack_vm %lx\n" "start_code %lx end_code %lx start_data %lx end_data %lx\n" "start_brk %lx brk %lx start_stack %lx\n" @@ -249,7 +249,8 @@ void dump_mm(const struct mm_struct *mm) atomic_read(&mm->mm_count), mm_pgtables_bytes(mm), mm->map_count, - mm->hiwater_rss, mm->hiwater_vm, mm->total_vm, mm->locked_vm, + mm->hiwater_rss, mm->hiwater_vm, mm->total_vm, + (u64)atomic64_read(&mm->locked_vm), (u64)atomic64_read(&mm->pinned_vm), mm->data_vm, mm->exec_vm, mm->stack_vm, mm->start_code, mm->end_code, mm->start_data, mm->end_data, diff --git a/mm/mlock.c b/mm/mlock.c index 93ca2bf30b4f..ec8c563ce233 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -561,7 +561,7 @@ static int mlock_fixup(struct vm_area_struct *vma, st= ruct vm_area_struct **prev, nr_pages =3D -nr_pages; else if (old_flags & VM_LOCKED) nr_pages =3D 0; - mm->locked_vm +=3D nr_pages; + atomic64_add(nr_pages, &mm->locked_vm); =20 /* * vm_flags is protected by the mmap_lock held in write mode. @@ -688,7 +688,7 @@ static __must_check int do_mlock(unsigned long start,= size_t len, vm_flags_t fla if (mmap_write_lock_killable(current->mm)) return -EINTR; =20 - locked +=3D current->mm->locked_vm; + locked +=3D atomic64_read(¤t->mm->locked_vm); if ((locked > lock_limit) && (!capable(CAP_IPC_LOCK))) { /* * It is possible that the regions requested intersect with diff --git a/mm/mmap.c b/mm/mmap.c index c65bd5a7f80b..17bd229f820b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1319,7 +1319,7 @@ static inline int mlock_future_check(struct mm_stru= ct *mm, /* mlock MCL_FUTURE? */ if (flags & VM_LOCKED) { locked =3D len >> PAGE_SHIFT; - locked +=3D mm->locked_vm; + locked +=3D atomic64_read(&mm->locked_vm); lock_limit =3D rlimit(RLIMIT_MEMLOCK); lock_limit >>=3D PAGE_SHIFT; if (locked > lock_limit && !capable(CAP_IPC_LOCK)) @@ -1812,7 +1812,7 @@ unsigned long mmap_region(struct file *file, unsign= ed long addr, vma =3D=3D get_gate_vma(current->mm)) vma->vm_flags &=3D VM_LOCKED_CLEAR_MASK; else - mm->locked_vm +=3D (len >> PAGE_SHIFT); + atomic64_add(len >> PAGE_SHIFT, &mm->locked_vm); } =20 if (file) @@ -2323,7 +2323,7 @@ static int acct_stack_growth(struct vm_area_struct = *vma, if (vma->vm_flags & VM_LOCKED) { unsigned long locked; unsigned long limit; - locked =3D mm->locked_vm + grow; + locked =3D atomic64_read(&mm->locked_vm) + grow; limit =3D rlimit(RLIMIT_MEMLOCK); limit >>=3D PAGE_SHIFT; if (locked > limit && !capable(CAP_IPC_LOCK)) @@ -2416,7 +2416,7 @@ int expand_upwards(struct vm_area_struct *vma, unsi= gned long address) */ spin_lock(&mm->page_table_lock); if (vma->vm_flags & VM_LOCKED) - mm->locked_vm +=3D grow; + atomic64_add(grow, &mm->locked_vm); vm_stat_account(mm, vma->vm_flags, grow); anon_vma_interval_tree_pre_update_vma(vma); vma->vm_end =3D address; @@ -2496,7 +2496,7 @@ int expand_downwards(struct vm_area_struct *vma, */ spin_lock(&mm->page_table_lock); if (vma->vm_flags & VM_LOCKED) - mm->locked_vm +=3D grow; + atomic64_add(grow, &mm->locked_vm); vm_stat_account(mm, vma->vm_flags, grow); anon_vma_interval_tree_pre_update_vma(vma); vma->vm_start =3D address; @@ -2839,11 +2839,11 @@ int __do_munmap(struct mm_struct *mm, unsigned lo= ng start, size_t len, /* * unlock any mlock()ed ranges before detaching vmas */ - if (mm->locked_vm) { + if (atomic64_read(&mm->locked_vm)) { struct vm_area_struct *tmp =3D vma; while (tmp && tmp->vm_start < end) { if (tmp->vm_flags & VM_LOCKED) { - mm->locked_vm -=3D vma_pages(tmp); + atomic64_sub(vma_pages(tmp), &mm->locked_vm); munlock_vma_pages_all(tmp); } =20 @@ -3083,7 +3083,7 @@ static int do_brk_flags(unsigned long addr, unsigne= d long len, unsigned long fla mm->total_vm +=3D len >> PAGE_SHIFT; mm->data_vm +=3D len >> PAGE_SHIFT; if (flags & VM_LOCKED) - mm->locked_vm +=3D (len >> PAGE_SHIFT); + atomic64_add(len >> PAGE_SHIFT, &mm->locked_vm); vma->vm_flags |=3D VM_SOFTDIRTY; return 0; } @@ -3155,7 +3155,7 @@ void exit_mmap(struct mm_struct *mm) mmap_write_unlock(mm); } =20 - if (mm->locked_vm) { + if (atomic64_read(&mm->locked_vm)) { vma =3D mm->mmap; while (vma) { if (vma->vm_flags & VM_LOCKED) diff --git a/mm/mremap.c b/mm/mremap.c index 138abbae4f75..451a5a77f82a 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -455,7 +455,7 @@ static unsigned long move_vma(struct vm_area_struct *= vma, } =20 if (vm_flags & VM_LOCKED) { - mm->locked_vm +=3D new_len >> PAGE_SHIFT; + atomic64_add(new_len >> PAGE_SHIFT, &mm->locked_vm); *locked =3D true; } out: @@ -520,7 +520,7 @@ static struct vm_area_struct *vma_to_resize(unsigned = long addr, =20 if (vma->vm_flags & VM_LOCKED) { unsigned long locked, lock_limit; - locked =3D mm->locked_vm << PAGE_SHIFT; + locked =3D atomic64_read(&mm->locked_vm) << PAGE_SHIFT; lock_limit =3D rlimit(RLIMIT_MEMLOCK); locked +=3D new_len - old_len; if (locked > lock_limit && !capable(CAP_IPC_LOCK)) @@ -765,7 +765,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned= long, old_len, =20 vm_stat_account(mm, vma->vm_flags, pages); if (vma->vm_flags & VM_LOCKED) { - mm->locked_vm +=3D pages; + atomic64_add(pages, &mm->locked_vm); locked =3D true; new_addr =3D addr; } diff --git a/mm/util.c b/mm/util.c index 8d6280c05238..473add0dc275 100644 --- a/mm/util.c +++ b/mm/util.c @@ -439,7 +439,7 @@ int __account_locked_vm(struct mm_struct *mm, unsigne= d long pages, bool inc, =20 mmap_assert_write_locked(mm); =20 - locked_vm =3D mm->locked_vm; + locked_vm =3D atomic64_read(&mm->locked_vm); if (inc) { if (!bypass_rlim) { limit =3D task_rlimit(task, RLIMIT_MEMLOCK) >> PAGE_SHIFT; @@ -447,10 +447,10 @@ int __account_locked_vm(struct mm_struct *mm, unsig= ned long pages, bool inc, ret =3D -ENOMEM; } if (!ret) - mm->locked_vm =3D locked_vm + pages; + atomic64_add(pages, &mm->locked_vm); } else { WARN_ON_ONCE(pages > locked_vm); - mm->locked_vm =3D locked_vm - pages; + atomic64_sub(pages, &mm->locked_vm); } =20 pr_debug("%s: [%d] caller %ps %c%lu %lu/%lu%s\n", __func__, task->pid, --=20 2.26.2