From: Boqun Feng <boqun.feng@gmail.com> To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org>, "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>, Andy Lutomirski <luto@amacapital.net>, Dave Watson <davejwatson@fb.com>, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Paul Turner <pjt@google.com>, Andrew Morton <akpm@linux-foundation.org>, Russell King <linux@arm.linux.org.uk>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, "H . Peter Anvin" <hpa@zytor.com>, Andrew Hunter <ahh@google.com>, Andi Kleen <andi@firstfloor.org>, Chris Lameter <cl@linux.com>, Ben Maurer <bmaurer@fb.com>, Steven Rostedt <rostedt@goodmis.org>, Josh Triplett <josh@joshtriplett.org>, Linus Torvalds <torvalds@linux-foundation.org>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will.deacon@arm.com>, Michael Kerrisk <mtk.manpages@gmail.com> Subject: Re: [RFC PATCH v2 for 4.15 08/14] Provide cpu_opv system call Date: Tue, 7 Nov 2017 10:07:11 +0800 [thread overview] Message-ID: <20171107020711.GA6095@tardis> (raw) In-Reply-To: <20171106205644.29386-9-mathieu.desnoyers@efficios.com> [-- Attachment #1: Type: text/plain, Size: 5334 bytes --] On Mon, Nov 06, 2017 at 03:56:38PM -0500, Mathieu Desnoyers wrote: [...] > +static int cpu_op_pin_pages(unsigned long addr, unsigned long len, > + struct page ***pinned_pages_ptr, size_t *nr_pinned, > + int write) > +{ > + struct page *pages[2]; > + int ret, nr_pages; > + > + if (!len) > + return 0; > + nr_pages = cpu_op_range_nr_pages(addr, len); > + BUG_ON(nr_pages > 2); > + if (*nr_pinned + nr_pages > NR_PINNED_PAGES_ON_STACK) { Is this a bug? Seems you will kzalloc() every time if *nr_pinned is bigger than NR_PINNED_PAGES_ON_STACK, which will result in memory leaking. I think the logic here is complex enough for us to introduce a structure, like: struct cpu_opv_page_pinner { int nr_pinned; bool is_kmalloc; struct page **pinned_pages; }; Thoughts? Regards, Boqun > + struct page **pinned_pages = > + kzalloc(CPU_OP_VEC_LEN_MAX * CPU_OP_MAX_PAGES > + * sizeof(struct page *), GFP_KERNEL); > + if (!pinned_pages) > + return -ENOMEM; > + memcpy(pinned_pages, *pinned_pages_ptr, > + *nr_pinned * sizeof(struct page *)); > + *pinned_pages_ptr = pinned_pages; > + } > +again: > + ret = get_user_pages_fast(addr, nr_pages, write, pages); > + if (ret < nr_pages) { > + if (ret > 0) > + put_page(pages[0]); > + return -EFAULT; > + } > + /* > + * Refuse device pages, the zero page, pages in the gate area, > + * and special mappings. > + */ > + ret = cpu_op_check_pages(pages, nr_pages); > + if (ret == -EAGAIN) { > + put_page(pages[0]); > + if (nr_pages > 1) > + put_page(pages[1]); > + goto again; > + } > + if (ret) > + goto error; > + (*pinned_pages_ptr)[(*nr_pinned)++] = pages[0]; > + if (nr_pages > 1) > + (*pinned_pages_ptr)[(*nr_pinned)++] = pages[1]; > + return 0; > + > +error: > + put_page(pages[0]); > + if (nr_pages > 1) > + put_page(pages[1]); > + return -EFAULT; > +} > + > +static int cpu_opv_pin_pages(struct cpu_op *cpuop, int cpuopcnt, > + struct page ***pinned_pages_ptr, size_t *nr_pinned) > +{ > + int ret, i; > + bool expect_fault = false; > + > + /* Check access, pin pages. */ > + for (i = 0; i < cpuopcnt; i++) { > + struct cpu_op *op = &cpuop[i]; > + > + switch (op->op) { > + case CPU_COMPARE_EQ_OP: > + case CPU_COMPARE_NE_OP: > + ret = -EFAULT; > + expect_fault = op->u.compare_op.expect_fault_a; > + if (!access_ok(VERIFY_READ, op->u.compare_op.a, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.compare_op.a, > + op->len, pinned_pages_ptr, nr_pinned, 0); > + if (ret) > + goto error; > + ret = -EFAULT; > + expect_fault = op->u.compare_op.expect_fault_b; > + if (!access_ok(VERIFY_READ, op->u.compare_op.b, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.compare_op.b, > + op->len, pinned_pages_ptr, nr_pinned, 0); > + if (ret) > + goto error; > + break; > + case CPU_MEMCPY_OP: > + ret = -EFAULT; > + expect_fault = op->u.memcpy_op.expect_fault_dst; > + if (!access_ok(VERIFY_WRITE, op->u.memcpy_op.dst, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.memcpy_op.dst, > + op->len, pinned_pages_ptr, nr_pinned, 1); > + if (ret) > + goto error; > + ret = -EFAULT; > + expect_fault = op->u.memcpy_op.expect_fault_src; > + if (!access_ok(VERIFY_READ, op->u.memcpy_op.src, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.memcpy_op.src, > + op->len, pinned_pages_ptr, nr_pinned, 0); > + if (ret) > + goto error; > + break; > + case CPU_ADD_OP: > + ret = -EFAULT; > + expect_fault = op->u.arithmetic_op.expect_fault_p; > + if (!access_ok(VERIFY_WRITE, op->u.arithmetic_op.p, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.arithmetic_op.p, > + op->len, pinned_pages_ptr, nr_pinned, 1); > + if (ret) > + goto error; > + break; > + case CPU_OR_OP: > + case CPU_AND_OP: > + case CPU_XOR_OP: > + ret = -EFAULT; > + expect_fault = op->u.bitwise_op.expect_fault_p; > + if (!access_ok(VERIFY_WRITE, op->u.bitwise_op.p, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.bitwise_op.p, > + op->len, pinned_pages_ptr, nr_pinned, 1); > + if (ret) > + goto error; > + break; > + case CPU_LSHIFT_OP: > + case CPU_RSHIFT_OP: > + ret = -EFAULT; > + expect_fault = op->u.shift_op.expect_fault_p; > + if (!access_ok(VERIFY_WRITE, op->u.shift_op.p, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.shift_op.p, > + op->len, pinned_pages_ptr, nr_pinned, 1); > + if (ret) > + goto error; > + break; > + case CPU_MB_OP: > + break; > + default: > + return -EINVAL; > + } > + } > + return 0; > + > +error: > + for (i = 0; i < *nr_pinned; i++) > + put_page((*pinned_pages_ptr)[i]); > + *nr_pinned = 0; > + /* > + * If faulting access is expected, return EAGAIN to user-space. > + * It allows user-space to distinguish between a fault caused by > + * an access which is expect to fault (e.g. due to concurrent > + * unmapping of underlying memory) from an unexpected fault from > + * which a retry would not recover. > + */ > + if (ret == -EFAULT && expect_fault) > + return -EAGAIN; > + return ret; > +} [...] [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --]
WARNING: multiple messages have this Message-ID
From: Boqun Feng <boqun.feng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> To: Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>, "Paul E . McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>, Dave Watson <davejwatson-b10kYP2dOMg@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>, Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>, Russell King <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org>, Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>, Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, "H . Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>, Andrew Hunter <ahh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>, Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>, Chris Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>, Ben Maurer <bmaurer-b10kYP2dOMg@public.gmane.org>, Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>, Josh Triplett <josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org>, Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>, Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> Subject: Re: [RFC PATCH v2 for 4.15 08/14] Provide cpu_opv system call Date: Tue, 7 Nov 2017 10:07:11 +0800 [thread overview] Message-ID: <20171107020711.GA6095@tardis> (raw) In-Reply-To: <20171106205644.29386-9-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org> [-- Attachment #1: Type: text/plain, Size: 5334 bytes --] On Mon, Nov 06, 2017 at 03:56:38PM -0500, Mathieu Desnoyers wrote: [...] > +static int cpu_op_pin_pages(unsigned long addr, unsigned long len, > + struct page ***pinned_pages_ptr, size_t *nr_pinned, > + int write) > +{ > + struct page *pages[2]; > + int ret, nr_pages; > + > + if (!len) > + return 0; > + nr_pages = cpu_op_range_nr_pages(addr, len); > + BUG_ON(nr_pages > 2); > + if (*nr_pinned + nr_pages > NR_PINNED_PAGES_ON_STACK) { Is this a bug? Seems you will kzalloc() every time if *nr_pinned is bigger than NR_PINNED_PAGES_ON_STACK, which will result in memory leaking. I think the logic here is complex enough for us to introduce a structure, like: struct cpu_opv_page_pinner { int nr_pinned; bool is_kmalloc; struct page **pinned_pages; }; Thoughts? Regards, Boqun > + struct page **pinned_pages = > + kzalloc(CPU_OP_VEC_LEN_MAX * CPU_OP_MAX_PAGES > + * sizeof(struct page *), GFP_KERNEL); > + if (!pinned_pages) > + return -ENOMEM; > + memcpy(pinned_pages, *pinned_pages_ptr, > + *nr_pinned * sizeof(struct page *)); > + *pinned_pages_ptr = pinned_pages; > + } > +again: > + ret = get_user_pages_fast(addr, nr_pages, write, pages); > + if (ret < nr_pages) { > + if (ret > 0) > + put_page(pages[0]); > + return -EFAULT; > + } > + /* > + * Refuse device pages, the zero page, pages in the gate area, > + * and special mappings. > + */ > + ret = cpu_op_check_pages(pages, nr_pages); > + if (ret == -EAGAIN) { > + put_page(pages[0]); > + if (nr_pages > 1) > + put_page(pages[1]); > + goto again; > + } > + if (ret) > + goto error; > + (*pinned_pages_ptr)[(*nr_pinned)++] = pages[0]; > + if (nr_pages > 1) > + (*pinned_pages_ptr)[(*nr_pinned)++] = pages[1]; > + return 0; > + > +error: > + put_page(pages[0]); > + if (nr_pages > 1) > + put_page(pages[1]); > + return -EFAULT; > +} > + > +static int cpu_opv_pin_pages(struct cpu_op *cpuop, int cpuopcnt, > + struct page ***pinned_pages_ptr, size_t *nr_pinned) > +{ > + int ret, i; > + bool expect_fault = false; > + > + /* Check access, pin pages. */ > + for (i = 0; i < cpuopcnt; i++) { > + struct cpu_op *op = &cpuop[i]; > + > + switch (op->op) { > + case CPU_COMPARE_EQ_OP: > + case CPU_COMPARE_NE_OP: > + ret = -EFAULT; > + expect_fault = op->u.compare_op.expect_fault_a; > + if (!access_ok(VERIFY_READ, op->u.compare_op.a, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.compare_op.a, > + op->len, pinned_pages_ptr, nr_pinned, 0); > + if (ret) > + goto error; > + ret = -EFAULT; > + expect_fault = op->u.compare_op.expect_fault_b; > + if (!access_ok(VERIFY_READ, op->u.compare_op.b, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.compare_op.b, > + op->len, pinned_pages_ptr, nr_pinned, 0); > + if (ret) > + goto error; > + break; > + case CPU_MEMCPY_OP: > + ret = -EFAULT; > + expect_fault = op->u.memcpy_op.expect_fault_dst; > + if (!access_ok(VERIFY_WRITE, op->u.memcpy_op.dst, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.memcpy_op.dst, > + op->len, pinned_pages_ptr, nr_pinned, 1); > + if (ret) > + goto error; > + ret = -EFAULT; > + expect_fault = op->u.memcpy_op.expect_fault_src; > + if (!access_ok(VERIFY_READ, op->u.memcpy_op.src, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.memcpy_op.src, > + op->len, pinned_pages_ptr, nr_pinned, 0); > + if (ret) > + goto error; > + break; > + case CPU_ADD_OP: > + ret = -EFAULT; > + expect_fault = op->u.arithmetic_op.expect_fault_p; > + if (!access_ok(VERIFY_WRITE, op->u.arithmetic_op.p, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.arithmetic_op.p, > + op->len, pinned_pages_ptr, nr_pinned, 1); > + if (ret) > + goto error; > + break; > + case CPU_OR_OP: > + case CPU_AND_OP: > + case CPU_XOR_OP: > + ret = -EFAULT; > + expect_fault = op->u.bitwise_op.expect_fault_p; > + if (!access_ok(VERIFY_WRITE, op->u.bitwise_op.p, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.bitwise_op.p, > + op->len, pinned_pages_ptr, nr_pinned, 1); > + if (ret) > + goto error; > + break; > + case CPU_LSHIFT_OP: > + case CPU_RSHIFT_OP: > + ret = -EFAULT; > + expect_fault = op->u.shift_op.expect_fault_p; > + if (!access_ok(VERIFY_WRITE, op->u.shift_op.p, > + op->len)) > + goto error; > + ret = cpu_op_pin_pages( > + (unsigned long)op->u.shift_op.p, > + op->len, pinned_pages_ptr, nr_pinned, 1); > + if (ret) > + goto error; > + break; > + case CPU_MB_OP: > + break; > + default: > + return -EINVAL; > + } > + } > + return 0; > + > +error: > + for (i = 0; i < *nr_pinned; i++) > + put_page((*pinned_pages_ptr)[i]); > + *nr_pinned = 0; > + /* > + * If faulting access is expected, return EAGAIN to user-space. > + * It allows user-space to distinguish between a fault caused by > + * an access which is expect to fault (e.g. due to concurrent > + * unmapping of underlying memory) from an unexpected fault from > + * which a retry would not recover. > + */ > + if (ret == -EFAULT && expect_fault) > + return -EAGAIN; > + return ret; > +} [...] [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2017-11-07 2:05 UTC|newest] Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-11-06 20:56 [RFC PATCH for 4.15 00/14] Restartable sequences and CPU op vector v10 Mathieu Desnoyers 2017-11-06 20:56 ` Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH v10 for 4.15 01/14] Restartable sequences system call Mathieu Desnoyers 2017-11-07 1:24 ` Boqun Feng 2017-11-07 1:24 ` Boqun Feng 2017-11-07 2:20 ` Mathieu Desnoyers 2017-11-07 2:20 ` Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH for 4.15 02/14] Restartable sequences: ARM 32 architecture support Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH for 4.15 03/14] Restartable sequences: wire up ARM 32 system call Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH for 4.15 04/14] Restartable sequences: x86 32/64 architecture support Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH for 4.15 05/14] Restartable sequences: wire up x86 32/64 system call Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH for 4.15 06/14] Restartable sequences: powerpc architecture support Mathieu Desnoyers 2017-11-06 20:56 ` Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH for 4.15 07/14] Restartable sequences: Wire up powerpc system call Mathieu Desnoyers 2017-11-06 20:56 ` Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH v2 for 4.15 08/14] Provide cpu_opv " Mathieu Desnoyers 2017-11-07 2:07 ` Boqun Feng [this message] 2017-11-07 2:07 ` Boqun Feng 2017-11-07 2:40 ` Mathieu Desnoyers 2017-11-07 2:40 ` Mathieu Desnoyers 2017-11-07 3:03 ` Boqun Feng 2017-11-07 3:03 ` Boqun Feng 2017-11-06 20:56 ` [RFC PATCH for 4.15 09/14] cpu_opv: Wire up x86 32/64 " Mathieu Desnoyers 2017-11-06 20:56 ` Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH for 4.15 10/14] cpu_opv: Wire up powerpc " Mathieu Desnoyers 2017-11-06 20:56 ` Mathieu Desnoyers 2017-11-07 0:37 ` Nicholas Piggin 2017-11-07 0:37 ` Nicholas Piggin 2017-11-07 0:47 ` Mathieu Desnoyers 2017-11-07 0:47 ` Mathieu Desnoyers 2017-11-07 1:21 ` Nicholas Piggin 2017-11-07 1:21 ` Nicholas Piggin 2017-11-06 20:56 ` [RFC PATCH for 4.15 11/14] cpu_opv: Wire up ARM32 " Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH v2 for 4.15 12/14] cpu_opv: Implement selftests Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH v2 for 4.15 13/14] Restartable sequences: Provide self-tests Mathieu Desnoyers 2017-11-06 20:56 ` Mathieu Desnoyers 2017-11-06 20:56 ` [RFC PATCH for 4.15 14/14] Restartable sequences selftests: arm: workaround gcc asm size guess Mathieu Desnoyers 2017-11-06 20:56 ` Mathieu Desnoyers -- strict thread matches above, loose matches on Subject: below -- 2017-11-06 9:22 [PATCH] mm, sparse: do not swamp log with huge vmemmap allocation failures Michal Hocko 2017-11-06 9:22 ` Michal Hocko 2017-11-06 17:35 ` Johannes Weiner 2017-11-06 17:35 ` Johannes Weiner 2017-11-06 17:57 ` Joe Perches 2017-11-06 18:14 ` Khalid Aziz 2017-11-06 18:14 ` Khalid Aziz 2017-11-06 18:18 ` Michal Hocko 2017-11-06 18:18 ` Michal Hocko 2017-11-06 20:17 ` Khalid Aziz 2017-11-06 20:17 ` Khalid Aziz 2017-11-07 9:06 ` Michal Hocko 2017-11-07 9:06 ` Michal Hocko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20171107020711.GA6095@tardis \ --to=boqun.feng@gmail.com \ --cc=ahh@google.com \ --cc=akpm@linux-foundation.org \ --cc=andi@firstfloor.org \ --cc=bmaurer@fb.com \ --cc=catalin.marinas@arm.com \ --cc=cl@linux.com \ --cc=davejwatson@fb.com \ --cc=hpa@zytor.com \ --cc=josh@joshtriplett.org \ --cc=linux-api@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux@arm.linux.org.uk \ --cc=luto@amacapital.net \ --cc=mathieu.desnoyers@efficios.com \ --cc=mingo@redhat.com \ --cc=mtk.manpages@gmail.com \ --cc=paulmck@linux.vnet.ibm.com \ --cc=peterz@infradead.org \ --cc=pjt@google.com \ --cc=rostedt@goodmis.org \ --cc=tglx@linutronix.de \ --cc=torvalds@linux-foundation.org \ --cc=will.deacon@arm.com \ --subject='Re: [RFC PATCH v2 for 4.15 08/14] Provide cpu_opv system call' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.