* [PATCH 0/2] sched: Introduce rcuwait @ 2016-12-22 17:01 Davidlohr Bueso 2016-12-22 17:01 ` [PATCH 1/2] sched: Introduce rcuwait machinery Davidlohr Bueso ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Davidlohr Bueso @ 2016-12-22 17:01 UTC (permalink / raw) To: mingo, peterz, oleg; +Cc: linux-kernel, dave Hi, Here's an updated version of the pcpu rwsem writer wait/wake changes with the abstractions wanted by Oleg. Patch 1 adds rcuwait (for a lack of better name), and patch 2 trivially makes use of it. Has survived torture testing, which is actually very handy in this case particularly dealing with equal amount of reader and writer threads. Thanks. Davidlohr Bueso (2): sched: Introduce rcuwait machinery locking/percpu-rwsem: Replace waitqueue with rcuwait include/linux/percpu-rwsem.h | 8 +++--- include/linux/rcuwait.h | 63 +++++++++++++++++++++++++++++++++++++++++++ kernel/exit.c | 29 ++++++++++++++++++++ kernel/locking/percpu-rwsem.c | 7 +++-- 4 files changed, 99 insertions(+), 8 deletions(-) create mode 100644 include/linux/rcuwait.h -- 2.6.6 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] sched: Introduce rcuwait machinery 2016-12-22 17:01 [PATCH 0/2] sched: Introduce rcuwait Davidlohr Bueso @ 2016-12-22 17:01 ` Davidlohr Bueso 2016-12-22 19:27 ` kbuild test robot ` (2 more replies) 2016-12-22 17:01 ` [PATCH 2/2] locking/percpu-rwsem: Replace waitqueue with rcuwait Davidlohr Bueso 2017-01-09 18:26 ` [PATCH 0/2] sched: Introduce rcuwait Davidlohr Bueso 2 siblings, 3 replies; 12+ messages in thread From: Davidlohr Bueso @ 2016-12-22 17:01 UTC (permalink / raw) To: mingo, peterz, oleg; +Cc: linux-kernel, dave, Davidlohr Bueso rcuwait provides support for (single) rcu-safe task wait/wake functionality, with the caveat that it must not be called after exit_notify(), such that we avoid racing with rcu delayed_put_task_struct callbacks, task_struct being rcu unaware in this context -- for which we similarly have task_rcu_dereference() magic, but with different return semantics, which can conflict with the wakeup side. The interfaces are quite straightforward: rcuwait_wait_event() rcuwait_trywake() More details are in the comments, but it's perhaps worth mentioning at least, that users must provide proper serialization when waiting on a condition, and avoid corrupting a concurrent waiter. Also care must be taken between the task and the condition for when calling the wakeup -- we cannot miss wakeups. When porting users, this is for example, a given when using waitqueues in that everything is done under the q->lock. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> --- include/linux/rcuwait.h | 63 +++++++++++++++++++++++++++++++++++++++++++++++++ kernel/exit.c | 29 +++++++++++++++++++++++ 2 files changed, 92 insertions(+) create mode 100644 include/linux/rcuwait.h diff --git a/include/linux/rcuwait.h b/include/linux/rcuwait.h new file mode 100644 index 000000000000..3e07beb14c1f --- /dev/null +++ b/include/linux/rcuwait.h @@ -0,0 +1,63 @@ +#ifndef _LINUX_RCUWAIT_H_ +#define _LINUX_RCUWAIT_H_ + +#include <linux/rcupdate.h> + +/* + * rcuwait provides a way of blocking and waking up a single + * task in an rcu-safe manner; where it is forbidden to use + * after exit_notify(). task_struct is not properly rcu protected, + * unless dealing with rcu-aware lists, ie: find_task_by_*(). + * + * Alternatively we have task_rcu_dereference(), but the return + * semantics have different implications which would break the + * wakeup side. The only time @task is non-nil is when a user is + * blocked (or checking if it needs to) on a condition, and reset + * as soon as we know that the condition has succeeded and are + * awoken. + */ +struct rcuwait { + struct task_struct *task; +}; + +#define __RCUWAIT_INITIALIZER(name) \ + { .task = NULL, } + +static inline void rcuwait_init(struct rcuwait *w) +{ + w->task = NULL; +} + +extern void rcuwait_trywake(struct rcuwait *w); + +/* + * The caller is responsible for locking around rcuwait_wait_event(), + * such that writes to @task are properly serialized. + */ +#define rcuwait_wait_event(w, condition) \ +({ \ + /* \ + * Complain if we are called after do_exit()/exit_notify(), \ + * as we cannot rely on the rcu critical region for the \ + * wakeup side. \ + */ \ + WARN_ON(current->exit_state); \ + \ + rcu_assign_pointer((w)->task, current); \ + for (;;) { \ + /* \ + * Implicit barrier (A) pairs with (B) in \ + * rcuwait_trywake(). \ + */ \ + set_current_state(TASK_UNINTERRUPTIBLE); \ + if (condition) \ + break; \ + \ + schedule(); \ + } \ + \ + WRITE_ONCE((w)->task, NULL); \ + __set_current_state(TASK_RUNNING); \ +}) + +#endif /* _LINUX_RCUWAIT_H_ */ diff --git a/kernel/exit.c b/kernel/exit.c index aacff8e2aec0..6862884179a8 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -282,6 +282,35 @@ struct task_struct *task_rcu_dereference(struct task_struct **ptask) return task; } +void rcuwait_trywake(struct rcuwait *w) +{ + struct task_struct *task; + + rcu_read_lock(); + + /* + * Order condition vs @task, such that everything prior to the load + * of @task is visible. This is the condition as to why the user called + * rcuwait_trywake() in the first place. Pairs with set_current_state() + * barrier (A) in rcuwait_wait_event(). + * + * WAIT WAKE + * [S] tsk = current [S] cond = true + * MB (A) MB (B) + * [L] cond [L] tsk + */ + smp_rmb(); /* (B) */ + + /* + * Avoid using task_rcu_dereference() magic as long as we are careful, + * see comment in rcuwait_wait_event() regarding ->exit_state. + */ + task = rcu_dereference(w->task); + if (task) + wake_up_process(task); + rcu_read_unlock(); +} + struct task_struct *try_get_task_struct(struct task_struct **ptask) { struct task_struct *task; -- 2.6.6 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] sched: Introduce rcuwait machinery 2016-12-22 17:01 ` [PATCH 1/2] sched: Introduce rcuwait machinery Davidlohr Bueso @ 2016-12-22 19:27 ` kbuild test robot 2017-01-03 23:20 ` Davidlohr Bueso 2016-12-22 19:55 ` kbuild test robot 2017-01-16 1:32 ` Davidlohr Bueso 2 siblings, 1 reply; 12+ messages in thread From: kbuild test robot @ 2016-12-22 19:27 UTC (permalink / raw) To: Davidlohr Bueso Cc: kbuild-all, mingo, peterz, oleg, linux-kernel, dave, Davidlohr Bueso [-- Attachment #1: Type: text/plain, Size: 3251 bytes --] Hi Davidlohr, [auto build test ERROR on tip/auto-latest] [also build test ERROR on v4.9 next-20161222] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Davidlohr-Bueso/sched-Introduce-rcuwait/20161223-020109 config: i386-randconfig-s1-201651 (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=i386 Note: the linux-review/Davidlohr-Bueso/sched-Introduce-rcuwait/20161223-020109 HEAD 9e9d238f94d5aa8e348e7e70585533fe0dbd373b builds fine. It only hurts bisectibility. All error/warnings (new ones prefixed by >>): >> kernel/exit.c:285:29: warning: 'struct rcuwait' declared inside parameter list will not be visible outside of this definition or declaration void rcuwait_trywake(struct rcuwait *w) ^~~~~~~ In file included from include/linux/srcu.h:33:0, from include/linux/notifier.h:15, from include/linux/memory_hotplug.h:6, from include/linux/mmzone.h:751, from include/linux/gfp.h:5, from include/linux/mm.h:9, from kernel/exit.c:7: kernel/exit.c: In function 'rcuwait_trywake': >> kernel/exit.c:308:26: error: dereferencing pointer to incomplete type 'struct rcuwait' task = rcu_dereference(w->task); ^ include/linux/rcupdate.h:606:10: note: in definition of macro '__rcu_dereference_check' typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \ ^ include/linux/rcupdate.h:786:28: note: in expansion of macro 'rcu_dereference_check' #define rcu_dereference(p) rcu_dereference_check(p, 0) ^~~~~~~~~~~~~~~~~~~~~ >> kernel/exit.c:308:9: note: in expansion of macro 'rcu_dereference' task = rcu_dereference(w->task); ^~~~~~~~~~~~~~~ vim +308 kernel/exit.c 279 if (!sighand) 280 return NULL; 281 282 return task; 283 } 284 > 285 void rcuwait_trywake(struct rcuwait *w) 286 { 287 struct task_struct *task; 288 289 rcu_read_lock(); 290 291 /* 292 * Order condition vs @task, such that everything prior to the load 293 * of @task is visible. This is the condition as to why the user called 294 * rcuwait_trywake() in the first place. Pairs with set_current_state() 295 * barrier (A) in rcuwait_wait_event(). 296 * 297 * WAIT WAKE 298 * [S] tsk = current [S] cond = true 299 * MB (A) MB (B) 300 * [L] cond [L] tsk 301 */ 302 smp_rmb(); /* (B) */ 303 304 /* 305 * Avoid using task_rcu_dereference() magic as long as we are careful, 306 * see comment in rcuwait_wait_event() regarding ->exit_state. 307 */ > 308 task = rcu_dereference(w->task); 309 if (task) 310 wake_up_process(task); 311 rcu_read_unlock(); --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 33607 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] sched: Introduce rcuwait machinery 2016-12-22 19:27 ` kbuild test robot @ 2017-01-03 23:20 ` Davidlohr Bueso 0 siblings, 0 replies; 12+ messages in thread From: Davidlohr Bueso @ 2017-01-03 23:20 UTC (permalink / raw) To: kbuild test robot Cc: kbuild-all, mingo, peterz, oleg, linux-kernel, Davidlohr Bueso On Fri, 23 Dec 2016, kbuild test robot wrote: >>> kernel/exit.c:285:29: warning: 'struct rcuwait' declared inside parameter list will not be visible outside of this definition or declaration > void rcuwait_trywake(struct rcuwait *w) > ^~~~~~~ Ah, I'm missing an linux/rcuwait.h include there. Here's v2, thanks. -----8<-------------------------------------------- From: Davidlohr Bueso <dave@stgolabs.net> Subject: [PATCH v2 1/2] sched: Introduce rcuwait machinery rcuwait provides support for (single) rcu-safe task wait/wake functionality, with the caveat that it must not be called after exit_notify(), such that we avoid racing with rcu delayed_put_task_struct callbacks, task_struct being rcu unaware in this context -- for which we similarly have task_rcu_dereference() magic, but with different return semantics, which can conflict with the wakeup side. The interfaces are quite straightforward: rcuwait_wait_event() rcuwait_trywake() More details are in the comments, but it's perhaps worth mentioning at least, that users must provide proper serialization when waiting on a condition, and avoid corrupting a concurrent waiter. Also care must be taken between the task and the condition for when calling the wakeup -- we cannot miss wakeups. When porting users, this is for example, a given when using waitqueues in that everything is done under the q->lock. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> --- include/linux/rcuwait.h | 63 +++++++++++++++++++++++++++++++++++++++++++++++++ kernel/exit.c | 30 +++++++++++++++++++++++ 2 files changed, 93 insertions(+) create mode 100644 include/linux/rcuwait.h diff --git a/include/linux/rcuwait.h b/include/linux/rcuwait.h new file mode 100644 index 000000000000..3e07beb14c1f --- /dev/null +++ b/include/linux/rcuwait.h @@ -0,0 +1,63 @@ +#ifndef _LINUX_RCUWAIT_H_ +#define _LINUX_RCUWAIT_H_ + +#include <linux/rcupdate.h> + +/* + * rcuwait provides a way of blocking and waking up a single + * task in an rcu-safe manner; where it is forbidden to use + * after exit_notify(). task_struct is not properly rcu protected, + * unless dealing with rcu-aware lists, ie: find_task_by_*(). + * + * Alternatively we have task_rcu_dereference(), but the return + * semantics have different implications which would break the + * wakeup side. The only time @task is non-nil is when a user is + * blocked (or checking if it needs to) on a condition, and reset + * as soon as we know that the condition has succeeded and are + * awoken. + */ +struct rcuwait { + struct task_struct *task; +}; + +#define __RCUWAIT_INITIALIZER(name) \ + { .task = NULL, } + +static inline void rcuwait_init(struct rcuwait *w) +{ + w->task = NULL; +} + +extern void rcuwait_trywake(struct rcuwait *w); + +/* + * The caller is responsible for locking around rcuwait_wait_event(), + * such that writes to @task are properly serialized. + */ +#define rcuwait_wait_event(w, condition) \ +({ \ + /* \ + * Complain if we are called after do_exit()/exit_notify(), \ + * as we cannot rely on the rcu critical region for the \ + * wakeup side. \ + */ \ + WARN_ON(current->exit_state); \ + \ + rcu_assign_pointer((w)->task, current); \ + for (;;) { \ + /* \ + * Implicit barrier (A) pairs with (B) in \ + * rcuwait_trywake(). \ + */ \ + set_current_state(TASK_UNINTERRUPTIBLE); \ + if (condition) \ + break; \ + \ + schedule(); \ + } \ + \ + WRITE_ONCE((w)->task, NULL); \ + __set_current_state(TASK_RUNNING); \ +}) + +#endif /* _LINUX_RCUWAIT_H_ */ diff --git a/kernel/exit.c b/kernel/exit.c index 8f14b866f9f6..e579b30a35a7 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -55,6 +55,7 @@ #include <linux/shm.h> #include <linux/kcov.h> #include <linux/random.h> +#include <linux/rcuwait.h> #include <linux/uaccess.h> #include <asm/unistd.h> @@ -282,6 +283,35 @@ struct task_struct *task_rcu_dereference(struct task_struct **ptask) return task; } +void rcuwait_trywake(struct rcuwait *w) +{ + struct task_struct *task; + + rcu_read_lock(); + + /* + * Order condition vs @task, such that everything prior to the load + * of @task is visible. This is the condition as to why the user called + * rcuwait_trywake() in the first place. Pairs with set_current_state() + * barrier (A) in rcuwait_wait_event(). + * + * WAIT WAKE + * [S] tsk = current [S] cond = true + * MB (A) MB (B) + * [L] cond [L] tsk + */ + smp_rmb(); /* (B) */ + + /* + * Avoid using task_rcu_dereference() magic as long as we are careful, + * see comment in rcuwait_wait_event() regarding ->exit_state. + */ + task = rcu_dereference(w->task); + if (task) + wake_up_process(task); + rcu_read_unlock(); +} + struct task_struct *try_get_task_struct(struct task_struct **ptask) { struct task_struct *task; -- 2.6.6 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] sched: Introduce rcuwait machinery 2016-12-22 17:01 ` [PATCH 1/2] sched: Introduce rcuwait machinery Davidlohr Bueso 2016-12-22 19:27 ` kbuild test robot @ 2016-12-22 19:55 ` kbuild test robot 2017-01-16 1:32 ` Davidlohr Bueso 2 siblings, 0 replies; 12+ messages in thread From: kbuild test robot @ 2016-12-22 19:55 UTC (permalink / raw) To: Davidlohr Bueso Cc: kbuild-all, mingo, peterz, oleg, linux-kernel, dave, Davidlohr Bueso [-- Attachment #1: Type: text/plain, Size: 11470 bytes --] Hi Davidlohr, [auto build test ERROR on tip/auto-latest] [also build test ERROR on v4.9 next-20161222] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Davidlohr-Bueso/sched-Introduce-rcuwait/20161223-020109 config: m68k-sun3_defconfig (attached as .config) compiler: m68k-linux-gcc (GCC) 4.9.0 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=m68k Note: the linux-review/Davidlohr-Bueso/sched-Introduce-rcuwait/20161223-020109 HEAD 9e9d238f94d5aa8e348e7e70585533fe0dbd373b builds fine. It only hurts bisectibility. All error/warnings (new ones prefixed by >>): >> kernel/exit.c:285:29: warning: 'struct rcuwait' declared inside parameter list void rcuwait_trywake(struct rcuwait *w) ^ >> kernel/exit.c:285:29: warning: its scope is only this definition or declaration, which is probably not what you want In file included from include/linux/srcu.h:33:0, from include/linux/notifier.h:15, from include/linux/memory_hotplug.h:6, from include/linux/mmzone.h:751, from include/linux/gfp.h:5, from include/linux/mm.h:9, from kernel/exit.c:7: kernel/exit.c: In function 'rcuwait_trywake': >> kernel/exit.c:308:26: error: dereferencing pointer to incomplete type task = rcu_dereference(w->task); ^ include/linux/rcupdate.h:606:10: note: in definition of macro '__rcu_dereference_check' typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \ ^ include/linux/rcupdate.h:786:28: note: in expansion of macro 'rcu_dereference_check' #define rcu_dereference(p) rcu_dereference_check(p, 0) ^ kernel/exit.c:308:9: note: in expansion of macro 'rcu_dereference' task = rcu_dereference(w->task); ^ >> kernel/exit.c:308:26: error: dereferencing pointer to incomplete type task = rcu_dereference(w->task); ^ include/linux/rcupdate.h:606:36: note: in definition of macro '__rcu_dereference_check' typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \ ^ include/linux/rcupdate.h:786:28: note: in expansion of macro 'rcu_dereference_check' #define rcu_dereference(p) rcu_dereference_check(p, 0) ^ kernel/exit.c:308:9: note: in expansion of macro 'rcu_dereference' task = rcu_dereference(w->task); ^ In file included from include/asm-generic/bug.h:4:0, from arch/m68k/include/asm/bug.h:28, from include/linux/bug.h:4, from include/linux/mmdebug.h:4, from include/linux/mm.h:8, from kernel/exit.c:7: >> kernel/exit.c:308:26: error: dereferencing pointer to incomplete type task = rcu_dereference(w->task); ^ include/linux/compiler.h:563:9: note: in definition of macro 'lockless_dereference' typeof(p) _________p1 = READ_ONCE(p); \ ^ include/linux/rcupdate.h:727:2: note: in expansion of macro '__rcu_dereference_check' __rcu_dereference_check((p), (c) || rcu_read_lock_held(), __rcu) ^ include/linux/rcupdate.h:786:28: note: in expansion of macro 'rcu_dereference_check' #define rcu_dereference(p) rcu_dereference_check(p, 0) ^ kernel/exit.c:308:9: note: in expansion of macro 'rcu_dereference' task = rcu_dereference(w->task); ^ In file included from include/asm-generic/bug.h:4:0, from arch/m68k/include/asm/bug.h:28, from include/linux/bug.h:4, from include/linux/mmdebug.h:4, from include/linux/mm.h:8, from kernel/exit.c:7: >> kernel/exit.c:308:26: error: dereferencing pointer to incomplete type task = rcu_dereference(w->task); ^ include/linux/compiler.h:305:17: note: in definition of macro '__READ_ONCE' union { typeof(x) __val; char __c[1]; } __u; \ ^ >> include/linux/compiler.h:563:26: note: in expansion of macro 'READ_ONCE' typeof(p) _________p1 = READ_ONCE(p); \ ^ >> include/linux/rcupdate.h:606:48: note: in expansion of macro 'lockless_dereference' typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \ ^ include/linux/rcupdate.h:727:2: note: in expansion of macro '__rcu_dereference_check' __rcu_dereference_check((p), (c) || rcu_read_lock_held(), __rcu) ^ include/linux/rcupdate.h:786:28: note: in expansion of macro 'rcu_dereference_check' #define rcu_dereference(p) rcu_dereference_check(p, 0) ^ kernel/exit.c:308:9: note: in expansion of macro 'rcu_dereference' task = rcu_dereference(w->task); ^ >> kernel/exit.c:308:26: error: dereferencing pointer to incomplete type task = rcu_dereference(w->task); ^ include/linux/compiler.h:307:22: note: in definition of macro '__READ_ONCE' __read_once_size(&(x), __u.__c, sizeof(x)); \ ^ >> include/linux/compiler.h:563:26: note: in expansion of macro 'READ_ONCE' typeof(p) _________p1 = READ_ONCE(p); \ ^ >> include/linux/rcupdate.h:606:48: note: in expansion of macro 'lockless_dereference' typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \ ^ include/linux/rcupdate.h:727:2: note: in expansion of macro '__rcu_dereference_check' __rcu_dereference_check((p), (c) || rcu_read_lock_held(), __rcu) ^ include/linux/rcupdate.h:786:28: note: in expansion of macro 'rcu_dereference_check' #define rcu_dereference(p) rcu_dereference_check(p, 0) ^ kernel/exit.c:308:9: note: in expansion of macro 'rcu_dereference' task = rcu_dereference(w->task); ^ >> kernel/exit.c:308:26: error: dereferencing pointer to incomplete type task = rcu_dereference(w->task); ^ include/linux/compiler.h:307:42: note: in definition of macro '__READ_ONCE' __read_once_size(&(x), __u.__c, sizeof(x)); \ ^ >> include/linux/compiler.h:563:26: note: in expansion of macro 'READ_ONCE' typeof(p) _________p1 = READ_ONCE(p); \ ^ >> include/linux/rcupdate.h:606:48: note: in expansion of macro 'lockless_dereference' typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \ ^ include/linux/rcupdate.h:727:2: note: in expansion of macro '__rcu_dereference_check' __rcu_dereference_check((p), (c) || rcu_read_lock_held(), __rcu) ^ include/linux/rcupdate.h:786:28: note: in expansion of macro 'rcu_dereference_check' #define rcu_dereference(p) rcu_dereference_check(p, 0) ^ kernel/exit.c:308:9: note: in expansion of macro 'rcu_dereference' task = rcu_dereference(w->task); ^ >> kernel/exit.c:308:26: error: dereferencing pointer to incomplete type task = rcu_dereference(w->task); ^ include/linux/compiler.h:309:30: note: in definition of macro '__READ_ONCE' __read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \ ^ >> include/linux/compiler.h:563:26: note: in expansion of macro 'READ_ONCE' typeof(p) _________p1 = READ_ONCE(p); \ ^ >> include/linux/rcupdate.h:606:48: note: in expansion of macro 'lockless_dereference' typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \ ^ include/linux/rcupdate.h:727:2: note: in expansion of macro '__rcu_dereference_check' __rcu_dereference_check((p), (c) || rcu_read_lock_held(), __rcu) ^ include/linux/rcupdate.h:786:28: note: in expansion of macro 'rcu_dereference_check' #define rcu_dereference(p) rcu_dereference_check(p, 0) ^ kernel/exit.c:308:9: note: in expansion of macro 'rcu_dereference' task = rcu_dereference(w->task); ^ >> kernel/exit.c:308:26: error: dereferencing pointer to incomplete type task = rcu_dereference(w->task); ^ include/linux/compiler.h:309:50: note: in definition of macro '__READ_ONCE' __read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \ ^ >> include/linux/compiler.h:563:26: note: in expansion of macro 'READ_ONCE' typeof(p) _________p1 = READ_ONCE(p); \ ^ >> include/linux/rcupdate.h:606:48: note: in expansion of macro 'lockless_dereference' typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \ ^ include/linux/rcupdate.h:727:2: note: in expansion of macro '__rcu_dereference_check' __rcu_dereference_check((p), (c) || rcu_read_lock_held(), __rcu) ^ include/linux/rcupdate.h:786:28: note: in expansion of macro 'rcu_dereference_check' #define rcu_dereference(p) rcu_dereference_check(p, 0) ^ kernel/exit.c:308:9: note: in expansion of macro 'rcu_dereference' task = rcu_dereference(w->task); ^ In file included from include/asm-generic/bug.h:4:0, from arch/m68k/include/asm/bug.h:28, from include/linux/bug.h:4, from include/linux/mmdebug.h:4, from include/linux/mm.h:8, from kernel/exit.c:7: vim +308 kernel/exit.c 279 if (!sighand) 280 return NULL; 281 282 return task; 283 } 284 > 285 void rcuwait_trywake(struct rcuwait *w) 286 { 287 struct task_struct *task; 288 289 rcu_read_lock(); 290 291 /* 292 * Order condition vs @task, such that everything prior to the load 293 * of @task is visible. This is the condition as to why the user called 294 * rcuwait_trywake() in the first place. Pairs with set_current_state() 295 * barrier (A) in rcuwait_wait_event(). 296 * 297 * WAIT WAKE 298 * [S] tsk = current [S] cond = true 299 * MB (A) MB (B) 300 * [L] cond [L] tsk 301 */ 302 smp_rmb(); /* (B) */ 303 304 /* 305 * Avoid using task_rcu_dereference() magic as long as we are careful, 306 * see comment in rcuwait_wait_event() regarding ->exit_state. 307 */ > 308 task = rcu_dereference(w->task); 309 if (task) 310 wake_up_process(task); 311 rcu_read_unlock(); --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 11676 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] sched: Introduce rcuwait machinery 2016-12-22 17:01 ` [PATCH 1/2] sched: Introduce rcuwait machinery Davidlohr Bueso 2016-12-22 19:27 ` kbuild test robot 2016-12-22 19:55 ` kbuild test robot @ 2017-01-16 1:32 ` Davidlohr Bueso 2017-01-17 17:41 ` Oleg Nesterov 2 siblings, 1 reply; 12+ messages in thread From: Davidlohr Bueso @ 2017-01-16 1:32 UTC (permalink / raw) To: mingo, peterz, oleg; +Cc: linux-kernel, Davidlohr Bueso On Thu, 22 Dec 2016, Bueso wrote: >+ WARN_ON(current->exit_state); \ While not related to this patch, but per 3245d6acab9 (exit: fix race between wait_consider_task() and wait_task_zombie()), should we not *_ONCE() all things ->exit_state? I'm not really refering to a specific bug (much less here, where that race would not matter obviously), but if nothing else, for documentation -- and I doubt it would make any difference performance wise. Thanks, Davidlohr ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] sched: Introduce rcuwait machinery 2017-01-16 1:32 ` Davidlohr Bueso @ 2017-01-17 17:41 ` Oleg Nesterov 0 siblings, 0 replies; 12+ messages in thread From: Oleg Nesterov @ 2017-01-17 17:41 UTC (permalink / raw) To: Davidlohr Bueso; +Cc: mingo, peterz, linux-kernel, Davidlohr Bueso On 01/15, Davidlohr Bueso wrote: > > On Thu, 22 Dec 2016, Bueso wrote: > >> + WARN_ON(current->exit_state); \ > > While not related to this patch, but per 3245d6acab9 (exit: fix race > between wait_consider_task() and wait_task_zombie()), should we not > *_ONCE() all things ->exit_state? current->exit_state != 0 is stable. I mean, only current can change it from zero to non-zero, and once it is non-zero it can't be zero again. > I'm not really refering to a specific > bug (much less here, where that race would not matter obviously), but > if nothing else, for documentation Oh, I won't argue but I do not agree. To me, READ_ONCE() often adds some confusion because I can almost never understand if it is actually needed for correctness or it was added "just in case". Oleg. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] locking/percpu-rwsem: Replace waitqueue with rcuwait 2016-12-22 17:01 [PATCH 0/2] sched: Introduce rcuwait Davidlohr Bueso 2016-12-22 17:01 ` [PATCH 1/2] sched: Introduce rcuwait machinery Davidlohr Bueso @ 2016-12-22 17:01 ` Davidlohr Bueso 2017-01-09 18:26 ` [PATCH 0/2] sched: Introduce rcuwait Davidlohr Bueso 2 siblings, 0 replies; 12+ messages in thread From: Davidlohr Bueso @ 2016-12-22 17:01 UTC (permalink / raw) To: mingo, peterz, oleg; +Cc: linux-kernel, dave, Davidlohr Bueso The use of any kind of wait queue is an overkill for pcpu-rwsems. While one option would be to use the less heavy simple (swait) flavor, this is still too much for what pcpu-rwsems needs. For one, we do not care about any sort of queuing in that the only (rare) time writers (and readers, for that matter) are queued is when trying to acquire the regular contended rw_sem. There cannot be any further queuing as writers are serialized by the rw_sem in the first place. Given that percpu_down_write() must not be called after exit_notify(), we can replace the bulky waitqueue with rcuwait such that a writer can wait for its turn to take the lock. As such, we can avoid the queue handling and locking overhead. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> --- include/linux/percpu-rwsem.h | 8 ++++---- kernel/locking/percpu-rwsem.c | 7 +++---- 2 files changed, 7 insertions(+), 8 deletions(-) diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h index 5b2e6159b744..93664f022ecf 100644 --- a/include/linux/percpu-rwsem.h +++ b/include/linux/percpu-rwsem.h @@ -4,15 +4,15 @@ #include <linux/atomic.h> #include <linux/rwsem.h> #include <linux/percpu.h> -#include <linux/wait.h> +#include <linux/rcuwait.h> #include <linux/rcu_sync.h> #include <linux/lockdep.h> struct percpu_rw_semaphore { struct rcu_sync rss; unsigned int __percpu *read_count; - struct rw_semaphore rw_sem; - wait_queue_head_t writer; + struct rw_semaphore rw_sem; /* slowpath */ + struct rcuwait writer; /* blocked writer */ int readers_block; }; @@ -22,7 +22,7 @@ static struct percpu_rw_semaphore name = { \ .rss = __RCU_SYNC_INITIALIZER(name.rss, RCU_SCHED_SYNC), \ .read_count = &__percpu_rwsem_rc_##name, \ .rw_sem = __RWSEM_INITIALIZER(name.rw_sem), \ - .writer = __WAIT_QUEUE_HEAD_INITIALIZER(name.writer), \ + .writer = __RCUWAIT_INITIALIZER(name.writer), \ } extern int __percpu_down_read(struct percpu_rw_semaphore *, int); diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c index ce182599cf2e..e2502d6ec82f 100644 --- a/kernel/locking/percpu-rwsem.c +++ b/kernel/locking/percpu-rwsem.c @@ -1,7 +1,6 @@ #include <linux/atomic.h> #include <linux/rwsem.h> #include <linux/percpu.h> -#include <linux/wait.h> #include <linux/lockdep.h> #include <linux/percpu-rwsem.h> #include <linux/rcupdate.h> @@ -18,7 +17,7 @@ int __percpu_init_rwsem(struct percpu_rw_semaphore *sem, /* ->rw_sem represents the whole percpu_rw_semaphore for lockdep */ rcu_sync_init(&sem->rss, RCU_SCHED_SYNC); __init_rwsem(&sem->rw_sem, name, rwsem_key); - init_waitqueue_head(&sem->writer); + rcuwait_init(&sem->writer); sem->readers_block = 0; return 0; } @@ -103,7 +102,7 @@ void __percpu_up_read(struct percpu_rw_semaphore *sem) __this_cpu_dec(*sem->read_count); /* Prod writer to recheck readers_active */ - wake_up(&sem->writer); + rcuwait_trywake(&sem->writer); } EXPORT_SYMBOL_GPL(__percpu_up_read); @@ -160,7 +159,7 @@ void percpu_down_write(struct percpu_rw_semaphore *sem) */ /* Wait for all now active readers to complete. */ - wait_event(sem->writer, readers_active_check(sem)); + rcuwait_wait_event(&sem->writer, readers_active_check(sem)); } EXPORT_SYMBOL_GPL(percpu_down_write); -- 2.6.6 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] sched: Introduce rcuwait 2016-12-22 17:01 [PATCH 0/2] sched: Introduce rcuwait Davidlohr Bueso 2016-12-22 17:01 ` [PATCH 1/2] sched: Introduce rcuwait machinery Davidlohr Bueso 2016-12-22 17:01 ` [PATCH 2/2] locking/percpu-rwsem: Replace waitqueue with rcuwait Davidlohr Bueso @ 2017-01-09 18:26 ` Davidlohr Bueso 2017-01-10 18:35 ` Oleg Nesterov 2 siblings, 1 reply; 12+ messages in thread From: Davidlohr Bueso @ 2017-01-09 18:26 UTC (permalink / raw) To: mingo, peterz, oleg; +Cc: linux-kernel Gents, any further thoughts on this? Thanks, Davidlohr ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] sched: Introduce rcuwait 2017-01-09 18:26 ` [PATCH 0/2] sched: Introduce rcuwait Davidlohr Bueso @ 2017-01-10 18:35 ` Oleg Nesterov 2017-01-10 19:37 ` Davidlohr Bueso 0 siblings, 1 reply; 12+ messages in thread From: Oleg Nesterov @ 2017-01-10 18:35 UTC (permalink / raw) To: Davidlohr Bueso; +Cc: mingo, peterz, linux-kernel On 01/09, Davidlohr Bueso wrote: > > Gents, any further thoughts on this? Both look correct to me, and I think this allows us to make more optimizations in percpu-rwsem.c. I am not sure about the naming... Yes, it relies on rcu but this is just implementation detail. But this is cosmetic and I can't suggest something better than rcuwait. Well, speaking of naming, rcuwait_trywake() doesn't look good to me, rcuwait_wake_up() looks better, "try" is misleading imo. But this is cosmetic/subjective too. Reviewed-by: Oleg Nesterov <oleg@redhat.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] sched: Introduce rcuwait 2017-01-10 18:35 ` Oleg Nesterov @ 2017-01-10 19:37 ` Davidlohr Bueso 0 siblings, 0 replies; 12+ messages in thread From: Davidlohr Bueso @ 2017-01-10 19:37 UTC (permalink / raw) To: Oleg Nesterov; +Cc: mingo, peterz, linux-kernel On Tue, 10 Jan 2017, Oleg Nesterov wrote: >Well, speaking of naming, rcuwait_trywake() doesn't look good to me, >rcuwait_wake_up() looks better, "try" is misleading imo. But this is >cosmetic/subjective too. I actually added the 'try' on second thought -- in that for the particular pcpu-rwsem user, obviously most of the time the wakeup will not actually occur. But yeah, I'd have no problem just naming it rcuwait_wake_up(). >Reviewed-by: Oleg Nesterov <oleg@redhat.com> Thanks! ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 0/2] sched: Introduce rcuwait @ 2017-01-11 15:22 Davidlohr Bueso 2017-01-11 15:22 ` [PATCH 2/2] locking/percpu-rwsem: Replace waitqueue with rcuwait Davidlohr Bueso 0 siblings, 1 reply; 12+ messages in thread From: Davidlohr Bueso @ 2017-01-11 15:22 UTC (permalink / raw) To: mingo, peterz; +Cc: oleg, dave, linux-kernel Changes from v1: - Renamed trywake to wake_up. - Added Oleg's review tags. Hi, Here's an updated version of the pcpu rwsem writer wait/wake changes with the abstractions wanted by Oleg. Patch 1 adds rcuwait (for a lack of better name), and patch 2 trivially makes use of it. Has survived torture testing, which is actually very handy in this case particularly dealing with equal amount of reader and writer threads. Applies on top of Linus' tree (4.10-rc3). Thanks. Davidlohr Bueso (2): sched: Introduce rcuwait machinery locking/percpu-rwsem: Replace waitqueue with rcuwait include/linux/percpu-rwsem.h | 8 +++--- include/linux/rcuwait.h | 63 +++++++++++++++++++++++++++++++++++++++++++ kernel/exit.c | 30 +++++++++++++++++++++ kernel/locking/percpu-rwsem.c | 7 +++-- 4 files changed, 100 insertions(+), 8 deletions(-) create mode 100644 include/linux/rcuwait.h -- 2.6.6 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] locking/percpu-rwsem: Replace waitqueue with rcuwait 2017-01-11 15:22 [PATCH v2 " Davidlohr Bueso @ 2017-01-11 15:22 ` Davidlohr Bueso 0 siblings, 0 replies; 12+ messages in thread From: Davidlohr Bueso @ 2017-01-11 15:22 UTC (permalink / raw) To: mingo, peterz; +Cc: oleg, dave, linux-kernel, Davidlohr Bueso The use of any kind of wait queue is an overkill for pcpu-rwsems. While one option would be to use the less heavy simple (swait) flavor, this is still too much for what pcpu-rwsems needs. For one, we do not care about any sort of queuing in that the only (rare) time writers (and readers, for that matter) are queued is when trying to acquire the regular contended rw_sem. There cannot be any further queuing as writers are serialized by the rw_sem in the first place. Given that percpu_down_write() must not be called after exit_notify(), we can replace the bulky waitqueue with rcuwait such that a writer can wait for its turn to take the lock. As such, we can avoid the queue handling and locking overhead. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> --- include/linux/percpu-rwsem.h | 8 ++++---- kernel/locking/percpu-rwsem.c | 7 +++---- 2 files changed, 7 insertions(+), 8 deletions(-) diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h index 5b2e6159b744..93664f022ecf 100644 --- a/include/linux/percpu-rwsem.h +++ b/include/linux/percpu-rwsem.h @@ -4,15 +4,15 @@ #include <linux/atomic.h> #include <linux/rwsem.h> #include <linux/percpu.h> -#include <linux/wait.h> +#include <linux/rcuwait.h> #include <linux/rcu_sync.h> #include <linux/lockdep.h> struct percpu_rw_semaphore { struct rcu_sync rss; unsigned int __percpu *read_count; - struct rw_semaphore rw_sem; - wait_queue_head_t writer; + struct rw_semaphore rw_sem; /* slowpath */ + struct rcuwait writer; /* blocked writer */ int readers_block; }; @@ -22,7 +22,7 @@ static struct percpu_rw_semaphore name = { \ .rss = __RCU_SYNC_INITIALIZER(name.rss, RCU_SCHED_SYNC), \ .read_count = &__percpu_rwsem_rc_##name, \ .rw_sem = __RWSEM_INITIALIZER(name.rw_sem), \ - .writer = __WAIT_QUEUE_HEAD_INITIALIZER(name.writer), \ + .writer = __RCUWAIT_INITIALIZER(name.writer), \ } extern int __percpu_down_read(struct percpu_rw_semaphore *, int); diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c index ce182599cf2e..883cf1b92d90 100644 --- a/kernel/locking/percpu-rwsem.c +++ b/kernel/locking/percpu-rwsem.c @@ -1,7 +1,6 @@ #include <linux/atomic.h> #include <linux/rwsem.h> #include <linux/percpu.h> -#include <linux/wait.h> #include <linux/lockdep.h> #include <linux/percpu-rwsem.h> #include <linux/rcupdate.h> @@ -18,7 +17,7 @@ int __percpu_init_rwsem(struct percpu_rw_semaphore *sem, /* ->rw_sem represents the whole percpu_rw_semaphore for lockdep */ rcu_sync_init(&sem->rss, RCU_SCHED_SYNC); __init_rwsem(&sem->rw_sem, name, rwsem_key); - init_waitqueue_head(&sem->writer); + rcuwait_init(&sem->writer); sem->readers_block = 0; return 0; } @@ -103,7 +102,7 @@ void __percpu_up_read(struct percpu_rw_semaphore *sem) __this_cpu_dec(*sem->read_count); /* Prod writer to recheck readers_active */ - wake_up(&sem->writer); + rcuwait_wake_up(&sem->writer); } EXPORT_SYMBOL_GPL(__percpu_up_read); @@ -160,7 +159,7 @@ void percpu_down_write(struct percpu_rw_semaphore *sem) */ /* Wait for all now active readers to complete. */ - wait_event(sem->writer, readers_active_check(sem)); + rcuwait_wait_event(&sem->writer, readers_active_check(sem)); } EXPORT_SYMBOL_GPL(percpu_down_write); -- 2.6.6 ^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-01-17 17:42 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-12-22 17:01 [PATCH 0/2] sched: Introduce rcuwait Davidlohr Bueso 2016-12-22 17:01 ` [PATCH 1/2] sched: Introduce rcuwait machinery Davidlohr Bueso 2016-12-22 19:27 ` kbuild test robot 2017-01-03 23:20 ` Davidlohr Bueso 2016-12-22 19:55 ` kbuild test robot 2017-01-16 1:32 ` Davidlohr Bueso 2017-01-17 17:41 ` Oleg Nesterov 2016-12-22 17:01 ` [PATCH 2/2] locking/percpu-rwsem: Replace waitqueue with rcuwait Davidlohr Bueso 2017-01-09 18:26 ` [PATCH 0/2] sched: Introduce rcuwait Davidlohr Bueso 2017-01-10 18:35 ` Oleg Nesterov 2017-01-10 19:37 ` Davidlohr Bueso 2017-01-11 15:22 [PATCH v2 " Davidlohr Bueso 2017-01-11 15:22 ` [PATCH 2/2] locking/percpu-rwsem: Replace waitqueue with rcuwait Davidlohr Bueso
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).