* [PATCH 0/1] rcu/sync: simplify the state machine @ 2019-04-25 16:40 Oleg Nesterov 2019-04-25 16:49 ` Oleg Nesterov 2019-04-25 16:50 ` [PATCH 1/1] " Oleg Nesterov 0 siblings, 2 replies; 8+ messages in thread From: Oleg Nesterov @ 2019-04-25 16:40 UTC (permalink / raw) To: Paul E. McKenney, Peter Zijlstra Cc: Joel Fernandes, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Steven Rostedt, linux-kernel Let me finally try to close the gestalt ;) This version doesn't add the new features yet, and it doesn't remove the "must die" rcu_sync_enter_start(). But with this patch we are ready, just I think that this should come as a separate change. To simplify the review, see the the most important parts of the code with the patch applied below. Oleg. ------------------------------------------------------------------------------- struct rcu_sync { int gp_state; int gp_count; wait_queue_head_t gp_wait; struct rcu_head cb_head; }; enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY }; #define rss_lock gp_wait.lock static void rcu_sync_call(struct rcu_sync *rsp) { call_rcu(&rsp->cb_head, rcu_sync_func); } static void rcu_sync_func(struct rcu_head *rcu) { struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head); unsigned long flags; WARN_ON_ONCE(rsp->gp_state == GP_IDLE); WARN_ON_ONCE(rsp->gp_state == GP_PASSED); spin_lock_irqsave(&rsp->rss_lock, flags); if (rsp->gp_count) { /* * We're at least a GP after the GP_IDLE->GP_ENTER transition. */ rsp->gp_state = GP_PASSED; wake_up_locked(&rsp->gp_wait); } else if (rsp->gp_state == GP_REPLAY) { /* * A new rcu_sync_exit() has happened; requeue the callback to * catch a later GP. */ rsp->gp_state = GP_EXIT; rcu_sync_call(rsp); } else { /* * We're at least a GP after the last rcu_sync_exit(); eveybody * will now have observed the write side critical section. * Let 'em rip!. */ rsp->gp_state = GP_IDLE; } spin_unlock_irqrestore(&rsp->rss_lock, flags); } void rcu_sync_enter(struct rcu_sync *rsp) { int gp_state; spin_lock_irq(&rsp->rss_lock); gp_state = rsp->gp_state; if (gp_state == GP_IDLE) { rsp->gp_state = GP_ENTER; WARN_ON_ONCE(rsp->gp_count); /* * Note that we could simply do rcu_sync_call(rsp) here and * avoid the "if (gp_state == GP_IDLE)" block below. * * However, synchronize_rcu() can be faster if rcu_expedited * or rcu_blocking_is_gp() is true. * * Another reason is that we can't wait for rcu callback if * we are called at early boot time but this shouldn't happen. */ } rsp->gp_count++; spin_unlock_irq(&rsp->rss_lock); if (gp_state == GP_IDLE) { /* * See the comment above, this simply does the "synchronous" * call_rcu(rcu_sync_func) which does GP_ENTER -> GP_PASSED. */ synchronize_rcu(); rcu_sync_func(&rsp->cb_head); /* Not really needed, wait_event() would see GP_PASSED. */ return; } wait_event(rsp->gp_wait, rsp->gp_state >= GP_PASSED); } void rcu_sync_exit(struct rcu_sync *rsp) { WARN_ON_ONCE(rsp->gp_state == GP_IDLE); WARN_ON_ONCE(rsp->gp_count == 0); spin_lock_irq(&rsp->rss_lock); if (!--rsp->gp_count) { if (rsp->gp_state == GP_PASSED) { rsp->gp_state = GP_EXIT; rcu_sync_call(rsp); } else if (rsp->gp_state == GP_EXIT) { rsp->gp_state = GP_REPLAY; } } spin_unlock_irq(&rsp->rss_lock); } ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/1] rcu/sync: simplify the state machine 2019-04-25 16:40 [PATCH 0/1] rcu/sync: simplify the state machine Oleg Nesterov @ 2019-04-25 16:49 ` Oleg Nesterov 2019-04-25 16:50 ` [PATCH 1/1] " Oleg Nesterov 1 sibling, 0 replies; 8+ messages in thread From: Oleg Nesterov @ 2019-04-25 16:49 UTC (permalink / raw) To: Paul E. McKenney, Peter Zijlstra Cc: Joel Fernandes, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Steven Rostedt, linux-kernel With this patch rcu_sync has a single state variable and the transition rules become really simple: GP_IDLE - owned by the first rcu_sync_enter() which moves it to GP_ENTER - owned by rcu-callback which moves it to GP_PASSED - owned by the last rcu_sync_exit() which moves it to GP_EXIT - and this is the only "nontrivial" state. rcu-callback moves it back to GP_IDLE unless another enter() comes before a GP pass. If rcu-callback is invoked before the next rcu_sync_exit() it must see gp_count incremented by that enter() and set GP_PASSED. Otherwise, if the next rcu_sync_exit() wins the race, it will move it to GP_REPLAY - owned by rcu-callback which moves it to GP_EXIT Signed-off-by: Oleg Nesterov <oleg@redhat.com> --- include/linux/rcu_sync.h | 2 - kernel/rcu/sync.c | 165 +++++++++++++++++++++++++++-------------------- 2 files changed, 95 insertions(+), 72 deletions(-) diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h index e7ae221..3156a14 100644 --- a/include/linux/rcu_sync.h +++ b/include/linux/rcu_sync.h @@ -19,7 +19,6 @@ struct rcu_sync { int gp_count; wait_queue_head_t gp_wait; - int cb_state; struct rcu_head cb_head; }; @@ -47,7 +46,6 @@ extern void rcu_sync_dtor(struct rcu_sync *); .gp_state = 0, \ .gp_count = 0, \ .gp_wait = __WAIT_QUEUE_HEAD_INITIALIZER(name.gp_wait), \ - .cb_state = 0, \ } #define DEFINE_RCU_SYNC(name) \ diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c index ee427e1..d9f80fc 100644 --- a/kernel/rcu/sync.c +++ b/kernel/rcu/sync.c @@ -10,15 +10,13 @@ #include <linux/rcu_sync.h> #include <linux/sched.h> -enum { GP_IDLE = 0, GP_PENDING, GP_PASSED }; -enum { CB_IDLE = 0, CB_PENDING, CB_REPLAY }; +enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY }; #define rss_lock gp_wait.lock /** * rcu_sync_init() - Initialize an rcu_sync structure * @rsp: Pointer to rcu_sync structure to be initialized - * @type: Flavor of RCU with which to synchronize rcu_sync structure */ void rcu_sync_init(struct rcu_sync *rsp) { @@ -41,56 +39,26 @@ void rcu_sync_enter_start(struct rcu_sync *rsp) rsp->gp_state = GP_PASSED; } -/** - * rcu_sync_enter() - Force readers onto slowpath - * @rsp: Pointer to rcu_sync structure to use for synchronization - * - * This function is used by updaters who need readers to make use of - * a slowpath during the update. After this function returns, all - * subsequent calls to rcu_sync_is_idle() will return false, which - * tells readers to stay off their fastpaths. A later call to - * rcu_sync_exit() re-enables reader slowpaths. - * - * When called in isolation, rcu_sync_enter() must wait for a grace - * period, however, closely spaced calls to rcu_sync_enter() can - * optimize away the grace-period wait via a state machine implemented - * by rcu_sync_enter(), rcu_sync_exit(), and rcu_sync_func(). - */ -void rcu_sync_enter(struct rcu_sync *rsp) -{ - bool need_wait, need_sync; - spin_lock_irq(&rsp->rss_lock); - need_wait = rsp->gp_count++; - need_sync = rsp->gp_state == GP_IDLE; - if (need_sync) - rsp->gp_state = GP_PENDING; - spin_unlock_irq(&rsp->rss_lock); +static void rcu_sync_func(struct rcu_head *rcu); - WARN_ON_ONCE(need_wait && need_sync); - if (need_sync) { - synchronize_rcu(); - rsp->gp_state = GP_PASSED; - wake_up_all(&rsp->gp_wait); - } else if (need_wait) { - wait_event(rsp->gp_wait, rsp->gp_state == GP_PASSED); - } else { - /* - * Possible when there's a pending CB from a rcu_sync_exit(). - * Nobody has yet been allowed the 'fast' path and thus we can - * avoid doing any sync(). The callback will get 'dropped'. - */ - WARN_ON_ONCE(rsp->gp_state != GP_PASSED); - } +static void rcu_sync_call(struct rcu_sync *rsp) +{ + call_rcu(&rsp->cb_head, rcu_sync_func); } /** * rcu_sync_func() - Callback function managing reader access to fastpath * @rhp: Pointer to rcu_head in rcu_sync structure to use for synchronization * - * This function is passed to one of the call_rcu() functions by + * This function is passed to call_rcu() function by rcu_sync_enter() and * rcu_sync_exit(), so that it is invoked after a grace period following the - * that invocation of rcu_sync_exit(). It takes action based on events that + * that invocation of enter/exit. + * + * If it is called by rcu_sync_enter() it signals that all the readers were + * switched onto slow path. + * + * If it is called by rcu_sync_exit() it takes action based on events that * have taken place in the meantime, so that closely spaced rcu_sync_enter() * and rcu_sync_exit() pairs need not wait for a grace period. * @@ -102,40 +70,93 @@ void rcu_sync_enter(struct rcu_sync *rsp) * rcu_sync_exit(). Otherwise, set all state back to idle so that readers * can again use their fastpaths. */ -static void rcu_sync_func(struct rcu_head *rhp) +static void rcu_sync_func(struct rcu_head *rcu) { - struct rcu_sync *rsp = container_of(rhp, struct rcu_sync, cb_head); + struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head); unsigned long flags; - WARN_ON_ONCE(rsp->gp_state != GP_PASSED); - WARN_ON_ONCE(rsp->cb_state == CB_IDLE); + WARN_ON_ONCE(rsp->gp_state == GP_IDLE); + WARN_ON_ONCE(rsp->gp_state == GP_PASSED); spin_lock_irqsave(&rsp->rss_lock, flags); if (rsp->gp_count) { /* - * A new rcu_sync_begin() has happened; drop the callback. + * We're at least a GP after the GP_IDLE->GP_ENTER transition. */ - rsp->cb_state = CB_IDLE; - } else if (rsp->cb_state == CB_REPLAY) { + rsp->gp_state = GP_PASSED; + wake_up_locked(&rsp->gp_wait); + } else if (rsp->gp_state == GP_REPLAY) { /* - * A new rcu_sync_exit() has happened; requeue the callback - * to catch a later GP. + * A new rcu_sync_exit() has happened; requeue the callback to + * catch a later GP. */ - rsp->cb_state = CB_PENDING; - call_rcu(&rsp->cb_head, rcu_sync_func); + rsp->gp_state = GP_EXIT; + rcu_sync_call(rsp); } else { /* - * We're at least a GP after rcu_sync_exit(); eveybody will now - * have observed the write side critical section. Let 'em rip!. + * We're at least a GP after the last rcu_sync_exit(); eveybody + * will now have observed the write side critical section. + * Let 'em rip!. */ - rsp->cb_state = CB_IDLE; rsp->gp_state = GP_IDLE; } spin_unlock_irqrestore(&rsp->rss_lock, flags); } /** - * rcu_sync_exit() - Allow readers back onto fast patch after grace period + * rcu_sync_enter() - Force readers onto slowpath + * @rsp: Pointer to rcu_sync structure to use for synchronization + * + * This function is used by updaters who need readers to make use of + * a slowpath during the update. After this function returns, all + * subsequent calls to rcu_sync_is_idle() will return false, which + * tells readers to stay off their fastpaths. A later call to + * rcu_sync_exit() re-enables reader slowpaths. + * + * When called in isolation, rcu_sync_enter() must wait for a grace + * period, however, closely spaced calls to rcu_sync_enter() can + * optimize away the grace-period wait via a state machine implemented + * by rcu_sync_enter(), rcu_sync_exit(), and rcu_sync_func(). + */ +void rcu_sync_enter(struct rcu_sync *rsp) +{ + int gp_state; + + spin_lock_irq(&rsp->rss_lock); + gp_state = rsp->gp_state; + if (gp_state == GP_IDLE) { + rsp->gp_state = GP_ENTER; + WARN_ON_ONCE(rsp->gp_count); + /* + * Note that we could simply do rcu_sync_call(rsp) here and + * avoid the "if (gp_state == GP_IDLE)" block below. + * + * However, synchronize_rcu() can be faster if rcu_expedited + * or rcu_blocking_is_gp() is true. + * + * Another reason is that we can't wait for rcu callback if + * we are called at early boot time but this shouldn't happen. + */ + } + rsp->gp_count++; + spin_unlock_irq(&rsp->rss_lock); + + if (gp_state == GP_IDLE) { + /* + * See the comment above, this simply does the "synchronous" + * call_rcu(rcu_sync_func) which does GP_ENTER -> GP_PASSED. + */ + synchronize_rcu(); + rcu_sync_func(&rsp->cb_head); + /* Not really needed, wait_event() would see GP_PASSED. */ + return; + } + + wait_event(rsp->gp_wait, rsp->gp_state >= GP_PASSED); +} + +/** + * rcu_sync_exit() - Allow readers back onto fast path after grace period * @rsp: Pointer to rcu_sync structure to use for synchronization * * This function is used by updaters who have completed, and can therefore @@ -146,13 +167,16 @@ static void rcu_sync_func(struct rcu_head *rhp) */ void rcu_sync_exit(struct rcu_sync *rsp) { + WARN_ON_ONCE(rsp->gp_state == GP_IDLE); + WARN_ON_ONCE(rsp->gp_count == 0); + spin_lock_irq(&rsp->rss_lock); if (!--rsp->gp_count) { - if (rsp->cb_state == CB_IDLE) { - rsp->cb_state = CB_PENDING; - call_rcu(&rsp->cb_head, rcu_sync_func); - } else if (rsp->cb_state == CB_PENDING) { - rsp->cb_state = CB_REPLAY; + if (rsp->gp_state == GP_PASSED) { + rsp->gp_state = GP_EXIT; + rcu_sync_call(rsp); + } else if (rsp->gp_state == GP_EXIT) { + rsp->gp_state = GP_REPLAY; } } spin_unlock_irq(&rsp->rss_lock); @@ -164,18 +188,19 @@ void rcu_sync_exit(struct rcu_sync *rsp) */ void rcu_sync_dtor(struct rcu_sync *rsp) { - int cb_state; + int gp_state; WARN_ON_ONCE(rsp->gp_count); + WARN_ON_ONCE(rsp->gp_state == GP_PASSED); spin_lock_irq(&rsp->rss_lock); - if (rsp->cb_state == CB_REPLAY) - rsp->cb_state = CB_PENDING; - cb_state = rsp->cb_state; + if (rsp->gp_state == GP_REPLAY) + rsp->gp_state = GP_EXIT; + gp_state = rsp->gp_state; spin_unlock_irq(&rsp->rss_lock); - if (cb_state != CB_IDLE) { + if (gp_state != GP_IDLE) { rcu_barrier(); - WARN_ON_ONCE(rsp->cb_state != CB_IDLE); + WARN_ON_ONCE(rsp->gp_state != GP_IDLE); } } -- 2.5.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 1/1] rcu/sync: simplify the state machine 2019-04-25 16:40 [PATCH 0/1] rcu/sync: simplify the state machine Oleg Nesterov 2019-04-25 16:49 ` Oleg Nesterov @ 2019-04-25 16:50 ` Oleg Nesterov 2019-04-27 21:02 ` Paul E. McKenney 1 sibling, 1 reply; 8+ messages in thread From: Oleg Nesterov @ 2019-04-25 16:50 UTC (permalink / raw) To: Paul E. McKenney, Peter Zijlstra Cc: Joel Fernandes, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Steven Rostedt, linux-kernel With this patch rcu_sync has a single state variable and the transition rules become really simple: GP_IDLE - owned by the first rcu_sync_enter() which moves it to GP_ENTER - owned by rcu-callback which moves it to GP_PASSED - owned by the last rcu_sync_exit() which moves it to GP_EXIT - and this is the only "nontrivial" state. rcu-callback moves it back to GP_IDLE unless another enter() comes before a GP pass. If rcu-callback is invoked before the next rcu_sync_exit() it must see gp_count incremented by that enter() and set GP_PASSED. Otherwise, if the next rcu_sync_exit() wins the race, it will move it to GP_REPLAY - owned by rcu-callback which moves it to GP_EXIT Signed-off-by: Oleg Nesterov <oleg@redhat.com> --- include/linux/rcu_sync.h | 2 - kernel/rcu/sync.c | 165 +++++++++++++++++++++++++++-------------------- 2 files changed, 95 insertions(+), 72 deletions(-) diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h index e7ae221..3156a14 100644 --- a/include/linux/rcu_sync.h +++ b/include/linux/rcu_sync.h @@ -19,7 +19,6 @@ struct rcu_sync { int gp_count; wait_queue_head_t gp_wait; - int cb_state; struct rcu_head cb_head; }; @@ -47,7 +46,6 @@ extern void rcu_sync_dtor(struct rcu_sync *); .gp_state = 0, \ .gp_count = 0, \ .gp_wait = __WAIT_QUEUE_HEAD_INITIALIZER(name.gp_wait), \ - .cb_state = 0, \ } #define DEFINE_RCU_SYNC(name) \ diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c index ee427e1..d9f80fc 100644 --- a/kernel/rcu/sync.c +++ b/kernel/rcu/sync.c @@ -10,15 +10,13 @@ #include <linux/rcu_sync.h> #include <linux/sched.h> -enum { GP_IDLE = 0, GP_PENDING, GP_PASSED }; -enum { CB_IDLE = 0, CB_PENDING, CB_REPLAY }; +enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY }; #define rss_lock gp_wait.lock /** * rcu_sync_init() - Initialize an rcu_sync structure * @rsp: Pointer to rcu_sync structure to be initialized - * @type: Flavor of RCU with which to synchronize rcu_sync structure */ void rcu_sync_init(struct rcu_sync *rsp) { @@ -41,56 +39,26 @@ void rcu_sync_enter_start(struct rcu_sync *rsp) rsp->gp_state = GP_PASSED; } -/** - * rcu_sync_enter() - Force readers onto slowpath - * @rsp: Pointer to rcu_sync structure to use for synchronization - * - * This function is used by updaters who need readers to make use of - * a slowpath during the update. After this function returns, all - * subsequent calls to rcu_sync_is_idle() will return false, which - * tells readers to stay off their fastpaths. A later call to - * rcu_sync_exit() re-enables reader slowpaths. - * - * When called in isolation, rcu_sync_enter() must wait for a grace - * period, however, closely spaced calls to rcu_sync_enter() can - * optimize away the grace-period wait via a state machine implemented - * by rcu_sync_enter(), rcu_sync_exit(), and rcu_sync_func(). - */ -void rcu_sync_enter(struct rcu_sync *rsp) -{ - bool need_wait, need_sync; - spin_lock_irq(&rsp->rss_lock); - need_wait = rsp->gp_count++; - need_sync = rsp->gp_state == GP_IDLE; - if (need_sync) - rsp->gp_state = GP_PENDING; - spin_unlock_irq(&rsp->rss_lock); +static void rcu_sync_func(struct rcu_head *rcu); - WARN_ON_ONCE(need_wait && need_sync); - if (need_sync) { - synchronize_rcu(); - rsp->gp_state = GP_PASSED; - wake_up_all(&rsp->gp_wait); - } else if (need_wait) { - wait_event(rsp->gp_wait, rsp->gp_state == GP_PASSED); - } else { - /* - * Possible when there's a pending CB from a rcu_sync_exit(). - * Nobody has yet been allowed the 'fast' path and thus we can - * avoid doing any sync(). The callback will get 'dropped'. - */ - WARN_ON_ONCE(rsp->gp_state != GP_PASSED); - } +static void rcu_sync_call(struct rcu_sync *rsp) +{ + call_rcu(&rsp->cb_head, rcu_sync_func); } /** * rcu_sync_func() - Callback function managing reader access to fastpath * @rhp: Pointer to rcu_head in rcu_sync structure to use for synchronization * - * This function is passed to one of the call_rcu() functions by + * This function is passed to call_rcu() function by rcu_sync_enter() and * rcu_sync_exit(), so that it is invoked after a grace period following the - * that invocation of rcu_sync_exit(). It takes action based on events that + * that invocation of enter/exit. + * + * If it is called by rcu_sync_enter() it signals that all the readers were + * switched onto slow path. + * + * If it is called by rcu_sync_exit() it takes action based on events that * have taken place in the meantime, so that closely spaced rcu_sync_enter() * and rcu_sync_exit() pairs need not wait for a grace period. * @@ -102,40 +70,93 @@ void rcu_sync_enter(struct rcu_sync *rsp) * rcu_sync_exit(). Otherwise, set all state back to idle so that readers * can again use their fastpaths. */ -static void rcu_sync_func(struct rcu_head *rhp) +static void rcu_sync_func(struct rcu_head *rcu) { - struct rcu_sync *rsp = container_of(rhp, struct rcu_sync, cb_head); + struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head); unsigned long flags; - WARN_ON_ONCE(rsp->gp_state != GP_PASSED); - WARN_ON_ONCE(rsp->cb_state == CB_IDLE); + WARN_ON_ONCE(rsp->gp_state == GP_IDLE); + WARN_ON_ONCE(rsp->gp_state == GP_PASSED); spin_lock_irqsave(&rsp->rss_lock, flags); if (rsp->gp_count) { /* - * A new rcu_sync_begin() has happened; drop the callback. + * We're at least a GP after the GP_IDLE->GP_ENTER transition. */ - rsp->cb_state = CB_IDLE; - } else if (rsp->cb_state == CB_REPLAY) { + rsp->gp_state = GP_PASSED; + wake_up_locked(&rsp->gp_wait); + } else if (rsp->gp_state == GP_REPLAY) { /* - * A new rcu_sync_exit() has happened; requeue the callback - * to catch a later GP. + * A new rcu_sync_exit() has happened; requeue the callback to + * catch a later GP. */ - rsp->cb_state = CB_PENDING; - call_rcu(&rsp->cb_head, rcu_sync_func); + rsp->gp_state = GP_EXIT; + rcu_sync_call(rsp); } else { /* - * We're at least a GP after rcu_sync_exit(); eveybody will now - * have observed the write side critical section. Let 'em rip!. + * We're at least a GP after the last rcu_sync_exit(); eveybody + * will now have observed the write side critical section. + * Let 'em rip!. */ - rsp->cb_state = CB_IDLE; rsp->gp_state = GP_IDLE; } spin_unlock_irqrestore(&rsp->rss_lock, flags); } /** - * rcu_sync_exit() - Allow readers back onto fast patch after grace period + * rcu_sync_enter() - Force readers onto slowpath + * @rsp: Pointer to rcu_sync structure to use for synchronization + * + * This function is used by updaters who need readers to make use of + * a slowpath during the update. After this function returns, all + * subsequent calls to rcu_sync_is_idle() will return false, which + * tells readers to stay off their fastpaths. A later call to + * rcu_sync_exit() re-enables reader slowpaths. + * + * When called in isolation, rcu_sync_enter() must wait for a grace + * period, however, closely spaced calls to rcu_sync_enter() can + * optimize away the grace-period wait via a state machine implemented + * by rcu_sync_enter(), rcu_sync_exit(), and rcu_sync_func(). + */ +void rcu_sync_enter(struct rcu_sync *rsp) +{ + int gp_state; + + spin_lock_irq(&rsp->rss_lock); + gp_state = rsp->gp_state; + if (gp_state == GP_IDLE) { + rsp->gp_state = GP_ENTER; + WARN_ON_ONCE(rsp->gp_count); + /* + * Note that we could simply do rcu_sync_call(rsp) here and + * avoid the "if (gp_state == GP_IDLE)" block below. + * + * However, synchronize_rcu() can be faster if rcu_expedited + * or rcu_blocking_is_gp() is true. + * + * Another reason is that we can't wait for rcu callback if + * we are called at early boot time but this shouldn't happen. + */ + } + rsp->gp_count++; + spin_unlock_irq(&rsp->rss_lock); + + if (gp_state == GP_IDLE) { + /* + * See the comment above, this simply does the "synchronous" + * call_rcu(rcu_sync_func) which does GP_ENTER -> GP_PASSED. + */ + synchronize_rcu(); + rcu_sync_func(&rsp->cb_head); + /* Not really needed, wait_event() would see GP_PASSED. */ + return; + } + + wait_event(rsp->gp_wait, rsp->gp_state >= GP_PASSED); +} + +/** + * rcu_sync_exit() - Allow readers back onto fast path after grace period * @rsp: Pointer to rcu_sync structure to use for synchronization * * This function is used by updaters who have completed, and can therefore @@ -146,13 +167,16 @@ static void rcu_sync_func(struct rcu_head *rhp) */ void rcu_sync_exit(struct rcu_sync *rsp) { + WARN_ON_ONCE(rsp->gp_state == GP_IDLE); + WARN_ON_ONCE(rsp->gp_count == 0); + spin_lock_irq(&rsp->rss_lock); if (!--rsp->gp_count) { - if (rsp->cb_state == CB_IDLE) { - rsp->cb_state = CB_PENDING; - call_rcu(&rsp->cb_head, rcu_sync_func); - } else if (rsp->cb_state == CB_PENDING) { - rsp->cb_state = CB_REPLAY; + if (rsp->gp_state == GP_PASSED) { + rsp->gp_state = GP_EXIT; + rcu_sync_call(rsp); + } else if (rsp->gp_state == GP_EXIT) { + rsp->gp_state = GP_REPLAY; } } spin_unlock_irq(&rsp->rss_lock); @@ -164,18 +188,19 @@ void rcu_sync_exit(struct rcu_sync *rsp) */ void rcu_sync_dtor(struct rcu_sync *rsp) { - int cb_state; + int gp_state; WARN_ON_ONCE(rsp->gp_count); + WARN_ON_ONCE(rsp->gp_state == GP_PASSED); spin_lock_irq(&rsp->rss_lock); - if (rsp->cb_state == CB_REPLAY) - rsp->cb_state = CB_PENDING; - cb_state = rsp->cb_state; + if (rsp->gp_state == GP_REPLAY) + rsp->gp_state = GP_EXIT; + gp_state = rsp->gp_state; spin_unlock_irq(&rsp->rss_lock); - if (cb_state != CB_IDLE) { + if (gp_state != GP_IDLE) { rcu_barrier(); - WARN_ON_ONCE(rsp->cb_state != CB_IDLE); + WARN_ON_ONCE(rsp->gp_state != GP_IDLE); } } -- 2.5.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] rcu/sync: simplify the state machine 2019-04-25 16:50 ` [PATCH 1/1] " Oleg Nesterov @ 2019-04-27 21:02 ` Paul E. McKenney 2019-04-28 22:26 ` Paul E. McKenney 0 siblings, 1 reply; 8+ messages in thread From: Paul E. McKenney @ 2019-04-27 21:02 UTC (permalink / raw) To: Oleg Nesterov Cc: Peter Zijlstra, Joel Fernandes, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Steven Rostedt, linux-kernel On Thu, Apr 25, 2019 at 06:50:55PM +0200, Oleg Nesterov wrote: > With this patch rcu_sync has a single state variable and the transition rules > become really simple: > > GP_IDLE - owned by the first rcu_sync_enter() which moves it to > > GP_ENTER - owned by rcu-callback which moves it to > > GP_PASSED - owned by the last rcu_sync_exit() which moves it to > > GP_EXIT - and this is the only "nontrivial" state. > > rcu-callback moves it back to GP_IDLE unless another enter() > comes before a GP pass. > > If rcu-callback is invoked before the next rcu_sync_exit() it > must see gp_count incremented by that enter() and set GP_PASSED. > > Otherwise, if the next rcu_sync_exit() wins the race, it will > move it to > > GP_REPLAY - owned by rcu-callback which moves it to GP_EXIT > > Signed-off-by: Oleg Nesterov <oleg@redhat.com> Queued and passing initial tests, thank you! It may be a day or two before it shows up on -rcu, but it will get there! Thanx, Paul > --- > include/linux/rcu_sync.h | 2 - > kernel/rcu/sync.c | 165 +++++++++++++++++++++++++++-------------------- > 2 files changed, 95 insertions(+), 72 deletions(-) > > diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h > index e7ae221..3156a14 100644 > --- a/include/linux/rcu_sync.h > +++ b/include/linux/rcu_sync.h > @@ -19,7 +19,6 @@ struct rcu_sync { > int gp_count; > wait_queue_head_t gp_wait; > > - int cb_state; > struct rcu_head cb_head; > }; > > @@ -47,7 +46,6 @@ extern void rcu_sync_dtor(struct rcu_sync *); > .gp_state = 0, \ > .gp_count = 0, \ > .gp_wait = __WAIT_QUEUE_HEAD_INITIALIZER(name.gp_wait), \ > - .cb_state = 0, \ > } > > #define DEFINE_RCU_SYNC(name) \ > diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c > index ee427e1..d9f80fc 100644 > --- a/kernel/rcu/sync.c > +++ b/kernel/rcu/sync.c > @@ -10,15 +10,13 @@ > #include <linux/rcu_sync.h> > #include <linux/sched.h> > > -enum { GP_IDLE = 0, GP_PENDING, GP_PASSED }; > -enum { CB_IDLE = 0, CB_PENDING, CB_REPLAY }; > +enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY }; > > #define rss_lock gp_wait.lock > > /** > * rcu_sync_init() - Initialize an rcu_sync structure > * @rsp: Pointer to rcu_sync structure to be initialized > - * @type: Flavor of RCU with which to synchronize rcu_sync structure > */ > void rcu_sync_init(struct rcu_sync *rsp) > { > @@ -41,56 +39,26 @@ void rcu_sync_enter_start(struct rcu_sync *rsp) > rsp->gp_state = GP_PASSED; > } > > -/** > - * rcu_sync_enter() - Force readers onto slowpath > - * @rsp: Pointer to rcu_sync structure to use for synchronization > - * > - * This function is used by updaters who need readers to make use of > - * a slowpath during the update. After this function returns, all > - * subsequent calls to rcu_sync_is_idle() will return false, which > - * tells readers to stay off their fastpaths. A later call to > - * rcu_sync_exit() re-enables reader slowpaths. > - * > - * When called in isolation, rcu_sync_enter() must wait for a grace > - * period, however, closely spaced calls to rcu_sync_enter() can > - * optimize away the grace-period wait via a state machine implemented > - * by rcu_sync_enter(), rcu_sync_exit(), and rcu_sync_func(). > - */ > -void rcu_sync_enter(struct rcu_sync *rsp) > -{ > - bool need_wait, need_sync; > > - spin_lock_irq(&rsp->rss_lock); > - need_wait = rsp->gp_count++; > - need_sync = rsp->gp_state == GP_IDLE; > - if (need_sync) > - rsp->gp_state = GP_PENDING; > - spin_unlock_irq(&rsp->rss_lock); > +static void rcu_sync_func(struct rcu_head *rcu); > > - WARN_ON_ONCE(need_wait && need_sync); > - if (need_sync) { > - synchronize_rcu(); > - rsp->gp_state = GP_PASSED; > - wake_up_all(&rsp->gp_wait); > - } else if (need_wait) { > - wait_event(rsp->gp_wait, rsp->gp_state == GP_PASSED); > - } else { > - /* > - * Possible when there's a pending CB from a rcu_sync_exit(). > - * Nobody has yet been allowed the 'fast' path and thus we can > - * avoid doing any sync(). The callback will get 'dropped'. > - */ > - WARN_ON_ONCE(rsp->gp_state != GP_PASSED); > - } > +static void rcu_sync_call(struct rcu_sync *rsp) > +{ > + call_rcu(&rsp->cb_head, rcu_sync_func); > } > > /** > * rcu_sync_func() - Callback function managing reader access to fastpath > * @rhp: Pointer to rcu_head in rcu_sync structure to use for synchronization > * > - * This function is passed to one of the call_rcu() functions by > + * This function is passed to call_rcu() function by rcu_sync_enter() and > * rcu_sync_exit(), so that it is invoked after a grace period following the > - * that invocation of rcu_sync_exit(). It takes action based on events that > + * that invocation of enter/exit. > + * > + * If it is called by rcu_sync_enter() it signals that all the readers were > + * switched onto slow path. > + * > + * If it is called by rcu_sync_exit() it takes action based on events that > * have taken place in the meantime, so that closely spaced rcu_sync_enter() > * and rcu_sync_exit() pairs need not wait for a grace period. > * > @@ -102,40 +70,93 @@ void rcu_sync_enter(struct rcu_sync *rsp) > * rcu_sync_exit(). Otherwise, set all state back to idle so that readers > * can again use their fastpaths. > */ > -static void rcu_sync_func(struct rcu_head *rhp) > +static void rcu_sync_func(struct rcu_head *rcu) > { > - struct rcu_sync *rsp = container_of(rhp, struct rcu_sync, cb_head); > + struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head); > unsigned long flags; > > - WARN_ON_ONCE(rsp->gp_state != GP_PASSED); > - WARN_ON_ONCE(rsp->cb_state == CB_IDLE); > + WARN_ON_ONCE(rsp->gp_state == GP_IDLE); > + WARN_ON_ONCE(rsp->gp_state == GP_PASSED); > > spin_lock_irqsave(&rsp->rss_lock, flags); > if (rsp->gp_count) { > /* > - * A new rcu_sync_begin() has happened; drop the callback. > + * We're at least a GP after the GP_IDLE->GP_ENTER transition. > */ > - rsp->cb_state = CB_IDLE; > - } else if (rsp->cb_state == CB_REPLAY) { > + rsp->gp_state = GP_PASSED; > + wake_up_locked(&rsp->gp_wait); > + } else if (rsp->gp_state == GP_REPLAY) { > /* > - * A new rcu_sync_exit() has happened; requeue the callback > - * to catch a later GP. > + * A new rcu_sync_exit() has happened; requeue the callback to > + * catch a later GP. > */ > - rsp->cb_state = CB_PENDING; > - call_rcu(&rsp->cb_head, rcu_sync_func); > + rsp->gp_state = GP_EXIT; > + rcu_sync_call(rsp); > } else { > /* > - * We're at least a GP after rcu_sync_exit(); eveybody will now > - * have observed the write side critical section. Let 'em rip!. > + * We're at least a GP after the last rcu_sync_exit(); eveybody > + * will now have observed the write side critical section. > + * Let 'em rip!. > */ > - rsp->cb_state = CB_IDLE; > rsp->gp_state = GP_IDLE; > } > spin_unlock_irqrestore(&rsp->rss_lock, flags); > } > > /** > - * rcu_sync_exit() - Allow readers back onto fast patch after grace period > + * rcu_sync_enter() - Force readers onto slowpath > + * @rsp: Pointer to rcu_sync structure to use for synchronization > + * > + * This function is used by updaters who need readers to make use of > + * a slowpath during the update. After this function returns, all > + * subsequent calls to rcu_sync_is_idle() will return false, which > + * tells readers to stay off their fastpaths. A later call to > + * rcu_sync_exit() re-enables reader slowpaths. > + * > + * When called in isolation, rcu_sync_enter() must wait for a grace > + * period, however, closely spaced calls to rcu_sync_enter() can > + * optimize away the grace-period wait via a state machine implemented > + * by rcu_sync_enter(), rcu_sync_exit(), and rcu_sync_func(). > + */ > +void rcu_sync_enter(struct rcu_sync *rsp) > +{ > + int gp_state; > + > + spin_lock_irq(&rsp->rss_lock); > + gp_state = rsp->gp_state; > + if (gp_state == GP_IDLE) { > + rsp->gp_state = GP_ENTER; > + WARN_ON_ONCE(rsp->gp_count); > + /* > + * Note that we could simply do rcu_sync_call(rsp) here and > + * avoid the "if (gp_state == GP_IDLE)" block below. > + * > + * However, synchronize_rcu() can be faster if rcu_expedited > + * or rcu_blocking_is_gp() is true. > + * > + * Another reason is that we can't wait for rcu callback if > + * we are called at early boot time but this shouldn't happen. > + */ > + } > + rsp->gp_count++; > + spin_unlock_irq(&rsp->rss_lock); > + > + if (gp_state == GP_IDLE) { > + /* > + * See the comment above, this simply does the "synchronous" > + * call_rcu(rcu_sync_func) which does GP_ENTER -> GP_PASSED. > + */ > + synchronize_rcu(); > + rcu_sync_func(&rsp->cb_head); > + /* Not really needed, wait_event() would see GP_PASSED. */ > + return; > + } > + > + wait_event(rsp->gp_wait, rsp->gp_state >= GP_PASSED); > +} > + > +/** > + * rcu_sync_exit() - Allow readers back onto fast path after grace period > * @rsp: Pointer to rcu_sync structure to use for synchronization > * > * This function is used by updaters who have completed, and can therefore > @@ -146,13 +167,16 @@ static void rcu_sync_func(struct rcu_head *rhp) > */ > void rcu_sync_exit(struct rcu_sync *rsp) > { > + WARN_ON_ONCE(rsp->gp_state == GP_IDLE); > + WARN_ON_ONCE(rsp->gp_count == 0); > + > spin_lock_irq(&rsp->rss_lock); > if (!--rsp->gp_count) { > - if (rsp->cb_state == CB_IDLE) { > - rsp->cb_state = CB_PENDING; > - call_rcu(&rsp->cb_head, rcu_sync_func); > - } else if (rsp->cb_state == CB_PENDING) { > - rsp->cb_state = CB_REPLAY; > + if (rsp->gp_state == GP_PASSED) { > + rsp->gp_state = GP_EXIT; > + rcu_sync_call(rsp); > + } else if (rsp->gp_state == GP_EXIT) { > + rsp->gp_state = GP_REPLAY; > } > } > spin_unlock_irq(&rsp->rss_lock); > @@ -164,18 +188,19 @@ void rcu_sync_exit(struct rcu_sync *rsp) > */ > void rcu_sync_dtor(struct rcu_sync *rsp) > { > - int cb_state; > + int gp_state; > > WARN_ON_ONCE(rsp->gp_count); > + WARN_ON_ONCE(rsp->gp_state == GP_PASSED); > > spin_lock_irq(&rsp->rss_lock); > - if (rsp->cb_state == CB_REPLAY) > - rsp->cb_state = CB_PENDING; > - cb_state = rsp->cb_state; > + if (rsp->gp_state == GP_REPLAY) > + rsp->gp_state = GP_EXIT; > + gp_state = rsp->gp_state; > spin_unlock_irq(&rsp->rss_lock); > > - if (cb_state != CB_IDLE) { > + if (gp_state != GP_IDLE) { > rcu_barrier(); > - WARN_ON_ONCE(rsp->cb_state != CB_IDLE); > + WARN_ON_ONCE(rsp->gp_state != GP_IDLE); > } > } > -- > 2.5.0 > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] rcu/sync: simplify the state machine 2019-04-27 21:02 ` Paul E. McKenney @ 2019-04-28 22:26 ` Paul E. McKenney 2019-04-29 16:06 ` Oleg Nesterov 0 siblings, 1 reply; 8+ messages in thread From: Paul E. McKenney @ 2019-04-28 22:26 UTC (permalink / raw) To: Oleg Nesterov Cc: Peter Zijlstra, Joel Fernandes, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Steven Rostedt, linux-kernel On Sat, Apr 27, 2019 at 02:02:30PM -0700, Paul E. McKenney wrote: > On Thu, Apr 25, 2019 at 06:50:55PM +0200, Oleg Nesterov wrote: > > With this patch rcu_sync has a single state variable and the transition rules > > become really simple: > > > > GP_IDLE - owned by the first rcu_sync_enter() which moves it to > > > > GP_ENTER - owned by rcu-callback which moves it to > > > > GP_PASSED - owned by the last rcu_sync_exit() which moves it to > > > > GP_EXIT - and this is the only "nontrivial" state. > > > > rcu-callback moves it back to GP_IDLE unless another enter() > > comes before a GP pass. > > > > If rcu-callback is invoked before the next rcu_sync_exit() it > > must see gp_count incremented by that enter() and set GP_PASSED. > > > > Otherwise, if the next rcu_sync_exit() wins the race, it will > > move it to > > > > GP_REPLAY - owned by rcu-callback which moves it to GP_EXIT > > > > Signed-off-by: Oleg Nesterov <oleg@redhat.com> > > Queued and passing initial tests, thank you! It may be a day or two > before it shows up on -rcu, but it will get there! And it still looks good after review, so I have pushed it. I did add READ_ONCE() and WRITE_ONCE() to unprotected uses of ->gp_state, but please let me know if I messed anything up. Thanx, Paul > > --- > > include/linux/rcu_sync.h | 2 - > > kernel/rcu/sync.c | 165 +++++++++++++++++++++++++++-------------------- > > 2 files changed, 95 insertions(+), 72 deletions(-) > > > > diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h > > index e7ae221..3156a14 100644 > > --- a/include/linux/rcu_sync.h > > +++ b/include/linux/rcu_sync.h > > @@ -19,7 +19,6 @@ struct rcu_sync { > > int gp_count; > > wait_queue_head_t gp_wait; > > > > - int cb_state; > > struct rcu_head cb_head; > > }; > > > > @@ -47,7 +46,6 @@ extern void rcu_sync_dtor(struct rcu_sync *); > > .gp_state = 0, \ > > .gp_count = 0, \ > > .gp_wait = __WAIT_QUEUE_HEAD_INITIALIZER(name.gp_wait), \ > > - .cb_state = 0, \ > > } > > > > #define DEFINE_RCU_SYNC(name) \ > > diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c > > index ee427e1..d9f80fc 100644 > > --- a/kernel/rcu/sync.c > > +++ b/kernel/rcu/sync.c > > @@ -10,15 +10,13 @@ > > #include <linux/rcu_sync.h> > > #include <linux/sched.h> > > > > -enum { GP_IDLE = 0, GP_PENDING, GP_PASSED }; > > -enum { CB_IDLE = 0, CB_PENDING, CB_REPLAY }; > > +enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY }; > > > > #define rss_lock gp_wait.lock > > > > /** > > * rcu_sync_init() - Initialize an rcu_sync structure > > * @rsp: Pointer to rcu_sync structure to be initialized > > - * @type: Flavor of RCU with which to synchronize rcu_sync structure > > */ > > void rcu_sync_init(struct rcu_sync *rsp) > > { > > @@ -41,56 +39,26 @@ void rcu_sync_enter_start(struct rcu_sync *rsp) > > rsp->gp_state = GP_PASSED; > > } > > > > -/** > > - * rcu_sync_enter() - Force readers onto slowpath > > - * @rsp: Pointer to rcu_sync structure to use for synchronization > > - * > > - * This function is used by updaters who need readers to make use of > > - * a slowpath during the update. After this function returns, all > > - * subsequent calls to rcu_sync_is_idle() will return false, which > > - * tells readers to stay off their fastpaths. A later call to > > - * rcu_sync_exit() re-enables reader slowpaths. > > - * > > - * When called in isolation, rcu_sync_enter() must wait for a grace > > - * period, however, closely spaced calls to rcu_sync_enter() can > > - * optimize away the grace-period wait via a state machine implemented > > - * by rcu_sync_enter(), rcu_sync_exit(), and rcu_sync_func(). > > - */ > > -void rcu_sync_enter(struct rcu_sync *rsp) > > -{ > > - bool need_wait, need_sync; > > > > - spin_lock_irq(&rsp->rss_lock); > > - need_wait = rsp->gp_count++; > > - need_sync = rsp->gp_state == GP_IDLE; > > - if (need_sync) > > - rsp->gp_state = GP_PENDING; > > - spin_unlock_irq(&rsp->rss_lock); > > +static void rcu_sync_func(struct rcu_head *rcu); > > > > - WARN_ON_ONCE(need_wait && need_sync); > > - if (need_sync) { > > - synchronize_rcu(); > > - rsp->gp_state = GP_PASSED; > > - wake_up_all(&rsp->gp_wait); > > - } else if (need_wait) { > > - wait_event(rsp->gp_wait, rsp->gp_state == GP_PASSED); > > - } else { > > - /* > > - * Possible when there's a pending CB from a rcu_sync_exit(). > > - * Nobody has yet been allowed the 'fast' path and thus we can > > - * avoid doing any sync(). The callback will get 'dropped'. > > - */ > > - WARN_ON_ONCE(rsp->gp_state != GP_PASSED); > > - } > > +static void rcu_sync_call(struct rcu_sync *rsp) > > +{ > > + call_rcu(&rsp->cb_head, rcu_sync_func); > > } > > > > /** > > * rcu_sync_func() - Callback function managing reader access to fastpath > > * @rhp: Pointer to rcu_head in rcu_sync structure to use for synchronization > > * > > - * This function is passed to one of the call_rcu() functions by > > + * This function is passed to call_rcu() function by rcu_sync_enter() and > > * rcu_sync_exit(), so that it is invoked after a grace period following the > > - * that invocation of rcu_sync_exit(). It takes action based on events that > > + * that invocation of enter/exit. > > + * > > + * If it is called by rcu_sync_enter() it signals that all the readers were > > + * switched onto slow path. > > + * > > + * If it is called by rcu_sync_exit() it takes action based on events that > > * have taken place in the meantime, so that closely spaced rcu_sync_enter() > > * and rcu_sync_exit() pairs need not wait for a grace period. > > * > > @@ -102,40 +70,93 @@ void rcu_sync_enter(struct rcu_sync *rsp) > > * rcu_sync_exit(). Otherwise, set all state back to idle so that readers > > * can again use their fastpaths. > > */ > > -static void rcu_sync_func(struct rcu_head *rhp) > > +static void rcu_sync_func(struct rcu_head *rcu) > > { > > - struct rcu_sync *rsp = container_of(rhp, struct rcu_sync, cb_head); > > + struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head); > > unsigned long flags; > > > > - WARN_ON_ONCE(rsp->gp_state != GP_PASSED); > > - WARN_ON_ONCE(rsp->cb_state == CB_IDLE); > > + WARN_ON_ONCE(rsp->gp_state == GP_IDLE); > > + WARN_ON_ONCE(rsp->gp_state == GP_PASSED); > > > > spin_lock_irqsave(&rsp->rss_lock, flags); > > if (rsp->gp_count) { > > /* > > - * A new rcu_sync_begin() has happened; drop the callback. > > + * We're at least a GP after the GP_IDLE->GP_ENTER transition. > > */ > > - rsp->cb_state = CB_IDLE; > > - } else if (rsp->cb_state == CB_REPLAY) { > > + rsp->gp_state = GP_PASSED; > > + wake_up_locked(&rsp->gp_wait); > > + } else if (rsp->gp_state == GP_REPLAY) { > > /* > > - * A new rcu_sync_exit() has happened; requeue the callback > > - * to catch a later GP. > > + * A new rcu_sync_exit() has happened; requeue the callback to > > + * catch a later GP. > > */ > > - rsp->cb_state = CB_PENDING; > > - call_rcu(&rsp->cb_head, rcu_sync_func); > > + rsp->gp_state = GP_EXIT; > > + rcu_sync_call(rsp); > > } else { > > /* > > - * We're at least a GP after rcu_sync_exit(); eveybody will now > > - * have observed the write side critical section. Let 'em rip!. > > + * We're at least a GP after the last rcu_sync_exit(); eveybody > > + * will now have observed the write side critical section. > > + * Let 'em rip!. > > */ > > - rsp->cb_state = CB_IDLE; > > rsp->gp_state = GP_IDLE; > > } > > spin_unlock_irqrestore(&rsp->rss_lock, flags); > > } > > > > /** > > - * rcu_sync_exit() - Allow readers back onto fast patch after grace period > > + * rcu_sync_enter() - Force readers onto slowpath > > + * @rsp: Pointer to rcu_sync structure to use for synchronization > > + * > > + * This function is used by updaters who need readers to make use of > > + * a slowpath during the update. After this function returns, all > > + * subsequent calls to rcu_sync_is_idle() will return false, which > > + * tells readers to stay off their fastpaths. A later call to > > + * rcu_sync_exit() re-enables reader slowpaths. > > + * > > + * When called in isolation, rcu_sync_enter() must wait for a grace > > + * period, however, closely spaced calls to rcu_sync_enter() can > > + * optimize away the grace-period wait via a state machine implemented > > + * by rcu_sync_enter(), rcu_sync_exit(), and rcu_sync_func(). > > + */ > > +void rcu_sync_enter(struct rcu_sync *rsp) > > +{ > > + int gp_state; > > + > > + spin_lock_irq(&rsp->rss_lock); > > + gp_state = rsp->gp_state; > > + if (gp_state == GP_IDLE) { > > + rsp->gp_state = GP_ENTER; > > + WARN_ON_ONCE(rsp->gp_count); > > + /* > > + * Note that we could simply do rcu_sync_call(rsp) here and > > + * avoid the "if (gp_state == GP_IDLE)" block below. > > + * > > + * However, synchronize_rcu() can be faster if rcu_expedited > > + * or rcu_blocking_is_gp() is true. > > + * > > + * Another reason is that we can't wait for rcu callback if > > + * we are called at early boot time but this shouldn't happen. > > + */ > > + } > > + rsp->gp_count++; > > + spin_unlock_irq(&rsp->rss_lock); > > + > > + if (gp_state == GP_IDLE) { > > + /* > > + * See the comment above, this simply does the "synchronous" > > + * call_rcu(rcu_sync_func) which does GP_ENTER -> GP_PASSED. > > + */ > > + synchronize_rcu(); > > + rcu_sync_func(&rsp->cb_head); > > + /* Not really needed, wait_event() would see GP_PASSED. */ > > + return; > > + } > > + > > + wait_event(rsp->gp_wait, rsp->gp_state >= GP_PASSED); > > +} > > + > > +/** > > + * rcu_sync_exit() - Allow readers back onto fast path after grace period > > * @rsp: Pointer to rcu_sync structure to use for synchronization > > * > > * This function is used by updaters who have completed, and can therefore > > @@ -146,13 +167,16 @@ static void rcu_sync_func(struct rcu_head *rhp) > > */ > > void rcu_sync_exit(struct rcu_sync *rsp) > > { > > + WARN_ON_ONCE(rsp->gp_state == GP_IDLE); > > + WARN_ON_ONCE(rsp->gp_count == 0); > > + > > spin_lock_irq(&rsp->rss_lock); > > if (!--rsp->gp_count) { > > - if (rsp->cb_state == CB_IDLE) { > > - rsp->cb_state = CB_PENDING; > > - call_rcu(&rsp->cb_head, rcu_sync_func); > > - } else if (rsp->cb_state == CB_PENDING) { > > - rsp->cb_state = CB_REPLAY; > > + if (rsp->gp_state == GP_PASSED) { > > + rsp->gp_state = GP_EXIT; > > + rcu_sync_call(rsp); > > + } else if (rsp->gp_state == GP_EXIT) { > > + rsp->gp_state = GP_REPLAY; > > } > > } > > spin_unlock_irq(&rsp->rss_lock); > > @@ -164,18 +188,19 @@ void rcu_sync_exit(struct rcu_sync *rsp) > > */ > > void rcu_sync_dtor(struct rcu_sync *rsp) > > { > > - int cb_state; > > + int gp_state; > > > > WARN_ON_ONCE(rsp->gp_count); > > + WARN_ON_ONCE(rsp->gp_state == GP_PASSED); > > > > spin_lock_irq(&rsp->rss_lock); > > - if (rsp->cb_state == CB_REPLAY) > > - rsp->cb_state = CB_PENDING; > > - cb_state = rsp->cb_state; > > + if (rsp->gp_state == GP_REPLAY) > > + rsp->gp_state = GP_EXIT; > > + gp_state = rsp->gp_state; > > spin_unlock_irq(&rsp->rss_lock); > > > > - if (cb_state != CB_IDLE) { > > + if (gp_state != GP_IDLE) { > > rcu_barrier(); > > - WARN_ON_ONCE(rsp->cb_state != CB_IDLE); > > + WARN_ON_ONCE(rsp->gp_state != GP_IDLE); > > } > > } > > -- > > 2.5.0 > > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] rcu/sync: simplify the state machine 2019-04-28 22:26 ` Paul E. McKenney @ 2019-04-29 16:06 ` Oleg Nesterov 2019-04-29 20:40 ` Paul E. McKenney 0 siblings, 1 reply; 8+ messages in thread From: Oleg Nesterov @ 2019-04-29 16:06 UTC (permalink / raw) To: Paul E. McKenney Cc: Peter Zijlstra, Joel Fernandes, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Steven Rostedt, linux-kernel On 04/28, Paul E. McKenney wrote: > > And it still looks good after review, so I have pushed it. Thanks! > I did add > READ_ONCE() and WRITE_ONCE() to unprotected uses of ->gp_state, but > please let me know if I messed anything up. Well, at least WRITE_ONCE()'s look certainly unneeded to me, gp_state is protected by rss_lock. WARN_ON_ONCE(gp_state) can read gp_state lockless, but even in this case I do not understand what READ_ONCE() tries to prevent... Nevermind, this won't hurt and as I already said I don't understand the _ONCE() magic anyway ;) Thanks, Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] rcu/sync: simplify the state machine 2019-04-29 16:06 ` Oleg Nesterov @ 2019-04-29 20:40 ` Paul E. McKenney 2019-04-30 11:27 ` Oleg Nesterov 0 siblings, 1 reply; 8+ messages in thread From: Paul E. McKenney @ 2019-04-29 20:40 UTC (permalink / raw) To: Oleg Nesterov Cc: Peter Zijlstra, Joel Fernandes, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Steven Rostedt, linux-kernel On Mon, Apr 29, 2019 at 06:06:04PM +0200, Oleg Nesterov wrote: > On 04/28, Paul E. McKenney wrote: > > > > And it still looks good after review, so I have pushed it. > > Thanks! > > > I did add > > READ_ONCE() and WRITE_ONCE() to unprotected uses of ->gp_state, but > > please let me know if I messed anything up. > > Well, at least WRITE_ONCE()'s look certainly unneeded to me, gp_state > is protected by rss_lock. > > WARN_ON_ONCE(gp_state) can read gp_state lockless, but even in this case > I do not understand what READ_ONCE() tries to prevent... > > Nevermind, this won't hurt and as I already said I don't understand the > _ONCE() magic anyway ;) If I understand correctly, rcu_sync_is_idle() can be inline and returns ->gp_state. Without the READ_ONCE(), the compiler might fuse reads from consecutive calls to rcu_sync_is_idle() or (under register pressure) re-read from it, getting inconsistent results. For example, this: tmp = rcu_sync_is_idle(rsp); do_something(tmp); do_something_else(tmp); Might become this: do_something(rcu_sync_is_idle(rsp)); do_something_else(rcu_sync_is_idle(rsp)); This might actually be harmless given current calls, but it would be at best an accident waiting to happen. Or am I missing something here? Thanx, Paul ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] rcu/sync: simplify the state machine 2019-04-29 20:40 ` Paul E. McKenney @ 2019-04-30 11:27 ` Oleg Nesterov 0 siblings, 0 replies; 8+ messages in thread From: Oleg Nesterov @ 2019-04-30 11:27 UTC (permalink / raw) To: Paul E. McKenney Cc: Peter Zijlstra, Joel Fernandes, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Steven Rostedt, linux-kernel On 04/29, Paul E. McKenney wrote: > > On Mon, Apr 29, 2019 at 06:06:04PM +0200, Oleg Nesterov wrote: > > > > Well, at least WRITE_ONCE()'s look certainly unneeded to me, gp_state > > is protected by rss_lock. > > > > WARN_ON_ONCE(gp_state) can read gp_state lockless, but even in this case > > I do not understand what READ_ONCE() tries to prevent... > > > > Nevermind, this won't hurt and as I already said I don't understand the > > _ONCE() magic anyway ;) > > If I understand correctly, rcu_sync_is_idle() can be inline and returns > ->gp_state. Ah, sorry! I didn't mean rcu_sync_is_idle(). To be honeest, I didn't even notice this change, but it looks obviously fine to me, with or without this patch. And yes, > Without the READ_ONCE(), the compiler might fuse reads from > consecutive calls to rcu_sync_is_idle() or (under register pressure) > re-read from it, getting inconsistent results. For example, this: > > tmp = rcu_sync_is_idle(rsp); > do_something(tmp); > do_something_else(tmp); > > Might become this: > > do_something(rcu_sync_is_idle(rsp)); > do_something_else(rcu_sync_is_idle(rsp)); this is very clear. Even for me ;) Thanks, Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-04-30 11:27 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-04-25 16:40 [PATCH 0/1] rcu/sync: simplify the state machine Oleg Nesterov 2019-04-25 16:49 ` Oleg Nesterov 2019-04-25 16:50 ` [PATCH 1/1] " Oleg Nesterov 2019-04-27 21:02 ` Paul E. McKenney 2019-04-28 22:26 ` Paul E. McKenney 2019-04-29 16:06 ` Oleg Nesterov 2019-04-29 20:40 ` Paul E. McKenney 2019-04-30 11:27 ` Oleg Nesterov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).