From: Petr Mladek <pmladek@suse.com> To: John Ogness <john.ogness@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>, Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>, Sergey Senozhatsky <sergey.senozhatsky@gmail.com>, Steven Rostedt <rostedt@goodmis.org>, Linus Torvalds <torvalds@linux-foundation.org>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Andrea Parri <parri.andrea@gmail.com>, Thomas Gleixner <tglx@linutronix.de>, kexec@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Full barrier in data_push_tail(): was [PATCH v2 2/3] printk: add lockless buffer Date: Tue, 9 Jun 2020 11:48:20 +0200 [thread overview] Message-ID: <20200609094820.GC23752@linux-b0ei> (raw) In-Reply-To: <20200501094010.17694-3-john.ogness@linutronix.de> On Fri 2020-05-01 11:46:09, John Ogness wrote: > Introduce a multi-reader multi-writer lockless ringbuffer for storing > the kernel log messages. Readers and writers may use their API from > any context (including scheduler and NMI). This ringbuffer will make > it possible to decouple printk() callers from any context, locking, > or console constraints. It also makes it possible for readers to have > full access to the ringbuffer contents at any time and context (for > example from any panic situation). > > diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c > new file mode 100644 > index 000000000000..e0a66468d4f3 > --- /dev/null > +++ b/kernel/printk/printk_ringbuffer.c > +/* > + * Advance the data ring tail to at least @lpos. This function puts > + * descriptors into the reusable state if the tail is pushed beyond > + * their associated data block. > + */ > +static bool data_push_tail(struct printk_ringbuffer *rb, > + struct prb_data_ring *data_ring, > + unsigned long lpos) > +{ > + unsigned long tail_lpos; > + unsigned long next_lpos; > + > + /* If @lpos is not valid, there is nothing to do. */ > + if (lpos == INVALID_LPOS) > + return true; > + > + tail_lpos = atomic_long_read(&data_ring->tail_lpos); > + > + do { > + /* Done, if the tail lpos is already at or beyond @lpos. */ > + if ((lpos - tail_lpos) - 1 >= DATA_SIZE(data_ring)) > + break; > + > + /* > + * Make all descriptors reusable that are associated with > + * data blocks before @lpos. > + */ > + if (!data_make_reusable(rb, data_ring, tail_lpos, lpos, > + &next_lpos)) { > + /* > + * Guarantee the descriptor state loaded in > + * data_make_reusable() is performed before reloading > + * the tail lpos. The failed data_make_reusable() may > + * be due to a newly recycled descriptor causing > + * the tail lpos to have been previously pushed. This > + * pairs with desc_reserve:D. > + * > + * Memory barrier involvement: > + * > + * If data_make_reusable:D reads from desc_reserve:G, > + * then data_push_tail:B reads from data_push_tail:D. > + * > + * Relies on: > + * > + * MB from data_push_tail:D to desc_reserve:G > + * matching > + * RMB from data_make_reusable:D to data_push_tail:B > + * > + * Note: data_push_tail:D and desc_reserve:G can be > + * different CPUs. However, the desc_reserve:G > + * CPU (which performs the full memory barrier) > + * must have previously seen data_push_tail:D. > + */ > + smp_rmb(); /* LMM(data_push_tail:A) */ > + > + next_lpos = atomic_long_read(&data_ring->tail_lpos > + ); /* LMM(data_push_tail:B) */ > + if (next_lpos == tail_lpos) > + return false; > + > + /* Another task pushed the tail. Try again. */ > + tail_lpos = next_lpos; > + continue; > + } > + > + /* > + * Guarantee any descriptor states that have transitioned to > + * reusable are stored before pushing the tail lpos. This > + * allows readers to identify if data has expired while > + * reading the descriptor. This pairs with desc_read:D. > + */ > + smp_mb(); /* LMM(data_push_tail:C) */ The comment does not explain why we need a full barrier here. It talks about writing descriptor states. It suggests that write barrier should be enough. I guess that this is related to the discussion that we had last time, and the litmus test mentioned in see https://lore.kernel.org/r/87h7zcjkxy.fsf@linutronix.de I would add something like: * Full barrier is necessary because the descriptors * might have been made reusable also by other CPUs. For people like me, it would be great to add also link to a more detailed explanation, for example, the litmus tests, or something even more human readable ;-) I think that it is a "rather" common problem. I wonder whether it is already documented somewhere. > + } while (!atomic_long_try_cmpxchg_relaxed(&data_ring->tail_lpos, > + &tail_lpos, next_lpos)); /* LMM(data_push_tail:D) */ > + > + return true; > +} > + Best Regards, Petr
WARNING: multiple messages have this Message-ID (diff)
From: Petr Mladek <pmladek@suse.com> To: John Ogness <john.ogness@linutronix.de> Cc: Andrea Parri <parri.andrea@gmail.com>, Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>, Peter Zijlstra <peterz@infradead.org>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>, Sergey Senozhatsky <sergey.senozhatsky@gmail.com>, Thomas Gleixner <tglx@linutronix.de>, Linus Torvalds <torvalds@linux-foundation.org> Subject: Full barrier in data_push_tail(): was [PATCH v2 2/3] printk: add lockless buffer Date: Tue, 9 Jun 2020 11:48:20 +0200 [thread overview] Message-ID: <20200609094820.GC23752@linux-b0ei> (raw) In-Reply-To: <20200501094010.17694-3-john.ogness@linutronix.de> On Fri 2020-05-01 11:46:09, John Ogness wrote: > Introduce a multi-reader multi-writer lockless ringbuffer for storing > the kernel log messages. Readers and writers may use their API from > any context (including scheduler and NMI). This ringbuffer will make > it possible to decouple printk() callers from any context, locking, > or console constraints. It also makes it possible for readers to have > full access to the ringbuffer contents at any time and context (for > example from any panic situation). > > diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c > new file mode 100644 > index 000000000000..e0a66468d4f3 > --- /dev/null > +++ b/kernel/printk/printk_ringbuffer.c > +/* > + * Advance the data ring tail to at least @lpos. This function puts > + * descriptors into the reusable state if the tail is pushed beyond > + * their associated data block. > + */ > +static bool data_push_tail(struct printk_ringbuffer *rb, > + struct prb_data_ring *data_ring, > + unsigned long lpos) > +{ > + unsigned long tail_lpos; > + unsigned long next_lpos; > + > + /* If @lpos is not valid, there is nothing to do. */ > + if (lpos == INVALID_LPOS) > + return true; > + > + tail_lpos = atomic_long_read(&data_ring->tail_lpos); > + > + do { > + /* Done, if the tail lpos is already at or beyond @lpos. */ > + if ((lpos - tail_lpos) - 1 >= DATA_SIZE(data_ring)) > + break; > + > + /* > + * Make all descriptors reusable that are associated with > + * data blocks before @lpos. > + */ > + if (!data_make_reusable(rb, data_ring, tail_lpos, lpos, > + &next_lpos)) { > + /* > + * Guarantee the descriptor state loaded in > + * data_make_reusable() is performed before reloading > + * the tail lpos. The failed data_make_reusable() may > + * be due to a newly recycled descriptor causing > + * the tail lpos to have been previously pushed. This > + * pairs with desc_reserve:D. > + * > + * Memory barrier involvement: > + * > + * If data_make_reusable:D reads from desc_reserve:G, > + * then data_push_tail:B reads from data_push_tail:D. > + * > + * Relies on: > + * > + * MB from data_push_tail:D to desc_reserve:G > + * matching > + * RMB from data_make_reusable:D to data_push_tail:B > + * > + * Note: data_push_tail:D and desc_reserve:G can be > + * different CPUs. However, the desc_reserve:G > + * CPU (which performs the full memory barrier) > + * must have previously seen data_push_tail:D. > + */ > + smp_rmb(); /* LMM(data_push_tail:A) */ > + > + next_lpos = atomic_long_read(&data_ring->tail_lpos > + ); /* LMM(data_push_tail:B) */ > + if (next_lpos == tail_lpos) > + return false; > + > + /* Another task pushed the tail. Try again. */ > + tail_lpos = next_lpos; > + continue; > + } > + > + /* > + * Guarantee any descriptor states that have transitioned to > + * reusable are stored before pushing the tail lpos. This > + * allows readers to identify if data has expired while > + * reading the descriptor. This pairs with desc_read:D. > + */ > + smp_mb(); /* LMM(data_push_tail:C) */ The comment does not explain why we need a full barrier here. It talks about writing descriptor states. It suggests that write barrier should be enough. I guess that this is related to the discussion that we had last time, and the litmus test mentioned in see https://lore.kernel.org/r/87h7zcjkxy.fsf@linutronix.de I would add something like: * Full barrier is necessary because the descriptors * might have been made reusable also by other CPUs. For people like me, it would be great to add also link to a more detailed explanation, for example, the litmus tests, or something even more human readable ;-) I think that it is a "rather" common problem. I wonder whether it is already documented somewhere. > + } while (!atomic_long_try_cmpxchg_relaxed(&data_ring->tail_lpos, > + &tail_lpos, next_lpos)); /* LMM(data_push_tail:D) */ > + > + return true; > +} > + Best Regards, Petr _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2020-06-09 9:48 UTC|newest] Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-05-01 9:40 [PATCH v2 0/3] printk: replace ringbuffer John Ogness 2020-05-01 9:40 ` John Ogness 2020-05-01 9:40 ` [PATCH v2 1/3] crash: add VMCOREINFO macro for anonymous structs John Ogness 2020-05-01 9:40 ` John Ogness 2020-06-03 10:16 ` Petr Mladek 2020-06-03 10:16 ` Petr Mladek 2020-05-01 9:40 ` [PATCH v2 2/3] printk: add lockless buffer John Ogness 2020-05-01 9:40 ` John Ogness 2020-05-18 13:03 ` John Ogness 2020-05-18 17:22 ` Linus Torvalds 2020-05-18 17:22 ` Linus Torvalds 2020-05-19 20:34 ` John Ogness 2020-05-19 20:34 ` John Ogness 2020-06-09 7:10 ` blk->id read race: was: " Petr Mladek 2020-06-09 7:10 ` Petr Mladek 2020-06-09 14:18 ` John Ogness 2020-06-09 14:18 ` John Ogness 2020-06-10 8:42 ` Petr Mladek 2020-06-10 8:42 ` Petr Mladek 2020-06-10 13:55 ` John Ogness 2020-06-10 13:55 ` John Ogness 2020-06-09 9:31 ` redundant check in make_data_reusable(): was " Petr Mladek 2020-06-09 9:31 ` Petr Mladek 2020-06-09 14:48 ` John Ogness 2020-06-09 14:48 ` John Ogness 2020-06-10 9:38 ` Petr Mladek 2020-06-10 9:38 ` Petr Mladek 2020-06-10 10:24 ` John Ogness 2020-06-10 10:24 ` John Ogness 2020-06-10 14:56 ` John Ogness 2020-06-10 14:56 ` John Ogness 2020-06-11 19:51 ` John Ogness 2020-06-11 19:51 ` John Ogness 2020-06-11 13:55 ` Petr Mladek 2020-06-11 13:55 ` Petr Mladek 2020-06-11 20:25 ` John Ogness 2020-06-11 20:25 ` John Ogness 2020-06-09 9:48 ` Petr Mladek [this message] 2020-06-09 9:48 ` Full barrier in data_push_tail(): " Petr Mladek 2020-06-09 15:03 ` John Ogness 2020-06-09 15:03 ` John Ogness 2020-06-09 11:37 ` Barrier before pushing desc_ring tail: " Petr Mladek 2020-06-09 11:37 ` Petr Mladek 2020-06-09 15:56 ` John Ogness 2020-06-09 15:56 ` John Ogness 2020-06-11 12:01 ` Petr Mladek 2020-06-11 12:01 ` Petr Mladek 2020-06-11 23:06 ` John Ogness 2020-06-11 23:06 ` John Ogness 2020-06-09 14:38 ` data_ring head_lpos and tail_lpos synchronization: " Petr Mladek 2020-06-09 14:38 ` Petr Mladek 2020-06-10 7:53 ` John Ogness 2020-06-10 7:53 ` John Ogness 2020-05-01 9:40 ` [PATCH v2 3/3] printk: use the lockless ringbuffer John Ogness 2020-05-01 9:40 ` John Ogness 2020-05-06 14:50 ` John Ogness 2020-05-06 14:50 ` John Ogness 2020-05-13 12:05 ` [PATCH v2 0/3] printk: replace ringbuffer Prarit Bhargava 2020-05-13 12:05 ` Prarit Bhargava 2020-05-15 10:24 ` Sergey Senozhatsky 2020-05-15 10:24 ` Sergey Senozhatsky
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200609094820.GC23752@linux-b0ei \ --to=pmladek@suse.com \ --cc=gregkh@linuxfoundation.org \ --cc=john.ogness@linutronix.de \ --cc=kexec@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=parri.andrea@gmail.com \ --cc=peterz@infradead.org \ --cc=rostedt@goodmis.org \ --cc=sergey.senozhatsky.work@gmail.com \ --cc=sergey.senozhatsky@gmail.com \ --cc=tglx@linutronix.de \ --cc=torvalds@linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.