From: Petr Mladek <pmladek@suse.com>
To: John Ogness <john.ogness@linutronix.de>
Cc: linux-kernel@vger.kernel.org,
Andrea Parri <andrea.parri@amarulasolutions.com>,
Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
Steven Rostedt <rostedt@goodmis.org>,
Brendan Higgins <brendanhiggins@google.com>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: numlist API Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation
Date: Fri, 23 Aug 2019 19:18:02 +0200 [thread overview]
Message-ID: <20190823171802.eo2chwyktibeub7a@pathway.suse.cz> (raw)
In-Reply-To: <20190807222634.1723-2-john.ogness@linutronix.de>
On Thu 2019-08-08 00:32:26, John Ogness wrote:
> --- /dev/null
> +++ b/kernel/printk/numlist.c
> @@ -0,0 +1,375 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/sched.h>
> +#include "numlist.h"
struct numlist is really special variant of a list. Let me to
do a short summary:
+ FIFO queue interface
+ nodes sequentially numbered
+ nodes referenced by ID instead pointers to avoid ABA problems
+ requires custom node() callback to get pointer for given ID
+ lockless access:
+ pushed nodes must not longer get modified by push() caller
+ pop() caller gets exclusive write access, except that they
must modify ID first and do smp_wmb() later
+ pop() does not work:
+ tail node is "busy"
+ needs a custom callback that defines when a node is busy
+ tail is the last node
+ needed for lockless sequential numbering
I will start with one inevitable question ;-) Is it realistic to find
another user for this API, please?
I am not sure that all the indirections, caused by the generic API,
are worth the gain.
Well, the separate API makes sense anyway. I have some ideas that
might make it cleaner.
The barriers are because of validating the ID. Now we have:
struct nl_node {
unsigned long seq;
unsigned long next_id;
};
that is used in:
struct prb_desc {
/* private */
atomic_long_t id;
struct dr_desc desc;
struct nl_node list;
};
What will happen when we move id from struct prb_desc into struct nl_node?
struct nl_node {
unsigned long seq;
atomic_long_t id;
unsigned long next_id;
};
struct prb_desc {
struct dr_desc desc;
struct nl_node list;
};
Then the "node" callback might just return the structure. It makes
perfect sense. struct nl_node is always static for a given id.
For the printk ringbuffer it would look like:
struct nl_node *prb_nl_get_node(unsigned long id, void *nl_user)
{
struct printk_ringbuffer *rb = (struct printk_ringbuffer *)nl_user;
struct prb_desc *d = to_desc(rb, id);
return &d->list;
}
I would also hide the callback behind a generic wrapper:
struct nl_node *numlist_get_node(struct numlist *nl, unsigned long id)
{
return nl->get_node(id, nl->user_data);
}
Then we could have nicely symetric and self contained barriers
in numlist_read():
bool numlist_read(struct numlist *nl, unsigned long id, unsigned long *seq,
unsigned long *next_id)
{
struct nl_node *n;
unsigned long cur_id;
n = numlist_get_node(nl, id);
if (!n)
return false;
/*
* Make sure that seq and next_id values will be read
* for the expected id.
*/
cur_id = atomic_long_read_acquire(&n->id);
if (cur_id != id)
return false;
if (seq) {
*seq = n->seq;
if (next_id)
*next_id = n->next_id;
}
/*
* Make sure that seq and next_id values were read for
* the expected ID.
*/
cur_id = atomic_long_read_release(&n->id);
return cur_id == id;
}
numlist_push() might be the same, except the I would
remove several WRITE_ONCE as discussed in another mail:
void numlist_push(struct numlist *nl, struct nl_node *n)
{
unsigned long head_id;
unsigned long seq;
unsigned long r;
/* Setup the node to be a list terminator: next_id == id. */
n->next_id = n->id;
do {
do {
head_id = atomic_long_read(&nl->head_id);
} while (!numlist_read(nl, head_id, &seq, NULL));
n->seq = seq + 1;
/*
* This store_release() guarantees that @seq and @next are
* stored before the node with @id is visible to any popping
* writers.
*
* It pairs with the acquire() when tail_id gets updated
* in headlist_pop();
*/
} while (atomic_long_cmpxchg_release(&nl->head_id, head_id, id) !=
head_id);
n = nl->get_node(nl, head_id);
/*
* This barrier makes sure that nl->head_id already points to
* the newly pushed node.
*
* It pairs with acquire when new id is written in numlist_pop().
* It allows to pop() and reuse this node. It can not longer
* be the last one.
*/
smp_store_release(&n->next_id, id);
}
Then I would add a symetric callback that would generate ID for
a newly popped struct. It will allow to set new ID in the numlist
API and have the barriers symetric. Something like:
unsined long prb_new_node_id(unsigned long old_id, , void *nl_user)
{
struct printk_ringbuffer *rb = (struct printk_ringbuffer *)nl_user;
return id + DESCS_COUNT(rb);
}
Then we could hide it in
unsigned long numlist_get_new_id(struct numlist *nl, unsigned long id)
{
return nl->get_new_id(id, nl->user_data);
}
and do
struct nl_node *numlist_pop(struct numlist *nl)
{
struct nl_node *n;
unsigned long tail_id;
unsigned long next_id;
unsigned long r;
tail_id = atomic_long_read(&nl->tail_id);
do {
do {
tail_id = atomic_long_read(&nl->tail_id);
} while (!numlist_read(nl, tail_id, NULL, &next_id));
/* Make sure the node is not the only node on the list. */
if (next_id == tail_id)
return NULL;
/* Make sure the node is not busy. */
if (nl->busy(tail_id, nl->busy_arg))
return NULL;
/*
* Make sure that nl->tail_id is update before
* we start modyfying the popped node.
*
* It pairs with release() when head_id is
* pushed in numlist_push().
*/
} while (atomic_long_cmpxchg_acquire(&nl->tail_id,
tail_id, next_id) !=
tail_id);
/* Got exclusive write access to the node. */
n = numlist_get_node(nl, tail_id);
tail_id = numlist_get_new_id(tail_id, nl);
/*
* Make sure that we set new ID before we allow
* more changes in user structure handled by this node.
*
* It pairs with release() barrier when the node is
* pushed into the numlist again, gets linked to
* the previous node and can't be modified anymore.
* See numlist_push().
*/
atomic_long_set_acquire(&d->id, atomic_long_read(&d->id) +
DESCS_COUNT(rb));
return n;
}
I hope that it makes some sense. I feel exhausted. It is Friday
evening here. I just wanted to send it because it looked like the most
constructive idea that I had this week. And I wanted to send something
more positive ;-)
Best Regards,
Petr
next prev parent reply other threads:[~2019-08-23 17:18 UTC|newest]
Thread overview: 131+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-07 22:26 [RFC PATCH v4 0/9] printk: new ringbuffer implementation John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 1/9] printk-rb: add a new printk " John Ogness
2019-08-20 8:15 ` numlist_pop(): " Petr Mladek
2019-08-21 5:41 ` John Ogness
2019-09-04 12:19 ` Peter Zijlstra
2019-08-20 8:22 ` assign_desc() barriers: " Petr Mladek
2019-08-20 14:14 ` Petr Mladek
2019-08-21 5:52 ` John Ogness
2019-08-22 11:53 ` Petr Mladek
2019-08-25 2:06 ` John Ogness
2019-08-26 8:21 ` John Ogness
2019-08-20 8:55 ` comments style: " Petr Mladek
2019-08-20 9:27 ` Sergey Senozhatsky
2019-08-21 5:46 ` John Ogness
2019-08-22 13:50 ` Petr Mladek
2019-08-22 17:38 ` Andrea Parri
2019-08-23 10:47 ` Petr Mladek
2019-08-23 14:27 ` Andrea Parri
2019-08-23 9:49 ` Sergey Senozhatsky
2019-08-23 5:54 ` Sergey Senozhatsky
2019-08-23 10:29 ` Petr Mladek
2019-08-21 5:42 ` John Ogness
2019-08-22 12:44 ` Petr Mladek
2019-08-20 13:50 ` dataring_push() barriers " Petr Mladek
2019-08-25 2:42 ` John Ogness
2019-08-27 14:36 ` Petr Mladek
2019-08-28 13:43 ` John Ogness
2019-08-20 15:12 ` datablock reuse races " Petr Mladek
2019-08-23 9:21 ` numlist_push() barriers " Petr Mladek
2019-08-26 8:34 ` Andrea Parri
2019-08-26 8:43 ` Andrea Parri
2019-08-26 14:10 ` Petr Mladek
2019-08-26 16:01 ` Andrea Parri
2019-08-26 22:36 ` John Ogness
2019-08-27 7:40 ` Petr Mladek
2019-08-27 14:28 ` John Ogness
2019-08-27 15:07 ` Petr Mladek
2019-08-28 10:24 ` John Ogness
2019-08-23 17:18 ` Petr Mladek [this message]
2019-08-26 23:57 ` numlist API " John Ogness
2019-08-27 13:03 ` Petr Mladek
2019-08-28 7:13 ` John Ogness
2019-08-28 8:58 ` Petr Mladek
2019-08-28 14:03 ` John Ogness
2019-08-29 11:28 ` Petr Mladek
2019-09-03 7:58 ` Sergey Senozhatsky
2019-08-30 14:48 ` dataring " Petr Mladek
2019-08-07 22:26 ` [RFC PATCH v4 2/9] printk-rb: add test module John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 3/9] printk-rb: fix missing includes/exports John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 4/9] printk-rb: initialize new descriptors as invalid John Ogness
2019-08-20 9:23 ` Petr Mladek
2019-08-20 10:16 ` Sergey Senozhatsky
2019-08-21 5:56 ` John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 5/9] printk-rb: remove extra data buffer size allocation John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 6/9] printk-rb: adjust test module ringbuffer sizes John Ogness
2019-08-19 21:29 ` [PATCH] printk-rb: fix test module macro usage John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 7/9] printk-rb: increase size of seq and size variables John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 8/9] printk-rb: new functionality to support printk John Ogness
2019-08-20 9:59 ` Sergey Senozhatsky
2019-08-21 5:47 ` John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 9/9] printk: use a new ringbuffer implementation John Ogness
2019-08-08 19:07 ` Linus Torvalds
2019-08-08 22:55 ` John Ogness
2019-08-08 23:33 ` Linus Torvalds
2019-08-08 23:45 ` Steven Rostedt
2019-08-09 0:21 ` Linus Torvalds
2019-08-09 0:48 ` Steven Rostedt
2019-08-09 1:15 ` Linus Torvalds
2019-08-09 11:15 ` Thomas Gleixner
2019-08-09 16:00 ` Linus Torvalds
2019-08-09 20:07 ` Thomas Gleixner
2019-08-09 20:20 ` Linus Torvalds
2019-08-09 6:14 ` Peter Zijlstra
2019-08-09 7:08 ` John Ogness
2019-08-09 15:57 ` Linus Torvalds
2019-08-10 5:53 ` Thomas Gleixner
2019-09-10 3:19 ` Sergey Senozhatsky
2019-08-12 9:54 ` Geert Uytterhoeven
2019-08-16 5:46 ` Dave Young
2019-08-16 5:54 ` Dave Young
2019-08-16 9:40 ` John Ogness
2019-09-04 12:35 ` [RFC PATCH v4 0/9] printk: " Peter Zijlstra
2019-09-05 13:05 ` Petr Mladek
2019-09-05 14:31 ` Peter Zijlstra
2019-09-05 15:38 ` Thomas Gleixner
2019-09-05 16:11 ` Steven Rostedt
2019-09-05 21:10 ` John Ogness
2019-09-06 9:39 ` Petr Mladek
2019-09-09 14:11 ` printk meeting at LPC Thomas Gleixner
2019-09-13 13:26 ` John Ogness
2019-09-13 14:48 ` Daniel Vetter
2019-09-15 13:47 ` John Ogness
2019-09-16 8:44 ` Daniel Vetter
2019-09-16 4:30 ` Tetsuo Handa
2019-09-16 10:46 ` Petr Mladek
2019-09-16 13:43 ` Steven Rostedt
2019-09-16 14:28 ` John Ogness
2019-09-17 8:11 ` Petr Mladek
2019-09-17 7:52 ` Petr Mladek
2019-09-17 13:02 ` Steven Rostedt
2019-09-17 13:12 ` Greg Kroah-Hartman
2019-09-17 13:37 ` Steven Rostedt
2019-09-17 14:08 ` Tetsuo Handa
2019-09-17 7:51 ` Sergey Senozhatsky
2019-09-18 1:25 ` Sergey Senozhatsky
2019-09-18 2:08 ` Steven Rostedt
2019-09-18 2:36 ` Sergey Senozhatsky
2019-09-18 5:19 ` Sergey Senozhatsky
2019-09-18 7:42 ` John Ogness
2019-09-18 8:10 ` Sergey Senozhatsky
2019-09-18 9:05 ` John Ogness
2019-09-18 9:11 ` Sergey Senozhatsky
2019-09-18 16:41 ` Petr Mladek
2019-09-18 16:48 ` Steven Rostedt
2019-09-24 14:24 ` Petr Mladek
2019-09-19 8:06 ` Daniel Vetter
2019-09-18 7:33 ` John Ogness
2019-09-18 8:08 ` Sergey Senozhatsky
2019-10-04 14:48 ` Tony Asleson
2019-10-07 12:01 ` Petr Mladek
2019-09-06 9:06 ` [RFC PATCH v4 0/9] printk: new ringbuffer implementation Peter Zijlstra
2019-09-06 10:09 ` Sergey Senozhatsky
2019-09-06 10:49 ` Peter Zijlstra
2019-09-06 13:44 ` Sergey Senozhatsky
2019-09-06 12:42 ` Petr Mladek
2019-09-06 14:01 ` Peter Zijlstra
2019-09-06 14:22 ` Peter Zijlstra
2019-09-06 19:53 ` Sergey Senozhatsky
2019-09-06 22:47 ` John Ogness
2019-09-08 22:18 ` Peter Zijlstra
2019-09-10 3:22 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190823171802.eo2chwyktibeub7a@pathway.suse.cz \
--to=pmladek@suse.com \
--cc=andrea.parri@amarulasolutions.com \
--cc=brendanhiggins@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=john.ogness@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=sergey.senozhatsky@gmail.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).