linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: John Ogness <john.ogness@linutronix.de>
Cc: linux-kernel@vger.kernel.org,
	Andrea Parri <andrea.parri@amarulasolutions.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Brendan Higgins <brendanhiggins@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: numlist API Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation
Date: Fri, 23 Aug 2019 19:18:02 +0200	[thread overview]
Message-ID: <20190823171802.eo2chwyktibeub7a@pathway.suse.cz> (raw)
In-Reply-To: <20190807222634.1723-2-john.ogness@linutronix.de>

On Thu 2019-08-08 00:32:26, John Ogness wrote:
> --- /dev/null
> +++ b/kernel/printk/numlist.c
> @@ -0,0 +1,375 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/sched.h>
> +#include "numlist.h"

struct numlist is really special variant of a list. Let me to
do a short summary:

   + FIFO queue interface

   + nodes sequentially numbered

   + nodes referenced by ID instead pointers to avoid ABA problems
     + requires custom node() callback to get pointer for given ID

   + lockless access:
     + pushed nodes must not longer get modified by push() caller
     + pop() caller gets exclusive write access, except that they
       must modify ID first and do smp_wmb() later

   + pop() does not work:
     + tail node is "busy"
	+ needs a custom callback that defines when a node is busy
     + tail is the last node
	+ needed for lockless sequential numbering

I will start with one inevitable question ;-) Is it realistic to find
another user for this API, please?

I am not sure that all the indirections, caused by the generic API,
are worth the gain.


Well, the separate API makes sense anyway. I have some ideas that
might make it cleaner.

The barriers are because of validating the ID. Now we have:

	struct nl_node {
		unsigned long	seq;
		unsigned long	next_id;
	};

that is used in:

	struct prb_desc {
		/* private */
		atomic_long_t		id;
		struct dr_desc		desc;
		struct nl_node		list;
	};

What will happen when we move id from struct prb_desc into struct nl_node?

	struct nl_node {
		unsigned long	seq;
		atomic_long_t	id;
		unsigned long	next_id;
	};

	struct prb_desc {
		struct dr_desc		desc;
		struct nl_node		list;
	};


Then the "node" callback might just return the structure. It makes
perfect sense. struct nl_node is always static for a given id.

For the printk ringbuffer it would look like:

struct nl_node *prb_nl_get_node(unsigned long id, void *nl_user)
{
	struct printk_ringbuffer *rb = (struct printk_ringbuffer *)nl_user;
	struct prb_desc *d = to_desc(rb, id);

	return &d->list;
}

I would also hide the callback behind a generic wrapper:

struct nl_node *numlist_get_node(struct numlist *nl, unsigned long id)
{
	return nl->get_node(id, nl->user_data);
}


Then we could have nicely symetric and self contained barriers
in numlist_read():

bool numlist_read(struct numlist *nl, unsigned long id, unsigned long *seq,
		  unsigned long *next_id)
{
	struct nl_node *n;
	unsigned long cur_id;

	n = numlist_get_node(nl, id);
	if (!n)
		return false;

	/*
	 * Make sure that seq and next_id values will be read
	 * for the expected id.
	 */
	cur_id = atomic_long_read_acquire(&n->id);
	if (cur_id != id)
		return false;

	if (seq) {
		*seq = n->seq;

	if (next_id)
		*next_id = n->next_id;
	}

	/*
	 * Make sure that seq and next_id values were read for
	 * the expected ID.
	 */
	cur_id = atomic_long_read_release(&n->id);

	return cur_id == id;
}

numlist_push() might be the same, except the I would
remove several WRITE_ONCE as discussed in another mail:

void numlist_push(struct numlist *nl, struct nl_node *n)
{
	unsigned long head_id;
	unsigned long seq;
	unsigned long r;

	/* Setup the node to be a list terminator: next_id == id. */
	n->next_id = n->id;

	do {
		do {
			head_id = atomic_long_read(&nl->head_id);
		} while (!numlist_read(nl, head_id, &seq, NULL));

		n->seq = seq + 1;

		/*
		 * This store_release() guarantees that @seq and @next are
		 * stored before the node with @id is visible to any popping
		 * writers.
		 * 
		 * It pairs with the acquire() when tail_id gets updated
		 * in headlist_pop();
		 */
	} while (atomic_long_cmpxchg_release(&nl->head_id, head_id, id) !=
			head_id);

	n = nl->get_node(nl, head_id);

	/*
	 * This barrier makes sure that nl->head_id already points to
	 * the newly pushed node.
	 *
	 * It pairs with acquire when new id is written in numlist_pop().
	 * It allows to pop() and reuse this node. It can not longer
	 * be the last one.
	 */
	smp_store_release(&n->next_id, id);
}

Then I would add a symetric callback that would generate ID for
a newly popped struct. It will allow to set new ID in the numlist
API and have the barriers symetric. Something like:

unsined long prb_new_node_id(unsigned long old_id, , void *nl_user)
{
	struct printk_ringbuffer *rb = (struct printk_ringbuffer *)nl_user;

	return id + DESCS_COUNT(rb);
}

Then we could hide it in

unsigned long numlist_get_new_id(struct numlist *nl, unsigned long id)
{
	return nl->get_new_id(id, nl->user_data);
}

and do

struct nl_node *numlist_pop(struct numlist *nl)
{
	struct nl_node *n;
	unsigned long tail_id;
	unsigned long next_id;
	unsigned long r;

	tail_id = atomic_long_read(&nl->tail_id);

	do {
		do {
			tail_id = atomic_long_read(&nl->tail_id);
		} while (!numlist_read(nl, tail_id, NULL, &next_id));

		/* Make sure the node is not the only node on the list. */
		if (next_id == tail_id)
			return NULL;

		/* Make sure the node is not busy. */
		if (nl->busy(tail_id, nl->busy_arg))
			return NULL;

		/*
		 * Make sure that nl->tail_id is update before
		 * we start modyfying the popped node.
		 *
		 * It pairs with release() when head_id is
		 * pushed in numlist_push().
		 */
	} while (atomic_long_cmpxchg_acquire(&nl->tail_id,
					tail_id, next_id) !=
		  tail_id);

	/* Got exclusive write access to the node. */
	n = numlist_get_node(nl, tail_id);

	tail_id = numlist_get_new_id(tail_id, nl);
	/*
	 * Make sure that we set new ID before we allow
	 * more changes in user structure handled by this node.
	 *
	 * It pairs with release() barrier when the node is
	 * pushed into the numlist again, gets linked to
	 * the previous node and can't be modified anymore.
	 * See numlist_push().
	 */
	atomic_long_set_acquire(&d->id, atomic_long_read(&d->id) +
				DESCS_COUNT(rb));

	return n;
}

I hope that it makes some sense. I feel exhausted. It is Friday
evening here. I just wanted to send it because it looked like the most
constructive idea that I had this week. And I wanted to send something
more positive ;-)

Best Regards,
Petr

  parent reply	other threads:[~2019-08-23 17:18 UTC|newest]

Thread overview: 131+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-07 22:26 [RFC PATCH v4 0/9] printk: new ringbuffer implementation John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 1/9] printk-rb: add a new printk " John Ogness
2019-08-20  8:15   ` numlist_pop(): " Petr Mladek
2019-08-21  5:41     ` John Ogness
2019-09-04 12:19     ` Peter Zijlstra
2019-08-20  8:22   ` assign_desc() barriers: " Petr Mladek
2019-08-20 14:14     ` Petr Mladek
2019-08-21  5:52       ` John Ogness
2019-08-22 11:53         ` Petr Mladek
2019-08-25  2:06           ` John Ogness
2019-08-26  8:21             ` John Ogness
2019-08-20  8:55   ` comments style: " Petr Mladek
2019-08-20  9:27     ` Sergey Senozhatsky
2019-08-21  5:46       ` John Ogness
2019-08-22 13:50         ` Petr Mladek
2019-08-22 17:38           ` Andrea Parri
2019-08-23 10:47             ` Petr Mladek
2019-08-23 14:27               ` Andrea Parri
2019-08-23  9:49           ` Sergey Senozhatsky
2019-08-23  5:54         ` Sergey Senozhatsky
2019-08-23 10:29           ` Petr Mladek
2019-08-21  5:42     ` John Ogness
2019-08-22 12:44       ` Petr Mladek
2019-08-20 13:50   ` dataring_push() barriers " Petr Mladek
2019-08-25  2:42     ` John Ogness
2019-08-27 14:36       ` Petr Mladek
2019-08-28 13:43         ` John Ogness
2019-08-20 15:12   ` datablock reuse races " Petr Mladek
2019-08-23  9:21   ` numlist_push() barriers " Petr Mladek
2019-08-26  8:34     ` Andrea Parri
2019-08-26  8:43       ` Andrea Parri
2019-08-26 14:10       ` Petr Mladek
2019-08-26 16:01         ` Andrea Parri
2019-08-26 22:36     ` John Ogness
2019-08-27  7:40       ` Petr Mladek
2019-08-27 14:28         ` John Ogness
2019-08-27 15:07           ` Petr Mladek
2019-08-28 10:24             ` John Ogness
2019-08-23 17:18   ` Petr Mladek [this message]
2019-08-26 23:57     ` numlist API " John Ogness
2019-08-27 13:03       ` Petr Mladek
2019-08-28  7:13         ` John Ogness
2019-08-28  8:58           ` Petr Mladek
2019-08-28 14:03             ` John Ogness
2019-08-29 11:28               ` Petr Mladek
2019-09-03  7:58         ` Sergey Senozhatsky
2019-08-30 14:48   ` dataring " Petr Mladek
2019-08-07 22:26 ` [RFC PATCH v4 2/9] printk-rb: add test module John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 3/9] printk-rb: fix missing includes/exports John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 4/9] printk-rb: initialize new descriptors as invalid John Ogness
2019-08-20  9:23   ` Petr Mladek
2019-08-20 10:16     ` Sergey Senozhatsky
2019-08-21  5:56     ` John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 5/9] printk-rb: remove extra data buffer size allocation John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 6/9] printk-rb: adjust test module ringbuffer sizes John Ogness
2019-08-19 21:29   ` [PATCH] printk-rb: fix test module macro usage John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 7/9] printk-rb: increase size of seq and size variables John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 8/9] printk-rb: new functionality to support printk John Ogness
2019-08-20  9:59   ` Sergey Senozhatsky
2019-08-21  5:47     ` John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 9/9] printk: use a new ringbuffer implementation John Ogness
2019-08-08 19:07   ` Linus Torvalds
2019-08-08 22:55     ` John Ogness
2019-08-08 23:33       ` Linus Torvalds
2019-08-08 23:45         ` Steven Rostedt
2019-08-09  0:21           ` Linus Torvalds
2019-08-09  0:48             ` Steven Rostedt
2019-08-09  1:15               ` Linus Torvalds
2019-08-09 11:15                 ` Thomas Gleixner
2019-08-09 16:00                   ` Linus Torvalds
2019-08-09 20:07                     ` Thomas Gleixner
2019-08-09 20:20                       ` Linus Torvalds
2019-08-09  6:14     ` Peter Zijlstra
2019-08-09  7:08       ` John Ogness
2019-08-09 15:57       ` Linus Torvalds
2019-08-10  5:53         ` Thomas Gleixner
2019-09-10  3:19           ` Sergey Senozhatsky
2019-08-12  9:54       ` Geert Uytterhoeven
2019-08-16  5:46   ` Dave Young
2019-08-16  5:54     ` Dave Young
2019-08-16  9:40     ` John Ogness
2019-09-04 12:35 ` [RFC PATCH v4 0/9] printk: " Peter Zijlstra
2019-09-05 13:05   ` Petr Mladek
2019-09-05 14:31     ` Peter Zijlstra
2019-09-05 15:38       ` Thomas Gleixner
2019-09-05 16:11         ` Steven Rostedt
2019-09-05 21:10           ` John Ogness
2019-09-06  9:39           ` Petr Mladek
2019-09-09 14:11           ` printk meeting at LPC Thomas Gleixner
2019-09-13 13:26             ` John Ogness
2019-09-13 14:48               ` Daniel Vetter
2019-09-15 13:47                 ` John Ogness
2019-09-16  8:44                   ` Daniel Vetter
2019-09-16  4:30               ` Tetsuo Handa
2019-09-16 10:46                 ` Petr Mladek
2019-09-16 13:43                   ` Steven Rostedt
2019-09-16 14:28                     ` John Ogness
2019-09-17  8:11                       ` Petr Mladek
2019-09-17  7:52                     ` Petr Mladek
2019-09-17 13:02                       ` Steven Rostedt
2019-09-17 13:12                         ` Greg Kroah-Hartman
2019-09-17 13:37                           ` Steven Rostedt
2019-09-17 14:08                             ` Tetsuo Handa
2019-09-17  7:51                   ` Sergey Senozhatsky
2019-09-18  1:25               ` Sergey Senozhatsky
2019-09-18  2:08                 ` Steven Rostedt
2019-09-18  2:36                   ` Sergey Senozhatsky
2019-09-18  5:19                     ` Sergey Senozhatsky
2019-09-18  7:42                       ` John Ogness
2019-09-18  8:10                         ` Sergey Senozhatsky
2019-09-18  9:05                           ` John Ogness
2019-09-18  9:11                             ` Sergey Senozhatsky
2019-09-18 16:41                             ` Petr Mladek
2019-09-18 16:48                               ` Steven Rostedt
2019-09-24 14:24                                 ` Petr Mladek
2019-09-19  8:06                         ` Daniel Vetter
2019-09-18  7:33                     ` John Ogness
2019-09-18  8:08                       ` Sergey Senozhatsky
2019-10-04 14:48               ` Tony Asleson
2019-10-07 12:01                 ` Petr Mladek
2019-09-06  9:06       ` [RFC PATCH v4 0/9] printk: new ringbuffer implementation Peter Zijlstra
2019-09-06 10:09         ` Sergey Senozhatsky
2019-09-06 10:49           ` Peter Zijlstra
2019-09-06 13:44             ` Sergey Senozhatsky
2019-09-06 12:42         ` Petr Mladek
2019-09-06 14:01           ` Peter Zijlstra
2019-09-06 14:22             ` Peter Zijlstra
2019-09-06 19:53             ` Sergey Senozhatsky
2019-09-06 22:47             ` John Ogness
2019-09-08 22:18             ` Peter Zijlstra
2019-09-10  3:22             ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190823171802.eo2chwyktibeub7a@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=andrea.parri@amarulasolutions.com \
    --cc=brendanhiggins@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).