All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: John Ogness <john.ogness@linutronix.de>
Cc: linux-kernel@vger.kernel.org,
	Andrea Parri <andrea.parri@amarulasolutions.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Brendan Higgins <brendanhiggins@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: numlist API Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation
Date: Fri, 23 Aug 2019 19:18:02 +0200	[thread overview]
Message-ID: <20190823171802.eo2chwyktibeub7a@pathway.suse.cz> (raw)
In-Reply-To: <20190807222634.1723-2-john.ogness@linutronix.de>

On Thu 2019-08-08 00:32:26, John Ogness wrote:
> --- /dev/null
> +++ b/kernel/printk/numlist.c
> @@ -0,0 +1,375 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/sched.h>
> +#include "numlist.h"

struct numlist is really special variant of a list. Let me to
do a short summary:

   + FIFO queue interface

   + nodes sequentially numbered

   + nodes referenced by ID instead pointers to avoid ABA problems
     + requires custom node() callback to get pointer for given ID

   + lockless access:
     + pushed nodes must not longer get modified by push() caller
     + pop() caller gets exclusive write access, except that they
       must modify ID first and do smp_wmb() later

   + pop() does not work:
     + tail node is "busy"
	+ needs a custom callback that defines when a node is busy
     + tail is the last node
	+ needed for lockless sequential numbering

I will start with one inevitable question ;-) Is it realistic to find
another user for this API, please?

I am not sure that all the indirections, caused by the generic API,
are worth the gain.


Well, the separate API makes sense anyway. I have some ideas that
might make it cleaner.

The barriers are because of validating the ID. Now we have:

	struct nl_node {
		unsigned long	seq;
		unsigned long	next_id;
	};

that is used in:

	struct prb_desc {
		/* private */
		atomic_long_t		id;
		struct dr_desc		desc;
		struct nl_node		list;
	};

What will happen when we move id from struct prb_desc into struct nl_node?

	struct nl_node {
		unsigned long	seq;
		atomic_long_t	id;
		unsigned long	next_id;
	};

	struct prb_desc {
		struct dr_desc		desc;
		struct nl_node		list;
	};


Then the "node" callback might just return the structure. It makes
perfect sense. struct nl_node is always static for a given id.

For the printk ringbuffer it would look like:

struct nl_node *prb_nl_get_node(unsigned long id, void *nl_user)
{
	struct printk_ringbuffer *rb = (struct printk_ringbuffer *)nl_user;
	struct prb_desc *d = to_desc(rb, id);

	return &d->list;
}

I would also hide the callback behind a generic wrapper:

struct nl_node *numlist_get_node(struct numlist *nl, unsigned long id)
{
	return nl->get_node(id, nl->user_data);
}


Then we could have nicely symetric and self contained barriers
in numlist_read():

bool numlist_read(struct numlist *nl, unsigned long id, unsigned long *seq,
		  unsigned long *next_id)
{
	struct nl_node *n;
	unsigned long cur_id;

	n = numlist_get_node(nl, id);
	if (!n)
		return false;

	/*
	 * Make sure that seq and next_id values will be read
	 * for the expected id.
	 */
	cur_id = atomic_long_read_acquire(&n->id);
	if (cur_id != id)
		return false;

	if (seq) {
		*seq = n->seq;

	if (next_id)
		*next_id = n->next_id;
	}

	/*
	 * Make sure that seq and next_id values were read for
	 * the expected ID.
	 */
	cur_id = atomic_long_read_release(&n->id);

	return cur_id == id;
}

numlist_push() might be the same, except the I would
remove several WRITE_ONCE as discussed in another mail:

void numlist_push(struct numlist *nl, struct nl_node *n)
{
	unsigned long head_id;
	unsigned long seq;
	unsigned long r;

	/* Setup the node to be a list terminator: next_id == id. */
	n->next_id = n->id;

	do {
		do {
			head_id = atomic_long_read(&nl->head_id);
		} while (!numlist_read(nl, head_id, &seq, NULL));

		n->seq = seq + 1;

		/*
		 * This store_release() guarantees that @seq and @next are
		 * stored before the node with @id is visible to any popping
		 * writers.
		 * 
		 * It pairs with the acquire() when tail_id gets updated
		 * in headlist_pop();
		 */
	} while (atomic_long_cmpxchg_release(&nl->head_id, head_id, id) !=
			head_id);

	n = nl->get_node(nl, head_id);

	/*
	 * This barrier makes sure that nl->head_id already points to
	 * the newly pushed node.
	 *
	 * It pairs with acquire when new id is written in numlist_pop().
	 * It allows to pop() and reuse this node. It can not longer
	 * be the last one.
	 */
	smp_store_release(&n->next_id, id);
}

Then I would add a symetric callback that would generate ID for
a newly popped struct. It will allow to set new ID in the numlist
API and have the barriers symetric. Something like:

unsined long prb_new_node_id(unsigned long old_id, , void *nl_user)
{
	struct printk_ringbuffer *rb = (struct printk_ringbuffer *)nl_user;

	return id + DESCS_COUNT(rb);
}

Then we could hide it in

unsigned long numlist_get_new_id(struct numlist *nl, unsigned long id)
{
	return nl->get_new_id(id, nl->user_data);
}

and do

struct nl_node *numlist_pop(struct numlist *nl)
{
	struct nl_node *n;
	unsigned long tail_id;
	unsigned long next_id;
	unsigned long r;

	tail_id = atomic_long_read(&nl->tail_id);

	do {
		do {
			tail_id = atomic_long_read(&nl->tail_id);
		} while (!numlist_read(nl, tail_id, NULL, &next_id));

		/* Make sure the node is not the only node on the list. */
		if (next_id == tail_id)
			return NULL;

		/* Make sure the node is not busy. */
		if (nl->busy(tail_id, nl->busy_arg))
			return NULL;

		/*
		 * Make sure that nl->tail_id is update before
		 * we start modyfying the popped node.
		 *
		 * It pairs with release() when head_id is
		 * pushed in numlist_push().
		 */
	} while (atomic_long_cmpxchg_acquire(&nl->tail_id,
					tail_id, next_id) !=
		  tail_id);

	/* Got exclusive write access to the node. */
	n = numlist_get_node(nl, tail_id);

	tail_id = numlist_get_new_id(tail_id, nl);
	/*
	 * Make sure that we set new ID before we allow
	 * more changes in user structure handled by this node.
	 *
	 * It pairs with release() barrier when the node is
	 * pushed into the numlist again, gets linked to
	 * the previous node and can't be modified anymore.
	 * See numlist_push().
	 */
	atomic_long_set_acquire(&d->id, atomic_long_read(&d->id) +
				DESCS_COUNT(rb));

	return n;
}

I hope that it makes some sense. I feel exhausted. It is Friday
evening here. I just wanted to send it because it looked like the most
constructive idea that I had this week. And I wanted to send something
more positive ;-)

Best Regards,
Petr

  parent reply	other threads:[~2019-08-23 17:18 UTC|newest]

Thread overview: 134+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-07 22:26 [RFC PATCH v4 0/9] printk: new ringbuffer implementation John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 1/9] printk-rb: add a new printk " John Ogness
2019-08-20  8:15   ` numlist_pop(): " Petr Mladek
2019-08-21  5:41     ` John Ogness
2019-09-04 12:19     ` Peter Zijlstra
2019-08-20  8:22   ` assign_desc() barriers: " Petr Mladek
2019-08-20 14:14     ` Petr Mladek
2019-08-21  5:52       ` John Ogness
2019-08-22 11:53         ` Petr Mladek
2019-08-25  2:06           ` John Ogness
2019-08-26  8:21             ` John Ogness
2019-08-20  8:55   ` comments style: " Petr Mladek
2019-08-20  9:27     ` Sergey Senozhatsky
2019-08-21  5:46       ` John Ogness
2019-08-22 13:50         ` Petr Mladek
2019-08-22 17:38           ` Andrea Parri
2019-08-23 10:47             ` Petr Mladek
2019-08-23 14:27               ` Andrea Parri
2019-08-23  9:49           ` Sergey Senozhatsky
2019-08-23  5:54         ` Sergey Senozhatsky
2019-08-23 10:29           ` Petr Mladek
2019-08-21  5:42     ` John Ogness
2019-08-22 12:44       ` Petr Mladek
2019-08-20 13:50   ` dataring_push() barriers " Petr Mladek
2019-08-25  2:42     ` John Ogness
2019-08-27 14:36       ` Petr Mladek
2019-08-28 13:43         ` John Ogness
2019-08-20 15:12   ` datablock reuse races " Petr Mladek
2019-08-23  9:21   ` numlist_push() barriers " Petr Mladek
2019-08-26  8:34     ` Andrea Parri
2019-08-26  8:43       ` Andrea Parri
2019-08-26 14:10       ` Petr Mladek
2019-08-26 16:01         ` Andrea Parri
2019-08-26 22:36     ` John Ogness
2019-08-27  7:40       ` Petr Mladek
2019-08-27 14:28         ` John Ogness
2019-08-27 15:07           ` Petr Mladek
2019-08-28 10:24             ` John Ogness
2019-08-23 17:18   ` Petr Mladek [this message]
2019-08-26 23:57     ` numlist API " John Ogness
2019-08-27 13:03       ` Petr Mladek
2019-08-28  7:13         ` John Ogness
2019-08-28  8:58           ` Petr Mladek
2019-08-28 14:03             ` John Ogness
2019-08-29 11:28               ` Petr Mladek
2019-09-03  7:58         ` Sergey Senozhatsky
2019-08-30 14:48   ` dataring " Petr Mladek
2019-08-07 22:26 ` [RFC PATCH v4 2/9] printk-rb: add test module John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 3/9] printk-rb: fix missing includes/exports John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 4/9] printk-rb: initialize new descriptors as invalid John Ogness
2019-08-20  9:23   ` Petr Mladek
2019-08-20 10:16     ` Sergey Senozhatsky
2019-08-21  5:56     ` John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 5/9] printk-rb: remove extra data buffer size allocation John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 6/9] printk-rb: adjust test module ringbuffer sizes John Ogness
2019-08-19 21:29   ` [PATCH] printk-rb: fix test module macro usage John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 7/9] printk-rb: increase size of seq and size variables John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 8/9] printk-rb: new functionality to support printk John Ogness
2019-08-20  9:59   ` Sergey Senozhatsky
2019-08-21  5:47     ` John Ogness
2019-08-07 22:26 ` [RFC PATCH v4 9/9] printk: use a new ringbuffer implementation John Ogness
2019-08-08 19:07   ` Linus Torvalds
2019-08-08 22:55     ` John Ogness
2019-08-08 23:33       ` Linus Torvalds
2019-08-08 23:45         ` Steven Rostedt
2019-08-09  0:21           ` Linus Torvalds
2019-08-09  0:48             ` Steven Rostedt
2019-08-09  1:15               ` Linus Torvalds
2019-08-09 11:15                 ` Thomas Gleixner
2019-08-09 16:00                   ` Linus Torvalds
2019-08-09 20:07                     ` Thomas Gleixner
2019-08-09 20:20                       ` Linus Torvalds
2019-08-09  6:14     ` Peter Zijlstra
2019-08-09  7:08       ` John Ogness
2019-08-09 15:57       ` Linus Torvalds
2019-08-10  5:53         ` Thomas Gleixner
2019-09-10  3:19           ` Sergey Senozhatsky
2019-08-12  9:54       ` Geert Uytterhoeven
2019-08-16  5:46   ` Dave Young
2019-08-16  5:46     ` Dave Young
2019-08-16  5:54     ` Dave Young
2019-08-16  5:54       ` Dave Young
2019-08-16  9:40     ` John Ogness
2019-08-16  9:40       ` John Ogness
2019-09-04 12:35 ` [RFC PATCH v4 0/9] printk: " Peter Zijlstra
2019-09-05 13:05   ` Petr Mladek
2019-09-05 14:31     ` Peter Zijlstra
2019-09-05 15:38       ` Thomas Gleixner
2019-09-05 16:11         ` Steven Rostedt
2019-09-05 21:10           ` John Ogness
2019-09-06  9:39           ` Petr Mladek
2019-09-09 14:11           ` printk meeting at LPC Thomas Gleixner
2019-09-13 13:26             ` John Ogness
2019-09-13 14:48               ` Daniel Vetter
2019-09-15 13:47                 ` John Ogness
2019-09-16  8:44                   ` Daniel Vetter
2019-09-16  4:30               ` Tetsuo Handa
2019-09-16 10:46                 ` Petr Mladek
2019-09-16 13:43                   ` Steven Rostedt
2019-09-16 14:28                     ` John Ogness
2019-09-17  8:11                       ` Petr Mladek
2019-09-17  7:52                     ` Petr Mladek
2019-09-17 13:02                       ` Steven Rostedt
2019-09-17 13:12                         ` Greg Kroah-Hartman
2019-09-17 13:37                           ` Steven Rostedt
2019-09-17 14:08                             ` Tetsuo Handa
2019-09-17  7:51                   ` Sergey Senozhatsky
2019-09-18  1:25               ` Sergey Senozhatsky
2019-09-18  2:08                 ` Steven Rostedt
2019-09-18  2:36                   ` Sergey Senozhatsky
2019-09-18  5:19                     ` Sergey Senozhatsky
2019-09-18  7:42                       ` John Ogness
2019-09-18  8:10                         ` Sergey Senozhatsky
2019-09-18  9:05                           ` John Ogness
2019-09-18  9:11                             ` Sergey Senozhatsky
2019-09-18 16:41                             ` Petr Mladek
2019-09-18 16:48                               ` Steven Rostedt
2019-09-24 14:24                                 ` Petr Mladek
2019-09-19  8:06                         ` Daniel Vetter
2019-09-18  7:33                     ` John Ogness
2019-09-18  8:08                       ` Sergey Senozhatsky
2019-10-04 14:48               ` Tony Asleson
2019-10-07 12:01                 ` Petr Mladek
2019-09-06  9:06       ` [RFC PATCH v4 0/9] printk: new ringbuffer implementation Peter Zijlstra
2019-09-06 10:09         ` Sergey Senozhatsky
2019-09-06 10:49           ` Peter Zijlstra
2019-09-06 13:44             ` Sergey Senozhatsky
2019-09-06 12:42         ` Petr Mladek
2019-09-06 14:01           ` Peter Zijlstra
2019-09-06 14:22             ` Peter Zijlstra
2019-09-06 19:53             ` Sergey Senozhatsky
2019-09-06 22:47             ` John Ogness
2019-09-08 22:18             ` Peter Zijlstra
2019-09-10  3:22             ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190823171802.eo2chwyktibeub7a@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=andrea.parri@amarulasolutions.com \
    --cc=brendanhiggins@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.