linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: mingo@kernel.org, rusty@rustcorp.com.au,
	mathieu.desnoyers@efficios.com, oleg@redhat.com,
	paulmck@linux.vnet.ibm.com, torvalds@linux-foundation.org
Cc: linux-kernel@vger.kernel.org, andi@firstfloor.org,
	rostedt@goodmis.org, tglx@linutronix.de, peterz@infradead.org,
	Andrea Arcangeli <aarcange@redhat.com>,
	David Woodhouse <David.Woodhouse@intel.com>,
	Rik van Riel <riel@redhat.com>,
	Michel Lespinasse <walken@google.com>
Subject: [PATCH v4 5/9] seqlock: Better document raw_write_seqcount_latch()
Date: Wed, 08 Apr 2015 18:48:18 +0200	[thread overview]
Message-ID: <20150408170044.056800430@infradead.org> (raw)
In-Reply-To: 20150408164813.810874878@infradead.org

[-- Attachment #1: peterz-latch-comment.patch --]
[-- Type: text/plain, Size: 4960 bytes --]

Improve the documentation of the latch technique as used in the
current timekeeping code, such that it can be readily employed
elsewhere.

Borrow from the comments in timekeeping and replace those with a
reference to this more generic comment.

Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/seqlock.h   |   76 +++++++++++++++++++++++++++++++++++++++++++++-
 kernel/time/timekeeping.c |   27 ----------------
 2 files changed, 76 insertions(+), 27 deletions(-)

--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -233,9 +233,83 @@ static inline void raw_write_seqcount_en
 	s->sequence++;
 }
 
-/*
+/**
  * raw_write_seqcount_latch - redirect readers to even/odd copy
  * @s: pointer to seqcount_t
+ *
+ * The latch technique is a multiversion concurrency control method that allows
+ * queries during non atomic modifications. If you can guarantee queries never
+ * interrupt the modification -- e.g. the concurrency is strictly between CPUs
+ * -- you most likely do not need this.
+ *
+ * Where the traditional RCU/lockless data structures rely on atomic
+ * modifications to ensure queries observe either the old or the new state the
+ * latch allows the same for non atomic updates. The trade-off is doubling the
+ * cost of storage; we have to maintain two copies of the entire data
+ * structure.
+ *
+ * Very simply put: we first modify one copy and then the other. This ensures
+ * there is always one copy in a stable state, ready to give us an answer.
+ *
+ * The basic form is a data structure like:
+ *
+ * struct latch_struct {
+ *	seqcount_t		seq;
+ *	struct data_struct	data[2];
+ * };
+ *
+ * Where a modification, which is assumed to be externally serialized, does the
+ * following:
+ *
+ * void latch_modify(struct latch_struct *latch, ...)
+ * {
+ *	smp_wmb();	<- Ensure that the last data[1] update is visible
+ *	latch->seq++;
+ *	smp_wmb();	<- Ensure that the seqcount update is visible
+ *
+ *	modify(latch->data[0], ...);
+ *
+ *	smp_wmb();	<- Ensure that the data[0] update is visible
+ *	latch->seq++;
+ *	smp_wmb();	<- Ensure that the seqcount update is visible
+ *
+ *	modify(latch->data[1], ...);
+ * }
+ *
+ * The query will have a form like:
+ *
+ * struct entry *latch_query(struct latch_struct *latch, ...)
+ * {
+ *	struct entry *entry;
+ *	unsigned seq, idx;
+ *
+ *	do {
+ *		seq = latch->seq;
+ *		smp_rmb();
+ *
+ *		idx = seq & 0x01;
+ *		entry = data_query(latch->data[idx], ...);
+ *
+ *		smp_rmb();
+ *	} while (seq != latch->seq);
+ *
+ *	return entry;
+ * }
+ *
+ * So during the modification, queries are first redirected to data[1]. Then we
+ * modify data[0]. When that is complete, we redirect queries back to data[0]
+ * and we can modify data[1].
+ *
+ * NOTE: The non-requirement for atomic modifications does _NOT_ include
+ *       the publishing of new entries in the case where data is a dynamic
+ *       data structure.
+ *
+ *       An iteration might start in data[0] and get suspended long enough
+ *       to miss an entire modification sequence, once it resumes it might
+ *       observe the new entry.
+ *
+ * NOTE: When data is a dynamic data structure; one should use regular RCU
+ *       patterns to manage the lifetimes of the objects within.
  */
 static inline void raw_write_seqcount_latch(seqcount_t *s)
 {
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -339,32 +339,7 @@ static inline s64 timekeeping_get_ns_raw
  * We want to use this from any context including NMI and tracing /
  * instrumenting the timekeeping code itself.
  *
- * So we handle this differently than the other timekeeping accessor
- * functions which retry when the sequence count has changed. The
- * update side does:
- *
- * smp_wmb();	<- Ensure that the last base[1] update is visible
- * tkf->seq++;
- * smp_wmb();	<- Ensure that the seqcount update is visible
- * update(tkf->base[0], tkr);
- * smp_wmb();	<- Ensure that the base[0] update is visible
- * tkf->seq++;
- * smp_wmb();	<- Ensure that the seqcount update is visible
- * update(tkf->base[1], tkr);
- *
- * The reader side does:
- *
- * do {
- *	seq = tkf->seq;
- *	smp_rmb();
- *	idx = seq & 0x01;
- *	now = now(tkf->base[idx]);
- *	smp_rmb();
- * } while (seq != tkf->seq)
- *
- * As long as we update base[0] readers are forced off to
- * base[1]. Once base[0] is updated readers are redirected to base[0]
- * and the base[1] update takes place.
+ * Employ the latch technique; see @raw_write_seqcount_latch.
  *
  * So if a NMI hits the update of base[0] then it will use base[1]
  * which is still consistent. In the worst case this can result is a



  parent reply	other threads:[~2015-04-08 17:11 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-08 16:48 [PATCH v4 0/9] latched RB-trees and __module_address() Peter Zijlstra
2015-04-08 16:48 ` [PATCH v4 1/9] module: Sanitize RCU usage and locking Peter Zijlstra
2015-04-09  4:21   ` Lai Jiangshan
2015-04-09 12:36     ` Peter Zijlstra
2015-04-08 16:48 ` [PATCH v4 2/9] module: Annotate module version magic Peter Zijlstra
2015-04-08 16:48 ` [PATCH v4 3/9] module, jump_label: Fix module locking Peter Zijlstra
2015-04-08 16:48 ` [PATCH v4 4/9] rbtree: Make lockless searches non-fatal Peter Zijlstra
2015-04-08 16:48 ` Peter Zijlstra [this message]
2015-04-08 16:48 ` [PATCH v4 6/9] rbtree: Implement generic latch_tree Peter Zijlstra
2015-04-09  8:09   ` Lai Jiangshan
2015-04-09  8:14     ` Peter Zijlstra
2015-04-09  8:55       ` Lai Jiangshan
2015-04-09 12:13         ` Peter Zijlstra
2015-04-09 12:28           ` Peter Zijlstra
2015-04-09 16:31           ` Linus Torvalds
2015-04-09 16:59             ` Peter Zijlstra
2015-04-09 17:12               ` Linus Torvalds
2015-04-09 17:17                 ` Linus Torvalds
2015-04-08 16:48 ` [PATCH v4 7/9] module: Optimize __module_address() using a latched RB-tree Peter Zijlstra
2015-04-08 16:48 ` [PATCH v4 8/9] module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING Peter Zijlstra
2015-04-08 16:48 ` [PATCH v4 9/9] module: Use __module_address() for module_address_lookup() Peter Zijlstra
2015-04-08 22:08 ` [PATCH v4 0/9] latched RB-trees and __module_address() Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150408170044.056800430@infradead.org \
    --to=peterz@infradead.org \
    --cc=David.Woodhouse@intel.com \
    --cc=aarcange@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).