All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ahmed S. Darwish" <a.darwish@linutronix.de>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: John Hubbard <jhubbard@nvidia.com>,
	Jason Gunthorpe <jgg@nvidia.com>, Peter Xu <peterx@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Christoph Hellwig <hch@lst.de>, Hugh Dickins <hughd@google.com>,
	Jan Kara <jack@suse.cz>, Jann Horn <jannh@google.com>,
	Kirill Shutemov <kirill@shutemov.name>,
	Kirill Tkhai <ktkhai@virtuozzo.com>,
	Leon Romanovsky <leonro@nvidia.com>,
	Linux-MM <linux-mm@kvack.org>, Michal Hocko <mhocko@suse.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Siewior <bigeasy@linutronix.de>
Subject: Re: [PATCH v2 2/2] mm: prevent gup_fast from racing with COW during fork
Date: Wed, 4 Nov 2020 02:32:12 +0100	[thread overview]
Message-ID: <20201104013212.GA82153@lx-t490> (raw)
In-Reply-To: <CAHk-=wgB8nyOQufpn0o6a5BpJCJPnXvH+kRxApujhsgG+7qAwQ@mail.gmail.com>

On Tue, Nov 03, 2020 at 09:40:22AM -0800, Linus Torvalds wrote:
> On Mon, Nov 2, 2020 at 10:52 PM Ahmed S. Darwish
> <a.darwish@linutronix.de> wrote:
> >
> > The problem is, I've already documented seqlock.h to death.... There are
> > more comments than code in there, and there is "seqlock.rst" under
> > Documentation/ to further describe the big picture.
>
> Well, honestly, I think the correct thing to do is to get rid of the
> *_seqcount_t_*() functions entirely.
>
> They add nothing but confusion, and they are entirely misnamed. That's
> not the pattern we use for "internal use only" functions, and they are
> *very* confusing.
>

I see. Would the enclosed patch #1 be OK? It basically uses the "__do_"
prefix instead, with some rationale.

>
> They have other issues too: like raw_write_seqcount_end() not being
> usable on its own when preemptibility isn't an issue like here. You
> basically _have_ to use raw_write_seqcount_t_end(), because otherwise
> it tries to re-enable preemption that was never there.
>

Hmmm, raw_write_seqcount_{begin,end}() *never* disable/enable preemption
for plain seqcount_t. This is why I kept recommending those for this
patch series instead of internal raw_write_seqcount_*t*_{begin,end}().

But..... given that multiple people made the exact same remark by now, I
guess that's due to:

#define raw_write_seqcount_begin(s)		\
do {						\
	if (__seqcount_lock_preemptible(s))	\
		preempt_disable();		\
						\
	...					\
} while (0);

#define raw_write_seqcount_end(s)		\
do {						\
	...					\
						\
	if (__seqcount_lock_preemptible(s))	\
		preempt_enable();		\
} while (0);

The tricky part is that __seqcount_lock_preemptible() is always false
for plain "seqcount_t".  With that data type, the _Generic() selection
makes it resolve to __seqprop_preemptible(), which just returns false.

Originally, __seqcount_lock_preemptible() was called:

  __seqcount_associated_lock_exists_and_is_preemptible()

but it got transformed to its current short form in the process of some
pre-mainline refactorings. Looking at it now after all the dust has
settled, maybe the verbose form was much more clear.

Please see the enclosed patch #2... Would that be OK too?

(I will submit these two patches in their own thread after some common
 ground is reached.)

Patches
-------

====>
====> patch #1:
====>

Subject: [RFC][PATCH 1/2] seqlock: Use __do_ prefix instead of non-standed
 _seqcount_t_ marker

The use of "*_seqcount_t_*" as a marker to denote internal seqlock.h
functions taking only plain seqcount_t instead of the whole
seqcount_LOCKNAME_t family is confusing users, as it's also not the
standard kernel pattern for denoting header file internal functions.

Use the __do_ prefix instead.

Note, a plain "__" prefix is not used since seqlock.h already uses it
for some of its exported functions; e.g. __read_seqcount_begin() and
__read_seqcount_retry().

Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Reported-by: John Hubbard <jhubbard@nvidia.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/CAHk-=wgB8nyOQufpn0o6a5BpJCJPnXvH+kRxApujhsgG+7qAwQ@mail.gmail.com
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 include/linux/seqlock.h | 62 ++++++++++++++++++++---------------------
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index cbfc78b92b65..5de043841d33 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -425,9 +425,9 @@ SEQCOUNT_LOCKNAME(ww_mutex,     struct ww_mutex, true,     &s->lock->base, ww_mu
  * Return: true if a read section retry is required, else false
  */
 #define __read_seqcount_retry(s, start)					\
-	__read_seqcount_t_retry(__seqcount_ptr(s), start)
+	__do___read_seqcount_retry(__seqcount_ptr(s), start)

-static inline int __read_seqcount_t_retry(const seqcount_t *s, unsigned start)
+static inline int __do___read_seqcount_retry(const seqcount_t *s, unsigned start)
 {
 	kcsan_atomic_next(0);
 	return unlikely(READ_ONCE(s->sequence) != start);
@@ -445,12 +445,12 @@ static inline int __read_seqcount_t_retry(const seqcount_t *s, unsigned start)
  * Return: true if a read section retry is required, else false
  */
 #define read_seqcount_retry(s, start)					\
-	read_seqcount_t_retry(__seqcount_ptr(s), start)
+	__do_read_seqcount_retry(__seqcount_ptr(s), start)

-static inline int read_seqcount_t_retry(const seqcount_t *s, unsigned start)
+static inline int __do_read_seqcount_retry(const seqcount_t *s, unsigned start)
 {
 	smp_rmb();
-	return __read_seqcount_t_retry(s, start);
+	return __do___read_seqcount_retry(s, start);
 }

 /**
@@ -462,10 +462,10 @@ do {									\
 	if (__seqcount_lock_preemptible(s))				\
 		preempt_disable();					\
 									\
-	raw_write_seqcount_t_begin(__seqcount_ptr(s));			\
+	__do_raw_write_seqcount_begin(__seqcount_ptr(s));		\
 } while (0)

-static inline void raw_write_seqcount_t_begin(seqcount_t *s)
+static inline void __do_raw_write_seqcount_begin(seqcount_t *s)
 {
 	kcsan_nestable_atomic_begin();
 	s->sequence++;
@@ -478,13 +478,13 @@ static inline void raw_write_seqcount_t_begin(seqcount_t *s)
  */
 #define raw_write_seqcount_end(s)					\
 do {									\
-	raw_write_seqcount_t_end(__seqcount_ptr(s));			\
+	__do_raw_write_seqcount_end(__seqcount_ptr(s));			\
 									\
 	if (__seqcount_lock_preemptible(s))				\
 		preempt_enable();					\
 } while (0)

-static inline void raw_write_seqcount_t_end(seqcount_t *s)
+static inline void __do_raw_write_seqcount_end(seqcount_t *s)
 {
 	smp_wmb();
 	s->sequence++;
@@ -506,12 +506,12 @@ do {									\
 	if (__seqcount_lock_preemptible(s))				\
 		preempt_disable();					\
 									\
-	write_seqcount_t_begin_nested(__seqcount_ptr(s), subclass);	\
+	__do_write_seqcount_begin_nested(__seqcount_ptr(s), subclass);	\
 } while (0)

-static inline void write_seqcount_t_begin_nested(seqcount_t *s, int subclass)
+static inline void __do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
 {
-	raw_write_seqcount_t_begin(s);
+	__do_raw_write_seqcount_begin(s);
 	seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_);
 }

@@ -533,12 +533,12 @@ do {									\
 	if (__seqcount_lock_preemptible(s))				\
 		preempt_disable();					\
 									\
-	write_seqcount_t_begin(__seqcount_ptr(s));			\
+	__do_write_seqcount_begin(__seqcount_ptr(s));			\
 } while (0)

-static inline void write_seqcount_t_begin(seqcount_t *s)
+static inline void __do_write_seqcount_begin(seqcount_t *s)
 {
-	write_seqcount_t_begin_nested(s, 0);
+	__do_write_seqcount_begin_nested(s, 0);
 }

 /**
@@ -549,16 +549,16 @@ static inline void write_seqcount_t_begin(seqcount_t *s)
  */
 #define write_seqcount_end(s)						\
 do {									\
-	write_seqcount_t_end(__seqcount_ptr(s));			\
+	__do_write_seqcount_end(__seqcount_ptr(s));			\
 									\
 	if (__seqcount_lock_preemptible(s))				\
 		preempt_enable();					\
 } while (0)

-static inline void write_seqcount_t_end(seqcount_t *s)
+static inline void __do_write_seqcount_end(seqcount_t *s)
 {
 	seqcount_release(&s->dep_map, _RET_IP_);
-	raw_write_seqcount_t_end(s);
+	__do_raw_write_seqcount_end(s);
 }

 /**
@@ -603,9 +603,9 @@ static inline void write_seqcount_t_end(seqcount_t *s)
  *      }
  */
 #define raw_write_seqcount_barrier(s)					\
-	raw_write_seqcount_t_barrier(__seqcount_ptr(s))
+	__do_raw_write_seqcount_barrier(__seqcount_ptr(s))

-static inline void raw_write_seqcount_t_barrier(seqcount_t *s)
+static inline void __do_raw_write_seqcount_barrier(seqcount_t *s)
 {
 	kcsan_nestable_atomic_begin();
 	s->sequence++;
@@ -623,9 +623,9 @@ static inline void raw_write_seqcount_t_barrier(seqcount_t *s)
  * will complete successfully and see data older than this.
  */
 #define write_seqcount_invalidate(s)					\
-	write_seqcount_t_invalidate(__seqcount_ptr(s))
+	__do_write_seqcount_invalidate(__seqcount_ptr(s))

-static inline void write_seqcount_t_invalidate(seqcount_t *s)
+static inline void __do_write_seqcount_invalidate(seqcount_t *s)
 {
 	smp_wmb();
 	kcsan_nestable_atomic_begin();
@@ -865,7 +865,7 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
 }

 /*
- * For all seqlock_t write side functions, use write_seqcount_*t*_begin()
+ * For all seqlock_t write side functions, use __do_write_seqcount_begin()
  * instead of the generic write_seqcount_begin(). This way, no redundant
  * lockdep_assert_held() checks are added.
  */
@@ -886,7 +886,7 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
 static inline void write_seqlock(seqlock_t *sl)
 {
 	spin_lock(&sl->lock);
-	write_seqcount_t_begin(&sl->seqcount.seqcount);
+	__do_write_seqcount_begin(&sl->seqcount.seqcount);
 }

 /**
@@ -898,7 +898,7 @@ static inline void write_seqlock(seqlock_t *sl)
  */
 static inline void write_sequnlock(seqlock_t *sl)
 {
-	write_seqcount_t_end(&sl->seqcount.seqcount);
+	__do_write_seqcount_end(&sl->seqcount.seqcount);
 	spin_unlock(&sl->lock);
 }

@@ -912,7 +912,7 @@ static inline void write_sequnlock(seqlock_t *sl)
 static inline void write_seqlock_bh(seqlock_t *sl)
 {
 	spin_lock_bh(&sl->lock);
-	write_seqcount_t_begin(&sl->seqcount.seqcount);
+	__do_write_seqcount_begin(&sl->seqcount.seqcount);
 }

 /**
@@ -925,7 +925,7 @@ static inline void write_seqlock_bh(seqlock_t *sl)
  */
 static inline void write_sequnlock_bh(seqlock_t *sl)
 {
-	write_seqcount_t_end(&sl->seqcount.seqcount);
+	__do_write_seqcount_end(&sl->seqcount.seqcount);
 	spin_unlock_bh(&sl->lock);
 }

@@ -939,7 +939,7 @@ static inline void write_sequnlock_bh(seqlock_t *sl)
 static inline void write_seqlock_irq(seqlock_t *sl)
 {
 	spin_lock_irq(&sl->lock);
-	write_seqcount_t_begin(&sl->seqcount.seqcount);
+	__do_write_seqcount_begin(&sl->seqcount.seqcount);
 }

 /**
@@ -951,7 +951,7 @@ static inline void write_seqlock_irq(seqlock_t *sl)
  */
 static inline void write_sequnlock_irq(seqlock_t *sl)
 {
-	write_seqcount_t_end(&sl->seqcount.seqcount);
+	__do_write_seqcount_end(&sl->seqcount.seqcount);
 	spin_unlock_irq(&sl->lock);
 }

@@ -960,7 +960,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
 	unsigned long flags;

 	spin_lock_irqsave(&sl->lock, flags);
-	write_seqcount_t_begin(&sl->seqcount.seqcount);
+	__do_write_seqcount_begin(&sl->seqcount.seqcount);
 	return flags;
 }

@@ -989,7 +989,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
 static inline void
 write_sequnlock_irqrestore(seqlock_t *sl, unsigned long flags)
 {
-	write_seqcount_t_end(&sl->seqcount.seqcount);
+	__do_write_seqcount_end(&sl->seqcount.seqcount);
 	spin_unlock_irqrestore(&sl->lock, flags);
 }

====>
====> patch #2:
====>

Subject: [PATCH 2/2] seqlock: seqcount_LOCKAME_t: Use more verbose macro names

As evidenced by multiple discussions over LKML so far, it's not clear
that __seqcount_lock_preemptible() is always false for plain seqcount_t.
For that data type, the _Generic() selection resolves to
__seqprop_preemptible(), which just returns false.

Use __seqcount_associated_lock_exists_and_is_preemptible() instead,
which hints that "preemptibility" is for the associated write
serialization lock (if any), not for the seqcount itself.

Similarly, rename __seqcount_assert_lock_held() to
__seqcount_assert_associated_lock_held().

Link: https://lkml.kernel.org/r/CAHk-=wgB8nyOQufpn0o6a5BpJCJPnXvH+kRxApujhsgG+7qAwQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20201030235121.GQ2620339@nvidia.com
Link: https://lkml.kernel.org/r/20201103170327.GJ20600@xz-x1
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 include/linux/seqlock.h | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index 5de043841d33..eb1e5a822e44 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -307,10 +307,10 @@ SEQCOUNT_LOCKNAME(ww_mutex,     struct ww_mutex, true,     &s->lock->base, ww_mu
 	__seqprop_case((s),	mutex,		prop),			\
 	__seqprop_case((s),	ww_mutex,	prop))

-#define __seqcount_ptr(s)		__seqprop(s, ptr)
-#define __seqcount_sequence(s)		__seqprop(s, sequence)
-#define __seqcount_lock_preemptible(s)	__seqprop(s, preemptible)
-#define __seqcount_assert_lock_held(s)	__seqprop(s, assert)
+#define __seqcount_ptr(s)					__seqprop(s, ptr)
+#define __seqcount_sequence(s)					__seqprop(s, sequence)
+#define __seqcount_associated_lock_exists_and_is_preemptible(s)	__seqprop(s, preemptible)
+#define __seqcount_assert_associated_lock_held(s)		__seqprop(s, assert)

 /**
  * __read_seqcount_begin() - begin a seqcount_t read section w/o barrier
@@ -459,7 +459,7 @@ static inline int __do_read_seqcount_retry(const seqcount_t *s, unsigned start)
  */
 #define raw_write_seqcount_begin(s)					\
 do {									\
-	if (__seqcount_lock_preemptible(s))				\
+	if (__seqcount_associated_lock_exists_and_is_preemptible(s))	\
 		preempt_disable();					\
 									\
 	__do_raw_write_seqcount_begin(__seqcount_ptr(s));		\
@@ -480,7 +480,7 @@ static inline void __do_raw_write_seqcount_begin(seqcount_t *s)
 do {									\
 	__do_raw_write_seqcount_end(__seqcount_ptr(s));			\
 									\
-	if (__seqcount_lock_preemptible(s))				\
+	if (__seqcount_associated_lock_exists_and_is_preemptible(s))	\
 		preempt_enable();					\
 } while (0)

@@ -501,9 +501,9 @@ static inline void __do_raw_write_seqcount_end(seqcount_t *s)
  */
 #define write_seqcount_begin_nested(s, subclass)			\
 do {									\
-	__seqcount_assert_lock_held(s);					\
+	__seqcount_assert_associated_lock_held(s);			\
 									\
-	if (__seqcount_lock_preemptible(s))				\
+	if (__seqcount_associated_lock_exists_and_is_preemptible(s))	\
 		preempt_disable();					\
 									\
 	__do_write_seqcount_begin_nested(__seqcount_ptr(s), subclass);	\
@@ -528,9 +528,9 @@ static inline void __do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
  */
 #define write_seqcount_begin(s)						\
 do {									\
-	__seqcount_assert_lock_held(s);					\
+	__seqcount_assert_associated_lock_held(s);			\
 									\
-	if (__seqcount_lock_preemptible(s))				\
+	if (__seqcount_associated_lock_exists_and_is_preemptible(s))	\
 		preempt_disable();					\
 									\
 	__do_write_seqcount_begin(__seqcount_ptr(s));			\
@@ -551,7 +551,7 @@ static inline void __do_write_seqcount_begin(seqcount_t *s)
 do {									\
 	__do_write_seqcount_end(__seqcount_ptr(s));			\
 									\
-	if (__seqcount_lock_preemptible(s))				\
+	if (__seqcount_associated_lock_exists_and_is_preemptible(s))	\
 		preempt_enable();					\
 } while (0)

>                    Linus

Thanks,

--
Ahmed S. Darwish
Linutronix GmbH

  reply	other threads:[~2020-11-04  1:32 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-30 14:46 [PATCH v2 0/2] Add a seqcount between gup_fast and copy_page_range() Jason Gunthorpe
2020-10-30 14:46 ` [PATCH v2 1/2] mm: reorganize internal_get_user_pages_fast() Jason Gunthorpe
2020-10-30 16:29   ` Jan Kara
2020-10-30 21:31   ` John Hubbard
2020-10-30 22:36   ` Peter Xu
2020-10-30 14:46 ` [PATCH v2 2/2] mm: prevent gup_fast from racing with COW during fork Jason Gunthorpe
2020-10-30 16:51   ` Jan Kara
2020-10-30 17:02     ` Jason Gunthorpe
2020-11-02  8:31       ` Jan Kara
2020-10-30 21:20   ` John Hubbard
2020-10-30 22:52   ` Peter Xu
2020-10-30 23:51     ` Jason Gunthorpe
2020-10-31 15:26       ` Peter Xu
2020-11-03  0:33         ` Ahmed S. Darwish
2020-11-03  0:17       ` Ahmed S. Darwish
2020-11-03  0:25         ` Jason Gunthorpe
2020-11-03  0:41           ` Ahmed S. Darwish
2020-11-03  2:20             ` John Hubbard
2020-11-03  6:52               ` Ahmed S. Darwish
2020-11-03  7:05                 ` John Hubbard
2020-11-03 17:40                 ` Linus Torvalds
2020-11-03 17:40                   ` Linus Torvalds
2020-11-04  1:32                   ` Ahmed S. Darwish [this message]
2020-11-04  2:01                     ` John Hubbard
2020-11-04  3:17                       ` Ahmed S. Darwish
2020-11-04 18:38                         ` Linus Torvalds
2020-11-04 18:38                           ` Linus Torvalds
2020-11-04 19:54                           ` Ahmed S. Darwish
2020-12-09 18:38                           ` [tip: locking/core] seqlock: Prefix internal seqcount_t-only macros with a "do_" tip-bot2 for Ahmed S. Darwish
2020-12-09 18:38                           ` [tip: locking/core] seqlock: kernel-doc: Specify when preemption is automatically altered tip-bot2 for Ahmed S. Darwish
2020-11-10 11:53                         ` [PATCH v2 2/2] mm: prevent gup_fast from racing with COW during fork Peter Zijlstra
2020-12-03 10:35                           ` [tip: locking/core] seqlock: Rename __seqprop() users tip-bot2 for Peter Zijlstra
2020-11-03 17:03         ` [PATCH v2 2/2] mm: prevent gup_fast from racing with COW during fork Peter Xu
2020-11-02 23:58   ` Ahmed S. Darwish
2020-11-02 22:19 ` [PATCH v2 0/2] Add a seqcount between gup_fast and copy_page_range() Ahmed S. Darwish
2020-11-02 22:39   ` Linus Torvalds
2020-11-02 22:39     ` Linus Torvalds
2020-11-02 23:18     ` Ahmed S. Darwish

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201104013212.GA82153@lx-t490 \
    --to=a.darwish@linutronix.de \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=bigeasy@linutronix.de \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill@shutemov.name \
    --cc=ktkhai@virtuozzo.com \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.