All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: David Howells <dhowells@redhat.com>
Cc: Jan Kara <jack@suse.cz>, linux-nvdimm <linux-nvdimm@lists.01.org>,
	david <david@fromorbit.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ralf Baechle <ralf@linux-mips.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [RFC][PATCH] sched/wait_bit: Introduce wait_var_event()/wake_up_var()
Date: Thu, 15 Mar 2018 12:51:46 +0100	[thread overview]
Message-ID: <20180315115146.GX4064@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <3396.1521107922@warthog.procyon.org.uk>

On Thu, Mar 15, 2018 at 09:58:42AM +0000, David Howells wrote:
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > > Argh, no no no.. That whole wait_for_atomic_t thing is a giant
> > > > trainwreck already and now you're making it worse still.
> 
> Your patch description needs to say why this isn't a trainwreck when you
> consider wait_for_atomic_t() to be one since it does things in a very similar
> way.

Does the below address things sufficiently clear?

---

Subject: sched/wait: Introduce wait_var_event()
From: Peter Zijlstra <peterz@infradead.org>
Date: Thu Mar 15 11:40:33 CET 2018

As a replacement for the wait_on_atomic_t() API provide the
wait_var_event() API.

The wait_var_event() API is based on the very same hashed-waitqueue
idea, but doesn't care about the type (atomic_t) or the specific
condition (atomic_read() == 0). IOW. it's much more widely
applicable/flexible.

It shares all the benefits/disadvantages of a hashed-waitqueue
approach with the existing wait_on_atomic_t/wait_on_bit() APIs.

The API is modeled after the existing wait_event() API, but instead of
taking a wait_queue_head, it takes an address. This addresses is
hashed to obtain a wait_queue_head from the bit_wait_table.

Similar to the wait_event() API, it takes a condition expression as
second argument and will wait until this expression becomes true.

The following are (mostly) identical replacements:

	wait_on_atomic_t(&my_atomic, atomic_t_wait, TASK_UNINTERRUPTIBLE);
	wake_up_atomic_t(&my_atomic);

	wait_var_event(&my_atomic, !atomic_read(&my_atomic));
	wake_up_var(&my_atomic);

The only difference is that wake_up_var() is an unconditional wakeup
and doesn't check the previously hard-coded (atomic_read() == 0)
condition here. This is of little concequence, since most callers are
already conditional on atomic_dec_and_test() and the ones that are
not, are trivial to make so.

Cc: David Howells <dhowells@redhat.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/wait_bit.h |   70 +++++++++++++++++++++++++++++++++++++++++++++++
 kernel/sched/wait_bit.c  |   48 ++++++++++++++++++++++++++++++++
 2 files changed, 118 insertions(+)

--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -262,4 +262,74 @@ int wait_on_atomic_t(atomic_t *val, wait
 	return out_of_line_wait_on_atomic_t(val, action, mode);
 }
 
+extern void init_wait_var_entry(struct wait_bit_queue_entry *wbq_entry, void *var, int flags);
+extern void wake_up_var(void *var);
+extern wait_queue_head_t *__var_waitqueue(void *p);
+
+#define ___wait_var_event(var, condition, state, exclusive, ret, cmd)	\
+({									\
+	__label__ __out;						\
+	struct wait_queue_head *__wq_head = __var_waitqueue(var);	\
+	struct wait_bit_queue_entry __wbq_entry;			\
+	long __ret = ret; /* explicit shadow */				\
+									\
+	init_wait_var_entry(&__wbq_entry, var,				\
+			    exclusive ? WQ_FLAG_EXCLUSIVE : 0);		\
+	for (;;) {							\
+		long __int = prepare_to_wait_event(__wq_head,		\
+						   &__wbq_entry.wq_entry, \
+						   state);		\
+		if (condition)						\
+			break;						\
+									\
+		if (___wait_is_interruptible(state) && __int) {		\
+			__ret = __int;					\
+			goto __out;					\
+		}							\
+									\
+		cmd;							\
+	}								\
+	finish_wait(__wq_head, &__wbq_entry.wq_entry);			\
+__out:	__ret;								\
+})
+
+#define __wait_var_event(var, condition)				\
+	___wait_var_event(var, condition, TASK_UNINTERRUPTIBLE, 0, 0,	\
+			  schedule())
+
+#define wait_var_event(var, condition)					\
+do {									\
+	might_sleep();							\
+	if (condition)							\
+		break;							\
+	__wait_var_event(var, condition);				\
+} while (0)
+
+#define __wait_var_event_killable(var, condition)			\
+	___wait_var_event(var, condition, TASK_KILLABLE, 0, 0,		\
+			  schedule())
+
+#define wait_var_event_killable(var, condition)				\
+({									\
+	int __ret = 0;							\
+	might_sleep();							\
+	if (!(condition))						\
+		__ret = __wait_var_event_killable(var, condition);	\
+	__ret;								\
+})
+
+#define __wait_var_event_timeout(var, condition, timeout)		\
+	___wait_var_event(var, ___wait_cond_timeout(condition),		\
+			  TASK_UNINTERRUPTIBLE, 0, timeout,		\
+			  __ret = schedule_timeout(__ret))
+
+#define wait_var_event_timeout(var, condition, timeout)			\
+({									\
+	long __ret = timeout;						\
+	might_sleep();							\
+	if (!___wait_cond_timeout(condition))				\
+		__ret = __wait_var_event_timeout(var, condition, timeout); \
+	__ret;								\
+})
+
 #endif /* _LINUX_WAIT_BIT_H */
--- a/kernel/sched/wait_bit.c
+++ b/kernel/sched/wait_bit.c
@@ -149,6 +149,54 @@ void wake_up_bit(void *word, int bit)
 }
 EXPORT_SYMBOL(wake_up_bit);
 
+wait_queue_head_t *__var_waitqueue(void *p)
+{
+	if (BITS_PER_LONG == 64) {
+		unsigned long q = (unsigned long)p;
+
+		return bit_waitqueue((void *)(q & ~1), q & 1);
+	}
+	return bit_waitqueue(p, 0);
+}
+EXPORT_SYMBOL(__var_waitqueue);
+
+static int
+var_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode,
+		  int sync, void *arg)
+{
+	struct wait_bit_key *key = arg;
+	struct wait_bit_queue_entry *wbq_entry =
+		container_of(wq_entry, struct wait_bit_queue_entry, wq_entry);
+
+	if (wbq_entry->key.flags != key->flags ||
+	    wbq_entry->key.bit_nr != key->bit_nr)
+		return 0;
+
+	return autoremove_wake_function(wq_entry, mode, sync, key);
+}
+
+void init_wait_var_entry(struct wait_bit_queue_entry *wbq_entry, void *var, int flags)
+{
+	*wbq_entry = (struct wait_bit_queue_entry){
+		.key = {
+			.flags	= (var),
+			.bit_nr = -1,
+		},
+		.wq_entry = {
+			.private = current,
+			.func	 = var_wake_function,
+			.entry	 = LIST_HEAD_INIT(wbq_entry->wq_entry.entry),
+		},
+	};
+}
+EXPORT_SYMBOL(init_wait_var_entry);
+
+void wake_up_var(void *var)
+{
+	__wake_up_bit(__var_waitqueue(var), var, -1);
+}
+EXPORT_SYMBOL(wake_up_var);
+
 /*
  * Manipulate the atomic_t address to produce a better bit waitqueue table hash
  * index (we're keying off bit -1, but that would produce a horrible hash
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: David Howells <dhowells@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Ingo Molnar <mingo@redhat.com>, Christoph Hellwig <hch@lst.de>,
	david <david@fromorbit.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Jan Kara <jack@suse.cz>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ralf Baechle <ralf@linux-mips.org>
Subject: Re: [RFC][PATCH] sched/wait_bit: Introduce wait_var_event()/wake_up_var()
Date: Thu, 15 Mar 2018 12:51:46 +0100	[thread overview]
Message-ID: <20180315115146.GX4064@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <3396.1521107922@warthog.procyon.org.uk>

On Thu, Mar 15, 2018 at 09:58:42AM +0000, David Howells wrote:
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > > Argh, no no no.. That whole wait_for_atomic_t thing is a giant
> > > > trainwreck already and now you're making it worse still.
> 
> Your patch description needs to say why this isn't a trainwreck when you
> consider wait_for_atomic_t() to be one since it does things in a very similar
> way.

Does the below address things sufficiently clear?

---

Subject: sched/wait: Introduce wait_var_event()
From: Peter Zijlstra <peterz@infradead.org>
Date: Thu Mar 15 11:40:33 CET 2018

As a replacement for the wait_on_atomic_t() API provide the
wait_var_event() API.

The wait_var_event() API is based on the very same hashed-waitqueue
idea, but doesn't care about the type (atomic_t) or the specific
condition (atomic_read() == 0). IOW. it's much more widely
applicable/flexible.

It shares all the benefits/disadvantages of a hashed-waitqueue
approach with the existing wait_on_atomic_t/wait_on_bit() APIs.

The API is modeled after the existing wait_event() API, but instead of
taking a wait_queue_head, it takes an address. This addresses is
hashed to obtain a wait_queue_head from the bit_wait_table.

Similar to the wait_event() API, it takes a condition expression as
second argument and will wait until this expression becomes true.

The following are (mostly) identical replacements:

	wait_on_atomic_t(&my_atomic, atomic_t_wait, TASK_UNINTERRUPTIBLE);
	wake_up_atomic_t(&my_atomic);

	wait_var_event(&my_atomic, !atomic_read(&my_atomic));
	wake_up_var(&my_atomic);

The only difference is that wake_up_var() is an unconditional wakeup
and doesn't check the previously hard-coded (atomic_read() == 0)
condition here. This is of little concequence, since most callers are
already conditional on atomic_dec_and_test() and the ones that are
not, are trivial to make so.

Cc: David Howells <dhowells@redhat.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/wait_bit.h |   70 +++++++++++++++++++++++++++++++++++++++++++++++
 kernel/sched/wait_bit.c  |   48 ++++++++++++++++++++++++++++++++
 2 files changed, 118 insertions(+)

--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -262,4 +262,74 @@ int wait_on_atomic_t(atomic_t *val, wait
 	return out_of_line_wait_on_atomic_t(val, action, mode);
 }
 
+extern void init_wait_var_entry(struct wait_bit_queue_entry *wbq_entry, void *var, int flags);
+extern void wake_up_var(void *var);
+extern wait_queue_head_t *__var_waitqueue(void *p);
+
+#define ___wait_var_event(var, condition, state, exclusive, ret, cmd)	\
+({									\
+	__label__ __out;						\
+	struct wait_queue_head *__wq_head = __var_waitqueue(var);	\
+	struct wait_bit_queue_entry __wbq_entry;			\
+	long __ret = ret; /* explicit shadow */				\
+									\
+	init_wait_var_entry(&__wbq_entry, var,				\
+			    exclusive ? WQ_FLAG_EXCLUSIVE : 0);		\
+	for (;;) {							\
+		long __int = prepare_to_wait_event(__wq_head,		\
+						   &__wbq_entry.wq_entry, \
+						   state);		\
+		if (condition)						\
+			break;						\
+									\
+		if (___wait_is_interruptible(state) && __int) {		\
+			__ret = __int;					\
+			goto __out;					\
+		}							\
+									\
+		cmd;							\
+	}								\
+	finish_wait(__wq_head, &__wbq_entry.wq_entry);			\
+__out:	__ret;								\
+})
+
+#define __wait_var_event(var, condition)				\
+	___wait_var_event(var, condition, TASK_UNINTERRUPTIBLE, 0, 0,	\
+			  schedule())
+
+#define wait_var_event(var, condition)					\
+do {									\
+	might_sleep();							\
+	if (condition)							\
+		break;							\
+	__wait_var_event(var, condition);				\
+} while (0)
+
+#define __wait_var_event_killable(var, condition)			\
+	___wait_var_event(var, condition, TASK_KILLABLE, 0, 0,		\
+			  schedule())
+
+#define wait_var_event_killable(var, condition)				\
+({									\
+	int __ret = 0;							\
+	might_sleep();							\
+	if (!(condition))						\
+		__ret = __wait_var_event_killable(var, condition);	\
+	__ret;								\
+})
+
+#define __wait_var_event_timeout(var, condition, timeout)		\
+	___wait_var_event(var, ___wait_cond_timeout(condition),		\
+			  TASK_UNINTERRUPTIBLE, 0, timeout,		\
+			  __ret = schedule_timeout(__ret))
+
+#define wait_var_event_timeout(var, condition, timeout)			\
+({									\
+	long __ret = timeout;						\
+	might_sleep();							\
+	if (!___wait_cond_timeout(condition))				\
+		__ret = __wait_var_event_timeout(var, condition, timeout); \
+	__ret;								\
+})
+
 #endif /* _LINUX_WAIT_BIT_H */
--- a/kernel/sched/wait_bit.c
+++ b/kernel/sched/wait_bit.c
@@ -149,6 +149,54 @@ void wake_up_bit(void *word, int bit)
 }
 EXPORT_SYMBOL(wake_up_bit);
 
+wait_queue_head_t *__var_waitqueue(void *p)
+{
+	if (BITS_PER_LONG == 64) {
+		unsigned long q = (unsigned long)p;
+
+		return bit_waitqueue((void *)(q & ~1), q & 1);
+	}
+	return bit_waitqueue(p, 0);
+}
+EXPORT_SYMBOL(__var_waitqueue);
+
+static int
+var_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode,
+		  int sync, void *arg)
+{
+	struct wait_bit_key *key = arg;
+	struct wait_bit_queue_entry *wbq_entry =
+		container_of(wq_entry, struct wait_bit_queue_entry, wq_entry);
+
+	if (wbq_entry->key.flags != key->flags ||
+	    wbq_entry->key.bit_nr != key->bit_nr)
+		return 0;
+
+	return autoremove_wake_function(wq_entry, mode, sync, key);
+}
+
+void init_wait_var_entry(struct wait_bit_queue_entry *wbq_entry, void *var, int flags)
+{
+	*wbq_entry = (struct wait_bit_queue_entry){
+		.key = {
+			.flags	= (var),
+			.bit_nr = -1,
+		},
+		.wq_entry = {
+			.private = current,
+			.func	 = var_wake_function,
+			.entry	 = LIST_HEAD_INIT(wbq_entry->wq_entry.entry),
+		},
+	};
+}
+EXPORT_SYMBOL(init_wait_var_entry);
+
+void wake_up_var(void *var)
+{
+	__wake_up_bit(__var_waitqueue(var), var, -1);
+}
+EXPORT_SYMBOL(wake_up_var);
+
 /*
  * Manipulate the atomic_t address to produce a better bit waitqueue table hash
  * index (we're keying off bit -1, but that would produce a horrible hash

  parent reply	other threads:[~2018-03-15 11:45 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-10  6:54 [PATCH v5 00/11] dax: fix dma vs truncate/hole-punch Dan Williams
2018-03-10  6:54 ` Dan Williams
2018-03-10  6:54 ` Dan Williams
2018-03-10  6:54 ` Dan Williams
2018-03-10  6:54 ` [PATCH v5 01/11] dax: store pfns in the radix Dan Williams
2018-03-10  6:54   ` Dan Williams
2018-03-10  6:54   ` Dan Williams
2018-03-10  6:54 ` [PATCH v5 02/11] xfs, dax: introduce xfs_dax_aops Dan Williams
2018-03-10  6:54   ` Dan Williams
2018-03-10  6:54   ` Dan Williams
2018-03-10  9:46   ` Christoph Hellwig
2018-03-10  9:46     ` Christoph Hellwig
2018-03-10 17:40     ` Dan Williams
2018-03-10 17:40       ` Dan Williams
2018-03-11 19:16       ` Dan Williams
2018-03-11 19:16         ` Dan Williams
2018-03-12  7:51         ` Christoph Hellwig
2018-03-12  7:51           ` Christoph Hellwig
2018-03-10  6:55 ` [PATCH v5 03/11] ext4, dax: introduce ext4_dax_aops Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55 ` [PATCH v5 04/11] ext2, dax: introduce ext2_dax_aops Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55 ` [PATCH v5 05/11] fs, dax: use page->mapping to warn if truncate collides with a busy page Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55 ` [PATCH v5 06/11] mm, dax: enable filesystems to trigger dev_pagemap ->page_free callbacks Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-12 14:09   ` Jerome Glisse
2018-03-12 14:09     ` Jerome Glisse
2018-03-12 14:09     ` Jerome Glisse
2018-03-10  6:55 ` [PATCH v5 07/11] mm, dev_pagemap: introduce CONFIG_DEV_PAGEMAP_OPS Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-12 14:17   ` Jerome Glisse
2018-03-12 14:17     ` Jerome Glisse
2018-03-12 14:17     ` Jerome Glisse
2018-03-12 18:17     ` Dan Williams
2018-03-12 18:17       ` Dan Williams
2018-03-10  6:55 ` [PATCH v5 08/11] wait_bit: introduce {wait_on,wake_up}_atomic_one Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-11 11:27   ` [PATCH v5 08/11] wait_bit: introduce {wait_on, wake_up}_atomic_one Peter Zijlstra
2018-03-11 11:27     ` [PATCH v5 08/11] wait_bit: introduce {wait_on,wake_up}_atomic_one Peter Zijlstra
2018-03-11 17:15     ` [PATCH v5 08/11] wait_bit: introduce {wait_on, wake_up}_atomic_one Dan Williams
2018-03-11 17:15       ` [PATCH v5 08/11] wait_bit: introduce {wait_on,wake_up}_atomic_one Dan Williams
2018-03-12 19:32       ` [PATCH v5 08/11] wait_bit: introduce {wait_on, wake_up}_atomic_one Dan Williams
2018-03-12 19:32         ` [PATCH v5 08/11] wait_bit: introduce {wait_on,wake_up}_atomic_one Dan Williams
2018-03-13 10:20       ` [RFC][PATCH] sched/wait_bit: Introduce wait_var_event()/wake_up_var() Peter Zijlstra
2018-03-13 10:20         ` Peter Zijlstra
2018-03-14  4:12         ` Dan Williams
2018-03-14  4:12           ` Dan Williams
2018-03-15  5:46         ` Dan Williams
2018-03-15  5:46           ` Dan Williams
2018-03-15  9:58       ` David Howells
2018-03-15  9:58         ` David Howells
2018-03-15 11:19         ` Peter Zijlstra
2018-03-15 11:19           ` Peter Zijlstra
2018-03-15 11:51         ` Peter Zijlstra [this message]
2018-03-15 11:51           ` Peter Zijlstra
2018-03-15 14:45         ` David Howells
2018-03-15 14:45           ` David Howells
2018-03-15 14:53           ` Peter Zijlstra
2018-03-15 14:53             ` Peter Zijlstra
2018-03-10  6:55 ` [PATCH v5 09/11] mm, fs, dax: handle layout changes to pinned dax mappings Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55 ` [PATCH v5 10/11] xfs: prepare xfs_break_layouts() for another layout type Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  9:51   ` Christoph Hellwig
2018-03-10  9:51     ` Christoph Hellwig
2018-03-10  6:55 ` [PATCH v5 11/11] xfs, dax: introduce xfs_break_dax_layouts() Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  6:55   ` Dan Williams
2018-03-10  9:55   ` Christoph Hellwig
2018-03-10  9:55     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180315115146.GX4064@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=ralf@linux-mips.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.