All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
@ 2014-03-18 13:10 Peng Tao
  2014-03-18 13:33 ` Peter Zijlstra
  0 siblings, 1 reply; 21+ messages in thread
From: Peng Tao @ 2014-03-18 13:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peng Tao, Ingo Molnar, Peter Zijlstra, Oleg Drokin, Andreas Dilger

Normally wait_queue_t is a FIFO list for exclusive waiting tasks.
As a side effect, if there are many threads waiting on the same
condition (which is common for data servers like Lustre), all
threads will be waken up again and again, causing unnecessary cache
line polution. Instead of FIFO lists, we can use LIFO lists to always
wake up the most recent active threads.

Lustre implements this add_wait_queue_exclusive_head() privately but we
think it might be useful as a generic function. With it being moved to
generic layer, the rest of Lustre private wrappers for wait queue can be
all removed.

Of course there is an alternative approach to just open code it but we'd
like to ask first to see if there is objection to making it generic.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/include/linux/libcfs/libcfs_prim.h      |    1 -
 .../lustre/lustre/libcfs/linux/linux-prim.c        |   24 --------------------
 include/linux/wait.h                               |    2 +
 kernel/sched/wait.c                                |   23 +++++++++++++++++++
 4 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_prim.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_prim.h
index e6e417a..c23b78c 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_prim.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_prim.h
@@ -54,7 +54,6 @@ void schedule_timeout_and_set_state(long, int64_t);
 void init_waitqueue_entry_current(wait_queue_t *link);
 int64_t waitq_timedwait(wait_queue_t *, long, int64_t);
 void waitq_wait(wait_queue_t *, long);
-void add_wait_queue_exclusive_head(wait_queue_head_t *, wait_queue_t *);
 
 void cfs_init_timer(struct timer_list *t);
 void cfs_timer_init(struct timer_list *t, cfs_timer_func_t *func, void *arg);
diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-prim.c b/drivers/staging/lustre/lustre/libcfs/linux/linux-prim.c
index c7bc7fc..13b4a80 100644
--- a/drivers/staging/lustre/lustre/libcfs/linux/linux-prim.c
+++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-prim.c
@@ -53,30 +53,6 @@ init_waitqueue_entry_current(wait_queue_t *link)
 }
 EXPORT_SYMBOL(init_waitqueue_entry_current);
 
-/**
- * wait_queue_t of Linux (version < 2.6.34) is a FIFO list for exclusively
- * waiting threads, which is not always desirable because all threads will
- * be waken up again and again, even user only needs a few of them to be
- * active most time. This is not good for performance because cache can
- * be polluted by different threads.
- *
- * LIFO list can resolve this problem because we always wakeup the most
- * recent active thread by default.
- *
- * NB: please don't call non-exclusive & exclusive wait on the same
- * waitq if add_wait_queue_exclusive_head is used.
- */
-void
-add_wait_queue_exclusive_head(wait_queue_head_t *waitq, wait_queue_t *link)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&waitq->lock, flags);
-	__add_wait_queue_exclusive(waitq, link);
-	spin_unlock_irqrestore(&waitq->lock, flags);
-}
-EXPORT_SYMBOL(add_wait_queue_exclusive_head);
-
 void
 waitq_wait(wait_queue_t *link, long state)
 {
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 559044c..634a49c 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -105,6 +105,8 @@ static inline int waitqueue_active(wait_queue_head_t *q)
 
 extern void add_wait_queue(wait_queue_head_t *q, wait_queue_t *wait);
 extern void add_wait_queue_exclusive(wait_queue_head_t *q, wait_queue_t *wait);
+extern void add_wait_queue_exclusive_head(wait_queue_head_t *q,
+					  wait_queue_t *wait);
 extern void remove_wait_queue(wait_queue_head_t *q, wait_queue_t *wait);
 
 static inline void __add_wait_queue(wait_queue_head_t *head, wait_queue_t *new)
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 7d50f79..69925c3 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -30,6 +30,29 @@ void add_wait_queue(wait_queue_head_t *q, wait_queue_t *wait)
 }
 EXPORT_SYMBOL(add_wait_queue);
 
+/**
+ * wait_queue_t is a FIFO list for exclusively waiting threads, which is
+ * not always desirable because all threads will be waken up again and
+ * again, even user only needs a few of them to be active most time. This
+ * is not good for performance because cache can be polluted by different
+ * threads.
+ *
+ * LIFO list can resolve this problem because we always wakeup the most
+ * recent active thread by default.
+ *
+ * NB: please don't call non-exclusive & exclusive wait on the same
+ * waitq if add_wait_queue_exclusive_head is used.
+ */
+void add_wait_queue_exclusive_head(wait_queue_head_t *q, wait_queue_t *wait)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&q->lock, flags);
+	__add_wait_queue_exclusive(q, wait);
+	spin_unlock_irqrestore(&q->lock, flags);
+}
+EXPORT_SYMBOL(add_wait_queue_exclusive_head);
+
 void add_wait_queue_exclusive(wait_queue_head_t *q, wait_queue_t *wait)
 {
 	unsigned long flags;
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-18 13:10 [PATCH RFC] sched: introduce add_wait_queue_exclusive_head Peng Tao
@ 2014-03-18 13:33 ` Peter Zijlstra
  2014-03-18 13:51   ` Peng Tao
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Zijlstra @ 2014-03-18 13:33 UTC (permalink / raw)
  To: Peng Tao
  Cc: linux-kernel, Ingo Molnar, Oleg Drokin, Andreas Dilger, Oleg Nesterov

On Tue, Mar 18, 2014 at 09:10:08PM +0800, Peng Tao wrote:
> Normally wait_queue_t is a FIFO list for exclusive waiting tasks.
> As a side effect, if there are many threads waiting on the same
> condition (which is common for data servers like Lustre), all
> threads will be waken up again and again, causing unnecessary cache
> line polution. Instead of FIFO lists, we can use LIFO lists to always
> wake up the most recent active threads.
> 
> Lustre implements this add_wait_queue_exclusive_head() privately but we
> think it might be useful as a generic function. With it being moved to
> generic layer, the rest of Lustre private wrappers for wait queue can be
> all removed.
> 
> Of course there is an alternative approach to just open code it but we'd
> like to ask first to see if there is objection to making it generic.

OK, so I don't particularly mind LIFO, but there are a few problems with
the patch.

Firstly I think the _head postfix for LIFO is a bad name, and secondly,
and most important, this breaks __wake_up_common().

So waitqueue wakeups are specified as waking all !exclusive tasks and @n
exclusive tasks. The way this works is that !exclusive tasks are added
to the head (LIFO) and exclusive tasks are added to the tail
(FIFO).

We can then iterate the list until @n exclusive tasks have been
observed.

However if you start adding exclusive tasks to the head this all comes
apart.

If you don't mix exclusive and !exclusive tasks on the same waitqueue
this isn't a problem, but I'm sure people will eventually do this and
get a nasty surprise.

I'm not sure what the best way around this would be; but I can see two
options:

 - add enough debugging bits to detect this fail case.
 - extend wait_queue_head_t to keep a pointer to the first !exclusive
   element and insert exclusive LIFO tasks there -- thereby keeping
   !exclusive tasks at the front.

Anybody?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-18 13:33 ` Peter Zijlstra
@ 2014-03-18 13:51   ` Peng Tao
  2014-03-18 14:05     ` Peter Zijlstra
  0 siblings, 1 reply; 21+ messages in thread
From: Peng Tao @ 2014-03-18 13:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linux Kernel Mailing List, Ingo Molnar, Oleg Drokin,
	Andreas Dilger, Oleg Nesterov

On Tue, Mar 18, 2014 at 9:33 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Mar 18, 2014 at 09:10:08PM +0800, Peng Tao wrote:
>> Normally wait_queue_t is a FIFO list for exclusive waiting tasks.
>> As a side effect, if there are many threads waiting on the same
>> condition (which is common for data servers like Lustre), all
>> threads will be waken up again and again, causing unnecessary cache
>> line polution. Instead of FIFO lists, we can use LIFO lists to always
>> wake up the most recent active threads.
>>
>> Lustre implements this add_wait_queue_exclusive_head() privately but we
>> think it might be useful as a generic function. With it being moved to
>> generic layer, the rest of Lustre private wrappers for wait queue can be
>> all removed.
>>
>> Of course there is an alternative approach to just open code it but we'd
>> like to ask first to see if there is objection to making it generic.
>
> OK, so I don't particularly mind LIFO, but there are a few problems with
> the patch.
>
> Firstly I think the _head postfix for LIFO is a bad name,
Do you have any preference on the name? add_wait_queue_exclusive_lifo()?

> and secondly,
> and most important, this breaks __wake_up_common().
>
> So waitqueue wakeups are specified as waking all !exclusive tasks and @n
> exclusive tasks. The way this works is that !exclusive tasks are added
> to the head (LIFO) and exclusive tasks are added to the tail
> (FIFO).
>
> We can then iterate the list until @n exclusive tasks have been
> observed.
>
> However if you start adding exclusive tasks to the head this all comes
> apart.
>
> If you don't mix exclusive and !exclusive tasks on the same waitqueue
> this isn't a problem, but I'm sure people will eventually do this and
> get a nasty surprise.
>
Yes, Lustre takes care not to mix exclusive and !exclusive tasks in this case.

> I'm not sure what the best way around this would be; but I can see two
> options:
>
>  - add enough debugging bits to detect this fail case.
>  - extend wait_queue_head_t to keep a pointer to the first !exclusive
>    element and insert exclusive LIFO tasks there -- thereby keeping
>    !exclusive tasks at the front.
>
Thank you for the suggestions. Personally I am in favor of the second
one but I'll wait others to comment first.

Thanks,
Tao

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-18 13:51   ` Peng Tao
@ 2014-03-18 14:05     ` Peter Zijlstra
  2014-03-18 14:44       ` Peng Tao
  2014-03-18 15:47       ` Oleg Nesterov
  0 siblings, 2 replies; 21+ messages in thread
From: Peter Zijlstra @ 2014-03-18 14:05 UTC (permalink / raw)
  To: Peng Tao
  Cc: Linux Kernel Mailing List, Ingo Molnar, Oleg Drokin,
	Andreas Dilger, Oleg Nesterov

On Tue, Mar 18, 2014 at 09:51:04PM +0800, Peng Tao wrote:
> > Firstly I think the _head postfix for LIFO is a bad name,
> Do you have any preference on the name? add_wait_queue_exclusive_lifo()?

I think we can avoid the entire function if we add
WQ_FLAG_LIFO and make prepare_to_wait_event() DTRT.

Then we only need to teach ___wait() about this; and I suppose we could
make .exclusive=-1 be the LIFO flag.

Unless you cannot use ___wait() and really need to open-code the
wait_event() stuff.

> > If you don't mix exclusive and !exclusive tasks on the same waitqueue
> > this isn't a problem, but I'm sure people will eventually do this and
> > get a nasty surprise.
> >
> Yes, Lustre takes care not to mix exclusive and !exclusive tasks in this case.

Right; I saw you had a comment to that effect after I wrote this email.

> > I'm not sure what the best way around this would be; but I can see two
> > options:
> >
> >  - add enough debugging bits to detect this fail case.
> >  - extend wait_queue_head_t to keep a pointer to the first !exclusive

s/!//

> >    element and insert exclusive LIFO tasks there -- thereby keeping
> >    !exclusive tasks at the front.
> >
> Thank you for the suggestions. Personally I am in favor of the second
> one but I'll wait others to comment first.

Oleg, Ingo, any preferences?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-18 14:05     ` Peter Zijlstra
@ 2014-03-18 14:44       ` Peng Tao
  2014-03-18 16:23         ` Oleg Nesterov
  2014-03-18 15:47       ` Oleg Nesterov
  1 sibling, 1 reply; 21+ messages in thread
From: Peng Tao @ 2014-03-18 14:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linux Kernel Mailing List, Ingo Molnar, Oleg Drokin,
	Andreas Dilger, Oleg Nesterov

On Tue, Mar 18, 2014 at 10:05 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Mar 18, 2014 at 09:51:04PM +0800, Peng Tao wrote:
>> > Firstly I think the _head postfix for LIFO is a bad name,
>> Do you have any preference on the name? add_wait_queue_exclusive_lifo()?
>
> I think we can avoid the entire function if we add
> WQ_FLAG_LIFO and make prepare_to_wait_event() DTRT.
>
> Then we only need to teach ___wait() about this; and I suppose we could
> make .exclusive=-1 be the LIFO flag.
>
> Unless you cannot use ___wait() and really need to open-code the
> wait_event() stuff.
>
Lustre's private l_wait_event() stuff takes care to (un)mask
LUSTRE_FATAL_SIGS and always wait in TASK_INTERRUPTIBLE state.

It looks to me that we can at least wrap l_wait_event() on top of
wait_event_interruptible/wait_event_timeout_interruptible.

Andreas, any opinions? Was l_wait_event added just for cross platform
support or is there some deeper reason that I missed?

Thanks,
Tao

>> > If you don't mix exclusive and !exclusive tasks on the same waitqueue
>> > this isn't a problem, but I'm sure people will eventually do this and
>> > get a nasty surprise.
>> >
>> Yes, Lustre takes care not to mix exclusive and !exclusive tasks in this case.
>
> Right; I saw you had a comment to that effect after I wrote this email.
>
>> > I'm not sure what the best way around this would be; but I can see two
>> > options:
>> >
>> >  - add enough debugging bits to detect this fail case.
>> >  - extend wait_queue_head_t to keep a pointer to the first !exclusive
>
> s/!//
>
>> >    element and insert exclusive LIFO tasks there -- thereby keeping
>> >    !exclusive tasks at the front.
>> >
>> Thank you for the suggestions. Personally I am in favor of the second
>> one but I'll wait others to comment first.
>
> Oleg, Ingo, any preferences?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-18 14:05     ` Peter Zijlstra
  2014-03-18 14:44       ` Peng Tao
@ 2014-03-18 15:47       ` Oleg Nesterov
  2014-03-19  2:17         ` Peng Tao
  1 sibling, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2014-03-18 15:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Peng Tao, Linux Kernel Mailing List, Ingo Molnar, Oleg Drokin,
	Andreas Dilger

On 03/18, Peter Zijlstra wrote:
>
> I think we can avoid the entire function if we add
> WQ_FLAG_LIFO and make prepare_to_wait_event() DTRT.

Agreed, this looks very natural.

prepare_to_wait_event(WQ_FLAG_LIFO) should probably skip all "flags == 0"
entries before list_add(). Given that it is supposed that the users won't
mix exclusive and !exclusive, perhaps the additional list_for_each() won't
hurt?

> Then we only need to teach ___wait() about this; and I suppose we could
> make .exclusive=-1 be the LIFO flag.

Or we can change ___wait_event()

	-	if (exclusive)							\
	-		__wait.flags = WQ_FLAG_EXCLUSIVE;			\
	-	else								\
	-		__wait.flags = 0;					\
	+	__wait.flags = exclusive;					\

and obviously change the callers. Perhaps this argument should be renamed
then.

But I am fine either way, just I like the idea to extend the current
interface.

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-18 14:44       ` Peng Tao
@ 2014-03-18 16:23         ` Oleg Nesterov
  2014-03-19  2:22           ` Peng Tao
  0 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2014-03-18 16:23 UTC (permalink / raw)
  To: Peng Tao
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Ingo Molnar,
	Oleg Drokin, Andreas Dilger

On 03/18, Peng Tao wrote:
>
> On Tue, Mar 18, 2014 at 10:05 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > Unless you cannot use ___wait() and really need to open-code the
> > wait_event() stuff.
> >
> Lustre's private l_wait_event() stuff takes care to (un)mask
> LUSTRE_FATAL_SIGS

Hmm. This is off-topic but after the quick grep LUSTRE_FATAL_SIGS/etc
looks suspicious.

Firtsly, cfs_block_sigs/cfs_block_sigsinv/etc are not exactly right,
they need set_current_blocked(). And you can read "old" lockless.

And note that cfs_block_sigsinv(0) (which should block all signals)
can't actually protect from SIGKILL (or in fact from another fatal
signal) or SIGSTOP if the caller is multithreaded. Or ptrace, or
freezer.

> and always wait in TASK_INTERRUPTIBLE state.

and it seems that __wstate passed to waitq_wait/waitq_timedwait is
simply ignored.

> It looks to me that we can at least wrap l_wait_event() on top of
> wait_event_interruptible/wait_event_timeout_interruptible.

l_wait_event looks really complicated ;) but perhaps you can rewrite
it on top of ___wait_event(), note that condition/cmd can do anything
you want.

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-18 15:47       ` Oleg Nesterov
@ 2014-03-19  2:17         ` Peng Tao
       [not found]           ` <20140319164907.GA10113@redhat.com>
  0 siblings, 1 reply; 21+ messages in thread
From: Peng Tao @ 2014-03-19  2:17 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Ingo Molnar,
	Oleg Drokin, Andreas Dilger

On Tue, Mar 18, 2014 at 11:47 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/18, Peter Zijlstra wrote:
>>
>> I think we can avoid the entire function if we add
>> WQ_FLAG_LIFO and make prepare_to_wait_event() DTRT.
>
> Agreed, this looks very natural.
>
> prepare_to_wait_event(WQ_FLAG_LIFO) should probably skip all "flags == 0"
> entries before list_add(). Given that it is supposed that the users won't
> mix exclusive and !exclusive, perhaps the additional list_for_each() won't
> hurt?
>
So please allow me to summary and define the semantics of wait/wake_up
w.r.t. this.

prepare_to_wait_event(0): task is added at the head of the queue.

prepare_to_wait_event(WQ_FLAG_LIFO): task is added at the head of
exclusive queue head

prepare_to_wait_event(WQ_FLAG_EXCLUSIVE): task is added at the tail of the queue

Maybe we should rename the flags to WQ_FLAG_EXCLUSIVE_FIFO and
WQ_FLAG_EXCLUSIVE_LIFO?

>> Then we only need to teach ___wait() about this; and I suppose we could
>> make .exclusive=-1 be the LIFO flag.
>
> Or we can change ___wait_event()
>
>         -       if (exclusive)                                                  \
>         -               __wait.flags = WQ_FLAG_EXCLUSIVE;                       \
>         -       else                                                            \
>         -               __wait.flags = 0;                                       \
>         +       __wait.flags = exclusive;                                       \
>
> and obviously change the callers. Perhaps this argument should be renamed
> then.
>
Current __wait.flags allows possible extension to the existing
interface. If we change it to __wait.flags = exclusive, we drop the
future extensibility. Is it acceptable?

Thanks,
Tao

> But I am fine either way, just I like the idea to extend the current
> interface.
>
> Oleg.
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-18 16:23         ` Oleg Nesterov
@ 2014-03-19  2:22           ` Peng Tao
  2014-03-19 17:33             ` Oleg Nesterov
  0 siblings, 1 reply; 21+ messages in thread
From: Peng Tao @ 2014-03-19  2:22 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Ingo Molnar,
	Oleg Drokin, Andreas Dilger

On Wed, Mar 19, 2014 at 12:23 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/18, Peng Tao wrote:
>>
>> On Tue, Mar 18, 2014 at 10:05 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> >
>> > Unless you cannot use ___wait() and really need to open-code the
>> > wait_event() stuff.
>> >
>> Lustre's private l_wait_event() stuff takes care to (un)mask
>> LUSTRE_FATAL_SIGS
>
> Hmm. This is off-topic but after the quick grep LUSTRE_FATAL_SIGS/etc
> looks suspicious.
>
> Firtsly, cfs_block_sigs/cfs_block_sigsinv/etc are not exactly right,
> they need set_current_blocked(). And you can read "old" lockless.
>
It seems that set_current_blocked() is not exported. Can we ask to export it?

And looking at other similar places like coda_block_signals(), it
doesn't call set_current_blocked() either. So it needs
set_current_blocked() as well.

> And note that cfs_block_sigsinv(0) (which should block all signals)
> can't actually protect from SIGKILL (or in fact from another fatal
> signal) or SIGSTOP if the caller is multithreaded. Or ptrace, or
> freezer.
>
>> and always wait in TASK_INTERRUPTIBLE state.
>
> and it seems that __wstate passed to waitq_wait/waitq_timedwait is
> simply ignored.
>
Yes. That needs to be dropped.

>> It looks to me that we can at least wrap l_wait_event() on top of
>> wait_event_interruptible/wait_event_timeout_interruptible.
>
> l_wait_event looks really complicated ;) but perhaps you can rewrite
> it on top of ___wait_event(), note that condition/cmd can do anything
> you want.
>
Yeah, I meant to say __wait_event(). Thanks for correcting me.

Thanks,
Tao

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
       [not found]           ` <20140319164907.GA10113@redhat.com>
@ 2014-03-19 16:57             ` Peter Zijlstra
  2014-03-19 17:19               ` Oleg Nesterov
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Zijlstra @ 2014-03-19 16:57 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peng Tao, Linux Kernel Mailing List, Ingo Molnar, Oleg Drokin,
	Andreas Dilger

On Wed, Mar 19, 2014 at 05:49:07PM +0100, Oleg Nesterov wrote:
> +static void add_wait_queue_flag(wait_queue_head_t *q, wait_queue_t *wait)
> +{
> +	struct list_head *head = &q->task_list;
> +	wait_queue_t *excl;
> +
> +	if (wait->flags & WQ_FLAG_EXCLUSIVE) {
> +		if (wait->flags & WQ_FLAG_EXCLUSIVE_HEAD) {
> +			list_for_each_entry(excl, head, task_list)
> +				if (excl->flags & WQ_FLAG_EXCLUSIVE) {
> +					head = &excl->task_list;
> +					break;
> +				}

I prefer an extra pair of { } here, but the main concern would be the
cost of that iteration.

> +		}
> +		/* turn list_add() below into list_add_tail() */
> +		head = head->prev;
> +	}
> +
> +	list_add(&wait->task_list, head);
> +}

Other than that, yes something like that would do I suppose.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-19 16:57             ` Peter Zijlstra
@ 2014-03-19 17:19               ` Oleg Nesterov
  2014-03-20 17:51                 ` [PATCH 0/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD Oleg Nesterov
  0 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2014-03-19 17:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Peng Tao, Linux Kernel Mailing List, Ingo Molnar, Oleg Drokin,
	Andreas Dilger

On 03/19, Peter Zijlstra wrote:
>
> On Wed, Mar 19, 2014 at 05:49:07PM +0100, Oleg Nesterov wrote:
> > +static void add_wait_queue_flag(wait_queue_head_t *q, wait_queue_t *wait)
> > +{
> > +	struct list_head *head = &q->task_list;
> > +	wait_queue_t *excl;
> > +
> > +	if (wait->flags & WQ_FLAG_EXCLUSIVE) {
> > +		if (wait->flags & WQ_FLAG_EXCLUSIVE_HEAD) {
> > +			list_for_each_entry(excl, head, task_list)
> > +				if (excl->flags & WQ_FLAG_EXCLUSIVE) {
> > +					head = &excl->task_list;
> > +					break;
> > +				}
>
> I prefer an extra pair of { } here,

OK,

> but the main concern would be the
> cost of that iteration.

Yes.

This change assumes that we do not mix exclusive and !exclusive, in
this case list_for_each_entry() is cheap, the list is either empty
or the first entry is WQ_FLAG_EXCLUSIVE.

Otherwise the user should blame itself, but the code still will work
correctly.

Or we can do

	if (WQ_FLAG_EXCLUSIVE_HEAD) {
		WARN_ON(!list_empty(head) &&
			(list_first_entry(...)-flags & WQ_FLAG_EXCLUSIVE));
		...
	}

> Other than that, yes something like that would do I suppose.

OK, I'll try to test/cleanup/resend tomorrow.

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-19  2:22           ` Peng Tao
@ 2014-03-19 17:33             ` Oleg Nesterov
  2014-03-19 19:44               ` Dilger, Andreas
  0 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2014-03-19 17:33 UTC (permalink / raw)
  To: Peng Tao
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Ingo Molnar,
	Oleg Drokin, Andreas Dilger

On 03/19, Peng Tao wrote:
>
> On Wed, Mar 19, 2014 at 12:23 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Firtsly, cfs_block_sigs/cfs_block_sigsinv/etc are not exactly right,
> > they need set_current_blocked(). And you can read "old" lockless.
> >
> It seems that set_current_blocked() is not exported. Can we ask to export it?

Why not. If you are going to change this code to use set_current_blocked(),
I'd suggest you to send the "export set_current_blocked" patch in series.
Otherwise, if it is sent separately, your change will depend on another
tree.

Or you can use sigprocmask(). Actually it should die, but this won't happen
soon and it is already exported.

> And looking at other similar places like coda_block_signals(),

Yes, it can have much more users.

But note that set_current_blocked() can't help you to really block
SIGKILL anyway.

Could you explain why __l_wait_event() can't use TASK_UNINTERRUPTIBLE
instead of cfs_block_sigsinv(0) ?

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-19 17:33             ` Oleg Nesterov
@ 2014-03-19 19:44               ` Dilger, Andreas
  2014-03-19 19:55                 ` Peter Zijlstra
  0 siblings, 1 reply; 21+ messages in thread
From: Dilger, Andreas @ 2014-03-19 19:44 UTC (permalink / raw)
  To: Oleg Nesterov, Peng Tao
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Ingo Molnar, Drokin, Oleg

On 2014/03/19, 11:33 AM, "Oleg Nesterov" <oleg@redhat.com> wrote:
>On 03/19, Peng Tao wrote:
>>
>> On Wed, Mar 19, 2014 at 12:23 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> >
>> > Firtsly, cfs_block_sigs/cfs_block_sigsinv/etc are not exactly right,
>> > they need set_current_blocked(). And you can read "old" lockless.
>> >
>> It seems that set_current_blocked() is not exported. Can we ask to
>>export it?
>
>Why not. If you are going to change this code to use
>set_current_blocked(), I'd suggest you to send the "export
>set_current_blocked" patch in series. Otherwise, if it is sent
> separately, your change will depend on another tree.
>
>Or you can use sigprocmask(). Actually it should die, but this won't
>happen soon and it is already exported.
>
>> And looking at other similar places like coda_block_signals(),
>
>Yes, it can have much more users.
>
>But note that set_current_blocked() can't help you to really block
>SIGKILL anyway.
>
>Could you explain why __l_wait_event() can't use TASK_UNINTERRUPTIBLE
>instead of cfs_block_sigsinv(0) ?

The original reason for l_wait_event() not using TASK_UNINTERRUPTIBLE
is to avoid the load on the server continually being "num_service_threads"
regardless of whether they are actually doing something or not.  We
added various cases for periodic wakeups and such afterward.

l_wait_event() was originally developed for 2.4 kernels, so there may
well be better primitives to use today.

I'd be happy to move toward replacing l_wait_event() with kernel
primitives if possible, but we need to ensure that this is tested
sufficiently since it can otherwise be a source of hard-to-find bugs.

Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-19 19:44               ` Dilger, Andreas
@ 2014-03-19 19:55                 ` Peter Zijlstra
  2014-03-20  7:06                   ` Dilger, Andreas
  2014-03-20 18:49                   ` Oleg Nesterov
  0 siblings, 2 replies; 21+ messages in thread
From: Peter Zijlstra @ 2014-03-19 19:55 UTC (permalink / raw)
  To: Dilger, Andreas
  Cc: Oleg Nesterov, Peng Tao, Linux Kernel Mailing List, Ingo Molnar,
	Drokin, Oleg

On Wed, Mar 19, 2014 at 07:44:29PM +0000, Dilger, Andreas wrote:
> The original reason for l_wait_event() not using TASK_UNINTERRUPTIBLE
> is to avoid the load on the server continually being "num_service_threads"
> regardless of whether they are actually doing something or not.  We
> added various cases for periodic wakeups and such afterward.

Hmm, maybe we should finally do the TASK_IDLE thing;

  https://lkml.org/lkml/2013/11/12/710

Oleg?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-19 19:55                 ` Peter Zijlstra
@ 2014-03-20  7:06                   ` Dilger, Andreas
  2014-03-20 18:49                   ` Oleg Nesterov
  1 sibling, 0 replies; 21+ messages in thread
From: Dilger, Andreas @ 2014-03-20  7:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, Peng Tao, Linux Kernel Mailing List, Ingo Molnar,
	Drokin, Oleg

On 2014/03/19, 1:55 PM, "Peter Zijlstra" <peterz@infradead.org> wrote:

>On Wed, Mar 19, 2014 at 07:44:29PM +0000, Dilger, Andreas wrote:
>> The original reason for l_wait_event() not using TASK_UNINTERRUPTIBLE
>> is to avoid the load on the server continually being
>>"num_service_threads"
>> regardless of whether they are actually doing something or not.  We
>> added various cases for periodic wakeups and such afterward.
>
>Hmm, maybe we should finally do the TASK_IDLE thing;
>
>  https://lkml.org/lkml/2013/11/12/710

That looks pretty interesting, and appears to do what we need.

Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 0/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD
  2014-03-19 17:19               ` Oleg Nesterov
@ 2014-03-20 17:51                 ` Oleg Nesterov
  2014-03-20 17:51                   ` [PATCH 1/2] wait: turn "bool exclusive" arg of __wait_event() into wflags Oleg Nesterov
                                     ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Oleg Nesterov @ 2014-03-20 17:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Peng Tao, Linux Kernel Mailing List, Ingo Molnar, Oleg Drokin,
	Andreas Dilger

On 03/19, Oleg Nesterov wrote:
>
> OK, I'll try to test/cleanup/resend tomorrow.

Cough. Still un-tested, sorry. I will test it somehow and report,
but I'd like to send this for review right now.

Because I simply can't decide what the new flag should actually
do, so please ack/nack the semantics/naming at least.

Changes:

	1. I decided it would be better to change __wait_event()
	   to accept wait.flags right now. This looks better in
	   any case to me, and otherwise we need to introduce the
	   __wait_exclusive_enum.

	   The change looks trivial (both actually), please tell
	   me if you think it doesn't deserve a separate patch.

	2. I won't insist, but WQ_FLAG_EXCLUSIVE_HEAD can be used
	   without WQ_FLAG_EXCLUSIVE.

	   Unlikely this can be useful, but it looks more natural
	   this way. Otherwise we need to add another check to
	   ensure that WQ_FLAG_EXCLUSIVE_HEAD can't come alone.

	   However, perhaps this means the new flag needs another
	   name. I agree in advance with any.

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/2] wait: turn "bool exclusive" arg of __wait_event() into wflags
  2014-03-20 17:51                 ` [PATCH 0/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD Oleg Nesterov
@ 2014-03-20 17:51                   ` Oleg Nesterov
  2014-03-20 17:51                   ` [PATCH 2/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD Oleg Nesterov
  2014-03-21  2:45                   ` [PATCH 0/2] " Dilger, Andreas
  2 siblings, 0 replies; 21+ messages in thread
From: Oleg Nesterov @ 2014-03-20 17:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Peng Tao, Linux Kernel Mailing List, Ingo Molnar, Oleg Drokin,
	Andreas Dilger

Change ___wait_event() to accept __wait.flags as an argument instead
of "exclusive", and change the only caller which uses exclusive == 1.

This allows us to add another WQ_FLAG (see the next patch). And this
is more flexible, we can overload this argument to pass more info.

This should not affect the  generated code, currently this argument is
always __builtin_constant_p().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/wait.h |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index 559044c..e547c6c 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -16,6 +16,7 @@ int default_wake_function(wait_queue_t *wait, unsigned mode, int flags, void *ke
 struct __wait_queue {
 	unsigned int		flags;
 #define WQ_FLAG_EXCLUSIVE	0x01
+#define WQ_FLAG_MASK		WQ_FLAG_EXCLUSIVE
 	void			*private;
 	wait_queue_func_t	func;
 	struct list_head	task_list;
@@ -191,17 +192,16 @@ wait_queue_head_t *bit_waitqueue(void *, int);
 	(!__builtin_constant_p(state) ||				\
 		state == TASK_INTERRUPTIBLE || state == TASK_KILLABLE)	\
 
-#define ___wait_event(wq, condition, state, exclusive, ret, cmd)	\
+#define ___wait_event(wq, condition, state, wflags, ret, cmd)		\
 ({									\
 	__label__ __out;						\
 	wait_queue_t __wait;						\
 	long __ret = ret;						\
 									\
 	INIT_LIST_HEAD(&__wait.task_list);				\
-	if (exclusive)							\
-		__wait.flags = WQ_FLAG_EXCLUSIVE;			\
-	else								\
-		__wait.flags = 0;					\
+	BUILD_BUG_ON(__builtin_constant_p(wflags) &&			\
+			((wflags) & ~WQ_FLAG_MASK));			\
+	__wait.flags = wflags;						\
 									\
 	for (;;) {							\
 		long __int = prepare_to_wait_event(&wq, &__wait, state);\
@@ -211,7 +211,7 @@ wait_queue_head_t *bit_waitqueue(void *, int);
 									\
 		if (___wait_is_interruptible(state) && __int) {		\
 			__ret = __int;					\
-			if (exclusive) {				\
+			if ((wflags) & WQ_FLAG_EXCLUSIVE) {		\
 				abort_exclusive_wait(&wq, &__wait,	\
 						     state, NULL);	\
 				goto __out;				\
@@ -438,8 +438,8 @@ do {									\
 })
 
 #define __wait_event_interruptible_exclusive(wq, condition)		\
-	___wait_event(wq, condition, TASK_INTERRUPTIBLE, 1, 0,		\
-		      schedule())
+	___wait_event(wq, condition, TASK_INTERRUPTIBLE,		\
+		      WQ_FLAG_EXCLUSIVE, 0, schedule())
 
 #define wait_event_interruptible_exclusive(wq, condition)		\
 ({									\
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD
  2014-03-20 17:51                 ` [PATCH 0/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD Oleg Nesterov
  2014-03-20 17:51                   ` [PATCH 1/2] wait: turn "bool exclusive" arg of __wait_event() into wflags Oleg Nesterov
@ 2014-03-20 17:51                   ` Oleg Nesterov
  2014-03-21  2:45                   ` [PATCH 0/2] " Dilger, Andreas
  2 siblings, 0 replies; 21+ messages in thread
From: Oleg Nesterov @ 2014-03-20 17:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Peng Tao, Linux Kernel Mailing List, Ingo Molnar, Oleg Drokin,
	Andreas Dilger

Normally wait_queue_t is a FIFO list for exclusive waiting tasks,
but Lustre wants LIFO to wake up the most recent active thread and
avoid the unnecessary cache line pollution.

As Peter suggested we add the new WQ_FLAG_EXCLUSIVE_HEAD flag and
teach prepare_to_wait_event() to insert the new entry right before
the first WQ_FLAG_EXCLUSIVE task.

Note:
	- this obviously assumes that the user of EXCLUSIVE_HEAD
	  doesn't mix exclusive and !exclusive too much, otherwise
	  it should accept the cost of additional list_for_each().

	- WQ_FLAG_EXCLUSIVE_HEAD doesn't imply WQ_FLAG_EXCLUSIVE,
	  it only controls the placement in queue. So the new flag
	  can be used individually even if this is unlikely useful.

Requested-by: Peng Tao <bergwolf@gmail.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/wait.h |    3 ++-
 kernel/sched/wait.c  |   30 ++++++++++++++++++++++++------
 2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index e547c6c..afd41eb 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -16,7 +16,8 @@ int default_wake_function(wait_queue_t *wait, unsigned mode, int flags, void *ke
 struct __wait_queue {
 	unsigned int		flags;
 #define WQ_FLAG_EXCLUSIVE	0x01
-#define WQ_FLAG_MASK		WQ_FLAG_EXCLUSIVE
+#define WQ_FLAG_EXCLUSIVE_HEAD	0x02
+#define WQ_FLAG_MASK		(WQ_FLAG_EXCLUSIVE | WQ_FLAG_EXCLUSIVE_HEAD)
 	void			*private;
 	wait_queue_func_t	func;
 	struct list_head	task_list;
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 7d50f79..894ff75 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -195,6 +195,28 @@ prepare_to_wait_exclusive(wait_queue_head_t *q, wait_queue_t *wait, int state)
 }
 EXPORT_SYMBOL(prepare_to_wait_exclusive);
 
+static void add_wait_queue_flag(wait_queue_head_t *q, wait_queue_t *wait)
+{
+	struct list_head *head = &q->task_list;
+
+	if (wait->flags & (WQ_FLAG_EXCLUSIVE | WQ_FLAG_EXCLUSIVE_HEAD)) {
+		if (wait->flags & WQ_FLAG_EXCLUSIVE_HEAD) {
+			wait_queue_t *curr;
+			/* find the first exclusive entry */
+			list_for_each_entry(curr, head, task_list) {
+				if (likely(curr->flags & WQ_FLAG_EXCLUSIVE)) {
+					head = &curr->task_list;
+					break;
+				}
+			}
+		}
+		/* turn list_add() below into list_add_tail() */
+		head = head->prev;
+	}
+
+	list_add(&wait->task_list, head);
+}
+
 long prepare_to_wait_event(wait_queue_head_t *q, wait_queue_t *wait, int state)
 {
 	unsigned long flags;
@@ -206,12 +228,8 @@ long prepare_to_wait_event(wait_queue_head_t *q, wait_queue_t *wait, int state)
 	wait->func = autoremove_wake_function;
 
 	spin_lock_irqsave(&q->lock, flags);
-	if (list_empty(&wait->task_list)) {
-		if (wait->flags & WQ_FLAG_EXCLUSIVE)
-			__add_wait_queue_tail(q, wait);
-		else
-			__add_wait_queue(q, wait);
-	}
+	if (list_empty(&wait->task_list))
+		add_wait_queue_flag(q, wait);
 	set_current_state(state);
 	spin_unlock_irqrestore(&q->lock, flags);
 
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC] sched: introduce add_wait_queue_exclusive_head
  2014-03-19 19:55                 ` Peter Zijlstra
  2014-03-20  7:06                   ` Dilger, Andreas
@ 2014-03-20 18:49                   ` Oleg Nesterov
  1 sibling, 0 replies; 21+ messages in thread
From: Oleg Nesterov @ 2014-03-20 18:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dilger, Andreas, Peng Tao, Linux Kernel Mailing List,
	Ingo Molnar, Drokin, Oleg

On 03/19, Peter Zijlstra wrote:
>
> On Wed, Mar 19, 2014 at 07:44:29PM +0000, Dilger, Andreas wrote:
> > The original reason for l_wait_event() not using TASK_UNINTERRUPTIBLE
> > is to avoid the load on the server continually being "num_service_threads"
> > regardless of whether they are actually doing something or not.  We
> > added various cases for periodic wakeups and such afterward.
>
> Hmm, maybe we should finally do the TASK_IDLE thing;
>
>   https://lkml.org/lkml/2013/11/12/710

Agreed. And if we are not going to expose TASK_IDLE bit in /proc/ (I think
we should not) the patch should be really trivial.

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD
  2014-03-20 17:51                 ` [PATCH 0/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD Oleg Nesterov
  2014-03-20 17:51                   ` [PATCH 1/2] wait: turn "bool exclusive" arg of __wait_event() into wflags Oleg Nesterov
  2014-03-20 17:51                   ` [PATCH 2/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD Oleg Nesterov
@ 2014-03-21  2:45                   ` Dilger, Andreas
  2014-03-21 18:49                     ` Oleg Nesterov
  2 siblings, 1 reply; 21+ messages in thread
From: Dilger, Andreas @ 2014-03-21  2:45 UTC (permalink / raw)
  To: Oleg Nesterov, Peter Zijlstra
  Cc: Peng Tao, Linux Kernel Mailing List, Ingo Molnar, Drokin, Oleg

On 2014/03/20, 11:51 AM, "Oleg Nesterov" <oleg@redhat.com> wrote:

>On 03/19, Oleg Nesterov wrote:
>>
>> OK, I'll try to test/cleanup/resend tomorrow.
>
>Cough. Still un-tested, sorry. I will test it somehow and report,
>but I'd like to send this for review right now.
>
>Because I simply can't decide what the new flag should actually
>do, so please ack/nack the semantics/naming at least.
>
>Changes:
>
>	1. I decided it would be better to change __wait_event()
>	   to accept wait.flags right now. This looks better in
>	   any case to me, and otherwise we need to introduce the
>	   __wait_exclusive_enum.
>
>	   The change looks trivial (both actually), please tell
>	   me if you think it doesn't deserve a separate patch.
>
>	2. I won't insist, but WQ_FLAG_EXCLUSIVE_HEAD can be used
>	   without WQ_FLAG_EXCLUSIVE.
>
>	   Unlikely this can be useful, but it looks more natural
>	   this way. Otherwise we need to add another check to
>	   ensure that WQ_FLAG_EXCLUSIVE_HEAD can't come alone.
>
>	   However, perhaps this means the new flag needs another
>	   name. I agree in advance with any.

What about:

#define WQ_FLAG_HEAD	0x02

#define WQ_FLAG_EXCLUSIVE_HEAD (WQ_FLAG_HEAD | WQ_FLAG_EXCLUSIVE)

That avoids having WQ_FLAG_EXCLUSIVE_HEAD not actually meaning "exclusive"?

Patches look reasonable at first glance.  The second patch would need
to be changed to handle that WQ_FLAG_EXCLUSIVE_HEAD has both bits set
(probably just replace uses of WQ_FLAG_EXCLUSIVE_HEAD with WQ_FLAG_HEAD).

Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD
  2014-03-21  2:45                   ` [PATCH 0/2] " Dilger, Andreas
@ 2014-03-21 18:49                     ` Oleg Nesterov
  0 siblings, 0 replies; 21+ messages in thread
From: Oleg Nesterov @ 2014-03-21 18:49 UTC (permalink / raw)
  To: Dilger, Andreas
  Cc: Peter Zijlstra, Peng Tao, Linux Kernel Mailing List, Ingo Molnar,
	Drokin, Oleg

On 03/21, Dilger, Andreas wrote:
>
> On 2014/03/20, 11:51 AM, "Oleg Nesterov" <oleg@redhat.com> wrote:
>
> >On 03/19, Oleg Nesterov wrote:
> >>
> >> OK, I'll try to test/cleanup/resend tomorrow.
> >
> >Cough. Still un-tested, sorry. I will test it somehow and report,
> >but I'd like to send this for review right now.
> >
> >Because I simply can't decide what the new flag should actually
> >do, so please ack/nack the semantics/naming at least.
> >
> >Changes:
> >
> >	1. I decided it would be better to change __wait_event()
> >	   to accept wait.flags right now. This looks better in
> >	   any case to me, and otherwise we need to introduce the
> >	   __wait_exclusive_enum.
> >
> >	   The change looks trivial (both actually), please tell
> >	   me if you think it doesn't deserve a separate patch.
> >
> >	2. I won't insist, but WQ_FLAG_EXCLUSIVE_HEAD can be used
> >	   without WQ_FLAG_EXCLUSIVE.
> >
> >	   Unlikely this can be useful, but it looks more natural
> >	   this way. Otherwise we need to add another check to
> >	   ensure that WQ_FLAG_EXCLUSIVE_HEAD can't come alone.
> >
> >	   However, perhaps this means the new flag needs another
> >	   name. I agree in advance with any.
>
> What about:
>
> #define WQ_FLAG_HEAD	0x02

I am fine either way ;)

But _HEAD looks a bit confusing too. This flag doesn't add at the head,
it inserts the new entry before other exclusive tasks.

> #define WQ_FLAG_EXCLUSIVE_HEAD (WQ_FLAG_HEAD | WQ_FLAG_EXCLUSIVE)
>
> That avoids having WQ_FLAG_EXCLUSIVE_HEAD not actually meaning "exclusive"?
>
> Patches look reasonable at first glance.  The second patch would need
> to be changed to handle that WQ_FLAG_EXCLUSIVE_HEAD has both bits set
> (probably just replace uses of WQ_FLAG_EXCLUSIVE_HEAD with WQ_FLAG_HEAD).

Yes, s/WQ_FLAG_EXCLUSIVE_HEAD/WQ_FLAG_HEAD/ is the only change we
need in this case.

Other than define(WQ_FLAG_EXCLUSIVE_HEAD) of course, but this flags should
only be used by ___wait_event() callers.

Peter, what do you think?

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-03-21 18:50 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-18 13:10 [PATCH RFC] sched: introduce add_wait_queue_exclusive_head Peng Tao
2014-03-18 13:33 ` Peter Zijlstra
2014-03-18 13:51   ` Peng Tao
2014-03-18 14:05     ` Peter Zijlstra
2014-03-18 14:44       ` Peng Tao
2014-03-18 16:23         ` Oleg Nesterov
2014-03-19  2:22           ` Peng Tao
2014-03-19 17:33             ` Oleg Nesterov
2014-03-19 19:44               ` Dilger, Andreas
2014-03-19 19:55                 ` Peter Zijlstra
2014-03-20  7:06                   ` Dilger, Andreas
2014-03-20 18:49                   ` Oleg Nesterov
2014-03-18 15:47       ` Oleg Nesterov
2014-03-19  2:17         ` Peng Tao
     [not found]           ` <20140319164907.GA10113@redhat.com>
2014-03-19 16:57             ` Peter Zijlstra
2014-03-19 17:19               ` Oleg Nesterov
2014-03-20 17:51                 ` [PATCH 0/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD Oleg Nesterov
2014-03-20 17:51                   ` [PATCH 1/2] wait: turn "bool exclusive" arg of __wait_event() into wflags Oleg Nesterov
2014-03-20 17:51                   ` [PATCH 2/2] wait: introduce WQ_FLAG_EXCLUSIVE_HEAD Oleg Nesterov
2014-03-21  2:45                   ` [PATCH 0/2] " Dilger, Andreas
2014-03-21 18:49                     ` Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.