On Tue, 2020-10-27 at 20:09 +0100, Peter Zijlstra wrote: > On Tue, Oct 27, 2020 at 02:39:43PM +0000, David Woodhouse wrote: > > From: David Woodhouse > > > > This allows an exclusive wait_queue_entry to be added at the head of the > > queue, instead of the tail as normal. Thus, it gets to consume events > > first without allowing non-exclusive waiters to be woken at all. > > > > The (first) intended use is for KVM IRQFD, which currently has > > Do you have more? You could easily special case this inside the KVM > code. I don't have more right now. What is the easy special case that you see? > I don't _think_ the other users of __add_wait_queue() will mind the > extra branch, but what do I know. I suppose we could add an unlikely() in there. It seemed like premature optimisation. > > static inline void __add_wait_queue(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) > > { > > - list_add(&wq_entry->entry, &wq_head->head); > > + struct list_head *head = &wq_head->head; > > + struct wait_queue_entry *wq; > > + > > + list_for_each_entry(wq, &wq_head->head, entry) { > > + if (!(wq->flags & WQ_FLAG_PRIORITY)) > > + break; > > + head = &wq->entry; > > + } > > + list_add(&wq_entry->entry, head); > > } > > So you're adding the PRIORITY things to the head of the list and need > the PRIORITY flag to keep them in FIFO order there, right? No, I don't care about the order of priority entries; there will typically be only one of them; that's the point. (I'd have used the word 'exclusive' if that wasn't already in use for something that... well... isn't.) I only case that the priority entries come *before* the bog-standard non-exclusive entries (like ep_poll_callback). The priority items end up getting added in FIFO order purely by chance, because it was simpler to use the same insertion flow for both priority and normal non-exclusive entries instead of making a new case. So they all get inserted behind any existing priority entries. > While looking at this I found that weird __add_wait_queue_exclusive() > which is used by fs/eventpoll.c and does something similar, except it > doesn't keep the FIFO order. It does, doesn't it? Except those so-called "exclusive" entries end up in FIFO order amongst themselves at the *tail* of the queue, to be woken up only after all the other entries before them *haven't* been excluded. > The Changelog doesn't state how important this property is to you. Because it isn't :) The ordering is: { PRIORITY }* { NON-EXCLUSIVE }* { EXCLUSIVE(sic) }* I care that PRIORITY comes before the others, because I want to actually exclude the others. Especially the "non-exclusive" ones, which the 'exclusive' ones don't actually exclude. I absolutely don't care about ordering *within* the set of PRIORITY entries, since as I said I expect there to be only one.