linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roman Penyaev <rpenyaev@suse.de>
To: Khazhismel Kumykov <khazhy@google.com>
Cc: viro@zeniv.linux.org.uk, akpm@linux-foundation.org, r@hev.cc,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] eventpoll: fix missing wakeup for ovflist in ep_poll_callback
Date: Fri, 24 Apr 2020 12:11:23 +0200	[thread overview]
Message-ID: <2bd5fcb37337dd7248a5cb245bf8dde9@suse.de> (raw)
In-Reply-To: <20200424025057.118641-1-khazhy@google.com>

Hi Khazhismel,

That seems to be correct. The patch you refer 339ddb53d373
relies on callback path, which *should* wake up, not the path
which harvests events (thus unnecessary wakeups).  When we add
a new event to the ->ovflist nobody wakes up the waiters,
thus missing wakeup. You are right.

May I suggest a small change in order to avoid one new goto?
We can add a new event in either ->ovflist or ->rdllist and
then wakeup should happen. So simple 'else if' branch should
do things right, something like the following:

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 8c596641a72b..7d566667c6ad 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1171,6 +1171,10 @@ static inline bool chain_epi_lockless(struct 
epitem *epi)
  {
         struct eventpoll *ep = epi->ep;

+       /* Fast preliminary check */
+       if (epi->next != EP_UNACTIVE_PTR)
+               return false;
+
         /* Check that the same epi has not been just chained from 
another CPU */
         if (cmpxchg(&epi->next, EP_UNACTIVE_PTR, NULL) != 
EP_UNACTIVE_PTR)
                 return false;
@@ -1237,16 +1241,13 @@ static int ep_poll_callback(wait_queue_entry_t 
*wait, unsigned mode, int sync, v
          * chained in ep->ovflist and requeued later on.
          */
         if (READ_ONCE(ep->ovflist) != EP_UNACTIVE_PTR) {
-               if (epi->next == EP_UNACTIVE_PTR &&
-                   chain_epi_lockless(epi))
+               if (chain_epi_lockless(epi))
                         ep_pm_stay_awake_rcu(epi);
-               goto out_unlock;
         }
-
-       /* If this file is already in the ready list we exit soon */
-       if (!ep_is_linked(epi) &&
-           list_add_tail_lockless(&epi->rdllink, &ep->rdllist)) {
-               ep_pm_stay_awake_rcu(epi);
+       /* Otherwise take usual path and add event to ready list */
+       else if (!ep_is_linked(epi)) {
+               if (list_add_tail_lockless(&epi->rdllink, &ep->rdllist))
+                       ep_pm_stay_awake_rcu(epi);
         }


I also moved 'epi->next == EP_UNACTIVE_PTR' check directly
to the chain_epi_lockless, where it should be.

This is minor, of course, you are free to keep it as is.

Reviewed-by: Roman Penyaev <rpenyaev@suse.de>

--
Roman


On 2020-04-24 04:50, Khazhismel Kumykov wrote:
> In the event that we add to ovflist, before 339ddb53d373 we would be
> woken up by ep_scan_ready_list, and did no wakeup in ep_poll_callback.
> With that wakeup removed, if we add to ovflist here, we may never wake
> up. Rather than adding back the ep_scan_ready_list wakeup - which was
> resulting un uncessary wakeups, trigger a wake-up in ep_poll_callback.
> 
> We noticed that one of our workloads was missing wakeups starting with
> 339ddb53d373 and upon manual inspection, this wakeup seemed missing to
> me. With this patch added, we no longer see missing wakeups. I haven't
> yet tried to make a small reproducer, but the existing kselftests in
> filesystem/epoll passed for me with this patch.
> 
> Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested 
> epoll")
> Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
> ---
>  fs/eventpoll.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 8c596641a72b..40cc89559cf6 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -1240,7 +1240,7 @@ static int ep_poll_callback(wait_queue_entry_t
> *wait, unsigned mode, int sync, v
>  		if (epi->next == EP_UNACTIVE_PTR &&
>  		    chain_epi_lockless(epi))
>  			ep_pm_stay_awake_rcu(epi);
> -		goto out_unlock;
> +		goto out_wakeup_unlock;
>  	}
> 
>  	/* If this file is already in the ready list we exit soon */
> @@ -1249,6 +1249,7 @@ static int ep_poll_callback(wait_queue_entry_t
> *wait, unsigned mode, int sync, v
>  		ep_pm_stay_awake_rcu(epi);
>  	}
> 
> +out_wakeup_unlock:
>  	/*
>  	 * Wake up ( if active ) both the eventpoll wait list and the 
> ->poll()
>  	 * wait list.


  parent reply	other threads:[~2020-04-24 10:11 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-24  2:50 [PATCH] eventpoll: fix missing wakeup for ovflist in ep_poll_callback Khazhismel Kumykov
2020-04-24  3:31 ` Andrew Morton
2020-04-24 10:11 ` Roman Penyaev [this message]
2020-04-24 18:50   ` Khazhismel Kumykov
2020-04-24 19:00 ` [PATCH v2] " Khazhismel Kumykov
2020-04-25 16:17   ` Jason Baron
2020-04-25 20:59     ` Khazhismel Kumykov
2020-04-27 20:38       ` Jason Baron
2020-04-28 18:10         ` Roman Penyaev
2020-04-29  4:12           ` Jason Baron
2020-04-29 14:38             ` Roman Penyaev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2bd5fcb37337dd7248a5cb245bf8dde9@suse.de \
    --to=rpenyaev@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=khazhy@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=r@hev.cc \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).