All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>, Michal Hocko <mhocko@kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page
Date: Thu, 23 Jul 2020 14:47:50 +0200	[thread overview]
Message-ID: <20200723124749.GA7428@redhat.com> (raw)
In-Reply-To: <CAHk-=whEjnsANEhTA3aqpNLZ3vv7huP7QAmcAEd-GUxm2YMo-Q@mail.gmail.com>

On 07/22, Linus Torvalds wrote:
>
> Comments? Oleg, this should fix the race you talked about too.

Yes.

I still can't convince myself thatI fully understand this patch but I see
nothing really wrong after a quick glance...

> +	 * We can no longer use 'wait' after we've done the
> +	 * list_del_init(&wait->entry),

Yes, but see below,

> +	 * the target may decide it's all done with no
> +	 * other locking, and 'wait' has been allocated on
> +	 * the stack of the target.
>  	 */
> -	if (test_bit(key->bit_nr, &key->page->flags))
> -		return -1;
> +	target = wait->private;
> +	smp_mb();
>
> -	return autoremove_wake_function(wait, mode, sync, key);
> +	/*
> +	 * Ok, we have successfully done what we're waiting for.
> +	 *
> +	 * Now unconditionally remove the wait entry, so that the
> +	 * waiter can use that to see success or not.
> +	 *
> +	 * We _really_ should have a "list_del_init_careful()"
> +	 * to properly pair with an unlocked "list_empty_careful()".
> +	 */
> +	list_del_init(&wait->entry);
> +
> +	/*
> +	 * Theres's another memory barrier in the wakup path, that
> +	 * makes sure the wakup happens after the above is visible
> +	 * to the target.
> +	 */
> +	wake_up_state(target, mode);

We can no longer use 'target'. If it was already woken up it can notice
list_empty_careful(), return without taking q->lock, and exit.

Of course, this is purely theoretical... rcu_read_lock() should help
but perhaps we can avoid it somehow?

Say, can't we abuse WQ_FLAG_WOKEN?

	wake_page_function:
		wait->flags |= WQ_FLAG_WOKEN;
		wmb();
		autoremove_wake_function(...);

	wait_on_page_bit_common:

		for (;;) {
			set_current_state();
			if (wait.flags & WQ_FLAG_WOKEN)
				break;
			schedule();
		}

		finish_wait();

		rmb();
		return wait.flags & WQ_FLAG_WOKEN ? 0 : -EINTR;

Another (cosmetic) problem is that wake_up_state(mode) looks confusing.
It is correct but only because we know that mode == TASK_NORMAL and thus
wake_up_state() can'fail if the target is still blocked.

> +	spin_lock_irq(&q->lock);
> +	SetPageWaiters(page);
> +	if (!trylock_page_bit_common(page, bit_nr, behavior))
> +		__add_wait_queue_entry_tail(q, wait);

do we need SetPageWaiters() if trylock() succeeds ?

Oleg.


  parent reply	other threads:[~2020-07-23 12:48 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-21  6:32 [RFC PATCH] mm: silence soft lockups from unlock_page Michal Hocko
2020-07-21 11:10 ` Qian Cai
2020-07-21 11:25   ` Michal Hocko
2020-07-21 11:44     ` Qian Cai
2020-07-21 12:17       ` Michal Hocko
2020-07-21 13:23         ` Qian Cai
2020-07-21 13:38           ` Michal Hocko
2020-07-21 14:15             ` Qian Cai
2020-07-21 14:17 ` Chris Down
2020-07-21 15:00   ` Michal Hocko
2020-07-21 15:33 ` Linus Torvalds
2020-07-21 15:33   ` Linus Torvalds
2020-07-21 15:49   ` Michal Hocko
2020-07-22 18:29   ` Linus Torvalds
2020-07-22 18:29     ` Linus Torvalds
2020-07-22 21:29     ` Hugh Dickins
2020-07-22 21:29       ` Hugh Dickins
2020-07-22 22:10       ` Linus Torvalds
2020-07-22 22:10         ` Linus Torvalds
2020-07-22 23:42         ` Linus Torvalds
2020-07-22 23:42           ` Linus Torvalds
2020-07-23  0:23           ` Linus Torvalds
2020-07-23  0:23             ` Linus Torvalds
2020-07-23 12:47           ` Oleg Nesterov [this message]
2020-07-23 17:32             ` Linus Torvalds
2020-07-23 17:32               ` Linus Torvalds
2020-07-23 18:01               ` Oleg Nesterov
2020-07-23 18:22                 ` Linus Torvalds
2020-07-23 18:22                   ` Linus Torvalds
2020-07-23 19:03                   ` Linus Torvalds
2020-07-23 19:03                     ` Linus Torvalds
2020-07-24 14:45                     ` Oleg Nesterov
2020-07-23 20:03               ` Linus Torvalds
2020-07-23 20:03                 ` Linus Torvalds
2020-07-23 23:11                 ` Hugh Dickins
2020-07-23 23:11                   ` Hugh Dickins
2020-07-23 23:43                   ` Linus Torvalds
2020-07-23 23:43                     ` Linus Torvalds
2020-07-24  0:07                     ` Hugh Dickins
2020-07-24  0:07                       ` Hugh Dickins
2020-07-24  0:46                       ` Linus Torvalds
2020-07-24  0:46                         ` Linus Torvalds
2020-07-24  3:45                         ` Hugh Dickins
2020-07-24  3:45                           ` Hugh Dickins
2020-07-24 15:24                     ` Oleg Nesterov
2020-07-24 17:32                       ` Linus Torvalds
2020-07-24 17:32                         ` Linus Torvalds
2020-07-24 23:25                         ` Linus Torvalds
2020-07-24 23:25                           ` Linus Torvalds
2020-07-25  2:08                           ` Hugh Dickins
2020-07-25  2:08                             ` Hugh Dickins
2020-07-25  2:46                             ` Linus Torvalds
2020-07-25  2:46                               ` Linus Torvalds
2020-07-25 10:14                           ` Oleg Nesterov
2020-07-25 18:48                             ` Linus Torvalds
2020-07-25 18:48                               ` Linus Torvalds
2020-07-25 19:27                               ` Oleg Nesterov
2020-07-25 19:51                                 ` Linus Torvalds
2020-07-25 19:51                                   ` Linus Torvalds
2020-07-26 13:57                                   ` Oleg Nesterov
2020-07-25 21:19                               ` Hugh Dickins
2020-07-25 21:19                                 ` Hugh Dickins
2020-07-26  4:22                                 ` Hugh Dickins
2020-07-26  4:22                                   ` Hugh Dickins
2020-07-26 20:30                                   ` Hugh Dickins
2020-07-26 20:30                                     ` Hugh Dickins
2020-07-26 20:41                                     ` Linus Torvalds
2020-07-26 20:41                                       ` Linus Torvalds
2020-07-26 22:09                                       ` Hugh Dickins
2020-07-26 22:09                                         ` Hugh Dickins
2020-07-27 19:35                                     ` Greg KH
2020-08-06  5:46                                       ` Hugh Dickins
2020-08-06  5:46                                         ` Hugh Dickins
2020-08-18 13:50                                         ` Greg KH
2020-08-06  5:21                                     ` Hugh Dickins
2020-08-06  5:21                                       ` Hugh Dickins
2020-08-06 17:07                                       ` Linus Torvalds
2020-08-06 17:07                                         ` Linus Torvalds
2020-08-06 18:00                                         ` Matthew Wilcox
2020-08-06 18:32                                           ` Linus Torvalds
2020-08-06 18:32                                             ` Linus Torvalds
2020-08-07 18:41                                             ` Hugh Dickins
2020-08-07 18:41                                               ` Hugh Dickins
2020-08-07 19:07                                               ` Linus Torvalds
2020-08-07 19:07                                                 ` Linus Torvalds
2020-08-07 19:35                                               ` Matthew Wilcox
2020-08-03 13:14                           ` Michal Hocko
2020-08-03 17:56                             ` Linus Torvalds
2020-08-03 17:56                               ` Linus Torvalds
2020-07-25  9:39                         ` Oleg Nesterov
2020-07-23  8:03     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200723124749.GA7428@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.