All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>, Christoph Lameter <cl@linux.com>,
	Linux-MM <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [GIT PULL] percpu fixes for v5.14-rc1
Date: Wed, 7 Jul 2021 11:40:45 -0700	[thread overview]
Message-ID: <CAHk-=wgMnammhbrevngFKwP31k9fO2npok26XnVCR0B3HJOUqQ@mail.gmail.com> (raw)
In-Reply-To: <YOWld9O5CZpzOUKA@google.com>

On Wed, Jul 7, 2021 at 6:00 AM Dennis Zhou <dennis@kernel.org> wrote:
>
> This is just a single change to fix percpu depopulation. The code relied
> on depopulation code written specifically for the free path and relied
> on vmalloc to do the tlb flush lazily. As we're modifying the backing
> pages during the lifetime of a chunk, we need to also flush the tlb
> accordingly.

I pulled this, but I ended up unpulling after looking at the fix.

The fix may be perfectly correct, but I'm looking at that
pcpu_reclaim_populated() function, and I want somebody to explain to
me what it's ok to drop and re-take the 'pcpu_lock' and just continue.

Because whatever it was protecting is now not protected any more.

It *looks* like it's intended to protect the pcpu_chunk_lists[]
content, and some other functions that do this look ok. So for
example, pcpu_balance_free() at least removes the 'chunk' from the
pcpu_chunk_lists[] before it drops the lock and then works on the
chunk contents.

But pcpu_reclaim_populated() seems to *leave* chunk on the
pcpu_chunk_lists[], drop the lock, and then continue to use 'chunk'.

That odd "release lock and continue to use the data it's supposed to
protect" seems to be pre-existing, but

 (a) this is the code that caused problems to begin with

and

 (b) it seems to now happen even more.

So maybe this code is right. But it looks very odd to me, and I'd like
to get more explanations of _why_ it would be ok before I pull this
fix, since there seems to be a deeper underlying problem in the code
that this tries to fix.

                 Linus

  reply	other threads:[~2021-07-07 18:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-07 13:00 [GIT PULL] percpu fixes for v5.14-rc1 Dennis Zhou
2021-07-07 18:40 ` Linus Torvalds [this message]
2021-07-07 18:40   ` Linus Torvalds
2021-07-07 21:06   ` Dennis Zhou
2021-07-10 17:35 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wgMnammhbrevngFKwP31k9fO2npok26XnVCR0B3HJOUqQ@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dennis@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.