LKML Archive on lore.kernel.org
 help / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	tglx@linutronix.de, "Steven J . Hill" <steven.hill@cavium.com>,
	Tejun Heo <htejun@gmail.com>, Christoph Lameter <cl@linux.com>
Subject: Re: [PATCH REPOST] Revert mm/vmstat.c: fix vmstat_update() preemption BUG
Date: Thu, 10 May 2018 08:32:50 +0200
Message-ID: <524ecef9-e513-fec4-1178-ac1a87452e57@suse.cz> (raw)
In-Reply-To: <20180509223539.43aznhri72ephluc@linutronix.de>

On 05/10/2018 12:35 AM, Sebastian Andrzej Siewior wrote:
> On 2018-05-08 16:02:57 [-0700], Andrew Morton wrote:
>> On Mon, 7 May 2018 09:31:05 +0200 Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>>> In any case I agree that the revert should be done immediately even
>>> before fixing the underlying bug. The preempt_disable/enable doesn't
>>> prevent the bug, it only prevents the debugging code from actually
>>> reporting it! Note that it's debugging code (CONFIG_DEBUG_PREEMPT) that
>>> production kernels most likely don't have enabled, so we are not even
>>> helping them not crash (while allowing possible data corruption).
>>
>> Grumble.
>>
>> I don't see much benefit in emitting warnings into end-users' logs for
>> bugs which we already know about.
> 
> not end-users (not to mention that neither Debian Stretch nor F28 has
> preemption enabled in their kernels). And if so, they may provide
> additional information for someone to fix the bug in the end. I wasn't

Even if end-users have enabled preemption, they likely won't have
enabled CONFIG_DEBUG_PREEMPT anyway.

> able to reproduce the bug but I don't have access to anything MIPSish
> where I can boot my own kernels. At least two people were looking at the
> code after I posted the revert and nobody spotted the bug.
> 
>> The only thing this buys us is that people will hassle us if we forget
>> to fix the bug, and how pathetic is that?  I mean, we may as well put
>>
>> 	printk("don't forget to fix the vmstat_update() bug!\n");
> 
> No that is different. That would be seen by everyone. The bug was only
> reported by Steven J. Hill which did not respond since. This message
> would also imply that we know how to fix the bug but didn't do it yet
> which is not the case. We seen that something was wrong but have no idea
> *how* it got there.
> 
> The preempt_disable() was added by the end of v4.16. The
> smp_processor_id() in vmstat_update() was added in commit 7cc36bbddde5
> ("vmstat: on-demand vmstat workers V8") which was in v3.18-rc1. The
> hotplug rework took place in v4.10-rc1. And it took (counting from the
> hotplug rework) 6 kernel releases for someone to trigger that warning
> _if_ this was related to the hotplug rework.
> 
> What we have *now* is way worse: We have a possible bug that triggered
> the warning. As we see in report the code in question was _already_
> invoked on the wrong CPU. The preempt_disable() just silences the
> warning, hiding the real issue so nobody will do a thing about it since
> it will be never reported again (in a kernel with preemption and debug
> enabled).

Fully agree with everything you said!

We could extend the warning to e.g. print affinity mask of the thread,
and e.g. state of cpus that are subject to ongoing hotplug/hotremove.
But maybe it's not so useful in general, as the common case is likely
indeed a missing preempt_disable, and this is an exception? In any case,
I would hope that Steven applies some patch locally and we get more
details about what's going on at that MIPS machine.

  reply index

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-04 10:44 Sebastian Andrzej Siewior
2018-05-07  7:31 ` Vlastimil Babka
2018-05-08 23:02   ` Andrew Morton
2018-05-09 22:35     ` Sebastian Andrzej Siewior
2018-05-10  6:32       ` Vlastimil Babka [this message]
2018-06-13 21:46         ` Thomas Gleixner
2018-06-14 21:27           ` Andrew Morton
2018-06-27 19:40             ` Steven Rostedt

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=524ecef9-e513-fec4-1178-ac1a87452e57@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=cl@linux.com \
    --cc=htejun@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=steven.hill@cavium.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox