From: Nicholas Piggin <npiggin@gmail.com>
To: Don Zickus <dzickus@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Borislav Petkov <bp@alien8.de>,
Sebastian Siewior <bigeasy@linutronix.de>,
Chris Metcalf <cmetcalf@mellanox.com>,
Ulrich Obergfell <uobergfe@redhat.com>,
Michael Ellerman <mpe@ellerman.id.au>
Subject: Re: [patch 00/29] lockup_detector: Cure hotplug deadlocks and replace duct tape
Date: Fri, 1 Sep 2017 14:42:29 +1000 [thread overview]
Message-ID: <20170901144229.3791e5c9@roar.ozlabs.ibm.com> (raw)
In-Reply-To: <20170831221014.b3tlpk3p6zxwclmy@redhat.com>
On Thu, 31 Aug 2017 18:10:14 -0400
Don Zickus <dzickus@redhat.com> wrote:
> On Thu, Aug 31, 2017 at 09:15:58AM +0200, Thomas Gleixner wrote:
> > The lockup detector is broken is several ways:
> >
> > - It's deadlock prone vs. CPU hotplug in various ways. Some of these
> > are due to recursive cpus_read_lock() others are due to
> > cpus_read_lock() from CPU hotplug callbacks which immediately lock
> > the machine because cpus are write locked.
> >
> > - The handling of the cpu hotplug threads happens sideways to the
> > smpboot thread infrastructure, which is racy and pointless
> >
> > - The handling of the user space sysctl interface is a complete
> > trainwreck as it fiddles directly with variables which can be
> > modified or evaluated by the running watchdogs.
> >
> > - The perf event initialization is a steaming pile of duct tape as it
> > idiotically tries to create perf events over and over even if perf is
> > not functional (no hardware, ....). To avoid excessive dmesg spam it
> > contains magic printk ratelimiting along with either wrong or useless
> > messages.
> >
> > - The code structure is horrible as ifdef sections are scattered all
> > over the place which makes it unreadable
> >
> > - There is more wreckage, but see the changelogs for the ugly details.
> >
> > Before I get utterly grumpy, I just pretend that I don't give a sh*t!
> >
> > The following series sanitizes the facility and addresses the problems.
>
> Hi Thomas,
>
> Thanks for the patchset. I agree with most your issues you complained
> about, just wasn't smart enough to figure out the right way to solve them.
> Despite your aggressive comments, I will review the code to see if it covers
> the scenarios that have popped up over the years and run some testing on my
> side. Probably need a few days to do that.
The powerpc bits look fine, there's no real changes pending there,
so just take them through your tree if you like.
I had a glance throught the series, no comments yet. The powerpc watchdog
already duplicates the proc tunables rather than using them directly, so
in theory it did not need the 2 stage reconfigure. In practice, it has a
brown paper bag bug because it does not stop the watchdog before changing
its internal variables :P 2 stage is probably safer and clearer way to go
though.
Thanks,
Nick
next prev parent reply other threads:[~2017-09-01 4:42 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-31 7:15 [patch 00/29] lockup_detector: Cure hotplug deadlocks and replace duct tape Thomas Gleixner
2017-08-31 7:15 ` [patch 01/29] hardlockup_detector: Provide interface to stop/restart perf events Thomas Gleixner
2017-09-06 16:14 ` Borislav Petkov
2017-08-31 7:16 ` [patch 02/29] perf/x86/intel: Sanitize PMU HT bug workaround Thomas Gleixner
2017-08-31 7:16 ` [patch 03/29] lockup_detector: Provide interface to stop from poweroff() Thomas Gleixner
2017-08-31 7:16 ` [patch 04/29] parisc: Use lockup_detector_stop() Thomas Gleixner
2017-08-31 7:16 ` [patch 05/29] lockup_detector: Remove broken suspend/resume interfaces Thomas Gleixner
2017-08-31 7:16 ` [patch 06/29] lockup_detector: Rework cpu hotplug locking Thomas Gleixner
2017-08-31 7:16 ` [patch 07/29] lockup_detector: Rename watchdog_proc_mutex Thomas Gleixner
2017-08-31 7:16 ` [patch 08/29] lockup_detector: Mark hardlockup_detector_disable() __init Thomas Gleixner
2017-08-31 7:16 ` [patch 09/29] lockup_detector/perf: Remove broken self disable on failure Thomas Gleixner
2017-08-31 7:16 ` [patch 10/29] lockup_detector/perf: Prevent cpu hotplug deadlock Thomas Gleixner
2017-09-01 19:02 ` Don Zickus
2017-09-01 19:29 ` Thomas Gleixner
2017-09-05 14:51 ` Don Zickus
2017-08-31 7:16 ` [patch 11/29] lockup_detector: Remove park_in_progress hackery Thomas Gleixner
[not found] ` <CAEeg4=CJohPTi8FUNWqb3egsbZnExyJapcNC7wD-2amXTsMrYw@mail.gmail.com>
2017-09-04 12:10 ` Peter Zijlstra
2017-09-05 15:15 ` Don Zickus
2017-09-05 15:42 ` Thomas Gleixner
2017-09-05 13:58 ` Thomas Gleixner
2017-09-05 19:19 ` [patch V2 11/29] lockup_detector: Remove park_in_progress obfuscation Thomas Gleixner
2017-09-14 10:43 ` [tip:core/urgent] watchdog/core: Remove the " tip-bot for Thomas Gleixner
2017-08-31 7:16 ` [patch 12/29] lockup_detector: Cleanup stub functions Thomas Gleixner
2017-08-31 7:16 ` [patch 13/29] lockup_detector: Cleanup the ifdef maze Thomas Gleixner
2017-08-31 7:16 ` [patch 14/29] lockup_detector: Split out cpumask write function Thomas Gleixner
2017-08-31 7:16 ` [patch 15/29] smpboot/threads: Avoid runtime allocation Thomas Gleixner
2017-08-31 7:16 ` [patch 16/29] lockup_detector: Create new thread handling infrastructure Thomas Gleixner
2017-08-31 7:16 ` [patch 17/29] lockup_detector: Get rid of the thread teardown/setup dance Thomas Gleixner
2017-09-01 19:08 ` Don Zickus
2017-09-01 19:45 ` Thomas Gleixner
2017-08-31 7:16 ` [patch 18/29] lockup_detector: Further simplify sysctl handling Thomas Gleixner
2017-08-31 7:16 ` [patch 19/29] lockup_detector: Cleanup header mess Thomas Gleixner
2017-08-31 7:16 ` [patch 20/29] lockup_detector/sysctl: Get rid of the ifdeffery Thomas Gleixner
2017-08-31 7:16 ` [patch 21/29] lockup_detector: Cleanup sysctl variable name space Thomas Gleixner
2017-08-31 7:16 ` [patch 22/29] lockup_detector: Make watchdog_nmi_reconfigure() two stage Thomas Gleixner
2017-08-31 7:16 ` [patch 23/29] lockup_detector: Get rid of the racy update loop Thomas Gleixner
2017-08-31 7:16 ` [patch 24/29] lockup_detector/perf: Implement init time perf validation Thomas Gleixner
2017-09-07 15:58 ` Don Zickus
2017-08-31 7:16 ` [patch 25/29] lockup_detector: Implement init time detection of perf Thomas Gleixner
2017-08-31 7:16 ` [patch 26/29] lockup_detector/perf: Implement CPU enable replacement Thomas Gleixner
2017-08-31 7:16 ` [patch 27/29] lockup_detector: Use new perf CPU enable mechanism Thomas Gleixner
2017-08-31 7:16 ` [patch 28/29] lockup_detector/perf: Simplify deferred event destroy Thomas Gleixner
2017-08-31 7:16 ` [patch 29/29] lockup_detector: Cleanup hotplug locking mess Thomas Gleixner
2017-08-31 22:10 ` [patch 00/29] lockup_detector: Cure hotplug deadlocks and replace duct tape Don Zickus
2017-09-01 4:42 ` Nicholas Piggin [this message]
2017-09-01 9:18 ` Thomas Gleixner
2017-09-07 16:04 ` Don Zickus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170901144229.3791e5c9@roar.ozlabs.ibm.com \
--to=npiggin@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=bp@alien8.de \
--cc=cmetcalf@mellanox.com \
--cc=dzickus@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=uobergfe@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).