linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg KH <gregkh@suse.de>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Kay Sievers <kay.sievers@vrfy.org>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>,
	Ming Lei <tom.leiming@gmail.com>,
	Djalal Harouni <tixxdz@opendz.org>,
	Borislav Petkov <borislav.petkov@amd.com>,
	Tony Luck <tony.luck@intel.com>,
	Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
	Ingo Molnar <mingo@elte.hu>, Andi Kleen <ak@linux.intel.com>,
	linux-kernel@vger.kernel.org, Kay Sievers <kay.sievers@vrfy.org>,
	gouders@et.bocholt.fh-gelsenkirchen.de,
	Marcos Souza <marcos.mage@gmail.com>,
	Linux PM mailing list <linux-pm@vger.kernel.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	prasad@linux.vnet.ibm.com, justinmattock@gmail.com,
	Jeff Chua <jeff.chua.linux@gmail.com>,
	Suresh B Siddha <suresh.b.siddha@intel.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Mel Gorman <mgorman@suse.de>,
	Gilad Ben-Yossef <gilad@benyossef.com>
Subject: Re: x86/mce: machine check warning during poweroff
Date: Mon, 16 Jan 2012 10:11:35 -0800	[thread overview]
Message-ID: <20120116181135.GA2680@suse.de> (raw)
In-Reply-To: <20120114144938.GA32033@suse.de>

On Sat, Jan 14, 2012 at 06:49:38AM -0800, Greg KH wrote:
> On Fri, Jan 13, 2012 at 06:53:04PM -0800, Linus Torvalds wrote:
> > On Fri, Jan 13, 2012 at 6:41 PM, Srivatsa S. Bhat
> > <srivatsa.bhat@linux.vnet.ibm.com> wrote:
> > >
> > > YES!! Finally I have a fix for this whole MCE thing! :-)
> > 
> > Goodie.
> > 
> > > The patch below works perfectly for me - I tested multiple CPU hotplug
> > > operations as well as multiple pm_test runs at core level. Please let me
> > > know if this solves the suspend issue as well..
> > 
> > Ok, I'll try, and I bet it does.
> > 
> > HOWEVER.
> > 
> > I'd be a whole lot happier knowing exactly which field in "struct
> > device" that needed to be NULL before it gets registered.
> > 
> > I don't like how
> > 
> >   device_register() + device_create_file(dev)..
> > 
> > is not sufficiently undone by
> > 
> >  .. device_remove_file(dev) +  device_unregister()
> > 
> > so that it can't be repeated. Exactly *what* state is stale and
> > re-used incorrectly if you do that device_register() a second time.
> > 
> > It smells like a misfeature of the device core handling.
> 
> It has to do with the fact that this is a "static" device that is being
> reused.  Normally it would be cleaned up properly in the release
> function, but as there isn't one, some fields are being left in a bad
> state.

Kay, I looked at this this morning, and it comes down to the line:

DEFINE_PER_CPU(struct device, mce_device);

Where we are creating static struct device variables.  I'm guessing this
is just done for "convenience" as we really don't care about where in
memory these structures are, we just want to make sure we have enough of
them around (this is the way all the other mce per-cpu structures are
handled.)

I couldn't figure out a "simple" way to create a variable per cpu here,
dynamically.  I tried doing something like:
	struct device *mce_device[CONFIG_NR_CPUS];
and dynamically create and clean them up when they go away, setting the
array value to NULL when they are unregistered, and let them clean up in
the release function, but does that race with creating the device again?

It seems that this would work, but I'm probably missing something
obvious here, any ideas?

The "correct" way to fix this up would be to have a per-cpu structure
for all of the different mce things that are created in this driver
(struct device, struct mce, exception counts, work queues, polling
banks, etc.), but that seems pretty messy, and I imagine some of these
want to stay as-is for some performance issues.  As I don't know this
code at all, I'm a bit leary to make that kind of change.

thanks,

greg k-h

  parent reply	other threads:[~2012-01-16 18:11 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-11  0:00 x86/mce: machine check warning during poweroff Djalal Harouni
2012-01-12 14:22 ` Ming Lei
2012-01-13 20:22   ` Srivatsa S. Bhat
2012-01-13 20:34     ` Justin P. Mattock
2012-01-13 20:37     ` Linus Torvalds
2012-01-13 20:53       ` Srivatsa S. Bhat
2012-01-13 21:08         ` Linus Torvalds
2012-01-13 21:24           ` Andi Kleen
2012-01-13 21:38             ` Justin P. Mattock
2012-01-13 22:06               ` Srivatsa S. Bhat
2012-01-13 22:17                 ` Alan Stern
2012-01-13 22:18                 ` Srivatsa S. Bhat
2012-01-13 23:13             ` Andi Kleen
2012-01-14  0:44       ` Dirk Gouders
2012-01-13 23:02     ` Linus Torvalds
2012-01-13 23:27       ` Srivatsa S. Bhat
2012-01-14  0:05         ` Linus Torvalds
2012-01-14  2:41           ` Srivatsa S. Bhat
2012-01-14  2:53             ` Linus Torvalds
2012-01-14  3:05               ` Srivatsa S. Bhat
2012-01-14  3:10                 ` Linus Torvalds
2012-01-14  3:18                   ` Srivatsa S. Bhat
2012-01-14  3:41                     ` Linus Torvalds
2012-01-14  5:15                   ` Tony Luck
2012-01-14 14:49               ` Greg KH
2012-01-14 16:30                 ` Alan Stern
2012-01-14 20:45                   ` Jeff Chua
2012-01-15  2:05                   ` Tony Luck
2012-01-15  2:34                     ` Greg KH
2012-01-15  3:36                       ` Alan Stern
2012-01-16 18:15                         ` Greg KH
2012-01-16 18:11                 ` Greg KH [this message]
2012-01-16 18:27                   ` Luck, Tony
2012-01-16 18:34                     ` Greg KH
2012-01-16 18:42                   ` Kay Sievers
2012-01-17  2:21             ` Suresh Siddha
2012-01-17  9:52               ` Srivatsa S. Bhat
2012-01-17 16:15                 ` Jeff Chua
2012-01-17 16:36                   ` Srivatsa S. Bhat
2012-01-18  3:17                 ` Suresh Siddha
2012-01-18 10:19                   ` Srivatsa S. Bhat
2012-01-18 13:15                   ` Srivatsa S. Bhat
2012-01-18 13:32                     ` Sergey Senozhatsky
2012-01-18 22:08                       ` Suresh Siddha
2012-01-19  7:50                         ` Sergey Senozhatsky
2012-01-19 12:02                         ` Srivatsa S. Bhat
2012-01-20  2:28                           ` Suresh Siddha
2012-01-23  8:43                             ` Peter Zijlstra
2012-01-26 20:27                             ` [tip:sched/urgent] sched/nohz: Fix nohz cpu idle load balancing state with cpu hotplug tip-bot for Suresh Siddha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120116181135.GA2680@suse.de \
    --to=gregkh@suse.de \
    --cc=a.p.zijlstra@chello.nl \
    --cc=ak@linux.intel.com \
    --cc=borislav.petkov@amd.com \
    --cc=gilad@benyossef.com \
    --cc=gouders@et.bocholt.fh-gelsenkirchen.de \
    --cc=jeff.chua.linux@gmail.com \
    --cc=justinmattock@gmail.com \
    --cc=kay.sievers@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=marcos.mage@gmail.com \
    --cc=mgorman@suse.de \
    --cc=mingo@elte.hu \
    --cc=prasad@linux.vnet.ibm.com \
    --cc=rjw@sisk.pl \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tixxdz@opendz.org \
    --cc=tom.leiming@gmail.com \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).