All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Tony Luck <tony.luck@gmail.com>
Cc: Havard Skinnemoen <hskinnemoen@google.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Ewout van Bekkum <ewout@google.com>
Subject: Re: [PATCH 4/6] x86-mce: Add spinlocks to prevent duplicated MCP and CMCI reports.
Date: Fri, 11 Jul 2014 21:52:00 +0200	[thread overview]
Message-ID: <20140711195200.GA18246@pd.tnic> (raw)
In-Reply-To: <CA+8MBbKYo_Tmb3Zk4W9gBZmZqBpCu4uLUu9M5C2MZAYZEb16Vg@mail.gmail.com>

On Fri, Jul 11, 2014 at 12:06:40PM -0700, Tony Luck wrote:
> > +               if (atomic_add_unless(&mce_banks[i].poll_reader, 1, 1)) {
> > +                       m.status = mce_rdmsrl(MSR_IA32_MCx_STATUS(i));
> 
> Same as yesterday. You may skip reading a bank because someone else
> is reading the same bank number, even though you don't share that bank
> with them.

Not if those banks are in a percpu variable. And this is what
machine_check_poll gets. The ->poll_reader thing is then per-cpu too.

For shared banks it should work also as expected since we want there
only one reader to see the MCE signature.

> If we are willing to be rather flexible amount when polling happens,
> and not allow very fast poll rates. Then we could do something like
> have the lowest numbered online cpu be the only one that sets a
> timer. When it goes off, it scans its own banks, and then uses an
> async cross-processor call to poke the next highest numbered
> online cpu to have it scan banks and poke the next guy.
> 
> That way we know that two cpus can't be polling at the same time,
> because we convoy them all one at a time.

See above - those banks are percpu. And besides, mce_timer_fn already
has the WARN_ON which otherwise be firing left and right.

It seems, Havard's issue is only with shared banks. I think they only
cause the repeated error records.

> Fast poll rates would be a problem on very large systems. Might
> need to have the highest numbered cpu notice that it is at the
> end of the chain and set some flag so the lowest one can tell
> whether it is safe to begin the next ripple.

Well, if fast polling rates will be a problem anyway, we probably should
talk about adjusting the polling alg. too.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

  reply	other threads:[~2014-07-11 19:52 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-09 17:09 [PATCH 0/6] x86 mce fixes Havard Skinnemoen
2014-07-09 17:09 ` [PATCH 1/6] x86-mce: Modify CMCI poll interval to adjust for small check_interval values Havard Skinnemoen
2014-07-09 19:17   ` Borislav Petkov
2014-07-09 21:24     ` Havard Skinnemoen
2014-07-10  9:01       ` Chen, Gong
2014-07-10 17:16         ` Havard Skinnemoen
2014-07-11  2:12           ` Chen, Gong
2014-07-10 11:42       ` Borislav Petkov
2014-07-10 17:51         ` Havard Skinnemoen
2014-07-10 18:55           ` Tony Luck
2014-07-10 22:45             ` Havard Skinnemoen
2014-07-11 15:35               ` Borislav Petkov
2014-07-11 18:56                 ` Havard Skinnemoen
2014-07-11 20:10                   ` Borislav Petkov
2014-07-11 20:39                     ` Havard Skinnemoen
2014-07-14 14:57                       ` Borislav Petkov
2014-07-11 20:22                   ` Borislav Petkov
2014-07-12  0:10                     ` Havard Skinnemoen
2014-07-14 15:14                       ` Borislav Petkov
2014-07-11 20:36                   ` Borislav Petkov
2014-07-11 21:05                     ` Havard Skinnemoen
2014-07-09 17:09 ` [PATCH 2/6] x86-mce: Modify CMCI storm exit to reenable instead of rediscover banks Havard Skinnemoen
2014-07-09 20:20   ` Luck, Tony
2014-07-09 21:34     ` Havard Skinnemoen
2014-07-10 15:51       ` Borislav Petkov
2014-07-10 18:32         ` Havard Skinnemoen
2014-07-09 17:09 ` [PATCH 3/6] x86-mce: Clear CMCI enable on all claimed CMCI banks before reboot Havard Skinnemoen
2014-07-09 20:36   ` Luck, Tony
2014-07-09 21:40     ` Havard Skinnemoen
2014-07-10 16:24       ` Borislav Petkov
2014-07-10 16:33         ` Tony Luck
2014-07-10 17:56         ` Havard Skinnemoen
2014-07-10 18:27           ` Tony Luck
2014-07-10 18:30           ` Borislav Petkov
2014-07-09 17:09 ` [PATCH 4/6] x86-mce: Add spinlocks to prevent duplicated MCP and CMCI reports Havard Skinnemoen
2014-07-09 20:35   ` Andi Kleen
2014-07-09 21:51     ` Havard Skinnemoen
2014-07-09 23:32       ` Luck, Tony
2014-07-10  8:16         ` Borislav Petkov
2014-07-09 20:47   ` Luck, Tony
2014-07-09 21:56     ` Havard Skinnemoen
2014-07-10 16:41   ` Borislav Petkov
2014-07-10 18:03     ` Havard Skinnemoen
2014-07-10 18:44       ` Borislav Petkov
2014-07-10 18:57         ` Tony Luck
2014-07-10 19:12           ` Borislav Petkov
2014-07-11  9:24             ` Borislav Petkov
2014-07-11 19:06               ` Tony Luck
2014-07-11 19:52                 ` Borislav Petkov [this message]
2014-07-11 21:15                   ` Havard Skinnemoen
2014-07-17 10:50                     ` Borislav Petkov
2014-07-18 21:23                       ` Tony Luck
2014-07-18 21:31                         ` Borislav Petkov
2014-07-09 17:09 ` [PATCH 5/6] x86-mce: check if no_way_out applies before deciding not to clear MCE banks Havard Skinnemoen
2014-07-09 21:00   ` Luck, Tony
2014-07-09 23:00     ` Havard Skinnemoen
2014-07-09 23:27       ` Luck, Tony
2014-07-10 16:49         ` Borislav Petkov
2014-07-09 17:09 ` [PATCH 6/6] x86-mce: ensure the MCP timer is not already set in the mce_timer_fn Havard Skinnemoen
2014-07-09 21:04   ` Luck, Tony
2014-07-09 23:01     ` Havard Skinnemoen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140711195200.GA18246@pd.tnic \
    --to=bp@alien8.de \
    --cc=ewout@google.com \
    --cc=hskinnemoen@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.