All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Tony Luck <tony.luck@gmail.com>
Cc: Havard Skinnemoen <hskinnemoen@google.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Ewout van Bekkum <ewout@google.com>
Subject: Re: [PATCH 4/6] x86-mce: Add spinlocks to prevent duplicated MCP and CMCI reports.
Date: Fri, 11 Jul 2014 11:24:54 +0200	[thread overview]
Message-ID: <20140711092454.GA17083@pd.tnic> (raw)
In-Reply-To: <20140710191224.GF5603@pd.tnic>

On Thu, Jul 10, 2014 at 09:12:24PM +0200, Borislav Petkov wrote:
> I'll think about it more tomorrow - my brain is twisted enough for
> today. :-)

Ok, new day, new luck. :-)

So, following yesterday's discussion, our problem is IMHO that shared
banks could be read multiple times before they're finally cleared,
leading to repeated MCE records.

Now, staring at machine_check_poll, the processing is controlled by one
bit - MCI_STATUS_VAL - which decides what happens next.

So how about we change processing around this one bit: we let only one
reader access MSR_IA32_MCx_STATUS(i) and clear it right afterwards by
saving its contents to m.status previously.

Concurrent callers of machine_check_poll will not read the MCI_STATUS
MSR and since they look at the local copy m.status which is 0, they'll
go to the next bank.

And this for the cost of a locked CMPXCHG when we have to inc
poll_reader which should be cheaper than disabling IRQs everytime.

I.e., something like that. Hmm...

---
diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 09edd0b65fef..5483b507025a 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -19,6 +19,7 @@ struct mce_bank {
 	unsigned char init;				/* initialise bank? */
 	struct device_attribute attr;			/* device attribute */
 	char			attrname[ATTR_LEN];	/* attribute name */
+	atomic_t		poll_reader;		/* sync for polled shared banks */
 };
 
 int mce_severity(struct mce *a, int tolerant, char **msg);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index bb92f38153b2..443861da86e4 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -609,9 +609,20 @@ void machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
 		m.addr = 0;
 		m.bank = i;
 		m.tsc = 0;
+		m.status = 0;
 
 		barrier();
-		m.status = mce_rdmsrl(MSR_IA32_MCx_STATUS(i));
+
+		if (atomic_add_unless(&mce_banks[i].poll_reader, 1, 1)) {
+			m.status = mce_rdmsrl(MSR_IA32_MCx_STATUS(i));
+
+			if (m.status & MCI_STATUS_VAL)
+				/* clear status register for this bank */
+				mce_wrmsrl(MSR_IA32_MCx_STATUS(i), 0);
+
+			atomic_dec(&mce_banks[i].poll_reader);
+		}
+
 		if (!(m.status & MCI_STATUS_VAL))
 			continue;
 
@@ -637,17 +648,12 @@ void machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
 		if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce)
 			mce_log(&m);
 
-		/*
-		 * Clear state for this bank.
-		 */
-		mce_wrmsrl(MSR_IA32_MCx_STATUS(i), 0);
 	}
 
 	/*
 	 * Don't clear MCG_STATUS here because it's only defined for
 	 * exceptions.
 	 */
-
 	sync_core();
 }
 EXPORT_SYMBOL_GPL(machine_check_poll);



-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

  reply	other threads:[~2014-07-11  9:25 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-09 17:09 [PATCH 0/6] x86 mce fixes Havard Skinnemoen
2014-07-09 17:09 ` [PATCH 1/6] x86-mce: Modify CMCI poll interval to adjust for small check_interval values Havard Skinnemoen
2014-07-09 19:17   ` Borislav Petkov
2014-07-09 21:24     ` Havard Skinnemoen
2014-07-10  9:01       ` Chen, Gong
2014-07-10 17:16         ` Havard Skinnemoen
2014-07-11  2:12           ` Chen, Gong
2014-07-10 11:42       ` Borislav Petkov
2014-07-10 17:51         ` Havard Skinnemoen
2014-07-10 18:55           ` Tony Luck
2014-07-10 22:45             ` Havard Skinnemoen
2014-07-11 15:35               ` Borislav Petkov
2014-07-11 18:56                 ` Havard Skinnemoen
2014-07-11 20:10                   ` Borislav Petkov
2014-07-11 20:39                     ` Havard Skinnemoen
2014-07-14 14:57                       ` Borislav Petkov
2014-07-11 20:22                   ` Borislav Petkov
2014-07-12  0:10                     ` Havard Skinnemoen
2014-07-14 15:14                       ` Borislav Petkov
2014-07-11 20:36                   ` Borislav Petkov
2014-07-11 21:05                     ` Havard Skinnemoen
2014-07-09 17:09 ` [PATCH 2/6] x86-mce: Modify CMCI storm exit to reenable instead of rediscover banks Havard Skinnemoen
2014-07-09 20:20   ` Luck, Tony
2014-07-09 21:34     ` Havard Skinnemoen
2014-07-10 15:51       ` Borislav Petkov
2014-07-10 18:32         ` Havard Skinnemoen
2014-07-09 17:09 ` [PATCH 3/6] x86-mce: Clear CMCI enable on all claimed CMCI banks before reboot Havard Skinnemoen
2014-07-09 20:36   ` Luck, Tony
2014-07-09 21:40     ` Havard Skinnemoen
2014-07-10 16:24       ` Borislav Petkov
2014-07-10 16:33         ` Tony Luck
2014-07-10 17:56         ` Havard Skinnemoen
2014-07-10 18:27           ` Tony Luck
2014-07-10 18:30           ` Borislav Petkov
2014-07-09 17:09 ` [PATCH 4/6] x86-mce: Add spinlocks to prevent duplicated MCP and CMCI reports Havard Skinnemoen
2014-07-09 20:35   ` Andi Kleen
2014-07-09 21:51     ` Havard Skinnemoen
2014-07-09 23:32       ` Luck, Tony
2014-07-10  8:16         ` Borislav Petkov
2014-07-09 20:47   ` Luck, Tony
2014-07-09 21:56     ` Havard Skinnemoen
2014-07-10 16:41   ` Borislav Petkov
2014-07-10 18:03     ` Havard Skinnemoen
2014-07-10 18:44       ` Borislav Petkov
2014-07-10 18:57         ` Tony Luck
2014-07-10 19:12           ` Borislav Petkov
2014-07-11  9:24             ` Borislav Petkov [this message]
2014-07-11 19:06               ` Tony Luck
2014-07-11 19:52                 ` Borislav Petkov
2014-07-11 21:15                   ` Havard Skinnemoen
2014-07-17 10:50                     ` Borislav Petkov
2014-07-18 21:23                       ` Tony Luck
2014-07-18 21:31                         ` Borislav Petkov
2014-07-09 17:09 ` [PATCH 5/6] x86-mce: check if no_way_out applies before deciding not to clear MCE banks Havard Skinnemoen
2014-07-09 21:00   ` Luck, Tony
2014-07-09 23:00     ` Havard Skinnemoen
2014-07-09 23:27       ` Luck, Tony
2014-07-10 16:49         ` Borislav Petkov
2014-07-09 17:09 ` [PATCH 6/6] x86-mce: ensure the MCP timer is not already set in the mce_timer_fn Havard Skinnemoen
2014-07-09 21:04   ` Luck, Tony
2014-07-09 23:01     ` Havard Skinnemoen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140711092454.GA17083@pd.tnic \
    --to=bp@alien8.de \
    --cc=ewout@google.com \
    --cc=hskinnemoen@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.