Linux-EDAC Archive on lore.kernel.org
 help / color / Atom feed
From: "Ghannam, Yazen" <Yazen.Ghannam@amd.com>
To: Borislav Petkov <bp@alien8.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"bp@suse.de" <bp@suse.de>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"x86@kernel.org" <x86@kernel.org>
Subject: RE: [PATCH v3 4/6] x86/MCE: Make number of MCA banks per_cpu
Date: Tue, 21 May 2019 17:52:42 +0000
Message-ID: <SN6PR12MB2639571E33EBC7342A0607F8F8070@SN6PR12MB2639.namprd12.prod.outlook.com> (raw)
In-Reply-To: <20190518112530.GA26276@zn.tnic>

> -----Original Message-----
> From: Borislav Petkov <bp@alien8.de>
> Sent: Saturday, May 18, 2019 6:26 AM
> To: Ghannam, Yazen <Yazen.Ghannam@amd.com>
> Cc: linux-edac@vger.kernel.org; linux-kernel@vger.kernel.org; bp@suse.de; tony.luck@intel.com; x86@kernel.org
> Subject: Re: [PATCH v3 4/6] x86/MCE: Make number of MCA banks per_cpu
> 
> 
> On Tue, Apr 30, 2019 at 08:32:20PM +0000, Ghannam, Yazen wrote:
> > From: Yazen Ghannam <yazen.ghannam@amd.com>
> >
> > The number of MCA banks is provided per logical CPU. Historically, this
> > number has been the same across all CPUs, but this is not an
> > architectural guarantee. Future AMD systems may have MCA bank counts
> > that vary between logical CPUs in a system.
> >
> > This issue was partially addressed in
> >
> > 006c077041dc ("x86/mce: Handle varying MCA bank counts")
> >
> > by allocating structures using the maximum number of MCA banks and by
> > saving the maximum MCA bank count in a system as the global count. This
> > means that some extra structures are allocated. Also, this means that
> > CPUs will spend more time in the #MC and other handlers checking extra
> > MCA banks.
> 
> ...
> 
> > @@ -1480,14 +1482,15 @@ EXPORT_SYMBOL_GPL(mce_notify_irq);
> >
> >  static int __mcheck_cpu_mce_banks_init(void)
> >  {
> > +     u8 n_banks = this_cpu_read(mce_num_banks);
> >       struct mce_bank *mce_banks;
> >       int i;
> >
> > -     mce_banks = kcalloc(MAX_NR_BANKS, sizeof(struct mce_bank), GFP_KERNEL);
> > +     mce_banks = kcalloc(n_banks, sizeof(struct mce_bank), GFP_KERNEL);
> 
> Something changed in mm land or maybe we were lucky and got away with an
> atomic GFP_KERNEL allocation until now but:
> 
> [    2.447838] smp: Bringing up secondary CPUs ...
> [    2.456895] x86: Booting SMP configuration:
> [    2.457822] .... node  #0, CPUs:        #1

The issue seems to be that the allocation is now happening on CPUs other than CPU0.

Patch 2 in this set has the same issue. I didn't see it until I turned on the "Lock Debugging" config options.

> [    1.344284] BUG: sleeping function called from invalid context at mm/slab.h:418

This message comes from ___might_sleep() which checks the system_state.

On CPU0, system_state=SYSTEM_BOOTING.

On every other CPU, system_state=SYSTEM_SCHEDULING, and that's the only system_state where the message is shown.

Changing GFP_KERNEL to GFP_ATOMIC seems to be a fix. Is this appropriate? Or do you think there's something else we could try?

Thanks,
Yazen


  reply index

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-30 20:32 [PATCH v3 0/6] Handle MCA banks in a per_cpu way Ghannam, Yazen
2019-04-30 20:32 ` [v3,1/6] x86/MCE: Make struct mce_banks[] static Yazen Ghannam
2019-04-30 20:32   ` [PATCH v3 1/6] " Ghannam, Yazen
2019-04-30 20:32 ` [v3,2/6] x86/MCE: Handle MCA controls in a per_cpu way Yazen Ghannam
2019-04-30 20:32   ` [PATCH v3 2/6] " Ghannam, Yazen
2019-04-30 20:32 ` [v3,3/6] x86/MCE/AMD: Don't cache block addresses on SMCA systems Yazen Ghannam
2019-04-30 20:32   ` [PATCH v3 3/6] " Ghannam, Yazen
2019-04-30 20:32 ` [v3,5/6] x86/MCE: Save MCA control bits that get set in hardware Yazen Ghannam
2019-04-30 20:32   ` [PATCH v3 5/6] " Ghannam, Yazen
2019-05-16 15:52   ` Luck, Tony
2019-05-16 16:14     ` Ghannam, Yazen
2019-05-16 16:56       ` Borislav Petkov
2019-05-16 17:09         ` Ghannam, Yazen
2019-05-16 17:21           ` Borislav Petkov
2019-05-16 20:20             ` Ghannam, Yazen
2019-05-16 20:34               ` Borislav Petkov
2019-05-16 20:59                 ` Luck, Tony
2019-05-17 10:10                   ` Borislav Petkov
2019-05-17 15:46                     ` Ghannam, Yazen
2019-05-17 16:37                       ` Borislav Petkov
2019-05-17 17:26                         ` Luck, Tony
2019-05-17 17:48                           ` Borislav Petkov
2019-05-17 18:06                             ` Luck, Tony
2019-05-17 19:34                               ` Borislav Petkov
2019-05-17 19:44                                 ` Luck, Tony
2019-05-17 19:50                                   ` Borislav Petkov
2019-05-17 19:49                                 ` Ghannam, Yazen
2019-05-17 20:02                                   ` Borislav Petkov
2019-05-23 20:00                                     ` Ghannam, Yazen
2019-05-27 23:28                                       ` Borislav Petkov
2019-06-07 14:49                                         ` Ghannam, Yazen
2019-06-07 16:37                                           ` Borislav Petkov
2019-06-07 16:44                                             ` Ghannam, Yazen
2019-06-07 16:59                                               ` Borislav Petkov
2019-06-07 17:08                                                 ` Ghannam, Yazen
2019-06-07 17:20                                                   ` Borislav Petkov
2019-06-11  5:13                                             ` Borislav Petkov
2019-04-30 20:32 ` [v3,4/6] x86/MCE: Make number of MCA banks per_cpu Yazen Ghannam
2019-04-30 20:32   ` [PATCH v3 4/6] " Ghannam, Yazen
2019-05-18 11:25   ` Borislav Petkov
2019-05-21 17:52     ` Ghannam, Yazen [this message]
2019-05-21 20:29       ` Borislav Petkov
2019-05-21 20:42         ` Luck, Tony
2019-05-21 23:09           ` Borislav Petkov
2019-05-22 14:01             ` Ghannam, Yazen
2019-04-30 20:32 ` [v3,6/6] x86/MCE: Treat MCE bank as initialized if control bits set in hardware Yazen Ghannam
2019-04-30 20:32   ` [PATCH v3 6/6] " Ghannam, Yazen

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SN6PR12MB2639571E33EBC7342A0607F8F8070@SN6PR12MB2639.namprd12.prod.outlook.com \
    --to=yazen.ghannam@amd.com \
    --cc=bp@alien8.de \
    --cc=bp@suse.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-EDAC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-edac/0 linux-edac/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-edac linux-edac/ https://lore.kernel.org/linux-edac \
		linux-edac@vger.kernel.org
	public-inbox-index linux-edac

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-edac


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git