All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Raise maximum number of memory controllers
@ 2018-09-25 14:34 ` Justin Ernst
  0 siblings, 0 replies; 56+ messages in thread
From: Justin Ernst @ 2018-09-25 14:34 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: russ.anderson, Justin Ernst, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

We observe an oops in the skx_edac module during boot.
Examining /var/log/messages:
[ 3401.985757] EDAC MC0: Giving out device to module skx_edac controller Skylake Socket#0 IMC#0
[ 3401.985887] EDAC MC1: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
[ 3401.986014] EDAC MC2: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
...
[ 3401.987318] EDAC MC13: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
[ 3401.987435] EDAC MC14: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
[ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
[ 3401.987579] Too many memory controllers: 16
[ 3402.042614] EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0

We observe there are two memory controllers per socket, with a limit of 16.
Raise the maximum number of memory controllers from 16 to 2 * MAX_NUMNODES (1024).

Cc: Borislav Petkov <bp@alien8.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Russ Anderson <russ.anderson@hpe.com>
Signed-off-by: Justin Ernst <justin.ernst@hpe.com>
---
 include/linux/edac.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/edac.h b/include/linux/edac.h
index bffb97828ed6..958d69332c1d 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -17,6 +17,7 @@
 #include <linux/completion.h>
 #include <linux/workqueue.h>
 #include <linux/debugfs.h>
+#include <linux/numa.h>
 
 #define EDAC_DEVICE_NAME_LEN	31
 
@@ -670,6 +671,6 @@ struct mem_ctl_info {
 /*
  * Maximum number of memory controllers in the coherent fabric.
  */
-#define EDAC_MAX_MCS	16
+#define EDAC_MAX_MCS	2 * MAX_NUMNODES
 
 #endif
-- 
2.12.3


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-25 14:34 ` Justin Ernst
  0 siblings, 0 replies; 56+ messages in thread
From: Justin Ernst @ 2018-09-25 14:34 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: russ.anderson, Justin Ernst, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

We observe an oops in the skx_edac module during boot.
Examining /var/log/messages:
[ 3401.985757] EDAC MC0: Giving out device to module skx_edac controller Skylake Socket#0 IMC#0
[ 3401.985887] EDAC MC1: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
[ 3401.986014] EDAC MC2: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
...
[ 3401.987318] EDAC MC13: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
[ 3401.987435] EDAC MC14: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
[ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
[ 3401.987579] Too many memory controllers: 16
[ 3402.042614] EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0

We observe there are two memory controllers per socket, with a limit of 16.
Raise the maximum number of memory controllers from 16 to 2 * MAX_NUMNODES (1024).

Cc: Borislav Petkov <bp@alien8.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Russ Anderson <russ.anderson@hpe.com>
Signed-off-by: Justin Ernst <justin.ernst@hpe.com>
---
 include/linux/edac.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/edac.h b/include/linux/edac.h
index bffb97828ed6..958d69332c1d 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -17,6 +17,7 @@
 #include <linux/completion.h>
 #include <linux/workqueue.h>
 #include <linux/debugfs.h>
+#include <linux/numa.h>
 
 #define EDAC_DEVICE_NAME_LEN	31
 
@@ -670,6 +671,6 @@ struct mem_ctl_info {
 /*
  * Maximum number of memory controllers in the coherent fabric.
  */
-#define EDAC_MAX_MCS	16
+#define EDAC_MAX_MCS	2 * MAX_NUMNODES
 
 #endif

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-25 15:26   ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-25 15:26 UTC (permalink / raw)
  To: Tony Luck
  Cc: Justin Ernst, russ.anderson, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

On Tue, Sep 25, 2018 at 09:34:49AM -0500, Justin Ernst wrote:
> We observe an oops in the skx_edac module during boot.
> Examining /var/log/messages:
> [ 3401.985757] EDAC MC0: Giving out device to module skx_edac controller Skylake Socket#0 IMC#0
> [ 3401.985887] EDAC MC1: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> [ 3401.986014] EDAC MC2: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> ...
> [ 3401.987318] EDAC MC13: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> [ 3401.987435] EDAC MC14: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
> [ 3401.987579] Too many memory controllers: 16
> [ 3402.042614] EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0
> 
> We observe there are two memory controllers per socket, with a limit of 16.
> Raise the maximum number of memory controllers from 16 to 2 * MAX_NUMNODES (1024).

Tony,

can we read that out from the hardware instead of having this silly
static number?

Leaving in the rest.

> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> Cc: linux-edac@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Russ Anderson <russ.anderson@hpe.com>
> Signed-off-by: Justin Ernst <justin.ernst@hpe.com>
> ---
>  include/linux/edac.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index bffb97828ed6..958d69332c1d 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -17,6 +17,7 @@
>  #include <linux/completion.h>
>  #include <linux/workqueue.h>
>  #include <linux/debugfs.h>
> +#include <linux/numa.h>
>  
>  #define EDAC_DEVICE_NAME_LEN	31
>  
> @@ -670,6 +671,6 @@ struct mem_ctl_info {
>  /*
>   * Maximum number of memory controllers in the coherent fabric.
>   */
> -#define EDAC_MAX_MCS	16
> +#define EDAC_MAX_MCS	2 * MAX_NUMNODES
>  
>  #endif
> -- 
> 2.12.3
> 

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-25 15:26   ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-25 15:26 UTC (permalink / raw)
  To: Tony Luck
  Cc: Justin Ernst, russ.anderson, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

On Tue, Sep 25, 2018 at 09:34:49AM -0500, Justin Ernst wrote:
> We observe an oops in the skx_edac module during boot.
> Examining /var/log/messages:
> [ 3401.985757] EDAC MC0: Giving out device to module skx_edac controller Skylake Socket#0 IMC#0
> [ 3401.985887] EDAC MC1: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> [ 3401.986014] EDAC MC2: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> ...
> [ 3401.987318] EDAC MC13: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> [ 3401.987435] EDAC MC14: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
> [ 3401.987579] Too many memory controllers: 16
> [ 3402.042614] EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0
> 
> We observe there are two memory controllers per socket, with a limit of 16.
> Raise the maximum number of memory controllers from 16 to 2 * MAX_NUMNODES (1024).

Tony,

can we read that out from the hardware instead of having this silly
static number?

Leaving in the rest.

> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> Cc: linux-edac@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Russ Anderson <russ.anderson@hpe.com>
> Signed-off-by: Justin Ernst <justin.ernst@hpe.com>
> ---
>  include/linux/edac.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index bffb97828ed6..958d69332c1d 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -17,6 +17,7 @@
>  #include <linux/completion.h>
>  #include <linux/workqueue.h>
>  #include <linux/debugfs.h>
> +#include <linux/numa.h>
>  
>  #define EDAC_DEVICE_NAME_LEN	31
>  
> @@ -670,6 +671,6 @@ struct mem_ctl_info {
>  /*
>   * Maximum number of memory controllers in the coherent fabric.
>   */
> -#define EDAC_MAX_MCS	16
> +#define EDAC_MAX_MCS	2 * MAX_NUMNODES
>  
>  #endif
> -- 
> 2.12.3
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-25 17:50     ` Luck, Tony
  0 siblings, 0 replies; 56+ messages in thread
From: Luck, Tony @ 2018-09-25 17:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Justin Ernst, russ.anderson, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

On Tue, Sep 25, 2018 at 05:26:59PM +0200, Borislav Petkov wrote:
> On Tue, Sep 25, 2018 at 09:34:49AM -0500, Justin Ernst wrote:
> > We observe an oops in the skx_edac module during boot.
> > Examining /var/log/messages:
> > [ 3401.985757] EDAC MC0: Giving out device to module skx_edac controller Skylake Socket#0 IMC#0
> > [ 3401.985887] EDAC MC1: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> > [ 3401.986014] EDAC MC2: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> > ...
> > [ 3401.987318] EDAC MC13: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> > [ 3401.987435] EDAC MC14: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> > [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
> > [ 3401.987579] Too many memory controllers: 16
> > [ 3402.042614] EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0
> > 
> > We observe there are two memory controllers per socket, with a limit of 16.
> > Raise the maximum number of memory controllers from 16 to 2 * MAX_NUMNODES (1024).
> 
> Tony,
> 
> can we read that out from the hardware instead of having this silly
> static number?
> 
> Leaving in the rest.

There are way too many places where we use the identifier "bus"
in the edac core and drivers. But I'm not sure that we need a
static array mc_bus[EDAC_MAX_MCS].

Why can't we:


-	mci->bus = &mc_bus[mci->mc_idx];
+	mci->bus = kmalloc(sizeof *(mci->bus), GFP_KERNEL);

and then figure out where to kfree(mci->bus) on driver removal?

Do we every do arithmetic on different mci->bus pointers that
assume they are all part of a single array?

-Tony

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-25 17:50     ` Luck, Tony
  0 siblings, 0 replies; 56+ messages in thread
From: Luck, Tony @ 2018-09-25 17:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Justin Ernst, russ.anderson, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

On Tue, Sep 25, 2018 at 05:26:59PM +0200, Borislav Petkov wrote:
> On Tue, Sep 25, 2018 at 09:34:49AM -0500, Justin Ernst wrote:
> > We observe an oops in the skx_edac module during boot.
> > Examining /var/log/messages:
> > [ 3401.985757] EDAC MC0: Giving out device to module skx_edac controller Skylake Socket#0 IMC#0
> > [ 3401.985887] EDAC MC1: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> > [ 3401.986014] EDAC MC2: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> > ...
> > [ 3401.987318] EDAC MC13: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> > [ 3401.987435] EDAC MC14: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> > [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
> > [ 3401.987579] Too many memory controllers: 16
> > [ 3402.042614] EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0
> > 
> > We observe there are two memory controllers per socket, with a limit of 16.
> > Raise the maximum number of memory controllers from 16 to 2 * MAX_NUMNODES (1024).
> 
> Tony,
> 
> can we read that out from the hardware instead of having this silly
> static number?
> 
> Leaving in the rest.

There are way too many places where we use the identifier "bus"
in the edac core and drivers. But I'm not sure that we need a
static array mc_bus[EDAC_MAX_MCS].

Why can't we:


-	mci->bus = &mc_bus[mci->mc_idx];
+	mci->bus = kmalloc(sizeof *(mci->bus), GFP_KERNEL);

and then figure out where to kfree(mci->bus) on driver removal?

Do we every do arithmetic on different mci->bus pointers that
assume they are all part of a single array?

-Tony

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-25 18:07       ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-25 18:07 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Justin Ernst, russ.anderson, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

On Tue, Sep 25, 2018 at 10:50:23AM -0700, Luck, Tony wrote:
> There are way too many places where we use the identifier "bus"
> in the edac core and drivers. But I'm not sure that we need a
> static array mc_bus[EDAC_MAX_MCS].

That, of course, is another way of looking at it which I didn't think
of.

> Why can't we:
> 
> 
> -	mci->bus = &mc_bus[mci->mc_idx];
> +	mci->bus = kmalloc(sizeof *(mci->bus), GFP_KERNEL);
> 
> and then figure out where to kfree(mci->bus) on driver removal?

AFAICT, in _edac_mc_free(). We free there mci itself so kfree(mci->bus)
can happen directly before it.

> Do we every do arithmetic on different mci->bus pointers that
> assume they are all part of a single array?

AFAICT, we use that thing for the bus_reg/unreg functions and we hand it
back'n'forth in edac_mc_sysfs.c, see

$ git grep -E "mci.*bus" drivers/edac/
drivers/edac/edac_mc.c:763:     mci->bus = &mc_bus[mci->mc_idx];
drivers/edac/edac_mc_sysfs.c:408:       csrow->dev.bus = mci->bus;
drivers/edac/edac_mc_sysfs.c:639:       dimm->dev.bus = mci->bus;
drivers/edac/edac_mc_sysfs.c:928:       mci->bus->name = name;
drivers/edac/edac_mc_sysfs.c:930:       edac_dbg(0, "creating bus %s\n", mci->bus->name);
drivers/edac/edac_mc_sysfs.c:932:       err = bus_register(mci->bus);
drivers/edac/edac_mc_sysfs.c:943:       mci->dev.bus = mci->bus;
drivers/edac/edac_mc_sysfs.c:1002:      bus_unregister(mci->bus);
drivers/edac/edac_mc_sysfs.c:1035:      struct bus_type *bus = mci->bus;
drivers/edac/edac_mc_sysfs.c:1036:      const char *name = mci->bus->name;
drivers/edac/edac_mc_sysfs.c:1071:      mci_pdev->bus = edac_get_sysfs_subsys();
drivers/edac/i5100_edac.c:967:  priv->debugfs = edac_debugfs_create_dir_at(mci->bus->name, i5100_debugfs);
drivers/edac/i7core_edac.c:1170:        pvt->addrmatch_dev->bus = mci->dev.bus;
drivers/edac/i7core_edac.c:1191:                pvt->chancounts_dev->bus = mci->dev.bus;

HOWEVER, look at

  88d84ac97378 ("EDAC: Fix lockdep splat")

Now I remember. I did that for lockdep because it wants statically
allocated memory. I'll try to think of something tomorrow.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-25 18:07       ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-25 18:07 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Justin Ernst, russ.anderson, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

On Tue, Sep 25, 2018 at 10:50:23AM -0700, Luck, Tony wrote:
> There are way too many places where we use the identifier "bus"
> in the edac core and drivers. But I'm not sure that we need a
> static array mc_bus[EDAC_MAX_MCS].

That, of course, is another way of looking at it which I didn't think
of.

> Why can't we:
> 
> 
> -	mci->bus = &mc_bus[mci->mc_idx];
> +	mci->bus = kmalloc(sizeof *(mci->bus), GFP_KERNEL);
> 
> and then figure out where to kfree(mci->bus) on driver removal?

AFAICT, in _edac_mc_free(). We free there mci itself so kfree(mci->bus)
can happen directly before it.

> Do we every do arithmetic on different mci->bus pointers that
> assume they are all part of a single array?

AFAICT, we use that thing for the bus_reg/unreg functions and we hand it
back'n'forth in edac_mc_sysfs.c, see

$ git grep -E "mci.*bus" drivers/edac/
drivers/edac/edac_mc.c:763:     mci->bus = &mc_bus[mci->mc_idx];
drivers/edac/edac_mc_sysfs.c:408:       csrow->dev.bus = mci->bus;
drivers/edac/edac_mc_sysfs.c:639:       dimm->dev.bus = mci->bus;
drivers/edac/edac_mc_sysfs.c:928:       mci->bus->name = name;
drivers/edac/edac_mc_sysfs.c:930:       edac_dbg(0, "creating bus %s\n", mci->bus->name);
drivers/edac/edac_mc_sysfs.c:932:       err = bus_register(mci->bus);
drivers/edac/edac_mc_sysfs.c:943:       mci->dev.bus = mci->bus;
drivers/edac/edac_mc_sysfs.c:1002:      bus_unregister(mci->bus);
drivers/edac/edac_mc_sysfs.c:1035:      struct bus_type *bus = mci->bus;
drivers/edac/edac_mc_sysfs.c:1036:      const char *name = mci->bus->name;
drivers/edac/edac_mc_sysfs.c:1071:      mci_pdev->bus = edac_get_sysfs_subsys();
drivers/edac/i5100_edac.c:967:  priv->debugfs = edac_debugfs_create_dir_at(mci->bus->name, i5100_debugfs);
drivers/edac/i7core_edac.c:1170:        pvt->addrmatch_dev->bus = mci->dev.bus;
drivers/edac/i7core_edac.c:1191:                pvt->chancounts_dev->bus = mci->dev.bus;

HOWEVER, look at

  88d84ac97378 ("EDAC: Fix lockdep splat")

Now I remember. I did that for lockdep because it wants statically
allocated memory. I'll try to think of something tomorrow.

Thx.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* RE: [PATCH] Raise maximum number of memory controllers
@ 2018-09-26  7:55   ` Qiuxu Zhuo
  0 siblings, 0 replies; 56+ messages in thread
From: Zhuo, Qiuxu @ 2018-09-26  7:55 UTC (permalink / raw)
  To: Justin Ernst, Borislav Petkov
  Cc: Luck, Tony, russ.anderson, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

Hi Justin,

> [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
    Just curious, has the system(two memory controllers per socket) got more than 8 sockets?
    Normally, the number "1" in the above string "Skylake Socekt#1 IMC#1" should be 7 (that was 15/2), but it was 1 here.

Thanks!
-Qiuxu

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-26  7:55   ` Qiuxu Zhuo
  0 siblings, 0 replies; 56+ messages in thread
From: Qiuxu Zhuo @ 2018-09-26  7:55 UTC (permalink / raw)
  To: Justin Ernst, Borislav Petkov
  Cc: Luck, Tony, russ.anderson, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

Hi Justin,

> [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
    Just curious, has the system(two memory controllers per socket) got more than 8 sockets?
    Normally, the number "1" in the above string "Skylake Socekt#1 IMC#1" should be 7 (that was 15/2), but it was 1 here.

Thanks!
-Qiuxu

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-26  9:35         ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-26  9:35 UTC (permalink / raw)
  To: Luck, Tony, Greg KH
  Cc: Justin Ernst, russ.anderson, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

On Tue, Sep 25, 2018 at 08:07:33PM +0200, Borislav Petkov wrote:
> Now I remember. I did that for lockdep because it wants statically
> allocated memory. I'll try to think of something tomorrow.

Some more info after some staring:

We could've made the lock_class_key only static storage so that lockdep
is happy but then struct bus_type embeds the *whole* lock_class_key and
not a pointer to it.

Which leaves us with either:

* this fix postponing the problem

* or Greg coming and saying, you're using bus_type all wrong and you
shouldn't and you should remove it completely! :-)

Which would be much better.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-26  9:35         ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-26  9:35 UTC (permalink / raw)
  To: Luck, Tony, Greg KH
  Cc: Justin Ernst, russ.anderson, Mauro Carvalho Chehab, linux-edac,
	linux-kernel

On Tue, Sep 25, 2018 at 08:07:33PM +0200, Borislav Petkov wrote:
> Now I remember. I did that for lockdep because it wants statically
> allocated memory. I'll try to think of something tomorrow.

Some more info after some staring:

We could've made the lock_class_key only static storage so that lockdep
is happy but then struct bus_type embeds the *whole* lock_class_key and
not a pointer to it.

Which leaves us with either:

* this fix postponing the problem

* or Greg coming and saying, you're using bus_type all wrong and you
shouldn't and you should remove it completely! :-)

Which would be much better.

Thx.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-26 13:53     ` Russ Anderson
  0 siblings, 0 replies; 56+ messages in thread
From: Russ Anderson @ 2018-09-26 13:53 UTC (permalink / raw)
  To: Zhuo, Qiuxu
  Cc: Justin Ernst, Borislav Petkov, Luck, Tony, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel

On Wed, Sep 26, 2018 at 07:55:39AM +0000, Zhuo, Qiuxu wrote:
> Hi Justin,
> 
> > [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
>     Just curious, has the system(two memory controllers per socket) got more than 8 sockets?
>     Normally, the number "1" in the above string "Skylake Socekt#1 IMC#1" should be 7 (that was 15/2), but it was 1 here.

Yes, that is from a 32 socket system.

Thanks.
-- 
Russ Anderson,  SuperDome Flex Linux Kernel Group Manager
HPE - Hewlett Packard Enterprise (formerly SGI)  rja@hpe.com

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-26 13:53     ` Russ Anderson
  0 siblings, 0 replies; 56+ messages in thread
From: Russ Anderson @ 2018-09-26 13:53 UTC (permalink / raw)
  To: Zhuo, Qiuxu
  Cc: Justin Ernst, Borislav Petkov, Luck, Tony, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel

On Wed, Sep 26, 2018 at 07:55:39AM +0000, Zhuo, Qiuxu wrote:
> Hi Justin,
> 
> > [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
>     Just curious, has the system(two memory controllers per socket) got more than 8 sockets?
>     Normally, the number "1" in the above string "Skylake Socekt#1 IMC#1" should be 7 (that was 15/2), but it was 1 here.

Yes, that is from a 32 socket system.

Thanks.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-26 15:27           ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-26 15:27 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Greg KH, Justin Ernst, russ.anderson, Mauro Carvalho Chehab,
	linux-edac, linux-kernel

On Wed, Sep 26, 2018 at 11:35:11AM +0200, Borislav Petkov wrote:
> * or Greg coming and saying, you're using bus_type all wrong and you
> shouldn't and you should remove it completely! :-)

Yap, and so he did! :-)

It looks like we can remove the whole per-MC bus thing, see below. Patch
seems to work here and grepping sysfs hierarchy before and after:

find /sys/ | grep -i edac > ...

doesn't show any differences.

Tony, I'd appreciate running this on one of your big boxes to check
nothing is missing in sysfs...

Thx.

---
 drivers/edac/edac_mc.c       |  9 +--------
 drivers/edac/edac_mc_sysfs.c | 30 ++----------------------------
 include/linux/edac.h         |  6 ------
 3 files changed, 3 insertions(+), 42 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 7d3edd713932..13594ffadcb3 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -55,8 +55,6 @@ static LIST_HEAD(mc_devices);
  */
 static const char *edac_mc_owner;
 
-static struct bus_type mc_bus[EDAC_MAX_MCS];
-
 int edac_get_report_status(void)
 {
 	return edac_report;
@@ -716,11 +714,6 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
 	int ret = -EINVAL;
 	edac_dbg(0, "\n");
 
-	if (mci->mc_idx >= EDAC_MAX_MCS) {
-		pr_warn_once("Too many memory controllers: %d\n", mci->mc_idx);
-		return -ENODEV;
-	}
-
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
 		edac_mc_dump_mci(mci);
@@ -760,7 +753,7 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
 	/* set load time so that error rate can be tracked */
 	mci->start_time = jiffies;
 
-	mci->bus = &mc_bus[mci->mc_idx];
+	mci->bus = edac_get_sysfs_subsys();
 
 	if (edac_create_sysfs_mci_device(mci, groups)) {
 		edac_mc_printk(mci, KERN_WARNING,
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 20374b8248f0..2ca2012f2857 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -914,27 +914,8 @@ static const struct device_type mci_attr_type = {
 int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
 				 const struct attribute_group **groups)
 {
-	char *name;
 	int i, err;
 
-	/*
-	 * The memory controller needs its own bus, in order to avoid
-	 * namespace conflicts at /sys/bus/edac.
-	 */
-	name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
-	if (!name)
-		return -ENOMEM;
-
-	mci->bus->name = name;
-
-	edac_dbg(0, "creating bus %s\n", mci->bus->name);
-
-	err = bus_register(mci->bus);
-	if (err < 0) {
-		kfree(name);
-		return err;
-	}
-
 	/* get the /sys/devices/system/edac subsys reference */
 	mci->dev.type = &mci_attr_type;
 	device_initialize(&mci->dev);
@@ -950,7 +931,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
 	err = device_add(&mci->dev);
 	if (err < 0) {
 		edac_dbg(1, "failure: create device %s\n", dev_name(&mci->dev));
-		goto fail_unregister_bus;
+		goto out;
 	}
 
 	/*
@@ -998,10 +979,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
 		device_unregister(&dimm->dev);
 	}
 	device_unregister(&mci->dev);
-fail_unregister_bus:
-	bus_unregister(mci->bus);
-	kfree(name);
 
+out:
 	return err;
 }
 
@@ -1032,13 +1011,8 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 
 void edac_unregister_sysfs(struct mem_ctl_info *mci)
 {
-	struct bus_type *bus = mci->bus;
-	const char *name = mci->bus->name;
-
 	edac_dbg(1, "Unregistering device %s\n", dev_name(&mci->dev));
 	device_unregister(&mci->dev);
-	bus_unregister(bus);
-	kfree(name);
 }
 
 static void mc_attr_release(struct device *dev)
diff --git a/include/linux/edac.h b/include/linux/edac.h
index a45ce1f84bfc..dd8d006f1837 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -668,10 +668,4 @@ struct mem_ctl_info {
 	bool fake_inject_ue;
 	u16 fake_inject_count;
 };
-
-/*
- * Maximum number of memory controllers in the coherent fabric.
- */
-#define EDAC_MAX_MCS	16
-
 #endif
-- 
2.17.0.582.gccdcbd54c

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-26 15:27           ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-26 15:27 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Greg KH, Justin Ernst, russ.anderson, Mauro Carvalho Chehab,
	linux-edac, linux-kernel

On Wed, Sep 26, 2018 at 11:35:11AM +0200, Borislav Petkov wrote:
> * or Greg coming and saying, you're using bus_type all wrong and you
> shouldn't and you should remove it completely! :-)

Yap, and so he did! :-)

It looks like we can remove the whole per-MC bus thing, see below. Patch
seems to work here and grepping sysfs hierarchy before and after:

find /sys/ | grep -i edac > ...

doesn't show any differences.

Tony, I'd appreciate running this on one of your big boxes to check
nothing is missing in sysfs...

Thx.
---
 drivers/edac/edac_mc.c       |  9 +--------
 drivers/edac/edac_mc_sysfs.c | 30 ++----------------------------
 include/linux/edac.h         |  6 ------
 3 files changed, 3 insertions(+), 42 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 7d3edd713932..13594ffadcb3 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -55,8 +55,6 @@ static LIST_HEAD(mc_devices);
  */
 static const char *edac_mc_owner;
 
-static struct bus_type mc_bus[EDAC_MAX_MCS];
-
 int edac_get_report_status(void)
 {
 	return edac_report;
@@ -716,11 +714,6 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
 	int ret = -EINVAL;
 	edac_dbg(0, "\n");
 
-	if (mci->mc_idx >= EDAC_MAX_MCS) {
-		pr_warn_once("Too many memory controllers: %d\n", mci->mc_idx);
-		return -ENODEV;
-	}
-
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
 		edac_mc_dump_mci(mci);
@@ -760,7 +753,7 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
 	/* set load time so that error rate can be tracked */
 	mci->start_time = jiffies;
 
-	mci->bus = &mc_bus[mci->mc_idx];
+	mci->bus = edac_get_sysfs_subsys();
 
 	if (edac_create_sysfs_mci_device(mci, groups)) {
 		edac_mc_printk(mci, KERN_WARNING,
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 20374b8248f0..2ca2012f2857 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -914,27 +914,8 @@ static const struct device_type mci_attr_type = {
 int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
 				 const struct attribute_group **groups)
 {
-	char *name;
 	int i, err;
 
-	/*
-	 * The memory controller needs its own bus, in order to avoid
-	 * namespace conflicts at /sys/bus/edac.
-	 */
-	name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
-	if (!name)
-		return -ENOMEM;
-
-	mci->bus->name = name;
-
-	edac_dbg(0, "creating bus %s\n", mci->bus->name);
-
-	err = bus_register(mci->bus);
-	if (err < 0) {
-		kfree(name);
-		return err;
-	}
-
 	/* get the /sys/devices/system/edac subsys reference */
 	mci->dev.type = &mci_attr_type;
 	device_initialize(&mci->dev);
@@ -950,7 +931,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
 	err = device_add(&mci->dev);
 	if (err < 0) {
 		edac_dbg(1, "failure: create device %s\n", dev_name(&mci->dev));
-		goto fail_unregister_bus;
+		goto out;
 	}
 
 	/*
@@ -998,10 +979,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
 		device_unregister(&dimm->dev);
 	}
 	device_unregister(&mci->dev);
-fail_unregister_bus:
-	bus_unregister(mci->bus);
-	kfree(name);
 
+out:
 	return err;
 }
 
@@ -1032,13 +1011,8 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 
 void edac_unregister_sysfs(struct mem_ctl_info *mci)
 {
-	struct bus_type *bus = mci->bus;
-	const char *name = mci->bus->name;
-
 	edac_dbg(1, "Unregistering device %s\n", dev_name(&mci->dev));
 	device_unregister(&mci->dev);
-	bus_unregister(bus);
-	kfree(name);
 }
 
 static void mc_attr_release(struct device *dev)
diff --git a/include/linux/edac.h b/include/linux/edac.h
index a45ce1f84bfc..dd8d006f1837 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -668,10 +668,4 @@ struct mem_ctl_info {
 	bool fake_inject_ue;
 	u16 fake_inject_count;
 };
-
-/*
- * Maximum number of memory controllers in the coherent fabric.
- */
-#define EDAC_MAX_MCS	16
-
 #endif

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-26 16:03             ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2018-09-26 16:03 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

Em Wed, 26 Sep 2018 17:27:52 +0200
Borislav Petkov <bp@alien8.de> escreveu:

> On Wed, Sep 26, 2018 at 11:35:11AM +0200, Borislav Petkov wrote:
> > * or Greg coming and saying, you're using bus_type all wrong and you
> > shouldn't and you should remove it completely! :-)  
> 
> Yap, and so he did! :-)
> 
> It looks like we can remove the whole per-MC bus thing, see below. 

I guess this is/was needed to create things like this:

	lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc

(I don't test this logic for a while).

> Patch
> seems to work here and grepping sysfs hierarchy before and after:
> 
> find /sys/ | grep -i edac > ...
> 
> doesn't show any differences.
> 
> Tony, I'd appreciate running this on one of your big boxes to check
> nothing is missing in sysfs...

There are a few EDAC drivers for old server chipsets that create
error report mechanisms for PCI bus controllers. I don't remember
the specifics anymore. I *suspect* those are the drivers that
do such usage:

	$ git grep -l edac_pci_add_device
	drivers/edac/amd8111_edac.c
	drivers/edac/amd8131_edac.c
	drivers/edac/edac_pci.c
	drivers/edac/edac_pci.h
	drivers/edac/mpc85xx_edac.c
	drivers/edac/mv64x60_edac.c
	drivers/edac/octeon_edac-pci.c

None of them use Intel chipsets.

Last time I tried to test it (more than 5 years ago), it was
hard to find such systems at the labs I had access on that time.

Anyway, as this is a more unusual usage of EDAC API, it could be
worth to double-check if this patch won't break it, maybe doing
some test on such machines before/after this patch in order to 
verify that this won't cause regressions there.

In thesis, it shouldn't affect, as the changes are happening at
edac_mc*.c, but I won't doubt that it might have a hidden dependency
somewhere.

> 
> Thx.
> 
> ---
>  drivers/edac/edac_mc.c       |  9 +--------
>  drivers/edac/edac_mc_sysfs.c | 30 ++----------------------------
>  include/linux/edac.h         |  6 ------
>  3 files changed, 3 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index 7d3edd713932..13594ffadcb3 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -55,8 +55,6 @@ static LIST_HEAD(mc_devices);
>   */
>  static const char *edac_mc_owner;
>  
> -static struct bus_type mc_bus[EDAC_MAX_MCS];
> -
>  int edac_get_report_status(void)
>  {
>  	return edac_report;
> @@ -716,11 +714,6 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
>  	int ret = -EINVAL;
>  	edac_dbg(0, "\n");
>  
> -	if (mci->mc_idx >= EDAC_MAX_MCS) {
> -		pr_warn_once("Too many memory controllers: %d\n", mci->mc_idx);
> -		return -ENODEV;
> -	}
> -
>  #ifdef CONFIG_EDAC_DEBUG
>  	if (edac_debug_level >= 3)
>  		edac_mc_dump_mci(mci);
> @@ -760,7 +753,7 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
>  	/* set load time so that error rate can be tracked */
>  	mci->start_time = jiffies;
>  
> -	mci->bus = &mc_bus[mci->mc_idx];
> +	mci->bus = edac_get_sysfs_subsys();
>  
>  	if (edac_create_sysfs_mci_device(mci, groups)) {
>  		edac_mc_printk(mci, KERN_WARNING,
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index 20374b8248f0..2ca2012f2857 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -914,27 +914,8 @@ static const struct device_type mci_attr_type = {
>  int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
>  				 const struct attribute_group **groups)
>  {
> -	char *name;
>  	int i, err;
>  
> -	/*
> -	 * The memory controller needs its own bus, in order to avoid
> -	 * namespace conflicts at /sys/bus/edac.
> -	 */
> -	name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
> -	if (!name)
> -		return -ENOMEM;
> -
> -	mci->bus->name = name;
> -
> -	edac_dbg(0, "creating bus %s\n", mci->bus->name);
> -
> -	err = bus_register(mci->bus);
> -	if (err < 0) {
> -		kfree(name);
> -		return err;
> -	}
> -
>  	/* get the /sys/devices/system/edac subsys reference */
>  	mci->dev.type = &mci_attr_type;
>  	device_initialize(&mci->dev);
> @@ -950,7 +931,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
>  	err = device_add(&mci->dev);
>  	if (err < 0) {
>  		edac_dbg(1, "failure: create device %s\n", dev_name(&mci->dev));
> -		goto fail_unregister_bus;
> +		goto out;
>  	}
>  
>  	/*
> @@ -998,10 +979,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
>  		device_unregister(&dimm->dev);
>  	}
>  	device_unregister(&mci->dev);
> -fail_unregister_bus:
> -	bus_unregister(mci->bus);
> -	kfree(name);
>  
> +out:
>  	return err;
>  }
>  
> @@ -1032,13 +1011,8 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
>  
>  void edac_unregister_sysfs(struct mem_ctl_info *mci)
>  {
> -	struct bus_type *bus = mci->bus;
> -	const char *name = mci->bus->name;
> -
>  	edac_dbg(1, "Unregistering device %s\n", dev_name(&mci->dev));
>  	device_unregister(&mci->dev);
> -	bus_unregister(bus);
> -	kfree(name);
>  }
>  
>  static void mc_attr_release(struct device *dev)
> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index a45ce1f84bfc..dd8d006f1837 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -668,10 +668,4 @@ struct mem_ctl_info {
>  	bool fake_inject_ue;
>  	u16 fake_inject_count;
>  };
> -
> -/*
> - * Maximum number of memory controllers in the coherent fabric.
> - */
> -#define EDAC_MAX_MCS	16
> -
>  #endif



Thanks,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-26 16:03             ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2018-09-26 16:03 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

Em Wed, 26 Sep 2018 17:27:52 +0200
Borislav Petkov <bp@alien8.de> escreveu:

> On Wed, Sep 26, 2018 at 11:35:11AM +0200, Borislav Petkov wrote:
> > * or Greg coming and saying, you're using bus_type all wrong and you
> > shouldn't and you should remove it completely! :-)  
> 
> Yap, and so he did! :-)
> 
> It looks like we can remove the whole per-MC bus thing, see below. 

I guess this is/was needed to create things like this:

	lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc

(I don't test this logic for a while).

> Patch
> seems to work here and grepping sysfs hierarchy before and after:
> 
> find /sys/ | grep -i edac > ...
> 
> doesn't show any differences.
> 
> Tony, I'd appreciate running this on one of your big boxes to check
> nothing is missing in sysfs...

There are a few EDAC drivers for old server chipsets that create
error report mechanisms for PCI bus controllers. I don't remember
the specifics anymore. I *suspect* those are the drivers that
do such usage:

	$ git grep -l edac_pci_add_device
	drivers/edac/amd8111_edac.c
	drivers/edac/amd8131_edac.c
	drivers/edac/edac_pci.c
	drivers/edac/edac_pci.h
	drivers/edac/mpc85xx_edac.c
	drivers/edac/mv64x60_edac.c
	drivers/edac/octeon_edac-pci.c

None of them use Intel chipsets.

Last time I tried to test it (more than 5 years ago), it was
hard to find such systems at the labs I had access on that time.

Anyway, as this is a more unusual usage of EDAC API, it could be
worth to double-check if this patch won't break it, maybe doing
some test on such machines before/after this patch in order to 
verify that this won't cause regressions there.

In thesis, it shouldn't affect, as the changes are happening at
edac_mc*.c, but I won't doubt that it might have a hidden dependency
somewhere.

> 
> Thx.
> 
> ---
>  drivers/edac/edac_mc.c       |  9 +--------
>  drivers/edac/edac_mc_sysfs.c | 30 ++----------------------------
>  include/linux/edac.h         |  6 ------
>  3 files changed, 3 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index 7d3edd713932..13594ffadcb3 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -55,8 +55,6 @@ static LIST_HEAD(mc_devices);
>   */
>  static const char *edac_mc_owner;
>  
> -static struct bus_type mc_bus[EDAC_MAX_MCS];
> -
>  int edac_get_report_status(void)
>  {
>  	return edac_report;
> @@ -716,11 +714,6 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
>  	int ret = -EINVAL;
>  	edac_dbg(0, "\n");
>  
> -	if (mci->mc_idx >= EDAC_MAX_MCS) {
> -		pr_warn_once("Too many memory controllers: %d\n", mci->mc_idx);
> -		return -ENODEV;
> -	}
> -
>  #ifdef CONFIG_EDAC_DEBUG
>  	if (edac_debug_level >= 3)
>  		edac_mc_dump_mci(mci);
> @@ -760,7 +753,7 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
>  	/* set load time so that error rate can be tracked */
>  	mci->start_time = jiffies;
>  
> -	mci->bus = &mc_bus[mci->mc_idx];
> +	mci->bus = edac_get_sysfs_subsys();
>  
>  	if (edac_create_sysfs_mci_device(mci, groups)) {
>  		edac_mc_printk(mci, KERN_WARNING,
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index 20374b8248f0..2ca2012f2857 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -914,27 +914,8 @@ static const struct device_type mci_attr_type = {
>  int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
>  				 const struct attribute_group **groups)
>  {
> -	char *name;
>  	int i, err;
>  
> -	/*
> -	 * The memory controller needs its own bus, in order to avoid
> -	 * namespace conflicts at /sys/bus/edac.
> -	 */
> -	name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
> -	if (!name)
> -		return -ENOMEM;
> -
> -	mci->bus->name = name;
> -
> -	edac_dbg(0, "creating bus %s\n", mci->bus->name);
> -
> -	err = bus_register(mci->bus);
> -	if (err < 0) {
> -		kfree(name);
> -		return err;
> -	}
> -
>  	/* get the /sys/devices/system/edac subsys reference */
>  	mci->dev.type = &mci_attr_type;
>  	device_initialize(&mci->dev);
> @@ -950,7 +931,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
>  	err = device_add(&mci->dev);
>  	if (err < 0) {
>  		edac_dbg(1, "failure: create device %s\n", dev_name(&mci->dev));
> -		goto fail_unregister_bus;
> +		goto out;
>  	}
>  
>  	/*
> @@ -998,10 +979,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
>  		device_unregister(&dimm->dev);
>  	}
>  	device_unregister(&mci->dev);
> -fail_unregister_bus:
> -	bus_unregister(mci->bus);
> -	kfree(name);
>  
> +out:
>  	return err;
>  }
>  
> @@ -1032,13 +1011,8 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
>  
>  void edac_unregister_sysfs(struct mem_ctl_info *mci)
>  {
> -	struct bus_type *bus = mci->bus;
> -	const char *name = mci->bus->name;
> -
>  	edac_dbg(1, "Unregistering device %s\n", dev_name(&mci->dev));
>  	device_unregister(&mci->dev);
> -	bus_unregister(bus);
> -	kfree(name);
>  }
>  
>  static void mc_attr_release(struct device *dev)
> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index a45ce1f84bfc..dd8d006f1837 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -668,10 +668,4 @@ struct mem_ctl_info {
>  	bool fake_inject_ue;
>  	u16 fake_inject_count;
>  };
> -
> -/*
> - * Maximum number of memory controllers in the coherent fabric.
> - */
> -#define EDAC_MAX_MCS	16
> -
>  #endif



Thanks,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-26 16:13   ` Aristeu Rozanski
  0 siblings, 0 replies; 56+ messages in thread
From: Aristeu Rozanski @ 2018-09-26 16:13 UTC (permalink / raw)
  To: Justin Ernst
  Cc: Borislav Petkov, russ.anderson, Mauro Carvalho Chehab,
	linux-edac, linux-kernel

On Tue, Sep 25, 2018 at 09:34:49AM -0500, Justin Ernst wrote:
> We observe an oops in the skx_edac module during boot.

That's happening also on sb_edac too and the oops comes from memory corruption
after trying to load the module several times during boot.

-- 
Aristeu


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-26 16:13   ` Aristeu Rozanski
  0 siblings, 0 replies; 56+ messages in thread
From: Aristeu Rozanski @ 2018-09-26 16:13 UTC (permalink / raw)
  To: Justin Ernst
  Cc: Borislav Petkov, russ.anderson, Mauro Carvalho Chehab,
	linux-edac, linux-kernel

On Tue, Sep 25, 2018 at 09:34:49AM -0500, Justin Ernst wrote:
> We observe an oops in the skx_edac module during boot.

That's happening also on sb_edac too and the oops comes from memory corruption
after trying to load the module several times during boot.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-26 16:17               ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-26 16:17 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Luck, Tony, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Wed, Sep 26, 2018 at 01:03:40PM -0300, Mauro Carvalho Chehab wrote:
> I guess this is/was needed to create things like this:
> 
> 	lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc

They're still there:

$ ls -l /sys/bus/edac/devices/
total 0
lrwxrwxrwx 1 root root 0 Sep 26 18:15 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc -> ../../../devices/system/edac/mc
lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc0 -> ../../../devices/system/edac/mc/mc0

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-26 16:17               ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-26 16:17 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Luck, Tony, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Wed, Sep 26, 2018 at 01:03:40PM -0300, Mauro Carvalho Chehab wrote:
> I guess this is/was needed to create things like this:
> 
> 	lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc

They're still there:

$ ls -l /sys/bus/edac/devices/
total 0
lrwxrwxrwx 1 root root 0 Sep 26 18:15 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc -> ../../../devices/system/edac/mc
lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc0 -> ../../../devices/system/edac/mc/mc0

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-26 17:39                 ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2018-09-26 17:39 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

Em Wed, 26 Sep 2018 18:17:49 +0200
Borislav Petkov <bp@alien8.de> escreveu:

> On Wed, Sep 26, 2018 at 01:03:40PM -0300, Mauro Carvalho Chehab wrote:
> > I guess this is/was needed to create things like this:
> > 
> > 	lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc  
> 
> They're still there:
> 
> $ ls -l /sys/bus/edac/devices/
> total 0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc -> ../../../devices/system/edac/mc
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc0 -> ../../../devices/system/edac/mc/mc0
> 

Ok. Yeah, probably some changes at the driver's core made that bus
thing obsolete. Or maybe it is there to allow unbind/remove
the edac driver. Had you test it with KASAN and tried to unbind/remove
the EDAC driver?

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-26 17:39                 ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2018-09-26 17:39 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

Em Wed, 26 Sep 2018 18:17:49 +0200
Borislav Petkov <bp@alien8.de> escreveu:

> On Wed, Sep 26, 2018 at 01:03:40PM -0300, Mauro Carvalho Chehab wrote:
> > I guess this is/was needed to create things like this:
> > 
> > 	lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc  
> 
> They're still there:
> 
> $ ls -l /sys/bus/edac/devices/
> total 0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc -> ../../../devices/system/edac/mc
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc0 -> ../../../devices/system/edac/mc/mc0
> 

Ok. Yeah, probably some changes at the driver's core made that bus
thing obsolete. Or maybe it is there to allow unbind/remove
the edac driver. Had you test it with KASAN and tried to unbind/remove
the EDAC driver?

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-26 18:10                 ` Luck, Tony
  0 siblings, 0 replies; 56+ messages in thread
From: Luck, Tony @ 2018-09-26 18:10 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Wed, Sep 26, 2018 at 06:17:49PM +0200, Borislav Petkov wrote:
> On Wed, Sep 26, 2018 at 01:03:40PM -0300, Mauro Carvalho Chehab wrote:
> > I guess this is/was needed to create things like this:
> > 
> > 	lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc
> 
> They're still there:
> 
> $ ls -l /sys/bus/edac/devices/
> total 0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc -> ../../../devices/system/edac/mc
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc0 -> ../../../devices/system/edac/mc/mc0

I ran into trouble on my 4 socket broadwell server (so 8 memory controllers,
a whole pile of DIMMs, running from sb_edac.c)

Things start going wrong with:

[   45.216657] sysfs: cannot create duplicate filename '/bus/edac/devices/dimm0'
[   45.216663] CPU: 37 PID: 2034 Comm: systemd-udevd Not tainted 4.19.0-rc5 #1
[   45.216665] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016
[   45.216667] Call Trace:
[   45.216688]  dump_stack+0x5c/0x7b
[   45.216697]  sysfs_warn_dup+0x56/0x70
[   45.216702]  sysfs_do_create_link_sd.isra.2+0x98/0xb0
[   45.216714]  bus_add_device+0x77/0x160
[   45.216720]  device_add+0x424/0x660
[   45.216731]  edac_create_sysfs_mci_device+0xb9/0x2f0
[   45.216738]  edac_mc_add_mc_with_groups+0x111/0x2b0
[   45.216747]  sbridge_init+0x13c9/0x2000 [sb_edac]
[   45.216757]  ? _raw_spin_lock+0x1d/0x20
[   45.216765]  ? free_pcppages_bulk+0x2ca/0x630
[   45.216769]  ? 0xffffffffc050f000
[   45.216779]  do_one_initcall+0x46/0x1c8
[   45.216784]  ? free_unref_page_commit+0x95/0x120
[   45.216791]  ? _cond_resched+0x15/0x40
[   45.216798]  ? kmem_cache_alloc_trace+0x153/0x1c0
[   45.216805]  do_init_module+0x5b/0x208
[   45.216826]  load_module+0x1a2d/0x1fb0
[   45.216835]  ? __do_sys_finit_module+0xe9/0x110
[   45.216840]  __do_sys_finit_module+0xe9/0x110
[   45.216847]  do_syscall_64+0x5b/0x180
[   45.216852]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   45.216856] RIP: 0033:0x7fcdec618bd9

and fell off a cliff after that.

Going back to the old code I have a "dimm0" on each of the eight controllers:

# find /sys -name dimm0
/sys/devices/system/edac/mc/mc6/dimm0
/sys/devices/system/edac/mc/mc4/dimm0
/sys/devices/system/edac/mc/mc2/dimm0
/sys/devices/system/edac/mc/mc0/dimm0
/sys/devices/system/edac/mc/mc7/dimm0
/sys/devices/system/edac/mc/mc5/dimm0
/sys/devices/system/edac/mc/mc3/dimm0
/sys/devices/system/edac/mc/mc1/dimm0
/sys/bus/mc6/devices/dimm0
/sys/bus/mc4/devices/dimm0
/sys/bus/mc2/devices/dimm0
/sys/bus/mc0/devices/dimm0
/sys/bus/mc7/devices/dimm0
/sys/bus/mc5/devices/dimm0
/sys/bus/mc3/devices/dimm0
/sys/bus/mc1/devices/dimm0
# ls -l /sys/bus/mc0/devices
total 0
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 mc0 -> ../../../devices/system/edac/mc/mc0

It looks like the new code isn't trying to place the dimm symlinks
in the proper subdirectories.

-Tony

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-26 18:10                 ` Luck, Tony
  0 siblings, 0 replies; 56+ messages in thread
From: Luck, Tony @ 2018-09-26 18:10 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Wed, Sep 26, 2018 at 06:17:49PM +0200, Borislav Petkov wrote:
> On Wed, Sep 26, 2018 at 01:03:40PM -0300, Mauro Carvalho Chehab wrote:
> > I guess this is/was needed to create things like this:
> > 
> > 	lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc
> 
> They're still there:
> 
> $ ls -l /sys/bus/edac/devices/
> total 0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc -> ../../../devices/system/edac/mc
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc0 -> ../../../devices/system/edac/mc/mc0

I ran into trouble on my 4 socket broadwell server (so 8 memory controllers,
a whole pile of DIMMs, running from sb_edac.c)

Things start going wrong with:

[   45.216657] sysfs: cannot create duplicate filename '/bus/edac/devices/dimm0'
[   45.216663] CPU: 37 PID: 2034 Comm: systemd-udevd Not tainted 4.19.0-rc5 #1
[   45.216665] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016
[   45.216667] Call Trace:
[   45.216688]  dump_stack+0x5c/0x7b
[   45.216697]  sysfs_warn_dup+0x56/0x70
[   45.216702]  sysfs_do_create_link_sd.isra.2+0x98/0xb0
[   45.216714]  bus_add_device+0x77/0x160
[   45.216720]  device_add+0x424/0x660
[   45.216731]  edac_create_sysfs_mci_device+0xb9/0x2f0
[   45.216738]  edac_mc_add_mc_with_groups+0x111/0x2b0
[   45.216747]  sbridge_init+0x13c9/0x2000 [sb_edac]
[   45.216757]  ? _raw_spin_lock+0x1d/0x20
[   45.216765]  ? free_pcppages_bulk+0x2ca/0x630
[   45.216769]  ? 0xffffffffc050f000
[   45.216779]  do_one_initcall+0x46/0x1c8
[   45.216784]  ? free_unref_page_commit+0x95/0x120
[   45.216791]  ? _cond_resched+0x15/0x40
[   45.216798]  ? kmem_cache_alloc_trace+0x153/0x1c0
[   45.216805]  do_init_module+0x5b/0x208
[   45.216826]  load_module+0x1a2d/0x1fb0
[   45.216835]  ? __do_sys_finit_module+0xe9/0x110
[   45.216840]  __do_sys_finit_module+0xe9/0x110
[   45.216847]  do_syscall_64+0x5b/0x180
[   45.216852]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   45.216856] RIP: 0033:0x7fcdec618bd9

and fell off a cliff after that.

Going back to the old code I have a "dimm0" on each of the eight controllers:

# find /sys -name dimm0
/sys/devices/system/edac/mc/mc6/dimm0
/sys/devices/system/edac/mc/mc4/dimm0
/sys/devices/system/edac/mc/mc2/dimm0
/sys/devices/system/edac/mc/mc0/dimm0
/sys/devices/system/edac/mc/mc7/dimm0
/sys/devices/system/edac/mc/mc5/dimm0
/sys/devices/system/edac/mc/mc3/dimm0
/sys/devices/system/edac/mc/mc1/dimm0
/sys/bus/mc6/devices/dimm0
/sys/bus/mc4/devices/dimm0
/sys/bus/mc2/devices/dimm0
/sys/bus/mc0/devices/dimm0
/sys/bus/mc7/devices/dimm0
/sys/bus/mc5/devices/dimm0
/sys/bus/mc3/devices/dimm0
/sys/bus/mc1/devices/dimm0
# ls -l /sys/bus/mc0/devices
total 0
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 mc0 -> ../../../devices/system/edac/mc/mc0

It looks like the new code isn't trying to place the dimm symlinks
in the proper subdirectories.

-Tony

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-26 18:23                   ` Russ Anderson
  0 siblings, 0 replies; 56+ messages in thread
From: Russ Anderson @ 2018-09-26 18:23 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Borislav Petkov, Mauro Carvalho Chehab, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Wed, Sep 26, 2018 at 11:10:35AM -0700, Luck, Tony wrote:
> On Wed, Sep 26, 2018 at 06:17:49PM +0200, Borislav Petkov wrote:
> > On Wed, Sep 26, 2018 at 01:03:40PM -0300, Mauro Carvalho Chehab wrote:
> > > I guess this is/was needed to create things like this:
> > > 
> > > 	lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc
> > 
> > They're still there:
> > 
> > $ ls -l /sys/bus/edac/devices/
> > total 0
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc -> ../../../devices/system/edac/mc
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc0 -> ../../../devices/system/edac/mc/mc0
> 
> I ran into trouble on my 4 socket broadwell server (so 8 memory controllers,
> a whole pile of DIMMs, running from sb_edac.c)

We are also having trouble on a 32 socket system.

---------------------------------------------------------------------------------------------
[  OK  ] Started Load kdump kernel early on startup.
[   ***] (2 of 2) A start job is running for...work interfaces (18s / no limit)[  132.638611] BUG: unable to handle kernel paging request at ffff8c7efeebefff
[  132.640895] PGD 5fec3fdd067 P4D 5fec3fdd067 PUD 5fec3fda067 PMD 0 
[  132.640895] Oops: 0002 [#1] SMP PTI
[  132.640895] CPU: 650 PID: 9884 Comm: kworker/650:1 Kdump: loaded Tainted: G            E     4.19.0-rc4-ernstj+ #6
[  132.640895] Hardware name: HPE Superdome Flex/Superdome Flex, BIOS Bundle:3.0.196 SFW:IP147.007.000.071.000.1809242200 09/24/2018
[  132.640895] Workqueue: events cache_reap
[  132.640895] RIP: 0010:free_block+0x11c/0x1e0
[  132.640895] Code: ea 20 45 29 d7 41 d3 ef 0f b6 4f 1d 45 01 fa 41 d3 ea 8b 48 30 44 8d 79 ff 48 8b 48 20 44 89 78 30 48 85 c9 0f 84 a5 00 00 00 <46> 88 14 39 8b 48 30 85 c9 0f 84 32 ff ff ff 49 8b 49 10 4c 8d 50
[  132.640895] RSP: 0018:ffffc9004b0c7d90 EFLAGS: 00010086
[  132.640895] RAX: ffffea11f7fbafc0 RBX: 0000000000000002 RCX: ffff8c7dfeebf000
[  132.640895] RDX: 0000000000000005 RSI: ffff8c7e08da9328 RDI: ffff880147c02000
[  132.640895] RBP: 0000000080000000 R08: ffff8c5047c004a8 R09: ffff8c5047c00480
[  132.640895] R10: 0000000000000001 R11: ffff8c7dfeebf800 R12: ffffea0000000000
[  132.640895] R13: 000077ff80000000 R14: ffff8c5047c00488 R15: 00000000ffffffff
[  132.640895] FS:  0000000000000000(0000) GS:ffff8c7e08d80000(0000) knlGS:0000000000000000
[  132.640895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  132.640895] CR2: ffff8c7efeebefff CR3: 000000000200a006 CR4: 00000000007606e0
[  132.640895] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  132.640895] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  132.640895] PKRU: 55555554
[  132.640895] Call Trace:
[  132.640895]  drain_array_locked+0x5b/0x80
[  132.640895]  drain_array+0x63/0x90
[  132.640895]  cache_reap+0x68/0x1f0
[  132.640895]  process_one_work+0x165/0x360
[  132.640895]  worker_thread+0x49/0x3e0
[  132.640895]  kthread+0xf8/0x130
[  132.640895]  ? max_active_store+0x60/0x60
[  132.640895]  ? kthread_bind+0x10/0x10
[  132.640895]  ret_from_fork+0x35/0x40
[  132.640895] Modules linked in: acpi_cpufreq(E-) skx_edac(E+) intel_rapl(E) nfit(E) libnvdimm(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) aes_x86_64(E) iscsi_ibft(E) crypto_simd(E) iscsi_boot_sysfs(E) cryptd(E) glue_helper(E) pcspkr(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) joydev(E) i40e(E) ipmi_ssif(E) lpc_ich(E) i2c_i801(E) mfd_core(E) wmi(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) button(E) xfs(E) libcrc32c(E) hid_generic(E) usbhid(E) sd_mod(E) mgag200(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) crc32c_intel(E) sysfillrect(E) xhci_pci(E) sysimgblt(E) fb_sys_fops(E) xhci_hcd(E) ahci(E) libahci(E) ttm(E) drm(E) libata(E) usbcore(E) sg(E) dm_multipath(E)
[  132.916934]  dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) msr(E) efivarfs(E) autofs4(E)
[  132.916934] CR2: ffff8c7efeebefff
[  132.916934] ---[ end trace 9ee1381bf4bae01f ]---
[  *** ] (2 of 2) A start job is running for.[  132.916934] RIP: 0010:free_block+0x11c/0x1e0
[  132.916934] Code: ea 20 45 29 d7 41 d3 ef 0f b6 4f 1d 45 01 fa 41 d3 ea 8b 48 30 44 8d 79 ff 48 8b 48 20 44 89 78 30 48 85 c9 0f 84 a5 00 00 00 <46> 88 14 39 8b 48 30 85 c9 0f 84 32 ff ff ff 49 8b 49 10 4c 8d 50
[  132.916934] RSP: 0018:ffffc9004b0c7d90 EFLAGS: 00010086
..work interface[  132.977236] EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0: DEV 0000:80:0a.0
[  132.916934] RAX: ffffea11f7fbafc0 RBX: 0000000000000002 RCX: ffff8c7dfeebf000
[  132.916934] RDX: 0000000000000005 RSI: ffff8c7e08da9328 RDI: ffff880147c02000
s (19s / no limi[  133.004953] RBP: 0000000080000000 R08: ffff8c5047c004a8 R09: ffff8c5047c00480
[  133.004953] R10: 0000000000000001 R11: ffff8c7dfeebf800 R12: ffffea0000000000
[  133.004953] R13: 000077ff80000000 R14: ffff8c5047c00488 R15: 00000000ffffffff
[  133.004953] FS:  0000000000000000(0000) GS:ffff8c7e08d80000(0000) knlGS:0000000000000000
[  133.004953] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  133.004953] CR2: ffff8c7efeebefff CR3: 000000000200a006 CR4: 00000000007606e0
[  133.004953] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  133.004953] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  133.004953] PKRU: 55555554
[  133.004953] Kernel panic - not syncing: Fatal exception
---------------------------------------------------------------------------------------------

> Things start going wrong with:
> 
> [   45.216657] sysfs: cannot create duplicate filename '/bus/edac/devices/dimm0'
> [   45.216663] CPU: 37 PID: 2034 Comm: systemd-udevd Not tainted 4.19.0-rc5 #1
> [   45.216665] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016
> [   45.216667] Call Trace:
> [   45.216688]  dump_stack+0x5c/0x7b
> [   45.216697]  sysfs_warn_dup+0x56/0x70
> [   45.216702]  sysfs_do_create_link_sd.isra.2+0x98/0xb0
> [   45.216714]  bus_add_device+0x77/0x160
> [   45.216720]  device_add+0x424/0x660
> [   45.216731]  edac_create_sysfs_mci_device+0xb9/0x2f0
> [   45.216738]  edac_mc_add_mc_with_groups+0x111/0x2b0
> [   45.216747]  sbridge_init+0x13c9/0x2000 [sb_edac]
> [   45.216757]  ? _raw_spin_lock+0x1d/0x20
> [   45.216765]  ? free_pcppages_bulk+0x2ca/0x630
> [   45.216769]  ? 0xffffffffc050f000
> [   45.216779]  do_one_initcall+0x46/0x1c8
> [   45.216784]  ? free_unref_page_commit+0x95/0x120
> [   45.216791]  ? _cond_resched+0x15/0x40
> [   45.216798]  ? kmem_cache_alloc_trace+0x153/0x1c0
> [   45.216805]  do_init_module+0x5b/0x208
> [   45.216826]  load_module+0x1a2d/0x1fb0
> [   45.216835]  ? __do_sys_finit_module+0xe9/0x110
> [   45.216840]  __do_sys_finit_module+0xe9/0x110
> [   45.216847]  do_syscall_64+0x5b/0x180
> [   45.216852]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   45.216856] RIP: 0033:0x7fcdec618bd9
> 
> and fell off a cliff after that.
> 
> Going back to the old code I have a "dimm0" on each of the eight controllers:
> 
> # find /sys -name dimm0
> /sys/devices/system/edac/mc/mc6/dimm0
> /sys/devices/system/edac/mc/mc4/dimm0
> /sys/devices/system/edac/mc/mc2/dimm0
> /sys/devices/system/edac/mc/mc0/dimm0
> /sys/devices/system/edac/mc/mc7/dimm0
> /sys/devices/system/edac/mc/mc5/dimm0
> /sys/devices/system/edac/mc/mc3/dimm0
> /sys/devices/system/edac/mc/mc1/dimm0
> /sys/bus/mc6/devices/dimm0
> /sys/bus/mc4/devices/dimm0
> /sys/bus/mc2/devices/dimm0
> /sys/bus/mc0/devices/dimm0
> /sys/bus/mc7/devices/dimm0
> /sys/bus/mc5/devices/dimm0
> /sys/bus/mc3/devices/dimm0
> /sys/bus/mc1/devices/dimm0
> # ls -l /sys/bus/mc0/devices
> total 0
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 mc0 -> ../../../devices/system/edac/mc/mc0
> 
> It looks like the new code isn't trying to place the dimm symlinks
> in the proper subdirectories.
> 
> -Tony

-- 
Russ Anderson,  SuperDome Flex Linux Kernel Group Manager
HPE - Hewlett Packard Enterprise (formerly SGI)  rja@hpe.com

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-26 18:23                   ` Russ Anderson
  0 siblings, 0 replies; 56+ messages in thread
From: Russ Anderson @ 2018-09-26 18:23 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Borislav Petkov, Mauro Carvalho Chehab, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Wed, Sep 26, 2018 at 11:10:35AM -0700, Luck, Tony wrote:
> On Wed, Sep 26, 2018 at 06:17:49PM +0200, Borislav Petkov wrote:
> > On Wed, Sep 26, 2018 at 01:03:40PM -0300, Mauro Carvalho Chehab wrote:
> > > I guess this is/was needed to create things like this:
> > > 
> > > 	lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc
> > 
> > They're still there:
> > 
> > $ ls -l /sys/bus/edac/devices/
> > total 0
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc -> ../../../devices/system/edac/mc
> > lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc0 -> ../../../devices/system/edac/mc/mc0
> 
> I ran into trouble on my 4 socket broadwell server (so 8 memory controllers,
> a whole pile of DIMMs, running from sb_edac.c)

We are also having trouble on a 32 socket system.

---------------------------------------------------------------------------------------------
[  OK  ] Started Load kdump kernel early on startup.
[   ***] (2 of 2) A start job is running for...work interfaces (18s / no limit)[  132.638611] BUG: unable to handle kernel paging request at ffff8c7efeebefff
[  132.640895] PGD 5fec3fdd067 P4D 5fec3fdd067 PUD 5fec3fda067 PMD 0 
[  132.640895] Oops: 0002 [#1] SMP PTI
[  132.640895] CPU: 650 PID: 9884 Comm: kworker/650:1 Kdump: loaded Tainted: G            E     4.19.0-rc4-ernstj+ #6
[  132.640895] Hardware name: HPE Superdome Flex/Superdome Flex, BIOS Bundle:3.0.196 SFW:IP147.007.000.071.000.1809242200 09/24/2018
[  132.640895] Workqueue: events cache_reap
[  132.640895] RIP: 0010:free_block+0x11c/0x1e0
[  132.640895] Code: ea 20 45 29 d7 41 d3 ef 0f b6 4f 1d 45 01 fa 41 d3 ea 8b 48 30 44 8d 79 ff 48 8b 48 20 44 89 78 30 48 85 c9 0f 84 a5 00 00 00 <46> 88 14 39 8b 48 30 85 c9 0f 84 32 ff ff ff 49 8b 49 10 4c 8d 50
[  132.640895] RSP: 0018:ffffc9004b0c7d90 EFLAGS: 00010086
[  132.640895] RAX: ffffea11f7fbafc0 RBX: 0000000000000002 RCX: ffff8c7dfeebf000
[  132.640895] RDX: 0000000000000005 RSI: ffff8c7e08da9328 RDI: ffff880147c02000
[  132.640895] RBP: 0000000080000000 R08: ffff8c5047c004a8 R09: ffff8c5047c00480
[  132.640895] R10: 0000000000000001 R11: ffff8c7dfeebf800 R12: ffffea0000000000
[  132.640895] R13: 000077ff80000000 R14: ffff8c5047c00488 R15: 00000000ffffffff
[  132.640895] FS:  0000000000000000(0000) GS:ffff8c7e08d80000(0000) knlGS:0000000000000000
[  132.640895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  132.640895] CR2: ffff8c7efeebefff CR3: 000000000200a006 CR4: 00000000007606e0
[  132.640895] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  132.640895] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  132.640895] PKRU: 55555554
[  132.640895] Call Trace:
[  132.640895]  drain_array_locked+0x5b/0x80
[  132.640895]  drain_array+0x63/0x90
[  132.640895]  cache_reap+0x68/0x1f0
[  132.640895]  process_one_work+0x165/0x360
[  132.640895]  worker_thread+0x49/0x3e0
[  132.640895]  kthread+0xf8/0x130
[  132.640895]  ? max_active_store+0x60/0x60
[  132.640895]  ? kthread_bind+0x10/0x10
[  132.640895]  ret_from_fork+0x35/0x40
[  132.640895] Modules linked in: acpi_cpufreq(E-) skx_edac(E+) intel_rapl(E) nfit(E) libnvdimm(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) aes_x86_64(E) iscsi_ibft(E) crypto_simd(E) iscsi_boot_sysfs(E) cryptd(E) glue_helper(E) pcspkr(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) joydev(E) i40e(E) ipmi_ssif(E) lpc_ich(E) i2c_i801(E) mfd_core(E) wmi(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) button(E) xfs(E) libcrc32c(E) hid_generic(E) usbhid(E) sd_mod(E) mgag200(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) crc32c_intel(E) sysfillrect(E) xhci_pci(E) sysimgblt(E) fb_sys_fops(E) xhci_hcd(E) ahci(E) libahci(E) ttm(E) drm(E) libata(E) usbcore(E) sg(E) dm_multipath(E)
[  132.916934]  dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) msr(E) efivarfs(E) autofs4(E)
[  132.916934] CR2: ffff8c7efeebefff
[  132.916934] ---[ end trace 9ee1381bf4bae01f ]---
[  *** ] (2 of 2) A start job is running for.[  132.916934] RIP: 0010:free_block+0x11c/0x1e0
[  132.916934] Code: ea 20 45 29 d7 41 d3 ef 0f b6 4f 1d 45 01 fa 41 d3 ea 8b 48 30 44 8d 79 ff 48 8b 48 20 44 89 78 30 48 85 c9 0f 84 a5 00 00 00 <46> 88 14 39 8b 48 30 85 c9 0f 84 32 ff ff ff 49 8b 49 10 4c 8d 50
[  132.916934] RSP: 0018:ffffc9004b0c7d90 EFLAGS: 00010086
..work interface[  132.977236] EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0: DEV 0000:80:0a.0
[  132.916934] RAX: ffffea11f7fbafc0 RBX: 0000000000000002 RCX: ffff8c7dfeebf000
[  132.916934] RDX: 0000000000000005 RSI: ffff8c7e08da9328 RDI: ffff880147c02000
s (19s / no limi[  133.004953] RBP: 0000000080000000 R08: ffff8c5047c004a8 R09: ffff8c5047c00480
[  133.004953] R10: 0000000000000001 R11: ffff8c7dfeebf800 R12: ffffea0000000000
[  133.004953] R13: 000077ff80000000 R14: ffff8c5047c00488 R15: 00000000ffffffff
[  133.004953] FS:  0000000000000000(0000) GS:ffff8c7e08d80000(0000) knlGS:0000000000000000
[  133.004953] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  133.004953] CR2: ffff8c7efeebefff CR3: 000000000200a006 CR4: 00000000007606e0
[  133.004953] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  133.004953] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  133.004953] PKRU: 55555554
[  133.004953] Kernel panic - not syncing: Fatal exception
---------------------------------------------------------------------------------------------

> Things start going wrong with:
> 
> [   45.216657] sysfs: cannot create duplicate filename '/bus/edac/devices/dimm0'
> [   45.216663] CPU: 37 PID: 2034 Comm: systemd-udevd Not tainted 4.19.0-rc5 #1
> [   45.216665] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016
> [   45.216667] Call Trace:
> [   45.216688]  dump_stack+0x5c/0x7b
> [   45.216697]  sysfs_warn_dup+0x56/0x70
> [   45.216702]  sysfs_do_create_link_sd.isra.2+0x98/0xb0
> [   45.216714]  bus_add_device+0x77/0x160
> [   45.216720]  device_add+0x424/0x660
> [   45.216731]  edac_create_sysfs_mci_device+0xb9/0x2f0
> [   45.216738]  edac_mc_add_mc_with_groups+0x111/0x2b0
> [   45.216747]  sbridge_init+0x13c9/0x2000 [sb_edac]
> [   45.216757]  ? _raw_spin_lock+0x1d/0x20
> [   45.216765]  ? free_pcppages_bulk+0x2ca/0x630
> [   45.216769]  ? 0xffffffffc050f000
> [   45.216779]  do_one_initcall+0x46/0x1c8
> [   45.216784]  ? free_unref_page_commit+0x95/0x120
> [   45.216791]  ? _cond_resched+0x15/0x40
> [   45.216798]  ? kmem_cache_alloc_trace+0x153/0x1c0
> [   45.216805]  do_init_module+0x5b/0x208
> [   45.216826]  load_module+0x1a2d/0x1fb0
> [   45.216835]  ? __do_sys_finit_module+0xe9/0x110
> [   45.216840]  __do_sys_finit_module+0xe9/0x110
> [   45.216847]  do_syscall_64+0x5b/0x180
> [   45.216852]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   45.216856] RIP: 0033:0x7fcdec618bd9
> 
> and fell off a cliff after that.
> 
> Going back to the old code I have a "dimm0" on each of the eight controllers:
> 
> # find /sys -name dimm0
> /sys/devices/system/edac/mc/mc6/dimm0
> /sys/devices/system/edac/mc/mc4/dimm0
> /sys/devices/system/edac/mc/mc2/dimm0
> /sys/devices/system/edac/mc/mc0/dimm0
> /sys/devices/system/edac/mc/mc7/dimm0
> /sys/devices/system/edac/mc/mc5/dimm0
> /sys/devices/system/edac/mc/mc3/dimm0
> /sys/devices/system/edac/mc/mc1/dimm0
> /sys/bus/mc6/devices/dimm0
> /sys/bus/mc4/devices/dimm0
> /sys/bus/mc2/devices/dimm0
> /sys/bus/mc0/devices/dimm0
> /sys/bus/mc7/devices/dimm0
> /sys/bus/mc5/devices/dimm0
> /sys/bus/mc3/devices/dimm0
> /sys/bus/mc1/devices/dimm0
> # ls -l /sys/bus/mc0/devices
> total 0
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
> lrwxrwxrwx. 1 root root 0 Sep 26 11:08 mc0 -> ../../../devices/system/edac/mc/mc0
> 
> It looks like the new code isn't trying to place the dimm symlinks
> in the proper subdirectories.
> 
> -Tony

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-26 23:02                     ` Luck, Tony
  0 siblings, 0 replies; 56+ messages in thread
From: Luck, Tony @ 2018-09-26 23:02 UTC (permalink / raw)
  To: Russ Anderson
  Cc: Borislav Petkov, Mauro Carvalho Chehab, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

This issue has made me look a bit more at what EDAC puts in sysfs.
It seems like the current code inherits some useless baggage
from the device calls it makes.

E.g. all the "power" subdirectories:

$ find /sys/devices/system/edac -name power
/sys/devices/system/edac/power
/sys/devices/system/edac/mc/mc6/dimm3/power
/sys/devices/system/edac/mc/mc6/power
/sys/devices/system/edac/mc/mc6/csrow0/power
/sys/devices/system/edac/mc/mc6/dimm6/power
/sys/devices/system/edac/mc/mc6/dimm0/power
/sys/devices/system/edac/mc/mc6/dimm9/power
/sys/devices/system/edac/mc/mc4/dimm3/power
/sys/devices/system/edac/mc/mc4/power
... total of 50 of these ...

$ grep -r . /sys/devices/system/edac/mc/mc6/dimm0/power
/sys/devices/system/edac/mc/mc6/dimm0/power/runtime_active_time:0
/sys/devices/system/edac/mc/mc6/dimm0/power/runtime_status:unsupported
grep: /sys/devices/system/edac/mc/mc6/dimm0/power/autosuspend_delay_ms: Input/output error
/sys/devices/system/edac/mc/mc6/dimm0/power/runtime_suspended_time:0
/sys/devices/system/edac/mc/mc6/dimm0/power/control:auto

We don't have stats, nor control of power on a per memory controller
or per dimm basis. So all these files are just noise.


But ... we are at -rc5. Not sure that we'll figure out, write, test & debug
the proper solution in the next 3-4 weeks. So perhaps we should apply

-#define EDAC_MAX_MCS   16
+#define EDAC_MAX_MCS   64

as a temporary band-aid to get HPE's 32-socket machine running while
we work on the proper fix?

-Tony

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-26 23:02                     ` Luck, Tony
  0 siblings, 0 replies; 56+ messages in thread
From: Luck, Tony @ 2018-09-26 23:02 UTC (permalink / raw)
  To: Russ Anderson
  Cc: Borislav Petkov, Mauro Carvalho Chehab, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

This issue has made me look a bit more at what EDAC puts in sysfs.
It seems like the current code inherits some useless baggage
from the device calls it makes.

E.g. all the "power" subdirectories:

$ find /sys/devices/system/edac -name power
/sys/devices/system/edac/power
/sys/devices/system/edac/mc/mc6/dimm3/power
/sys/devices/system/edac/mc/mc6/power
/sys/devices/system/edac/mc/mc6/csrow0/power
/sys/devices/system/edac/mc/mc6/dimm6/power
/sys/devices/system/edac/mc/mc6/dimm0/power
/sys/devices/system/edac/mc/mc6/dimm9/power
/sys/devices/system/edac/mc/mc4/dimm3/power
/sys/devices/system/edac/mc/mc4/power
... total of 50 of these ...

$ grep -r . /sys/devices/system/edac/mc/mc6/dimm0/power
/sys/devices/system/edac/mc/mc6/dimm0/power/runtime_active_time:0
/sys/devices/system/edac/mc/mc6/dimm0/power/runtime_status:unsupported
grep: /sys/devices/system/edac/mc/mc6/dimm0/power/autosuspend_delay_ms: Input/output error
/sys/devices/system/edac/mc/mc6/dimm0/power/runtime_suspended_time:0
/sys/devices/system/edac/mc/mc6/dimm0/power/control:auto

We don't have stats, nor control of power on a per memory controller
or per dimm basis. So all these files are just noise.


But ... we are at -rc5. Not sure that we'll figure out, write, test & debug
the proper solution in the next 3-4 weeks. So perhaps we should apply

-#define EDAC_MAX_MCS   16
+#define EDAC_MAX_MCS   64

as a temporary band-aid to get HPE's 32-socket machine running while
we work on the proper fix?

-Tony

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-27  4:52                       ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-27  4:52 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Russ Anderson, Mauro Carvalho Chehab, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Wed, Sep 26, 2018 at 04:02:57PM -0700, Luck, Tony wrote:
> We don't have stats, nor control of power on a per memory controller
> or per dimm basis. So all these files are just noise.

Yeah, and also, looking at your previous mail, stuff like:

/sys/bus/mc6/devices/dimm0
/sys/bus/mc4/devices/dimm0

doesn't make any sense: why is mc* directly under bus? It should be
under ...bus/edac/mc/...

We'll have to clean it up carefully, when there's time.

> But ... we are at -rc5. Not sure that we'll figure out, write, test & debug
> the proper solution in the next 3-4 weeks. So perhaps we should apply
> 
> -#define EDAC_MAX_MCS   16
> +#define EDAC_MAX_MCS   64
> 
> as a temporary band-aid to get HPE's 32-socket machine running while
> we work on the proper fix?

Yeah, after sleeping on it I see it the same way - band-aid it now and
clean it up properly later.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-27  4:52                       ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-27  4:52 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Russ Anderson, Mauro Carvalho Chehab, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Wed, Sep 26, 2018 at 04:02:57PM -0700, Luck, Tony wrote:
> We don't have stats, nor control of power on a per memory controller
> or per dimm basis. So all these files are just noise.

Yeah, and also, looking at your previous mail, stuff like:

/sys/bus/mc6/devices/dimm0
/sys/bus/mc4/devices/dimm0

doesn't make any sense: why is mc* directly under bus? It should be
under ...bus/edac/mc/...

We'll have to clean it up carefully, when there's time.

> But ... we are at -rc5. Not sure that we'll figure out, write, test & debug
> the proper solution in the next 3-4 weeks. So perhaps we should apply
> 
> -#define EDAC_MAX_MCS   16
> +#define EDAC_MAX_MCS   64
> 
> as a temporary band-aid to get HPE's 32-socket machine running while
> we work on the proper fix?

Yeah, after sleeping on it I see it the same way - band-aid it now and
clean it up properly later.

Thx.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-27  5:56   ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-27  5:56 UTC (permalink / raw)
  To: Justin Ernst
  Cc: russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel

On Tue, Sep 25, 2018 at 09:34:49AM -0500, Justin Ernst wrote:
> We observe an oops in the skx_edac module during boot.
> Examining /var/log/messages:
> [ 3401.985757] EDAC MC0: Giving out device to module skx_edac controller Skylake Socket#0 IMC#0
> [ 3401.985887] EDAC MC1: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> [ 3401.986014] EDAC MC2: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> ...
> [ 3401.987318] EDAC MC13: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> [ 3401.987435] EDAC MC14: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
> [ 3401.987579] Too many memory controllers: 16
> [ 3402.042614] EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0
> 
> We observe there are two memory controllers per socket, with a limit of 16.
> Raise the maximum number of memory controllers from 16 to 2 * MAX_NUMNODES (1024).
> 
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> Cc: linux-edac@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Russ Anderson <russ.anderson@hpe.com>
> Signed-off-by: Justin Ernst <justin.ernst@hpe.com>
> ---
>  include/linux/edac.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Applied, thanks.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-27  5:56   ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-27  5:56 UTC (permalink / raw)
  To: Justin Ernst
  Cc: russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel

On Tue, Sep 25, 2018 at 09:34:49AM -0500, Justin Ernst wrote:
> We observe an oops in the skx_edac module during boot.
> Examining /var/log/messages:
> [ 3401.985757] EDAC MC0: Giving out device to module skx_edac controller Skylake Socket#0 IMC#0
> [ 3401.985887] EDAC MC1: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> [ 3401.986014] EDAC MC2: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> ...
> [ 3401.987318] EDAC MC13: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
> [ 3401.987435] EDAC MC14: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
> [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
> [ 3401.987579] Too many memory controllers: 16
> [ 3402.042614] EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0
> 
> We observe there are two memory controllers per socket, with a limit of 16.
> Raise the maximum number of memory controllers from 16 to 2 * MAX_NUMNODES (1024).
> 
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> Cc: linux-edac@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Russ Anderson <russ.anderson@hpe.com>
> Signed-off-by: Justin Ernst <justin.ernst@hpe.com>
> ---
>  include/linux/edac.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Applied, thanks.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-27 21:44                         ` Luck, Tony
  0 siblings, 0 replies; 56+ messages in thread
From: Luck, Tony @ 2018-09-27 21:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Russ Anderson, Mauro Carvalho Chehab, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Thu, Sep 27, 2018 at 06:52:44AM +0200, Borislav Petkov wrote:
> On Wed, Sep 26, 2018 at 04:02:57PM -0700, Luck, Tony wrote:
> > But ... we are at -rc5. Not sure that we'll figure out, write, test & debug
> > the proper solution in the next 3-4 weeks. So perhaps we should apply
> > 
> > -#define EDAC_MAX_MCS   16
> > +#define EDAC_MAX_MCS   64
> > 
> > as a temporary band-aid to get HPE's 32-socket machine running while
> > we work on the proper fix?
> 
> Yeah, after sleeping on it I see it the same way - band-aid it now and
> clean it up properly later.

The problem with your patch that gets rid of EDAC_MAX_MCS is making
device links under /sys/bus/edac.  Which is hinted at in some of the
code your patch deleted:

-       /*
-        * The memory controller needs its own bus, in order to avoid
-        * namespace conflicts at /sys/bus/edac.
-        */
-       name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
-       if (!name)
-               return -ENOMEM;
-
-       mci->bus->name = name;
-
-       edac_dbg(0, "creating bus %s\n", mci->bus->name);
-
-       err = bus_register(mci->bus);

Just to see if there was anything else wrong I added a patch to
make the names unique:


diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 2ca2012f2857..6ec6d8a2adb8 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -410,7 +410,7 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
 	device_initialize(&csrow->dev);
 	csrow->dev.parent = &mci->dev;
 	csrow->mci = mci;
-	dev_set_name(&csrow->dev, "csrow%d", index);
+	dev_set_name(&csrow->dev, "mci%d_csrow%d", mci->mc_idx, index);
 	dev_set_drvdata(&csrow->dev, csrow);
 
 	edac_dbg(0, "creating (virtual) csrow node %s\n",
@@ -641,9 +641,9 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
 
 	dimm->dev.parent = &mci->dev;
 	if (mci->csbased)
-		dev_set_name(&dimm->dev, "rank%d", index);
+		dev_set_name(&dimm->dev, "mci%d_rank%d", mci->mc_idx, index);
 	else
-		dev_set_name(&dimm->dev, "dimm%d", index);
+		dev_set_name(&dimm->dev, "mci%d_dimm%d", mci->mc_idx, index);
 	dev_set_drvdata(&dimm->dev, dimm);
 	pm_runtime_forbid(&mci->dev);
 

which seemed to work.  But then I began wondering what are ABI expectations
from applications that read the EDAC /sys files?

Is this this current source repository?  https://github.com/grondo/edac-utils

This code doesn't seem to know about the "dimm*" directories below the
"mc*" level. It just looks for the csrow* entries.

-Tony

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-27 21:44                         ` Luck, Tony
  0 siblings, 0 replies; 56+ messages in thread
From: Luck, Tony @ 2018-09-27 21:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Russ Anderson, Mauro Carvalho Chehab, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Thu, Sep 27, 2018 at 06:52:44AM +0200, Borislav Petkov wrote:
> On Wed, Sep 26, 2018 at 04:02:57PM -0700, Luck, Tony wrote:
> > But ... we are at -rc5. Not sure that we'll figure out, write, test & debug
> > the proper solution in the next 3-4 weeks. So perhaps we should apply
> > 
> > -#define EDAC_MAX_MCS   16
> > +#define EDAC_MAX_MCS   64
> > 
> > as a temporary band-aid to get HPE's 32-socket machine running while
> > we work on the proper fix?
> 
> Yeah, after sleeping on it I see it the same way - band-aid it now and
> clean it up properly later.

The problem with your patch that gets rid of EDAC_MAX_MCS is making
device links under /sys/bus/edac.  Which is hinted at in some of the
code your patch deleted:

-       /*
-        * The memory controller needs its own bus, in order to avoid
-        * namespace conflicts at /sys/bus/edac.
-        */
-       name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
-       if (!name)
-               return -ENOMEM;
-
-       mci->bus->name = name;
-
-       edac_dbg(0, "creating bus %s\n", mci->bus->name);
-
-       err = bus_register(mci->bus);

Just to see if there was anything else wrong I added a patch to
make the names unique:



which seemed to work.  But then I began wondering what are ABI expectations
from applications that read the EDAC /sys files?

Is this this current source repository?  https://github.com/grondo/edac-utils

This code doesn't seem to know about the "dimm*" directories below the
"mc*" level. It just looks for the csrow* entries.

-Tony

diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 2ca2012f2857..6ec6d8a2adb8 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -410,7 +410,7 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
 	device_initialize(&csrow->dev);
 	csrow->dev.parent = &mci->dev;
 	csrow->mci = mci;
-	dev_set_name(&csrow->dev, "csrow%d", index);
+	dev_set_name(&csrow->dev, "mci%d_csrow%d", mci->mc_idx, index);
 	dev_set_drvdata(&csrow->dev, csrow);
 
 	edac_dbg(0, "creating (virtual) csrow node %s\n",
@@ -641,9 +641,9 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
 
 	dimm->dev.parent = &mci->dev;
 	if (mci->csbased)
-		dev_set_name(&dimm->dev, "rank%d", index);
+		dev_set_name(&dimm->dev, "mci%d_rank%d", mci->mc_idx, index);
 	else
-		dev_set_name(&dimm->dev, "dimm%d", index);
+		dev_set_name(&dimm->dev, "mci%d_dimm%d", mci->mc_idx, index);
 	dev_set_drvdata(&dimm->dev, dimm);
 	pm_runtime_forbid(&mci->dev);
 

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-27 22:03                           ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-27 22:03 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Russ Anderson, Mauro Carvalho Chehab, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Thu, Sep 27, 2018 at 02:44:01PM -0700, Luck, Tony wrote:
> The problem with your patch that gets rid of EDAC_MAX_MCS is making
> device links under /sys/bus/edac.  Which is hinted at in some of the
> code your patch deleted:
> 
> -       /*
> -        * The memory controller needs its own bus, in order to avoid
> -        * namespace conflicts at /sys/bus/edac.
> -        */
> -       name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
> -       if (!name)
> -               return -ENOMEM;
> -
> -       mci->bus->name = name;

Yes, and that needed to go because I am using a single bus. Which kinda
makes sense because you want to have a single bus and multiple devices
on it. I mean, if we *have* to have a bus.

I think this whole /sys/bus/edac thing is crap and needs to go. We
have a perfectly fine hierarchy under /sys/devices/system/edac and
duplicating it under /sys/bus/edac is just bollocks. IMHO. Feel free to
correct me with, but but, this is useful for...

> which seemed to work.

Right.

> But then I began wondering what are ABI expectations
> from applications that read the EDAC /sys files?
> 
> Is this this current source repository?  https://github.com/grondo/edac-utils
> 
> This code doesn't seem to know about the "dimm*" directories below the
> "mc*" level. It just looks for the csrow* entries.

I guess this is a question for Mauro. I never really needed any special
edac tool to get info and if you ask me, we probably should try to keep
it simple and grep sysfs. So that you can always get the info without
having to install any special tools. Like ftrace works on every system
with just a shell and basic tools. I think this is very powerful. But
this is old spartan me only thinking out loud.

In any case, I'm more than fine with dropping the bus hierarchy if
nothing uses it.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-27 22:03                           ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-09-27 22:03 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Russ Anderson, Mauro Carvalho Chehab, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Thu, Sep 27, 2018 at 02:44:01PM -0700, Luck, Tony wrote:
> The problem with your patch that gets rid of EDAC_MAX_MCS is making
> device links under /sys/bus/edac.  Which is hinted at in some of the
> code your patch deleted:
> 
> -       /*
> -        * The memory controller needs its own bus, in order to avoid
> -        * namespace conflicts at /sys/bus/edac.
> -        */
> -       name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
> -       if (!name)
> -               return -ENOMEM;
> -
> -       mci->bus->name = name;

Yes, and that needed to go because I am using a single bus. Which kinda
makes sense because you want to have a single bus and multiple devices
on it. I mean, if we *have* to have a bus.

I think this whole /sys/bus/edac thing is crap and needs to go. We
have a perfectly fine hierarchy under /sys/devices/system/edac and
duplicating it under /sys/bus/edac is just bollocks. IMHO. Feel free to
correct me with, but but, this is useful for...

> which seemed to work.

Right.

> But then I began wondering what are ABI expectations
> from applications that read the EDAC /sys files?
> 
> Is this this current source repository?  https://github.com/grondo/edac-utils
> 
> This code doesn't seem to know about the "dimm*" directories below the
> "mc*" level. It just looks for the csrow* entries.

I guess this is a question for Mauro. I never really needed any special
edac tool to get info and if you ask me, we probably should try to keep
it simple and grep sysfs. So that you can always get the info without
having to install any special tools. Like ftrace works on every system
with just a shell and basic tools. I think this is very powerful. But
this is old spartan me only thinking out loud.

In any case, I'm more than fine with dropping the bus hierarchy if
nothing uses it.

Thx.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-09-28  1:10                             ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2018-09-28  1:10 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, Russ Anderson, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

Em Fri, 28 Sep 2018 00:03:55 +0200
Borislav Petkov <bp@alien8.de> escreveu:

> On Thu, Sep 27, 2018 at 02:44:01PM -0700, Luck, Tony wrote:
> > The problem with your patch that gets rid of EDAC_MAX_MCS is making
> > device links under /sys/bus/edac.  Which is hinted at in some of the
> > code your patch deleted:
> > 
> > -       /*
> > -        * The memory controller needs its own bus, in order to avoid
> > -        * namespace conflicts at /sys/bus/edac.
> > -        */
> > -       name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
> > -       if (!name)
> > -               return -ENOMEM;
> > -
> > -       mci->bus->name = name;  
> 
> Yes, and that needed to go because I am using a single bus. Which kinda
> makes sense because you want to have a single bus and multiple devices
> on it. I mean, if we *have* to have a bus.
> 
> I think this whole /sys/bus/edac thing is crap and needs to go. We
> have a perfectly fine hierarchy under /sys/devices/system/edac and
> duplicating it under /sys/bus/edac is just bollocks. IMHO. Feel free to
> correct me with, but but, this is useful for...
> 
> > which seemed to work.  
> 
> Right.
> 
> > But then I began wondering what are ABI expectations
> > from applications that read the EDAC /sys files?
> > 
> > Is this this current source repository?  https://github.com/grondo/edac-utils
> > 
> > This code doesn't seem to know about the "dimm*" directories below the
> > "mc*" level. It just looks for the csrow* entries.  
> 
> I guess this is a question for Mauro. I never really needed any special
> edac tool to get info and if you ask me, we probably should try to keep
> it simple and grep sysfs. So that you can always get the info without
> having to install any special tools. Like ftrace works on every system
> with just a shell and basic tools. I think this is very powerful. But
> this is old spartan me only thinking out loud.
> 
> In any case, I'm more than fine with dropping the bus hierarchy if
> nothing uses it.

I don't remember about any rationale behind /sys/bus/edac. It was
there already before I start working on EDAC about 10 years ago.
I guess it was used in the past by edac-utils (or maybe it is just a
side effect of the need to create a bus on some past).

Btw, The documented EDAC ABI is /sys/devices/system/edac, as
described at Documentation/ABI/testing/sysfs-devices-edac.

So, I suspect it should be safe to get rid of /sys/bus/edac,
provided that it won't cause side effects at /sys/devices/system/edac.

Why I think it is safe to get rid of /sys/bus/edac?
---------------------------------------------------

As far as I can tell, there are only two toolsets: the legacy edac-utils
and the rasdaemon. At least on Fedora 28, both applications are 
packaged (meaning that there are probably people using both).

The edac-utils uses the old sysfs entries (the ones whose entries
are dated up to 2007). I don't see any changes upstream for it
since 2008:

	https://sourceforge.net/projects/edac-utils/

I did a grep on its source code (on its version 0.16, from 2018). It seems
that it uses only /sys/devices/system/edac.

The rasdaemon uses also the new sysfs entries (the ones dated as
2012 and 2016). I'm maintaining it. Rastool not only receive traces,
but it can also store them on a database and even generate ABRT events.
It also uses only /sys/devices/system/edac.

On both toolsets, the sysfs entries there are important, in order to 
not only list the memory layout and error counts, but also to store
the dimm labels.

The rasdaemon itself uses perf trace events, although Aris added
support for it to work on non-daemon mode, where it just reads
the counters via sysfs, at /sys/devices/system/edac.

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-09-28  1:10                             ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2018-09-28  1:10 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, Russ Anderson, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

Em Fri, 28 Sep 2018 00:03:55 +0200
Borislav Petkov <bp@alien8.de> escreveu:

> On Thu, Sep 27, 2018 at 02:44:01PM -0700, Luck, Tony wrote:
> > The problem with your patch that gets rid of EDAC_MAX_MCS is making
> > device links under /sys/bus/edac.  Which is hinted at in some of the
> > code your patch deleted:
> > 
> > -       /*
> > -        * The memory controller needs its own bus, in order to avoid
> > -        * namespace conflicts at /sys/bus/edac.
> > -        */
> > -       name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
> > -       if (!name)
> > -               return -ENOMEM;
> > -
> > -       mci->bus->name = name;  
> 
> Yes, and that needed to go because I am using a single bus. Which kinda
> makes sense because you want to have a single bus and multiple devices
> on it. I mean, if we *have* to have a bus.
> 
> I think this whole /sys/bus/edac thing is crap and needs to go. We
> have a perfectly fine hierarchy under /sys/devices/system/edac and
> duplicating it under /sys/bus/edac is just bollocks. IMHO. Feel free to
> correct me with, but but, this is useful for...
> 
> > which seemed to work.  
> 
> Right.
> 
> > But then I began wondering what are ABI expectations
> > from applications that read the EDAC /sys files?
> > 
> > Is this this current source repository?  https://github.com/grondo/edac-utils
> > 
> > This code doesn't seem to know about the "dimm*" directories below the
> > "mc*" level. It just looks for the csrow* entries.  
> 
> I guess this is a question for Mauro. I never really needed any special
> edac tool to get info and if you ask me, we probably should try to keep
> it simple and grep sysfs. So that you can always get the info without
> having to install any special tools. Like ftrace works on every system
> with just a shell and basic tools. I think this is very powerful. But
> this is old spartan me only thinking out loud.
> 
> In any case, I'm more than fine with dropping the bus hierarchy if
> nothing uses it.

I don't remember about any rationale behind /sys/bus/edac. It was
there already before I start working on EDAC about 10 years ago.
I guess it was used in the past by edac-utils (or maybe it is just a
side effect of the need to create a bus on some past).

Btw, The documented EDAC ABI is /sys/devices/system/edac, as
described at Documentation/ABI/testing/sysfs-devices-edac.

So, I suspect it should be safe to get rid of /sys/bus/edac,
provided that it won't cause side effects at /sys/devices/system/edac.

Why I think it is safe to get rid of /sys/bus/edac?
---------------------------------------------------

As far as I can tell, there are only two toolsets: the legacy edac-utils
and the rasdaemon. At least on Fedora 28, both applications are 
packaged (meaning that there are probably people using both).

The edac-utils uses the old sysfs entries (the ones whose entries
are dated up to 2007). I don't see any changes upstream for it
since 2008:

	https://sourceforge.net/projects/edac-utils/

I did a grep on its source code (on its version 0.16, from 2018). It seems
that it uses only /sys/devices/system/edac.

The rasdaemon uses also the new sysfs entries (the ones dated as
2012 and 2016). I'm maintaining it. Rastool not only receive traces,
but it can also store them on a database and even generate ABRT events.
It also uses only /sys/devices/system/edac.

On both toolsets, the sysfs entries there are important, in order to 
not only list the memory layout and error counts, but also to store
the dimm labels.

The rasdaemon itself uses perf trace events, although Aris added
support for it to work on non-daemon mode, where it just reads
the counters via sysfs, at /sys/devices/system/edac.

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] Raise maximum number of memory controllers
@ 2018-10-01 12:47                               ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-10-01 12:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Luck, Tony, Russ Anderson, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Thu, Sep 27, 2018 at 10:10:54PM -0300, Mauro Carvalho Chehab wrote:
> I don't remember about any rationale behind /sys/bus/edac. It was
> there already before I start working on EDAC about 10 years ago.
> I guess it was used in the past by edac-utils (or maybe it is just a
> side effect of the need to create a bus on some past).
> 
> Btw, The documented EDAC ABI is /sys/devices/system/edac, as
> described at Documentation/ABI/testing/sysfs-devices-edac.
> 
> So, I suspect it should be safe to get rid of /sys/bus/edac,
> provided that it won't cause side effects at /sys/devices/system/edac.
> 
> Why I think it is safe to get rid of /sys/bus/edac?
> ---------------------------------------------------

...

Thanks for the analysis. So yap, I think we should try to rip out the
whole bus hierarchy then, when we have a quiet minute and whoever does
this, should add your analysis to the commit message so that we know.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Raise maximum number of memory controllers
@ 2018-10-01 12:47                               ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-10-01 12:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Luck, Tony, Russ Anderson, Greg KH, Justin Ernst, russ.anderson,
	Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

On Thu, Sep 27, 2018 at 10:10:54PM -0300, Mauro Carvalho Chehab wrote:
> I don't remember about any rationale behind /sys/bus/edac. It was
> there already before I start working on EDAC about 10 years ago.
> I guess it was used in the past by edac-utils (or maybe it is just a
> side effect of the need to create a bus on some past).
> 
> Btw, The documented EDAC ABI is /sys/devices/system/edac, as
> described at Documentation/ABI/testing/sysfs-devices-edac.
> 
> So, I suspect it should be safe to get rid of /sys/bus/edac,
> provided that it won't cause side effects at /sys/devices/system/edac.
> 
> Why I think it is safe to get rid of /sys/bus/edac?
> ---------------------------------------------------

...

Thanks for the analysis. So yap, I think we should try to rip out the
whole bus hierarchy then, when we have a quiet minute and whoever does
this, should add your analysis to the commit message so that we know.

Thx.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH] EDAC: Don't add devices under /sys/bus/edac
@ 2018-10-01 22:43                                 ` Luck, Tony
  0 siblings, 0 replies; 56+ messages in thread
From: Luck, Tony @ 2018-10-01 22:43 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, Russ Anderson, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

Nobody(*) uses them.  Dropping this will allow us to make the total
number of memory controllers configurable (as we won't have to
worry about duplicated device names under this directory).

(*) https://marc.info/?l=linux-edac&m=153809709903987&w=2

Signed-off-by: Tony Luck <tony.luck@intel.com>
---

Boris: Apply this, then your earlier patch to get rid of the
hard coded limit on the number of memory controllers:
  https://marc.info/?l=linux-edac&m=153797567628947&w=2
the combination works on my 4 socket machine. Perhaps HPE
can test on their superdome.

 drivers/edac/edac_mc_sysfs.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 20374b8248f0..4c1bee59c2e6 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -405,7 +405,6 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
 				    struct csrow_info *csrow, int index)
 {
 	csrow->dev.type = &csrow_attr_type;
-	csrow->dev.bus = mci->bus;
 	csrow->dev.groups = csrow_dev_groups;
 	device_initialize(&csrow->dev);
 	csrow->dev.parent = &mci->dev;
@@ -636,7 +635,6 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
 	dimm->mci = mci;
 
 	dimm->dev.type = &dimm_attr_type;
-	dimm->dev.bus = mci->bus;
 	device_initialize(&dimm->dev);
 
 	dimm->dev.parent = &mci->dev;
@@ -940,7 +938,6 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
 	device_initialize(&mci->dev);
 
 	mci->dev.parent = mci_pdev;
-	mci->dev.bus = mci->bus;
 	mci->dev.groups = groups;
 	dev_set_name(&mci->dev, "mc%d", mci->mc_idx);
 	dev_set_drvdata(&mci->dev, mci);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* EDAC: Don't add devices under /sys/bus/edac
@ 2018-10-01 22:43                                 ` Luck, Tony
  0 siblings, 0 replies; 56+ messages in thread
From: Luck, Tony @ 2018-10-01 22:43 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, Russ Anderson, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

Nobody(*) uses them.  Dropping this will allow us to make the total
number of memory controllers configurable (as we won't have to
worry about duplicated device names under this directory).

(*) https://marc.info/?l=linux-edac&m=153809709903987&w=2

Signed-off-by: Tony Luck <tony.luck@intel.com>
---

Boris: Apply this, then your earlier patch to get rid of the
hard coded limit on the number of memory controllers:
  https://marc.info/?l=linux-edac&m=153797567628947&w=2
the combination works on my 4 socket machine. Perhaps HPE
can test on their superdome.

 drivers/edac/edac_mc_sysfs.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 20374b8248f0..4c1bee59c2e6 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -405,7 +405,6 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
 				    struct csrow_info *csrow, int index)
 {
 	csrow->dev.type = &csrow_attr_type;
-	csrow->dev.bus = mci->bus;
 	csrow->dev.groups = csrow_dev_groups;
 	device_initialize(&csrow->dev);
 	csrow->dev.parent = &mci->dev;
@@ -636,7 +635,6 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
 	dimm->mci = mci;
 
 	dimm->dev.type = &dimm_attr_type;
-	dimm->dev.bus = mci->bus;
 	device_initialize(&dimm->dev);
 
 	dimm->dev.parent = &mci->dev;
@@ -940,7 +938,6 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
 	device_initialize(&mci->dev);
 
 	mci->dev.parent = mci_pdev;
-	mci->dev.bus = mci->bus;
 	mci->dev.groups = groups;
 	dev_set_name(&mci->dev, "mc%d", mci->mc_idx);
 	dev_set_drvdata(&mci->dev, mci);

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH] EDAC: Don't add devices under /sys/bus/edac
@ 2018-10-02  1:22                                   ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2018-10-02  1:22 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Borislav Petkov, Russ Anderson, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

Em Mon, 1 Oct 2018 15:43:13 -0700
"Luck, Tony" <tony.luck@intel.com> escreveu:

> Nobody(*) uses them.  Dropping this will allow us to make the total
> number of memory controllers configurable (as we won't have to
> worry about duplicated device names under this directory).
> 
> (*) https://marc.info/?l=linux-edac&m=153809709903987&w=2
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
> 
> Boris: Apply this, then your earlier patch to get rid of the
> hard coded limit on the number of memory controllers:
>   https://marc.info/?l=linux-edac&m=153797567628947&w=2
> the combination works on my 4 socket machine. Perhaps HPE
> can test on their superdome.
> 

For both this and the referred patch:

Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>

>  drivers/edac/edac_mc_sysfs.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index 20374b8248f0..4c1bee59c2e6 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -405,7 +405,6 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
>  				    struct csrow_info *csrow, int index)
>  {
>  	csrow->dev.type = &csrow_attr_type;
> -	csrow->dev.bus = mci->bus;
>  	csrow->dev.groups = csrow_dev_groups;
>  	device_initialize(&csrow->dev);
>  	csrow->dev.parent = &mci->dev;
> @@ -636,7 +635,6 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
>  	dimm->mci = mci;
>  
>  	dimm->dev.type = &dimm_attr_type;
> -	dimm->dev.bus = mci->bus;
>  	device_initialize(&dimm->dev);
>  
>  	dimm->dev.parent = &mci->dev;
> @@ -940,7 +938,6 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
>  	device_initialize(&mci->dev);
>  
>  	mci->dev.parent = mci_pdev;
> -	mci->dev.bus = mci->bus;
>  	mci->dev.groups = groups;
>  	dev_set_name(&mci->dev, "mc%d", mci->mc_idx);
>  	dev_set_drvdata(&mci->dev, mci);



Thanks,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* EDAC: Don't add devices under /sys/bus/edac
@ 2018-10-02  1:22                                   ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2018-10-02  1:22 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Borislav Petkov, Russ Anderson, Greg KH, Justin Ernst,
	russ.anderson, Mauro Carvalho Chehab, linux-edac, linux-kernel,
	Aristeu Rozanski Filho

Em Mon, 1 Oct 2018 15:43:13 -0700
"Luck, Tony" <tony.luck@intel.com> escreveu:

> Nobody(*) uses them.  Dropping this will allow us to make the total
> number of memory controllers configurable (as we won't have to
> worry about duplicated device names under this directory).
> 
> (*) https://marc.info/?l=linux-edac&m=153809709903987&w=2
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
> 
> Boris: Apply this, then your earlier patch to get rid of the
> hard coded limit on the number of memory controllers:
>   https://marc.info/?l=linux-edac&m=153797567628947&w=2
> the combination works on my 4 socket machine. Perhaps HPE
> can test on their superdome.
> 

For both this and the referred patch:

Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>

>  drivers/edac/edac_mc_sysfs.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index 20374b8248f0..4c1bee59c2e6 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -405,7 +405,6 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
>  				    struct csrow_info *csrow, int index)
>  {
>  	csrow->dev.type = &csrow_attr_type;
> -	csrow->dev.bus = mci->bus;
>  	csrow->dev.groups = csrow_dev_groups;
>  	device_initialize(&csrow->dev);
>  	csrow->dev.parent = &mci->dev;
> @@ -636,7 +635,6 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
>  	dimm->mci = mci;
>  
>  	dimm->dev.type = &dimm_attr_type;
> -	dimm->dev.bus = mci->bus;
>  	device_initialize(&dimm->dev);
>  
>  	dimm->dev.parent = &mci->dev;
> @@ -940,7 +938,6 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
>  	device_initialize(&mci->dev);
>  
>  	mci->dev.parent = mci_pdev;
> -	mci->dev.bus = mci->bus;
>  	mci->dev.groups = groups;
>  	dev_set_name(&mci->dev, "mc%d", mci->mc_idx);
>  	dev_set_drvdata(&mci->dev, mci);



Thanks,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* RE: [PATCH] EDAC: Don't add devices under /sys/bus/edac
@ 2018-10-02 15:51                                     ` Justin Ernst
  0 siblings, 0 replies; 56+ messages in thread
From: Ernst, Justin @ 2018-10-02 15:51 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Borislav Petkov, Greg KH, Anderson, Russ, Mauro Carvalho Chehab,
	linux-edac, linux-kernel, Aristeu Rozanski Filho

The combined patches work on a 20 socket system. 
Thanks!
-Justin

> -----Original Message-----
> From: Mauro Carvalho Chehab [mailto:mchehab+samsung@kernel.org]
> Sent: Monday, October 1, 2018 8:23 PM
> To: Luck, Tony <tony.luck@intel.com>
> Cc: Borislav Petkov <bp@alien8.de>; Anderson, Russ
> <russ.anderson@hpe.com>; Greg KH <gregkh@linuxfoundation.org>; Ernst,
> Justin <justin.ernst@hpe.com>; Anderson, Russ <russ.anderson@hpe.com>;
> Mauro Carvalho Chehab <mchehab@kernel.org>; linux-
> edac@vger.kernel.org; linux-kernel@vger.kernel.org; Aristeu Rozanski Filho
> <arozansk@redhat.com>
> Subject: Re: [PATCH] EDAC: Don't add devices under /sys/bus/edac
> 
> Em Mon, 1 Oct 2018 15:43:13 -0700
> "Luck, Tony" <tony.luck@intel.com> escreveu:
> 
> > Nobody(*) uses them.  Dropping this will allow us to make the total
> > number of memory controllers configurable (as we won't have to
> > worry about duplicated device names under this directory).
> >
> > (*) https://marc.info/?l=linux-edac&m=153809709903987&w=2
> >
> > Signed-off-by: Tony Luck <tony.luck@intel.com>
> > ---
> >
> > Boris: Apply this, then your earlier patch to get rid of the
> > hard coded limit on the number of memory controllers:
> >   https://marc.info/?l=linux-edac&m=153797567628947&w=2
> > the combination works on my 4 socket machine. Perhaps HPE
> > can test on their superdome.
> >
> 
> For both this and the referred patch:
> 
> Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
> 
> >  drivers/edac/edac_mc_sysfs.c | 3 ---
> >  1 file changed, 3 deletions(-)
> >
> > diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> > index 20374b8248f0..4c1bee59c2e6 100644
> > --- a/drivers/edac/edac_mc_sysfs.c
> > +++ b/drivers/edac/edac_mc_sysfs.c
> > @@ -405,7 +405,6 @@ static int edac_create_csrow_object(struct
> mem_ctl_info *mci,
> >  				    struct csrow_info *csrow, int index)
> >  {
> >  	csrow->dev.type = &csrow_attr_type;
> > -	csrow->dev.bus = mci->bus;
> >  	csrow->dev.groups = csrow_dev_groups;
> >  	device_initialize(&csrow->dev);
> >  	csrow->dev.parent = &mci->dev;
> > @@ -636,7 +635,6 @@ static int edac_create_dimm_object(struct
> mem_ctl_info *mci,
> >  	dimm->mci = mci;
> >
> >  	dimm->dev.type = &dimm_attr_type;
> > -	dimm->dev.bus = mci->bus;
> >  	device_initialize(&dimm->dev);
> >
> >  	dimm->dev.parent = &mci->dev;
> > @@ -940,7 +938,6 @@ int edac_create_sysfs_mci_device(struct
> mem_ctl_info *mci,
> >  	device_initialize(&mci->dev);
> >
> >  	mci->dev.parent = mci_pdev;
> > -	mci->dev.bus = mci->bus;
> >  	mci->dev.groups = groups;
> >  	dev_set_name(&mci->dev, "mc%d", mci->mc_idx);
> >  	dev_set_drvdata(&mci->dev, mci);
> 
> 
> 
> Thanks,
> Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* EDAC: Don't add devices under /sys/bus/edac
@ 2018-10-02 15:51                                     ` Justin Ernst
  0 siblings, 0 replies; 56+ messages in thread
From: Justin Ernst @ 2018-10-02 15:51 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Borislav Petkov, Greg KH, Anderson, Russ, Mauro Carvalho Chehab,
	linux-edac, linux-kernel, Aristeu Rozanski Filho

The combined patches work on a 20 socket system. 
Thanks!
-Justin

> -----Original Message-----
> From: Mauro Carvalho Chehab [mailto:mchehab+samsung@kernel.org]
> Sent: Monday, October 1, 2018 8:23 PM
> To: Luck, Tony <tony.luck@intel.com>
> Cc: Borislav Petkov <bp@alien8.de>; Anderson, Russ
> <russ.anderson@hpe.com>; Greg KH <gregkh@linuxfoundation.org>; Ernst,
> Justin <justin.ernst@hpe.com>; Anderson, Russ <russ.anderson@hpe.com>;
> Mauro Carvalho Chehab <mchehab@kernel.org>; linux-
> edac@vger.kernel.org; linux-kernel@vger.kernel.org; Aristeu Rozanski Filho
> <arozansk@redhat.com>
> Subject: Re: [PATCH] EDAC: Don't add devices under /sys/bus/edac
> 
> Em Mon, 1 Oct 2018 15:43:13 -0700
> "Luck, Tony" <tony.luck@intel.com> escreveu:
> 
> > Nobody(*) uses them.  Dropping this will allow us to make the total
> > number of memory controllers configurable (as we won't have to
> > worry about duplicated device names under this directory).
> >
> > (*) https://marc.info/?l=linux-edac&m=153809709903987&w=2
> >
> > Signed-off-by: Tony Luck <tony.luck@intel.com>
> > ---
> >
> > Boris: Apply this, then your earlier patch to get rid of the
> > hard coded limit on the number of memory controllers:
> >   https://marc.info/?l=linux-edac&m=153797567628947&w=2
> > the combination works on my 4 socket machine. Perhaps HPE
> > can test on their superdome.
> >
> 
> For both this and the referred patch:
> 
> Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
> 
> >  drivers/edac/edac_mc_sysfs.c | 3 ---
> >  1 file changed, 3 deletions(-)
> >
> > diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> > index 20374b8248f0..4c1bee59c2e6 100644
> > --- a/drivers/edac/edac_mc_sysfs.c
> > +++ b/drivers/edac/edac_mc_sysfs.c
> > @@ -405,7 +405,6 @@ static int edac_create_csrow_object(struct
> mem_ctl_info *mci,
> >  				    struct csrow_info *csrow, int index)
> >  {
> >  	csrow->dev.type = &csrow_attr_type;
> > -	csrow->dev.bus = mci->bus;
> >  	csrow->dev.groups = csrow_dev_groups;
> >  	device_initialize(&csrow->dev);
> >  	csrow->dev.parent = &mci->dev;
> > @@ -636,7 +635,6 @@ static int edac_create_dimm_object(struct
> mem_ctl_info *mci,
> >  	dimm->mci = mci;
> >
> >  	dimm->dev.type = &dimm_attr_type;
> > -	dimm->dev.bus = mci->bus;
> >  	device_initialize(&dimm->dev);
> >
> >  	dimm->dev.parent = &mci->dev;
> > @@ -940,7 +938,6 @@ int edac_create_sysfs_mci_device(struct
> mem_ctl_info *mci,
> >  	device_initialize(&mci->dev);
> >
> >  	mci->dev.parent = mci_pdev;
> > -	mci->dev.bus = mci->bus;
> >  	mci->dev.groups = groups;
> >  	dev_set_name(&mci->dev, "mc%d", mci->mc_idx);
> >  	dev_set_drvdata(&mci->dev, mci);
> 
> 
> 
> Thanks,
> Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] EDAC: Don't add devices under /sys/bus/edac
@ 2018-10-02 16:26                                       ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-10-02 16:26 UTC (permalink / raw)
  To: Ernst, Justin
  Cc: Luck, Tony, Greg KH, Anderson, Russ, Mauro Carvalho Chehab,
	linux-edac, linux-kernel, Aristeu Rozanski Filho

On Tue, Oct 02, 2018 at 03:51:41PM +0000, Ernst, Justin wrote:
> The combined patches work on a 20 socket system. 
> Thanks!

Cool, thanks for testing.

Nevertheless, I'll queue them for 4.21 so that we have a full cycle of
testing before we really kill the bus thing.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* EDAC: Don't add devices under /sys/bus/edac
@ 2018-10-02 16:26                                       ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-10-02 16:26 UTC (permalink / raw)
  To: Ernst, Justin
  Cc: Luck, Tony, Greg KH, Anderson, Russ, Mauro Carvalho Chehab,
	linux-edac, linux-kernel, Aristeu Rozanski Filho

On Tue, Oct 02, 2018 at 03:51:41PM +0000, Ernst, Justin wrote:
> The combined patches work on a 20 socket system. 
> Thanks!

Cool, thanks for testing.

Nevertheless, I'll queue them for 4.21 so that we have a full cycle of
testing before we really kill the bus thing.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] EDAC: Don't add devices under /sys/bus/edac
@ 2018-11-06 14:45                                         ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-11-06 14:45 UTC (permalink / raw)
  To: Ernst, Justin, Luck, Tony
  Cc: Greg KH, Anderson, Russ, Mauro Carvalho Chehab, linux-edac,
	linux-kernel, Aristeu Rozanski Filho

On Tue, Oct 02, 2018 at 06:26:08PM +0200, Borislav Petkov wrote:
> On Tue, Oct 02, 2018 at 03:51:41PM +0000, Ernst, Justin wrote:
> > The combined patches work on a 20 socket system. 
> > Thanks!
> 
> Cool, thanks for testing.
> 
> Nevertheless, I'll queue them for 4.21 so that we have a full cycle of
> testing before we really kill the bus thing.

Ok, I've pushed the two patches ontop of EDAC's for-next branch, here:

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=edac-for-4.21-bus

and I'd appreciate testing them one more time on the big boxes you guys have.

Diffing sysfs here shows this:

--- edac.before	2018-11-06 13:37:37.925448609 +0100
+++ edac.after	2018-11-06 15:36:11.229497795 +0100
@@ -37,7 +37,6 @@
 /sys/devices/system/edac/mc/mc0/dimm3/power/control
 /sys/devices/system/edac/mc/mc0/dimm3/dimm_dev_type
 /sys/devices/system/edac/mc/mc0/dimm3/size
-/sys/devices/system/edac/mc/mc0/dimm3/subsystem
 /sys/devices/system/edac/mc/mc0/dimm3/dimm_ce_count
 /sys/devices/system/edac/mc/mc0/dimm3/dimm_label
 /sys/devices/system/edac/mc/mc0/dimm3/dimm_location

which is the bus symlink:

subsystem -> ../../../../../../bus/mc0

and I think that's ok as nothing should be using it.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* EDAC: Don't add devices under /sys/bus/edac
@ 2018-11-06 14:45                                         ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-11-06 14:45 UTC (permalink / raw)
  To: Ernst, Justin, Luck, Tony
  Cc: Greg KH, Anderson, Russ, Mauro Carvalho Chehab, linux-edac,
	linux-kernel, Aristeu Rozanski Filho

On Tue, Oct 02, 2018 at 06:26:08PM +0200, Borislav Petkov wrote:
> On Tue, Oct 02, 2018 at 03:51:41PM +0000, Ernst, Justin wrote:
> > The combined patches work on a 20 socket system. 
> > Thanks!
> 
> Cool, thanks for testing.
> 
> Nevertheless, I'll queue them for 4.21 so that we have a full cycle of
> testing before we really kill the bus thing.

Ok, I've pushed the two patches ontop of EDAC's for-next branch, here:

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=edac-for-4.21-bus

and I'd appreciate testing them one more time on the big boxes you guys have.

Diffing sysfs here shows this:


which is the bus symlink:

subsystem -> ../../../../../../bus/mc0

and I think that's ok as nothing should be using it.

Thx.

--- edac.before	2018-11-06 13:37:37.925448609 +0100
+++ edac.after	2018-11-06 15:36:11.229497795 +0100
@@ -37,7 +37,6 @@
 /sys/devices/system/edac/mc/mc0/dimm3/power/control
 /sys/devices/system/edac/mc/mc0/dimm3/dimm_dev_type
 /sys/devices/system/edac/mc/mc0/dimm3/size
-/sys/devices/system/edac/mc/mc0/dimm3/subsystem
 /sys/devices/system/edac/mc/mc0/dimm3/dimm_ce_count
 /sys/devices/system/edac/mc/mc0/dimm3/dimm_label
 /sys/devices/system/edac/mc/mc0/dimm3/dimm_location

^ permalink raw reply	[flat|nested] 56+ messages in thread

* RE: [PATCH] EDAC: Don't add devices under /sys/bus/edac
@ 2018-11-13 19:09                                           ` Justin Ernst
  0 siblings, 0 replies; 56+ messages in thread
From: Ernst, Justin @ 2018-11-13 19:09 UTC (permalink / raw)
  To: Borislav Petkov, Luck, Tony
  Cc: Greg KH, Anderson, Russ, Mauro Carvalho Chehab, linux-edac,
	linux-kernel, Aristeu Rozanski Filho, Ernst, Justin

Looks good on a 32 socket system. All 64 memory controllers show up and I'm able to see the same sysfs diff.
Thanks
-Justin

> -----Original Message-----
> From: Borislav Petkov [mailto:bp@alien8.de]
> Sent: Tuesday, November 6, 2018 8:46 AM
> To: Ernst, Justin <justin.ernst@hpe.com>; Luck, Tony <tony.luck@intel.com>
> Cc: Greg KH <gregkh@linuxfoundation.org>; Anderson, Russ
> <russ.anderson@hpe.com>; Mauro Carvalho Chehab
> <mchehab@kernel.org>; linux-edac@vger.kernel.org; linux-
> kernel@vger.kernel.org; Aristeu Rozanski Filho <arozansk@redhat.com>
> Subject: Re: [PATCH] EDAC: Don't add devices under /sys/bus/edac
> 
> On Tue, Oct 02, 2018 at 06:26:08PM +0200, Borislav Petkov wrote:
> > On Tue, Oct 02, 2018 at 03:51:41PM +0000, Ernst, Justin wrote:
> > > The combined patches work on a 20 socket system.
> > > Thanks!
> >
> > Cool, thanks for testing.
> >
> > Nevertheless, I'll queue them for 4.21 so that we have a full cycle of
> > testing before we really kill the bus thing.
> 
> Ok, I've pushed the two patches ontop of EDAC's for-next branch, here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=edac-for-
> 4.21-bus
> 
> and I'd appreciate testing them one more time on the big boxes you guys
> have.
> 
> Diffing sysfs here shows this:
> 
> --- edac.before	2018-11-06 13:37:37.925448609 +0100
> +++ edac.after	2018-11-06 15:36:11.229497795 +0100
> @@ -37,7 +37,6 @@
>  /sys/devices/system/edac/mc/mc0/dimm3/power/control
>  /sys/devices/system/edac/mc/mc0/dimm3/dimm_dev_type
>  /sys/devices/system/edac/mc/mc0/dimm3/size
> -/sys/devices/system/edac/mc/mc0/dimm3/subsystem
>  /sys/devices/system/edac/mc/mc0/dimm3/dimm_ce_count
>  /sys/devices/system/edac/mc/mc0/dimm3/dimm_label
>  /sys/devices/system/edac/mc/mc0/dimm3/dimm_location
> 
> which is the bus symlink:
> 
> subsystem -> ../../../../../../bus/mc0
> 
> and I think that's ok as nothing should be using it.
> 
> Thx.
> 
> --
> Regards/Gruss,
>     Boris.
> 
> Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* EDAC: Don't add devices under /sys/bus/edac
@ 2018-11-13 19:09                                           ` Justin Ernst
  0 siblings, 0 replies; 56+ messages in thread
From: Justin Ernst @ 2018-11-13 19:09 UTC (permalink / raw)
  To: Borislav Petkov, Luck, Tony
  Cc: Greg KH, Anderson, Russ, Mauro Carvalho Chehab, linux-edac,
	linux-kernel, Aristeu Rozanski Filho, Ernst, Justin

Looks good on a 32 socket system. All 64 memory controllers show up and I'm able to see the same sysfs diff.
Thanks
-Justin

> -----Original Message-----
> From: Borislav Petkov [mailto:bp@alien8.de]
> Sent: Tuesday, November 6, 2018 8:46 AM
> To: Ernst, Justin <justin.ernst@hpe.com>; Luck, Tony <tony.luck@intel.com>
> Cc: Greg KH <gregkh@linuxfoundation.org>; Anderson, Russ
> <russ.anderson@hpe.com>; Mauro Carvalho Chehab
> <mchehab@kernel.org>; linux-edac@vger.kernel.org; linux-
> kernel@vger.kernel.org; Aristeu Rozanski Filho <arozansk@redhat.com>
> Subject: Re: [PATCH] EDAC: Don't add devices under /sys/bus/edac
> 
> On Tue, Oct 02, 2018 at 06:26:08PM +0200, Borislav Petkov wrote:
> > On Tue, Oct 02, 2018 at 03:51:41PM +0000, Ernst, Justin wrote:
> > > The combined patches work on a 20 socket system.
> > > Thanks!
> >
> > Cool, thanks for testing.
> >
> > Nevertheless, I'll queue them for 4.21 so that we have a full cycle of
> > testing before we really kill the bus thing.
> 
> Ok, I've pushed the two patches ontop of EDAC's for-next branch, here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=edac-for-
> 4.21-bus
> 
> and I'd appreciate testing them one more time on the big boxes you guys
> have.
> 
> Diffing sysfs here shows this:
> 
> --- edac.before	2018-11-06 13:37:37.925448609 +0100
> +++ edac.after	2018-11-06 15:36:11.229497795 +0100
> @@ -37,7 +37,6 @@
>  /sys/devices/system/edac/mc/mc0/dimm3/power/control
>  /sys/devices/system/edac/mc/mc0/dimm3/dimm_dev_type
>  /sys/devices/system/edac/mc/mc0/dimm3/size
> -/sys/devices/system/edac/mc/mc0/dimm3/subsystem
>  /sys/devices/system/edac/mc/mc0/dimm3/dimm_ce_count
>  /sys/devices/system/edac/mc/mc0/dimm3/dimm_label
>  /sys/devices/system/edac/mc/mc0/dimm3/dimm_location
> 
> which is the bus symlink:
> 
> subsystem -> ../../../../../../bus/mc0
> 
> and I think that's ok as nothing should be using it.
> 
> Thx.
> 
> --
> Regards/Gruss,
>     Boris.
> 
> Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] EDAC: Don't add devices under /sys/bus/edac
@ 2018-11-13 19:15                                             ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-11-13 19:15 UTC (permalink / raw)
  To: Ernst, Justin
  Cc: Luck, Tony, Greg KH, Anderson, Russ, Mauro Carvalho Chehab,
	linux-edac, linux-kernel, Aristeu Rozanski Filho

On Tue, Nov 13, 2018 at 07:09:24PM +0000, Ernst, Justin wrote:
> Looks good on a 32 socket system. All 64 memory controllers show up
> and I'm able to see the same sysfs diff. Thanks

Thanks for testing, much appreciated!

I'm queuing the stuff for 4.21.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* EDAC: Don't add devices under /sys/bus/edac
@ 2018-11-13 19:15                                             ` Borislav Petkov
  0 siblings, 0 replies; 56+ messages in thread
From: Borislav Petkov @ 2018-11-13 19:15 UTC (permalink / raw)
  To: Ernst, Justin
  Cc: Luck, Tony, Greg KH, Anderson, Russ, Mauro Carvalho Chehab,
	linux-edac, linux-kernel, Aristeu Rozanski Filho

On Tue, Nov 13, 2018 at 07:09:24PM +0000, Ernst, Justin wrote:
> Looks good on a 32 socket system. All 64 memory controllers show up
> and I'm able to see the same sysfs diff. Thanks

Thanks for testing, much appreciated!

I'm queuing the stuff for 4.21.

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2018-11-13 19:15 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-25 14:34 [PATCH] Raise maximum number of memory controllers Justin Ernst
2018-09-25 14:34 ` Justin Ernst
2018-09-25 15:26 ` [PATCH] " Borislav Petkov
2018-09-25 15:26   ` Borislav Petkov
2018-09-25 17:50   ` [PATCH] " Luck, Tony
2018-09-25 17:50     ` Luck, Tony
2018-09-25 18:07     ` [PATCH] " Borislav Petkov
2018-09-25 18:07       ` Borislav Petkov
2018-09-26  9:35       ` [PATCH] " Borislav Petkov
2018-09-26  9:35         ` Borislav Petkov
2018-09-26 15:27         ` [PATCH] " Borislav Petkov
2018-09-26 15:27           ` Borislav Petkov
2018-09-26 16:03           ` [PATCH] " Mauro Carvalho Chehab
2018-09-26 16:03             ` Mauro Carvalho Chehab
2018-09-26 16:17             ` [PATCH] " Borislav Petkov
2018-09-26 16:17               ` Borislav Petkov
2018-09-26 17:39               ` [PATCH] " Mauro Carvalho Chehab
2018-09-26 17:39                 ` Mauro Carvalho Chehab
2018-09-26 18:10               ` [PATCH] " Luck, Tony
2018-09-26 18:10                 ` Luck, Tony
2018-09-26 18:23                 ` [PATCH] " Russ Anderson
2018-09-26 18:23                   ` Russ Anderson
2018-09-26 23:02                   ` [PATCH] " Luck, Tony
2018-09-26 23:02                     ` Luck, Tony
2018-09-27  4:52                     ` [PATCH] " Borislav Petkov
2018-09-27  4:52                       ` Borislav Petkov
2018-09-27 21:44                       ` [PATCH] " Luck, Tony
2018-09-27 21:44                         ` Luck, Tony
2018-09-27 22:03                         ` [PATCH] " Borislav Petkov
2018-09-27 22:03                           ` Borislav Petkov
2018-09-28  1:10                           ` [PATCH] " Mauro Carvalho Chehab
2018-09-28  1:10                             ` Mauro Carvalho Chehab
2018-10-01 12:47                             ` [PATCH] " Borislav Petkov
2018-10-01 12:47                               ` Borislav Petkov
2018-10-01 22:43                               ` [PATCH] EDAC: Don't add devices under /sys/bus/edac Luck, Tony
2018-10-01 22:43                                 ` Luck, Tony
2018-10-02  1:22                                 ` [PATCH] " Mauro Carvalho Chehab
2018-10-02  1:22                                   ` Mauro Carvalho Chehab
2018-10-02 15:51                                   ` [PATCH] " Ernst, Justin
2018-10-02 15:51                                     ` Justin Ernst
2018-10-02 16:26                                     ` [PATCH] " Borislav Petkov
2018-10-02 16:26                                       ` Borislav Petkov
2018-11-06 14:45                                       ` [PATCH] " Borislav Petkov
2018-11-06 14:45                                         ` Borislav Petkov
2018-11-13 19:09                                         ` [PATCH] " Ernst, Justin
2018-11-13 19:09                                           ` Justin Ernst
2018-11-13 19:15                                           ` [PATCH] " Borislav Petkov
2018-11-13 19:15                                             ` Borislav Petkov
2018-09-26  7:55 ` [PATCH] Raise maximum number of memory controllers Zhuo, Qiuxu
2018-09-26  7:55   ` Qiuxu Zhuo
2018-09-26 13:53   ` [PATCH] " Russ Anderson
2018-09-26 13:53     ` Russ Anderson
2018-09-26 16:13 ` [PATCH] " Aristeu Rozanski
2018-09-26 16:13   ` Aristeu Rozanski
2018-09-27  5:56 ` [PATCH] " Borislav Petkov
2018-09-27  5:56   ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.