Linux-EDAC Archive on lore.kernel.org
 help / color / Atom feed
From: James Morse <james.morse@arm.com>
To: Robert Richter <rrichter@marvell.com>
Cc: Borislav Petkov <bp@alien8.de>, Tony Luck <tony.luck@intel.com>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 12/21] EDAC, ghes: Add support for legacy API counters
Date: Wed, 19 Jun 2019 18:22:32 +0100
Message-ID: <c08290d8-3690-efa9-3bc7-37f8b1fdbfd4@arm.com> (raw)
In-Reply-To: <20190612184058.2plbdweri6bjmppr@rric.localdomain>

Hi Robert,

On 12/06/2019 19:41, Robert Richter wrote:
> On 29.05.19 16:13:02, James Morse wrote:
>> On 29/05/2019 09:44, Robert Richter wrote:
>>> The ghes driver is not able yet to count legacy API counters in sysfs,
>>> e.g.:
>>>
>>>  /sys/devices/system/edac/mc/mc0/csrow2/ce_count
>>>  /sys/devices/system/edac/mc/mc0/csrow2/ch0_ce_count
>>>  /sys/devices/system/edac/mc/mc0/csrow2/ch1_ce_count
>>>
>>> Make counting csrows/channels generic so that the ghes driver can use
>>> it too.
>>
>> What for?
> 
> With EDAC_LEGACY_SYSFS enabled those counters are exposed to sysfs,
> but the numbers are wrong (all zero).

Excellent, so its legacy and broken.


>> Is this for an arm64 system? Surely we don't have any systems that used to work with these
>> legacy counters. Aren't they legacy because we want new software to stop using them!
> 
> The option is to support legacy userland. If we want to provide a> similar "user experience" as for x86 the counters should be correct.

The flip-side is arm64 doesn't have the same baggage. These counters have never worked
with this driver (even on x86).

This ghes driver also probes on HPE Server platform, so the architecture isn't really
relevant. (I was curious why Marvell care).


> Of course it is not a real mapping to csrows, but it makes that i/f
> work.

(...which stinks)


> In any case, this patch cleans up code as old API's counter code is
> isolated and moved to common code. Making the counter's work for ghes
> is actually a side-effect here. The cleanup is a prerequisit for
> follow on patches.

I'm all for removing/warning-its-broken it when ghes_edac is in use. But the convincing
argument is debian ships a 'current' version of edac-utils that predates 199747106934,
(that made all this fake csrow stuff deprecated), and debian's popcon says ~1000 people
have it installed.


If you want it fixed, please don't do it as a side effect of cleanup. Fixes need to be a
small separate series that can be backported. (unless we're confident no-one uses it, in
which case, why fix it?)



Thanks,

James

  reply index

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-29  8:44 [PATCH 00/21] EDAC, mc, ghes: Fixes and updates to improve memory error reporting Robert Richter
2019-05-29  8:44 ` [PATCH 01/21] EDAC, mc: Fix edac_mc_find() in case no device is found Robert Richter
2019-05-29  8:44 ` [PATCH 02/21] EDAC: Fixes to use put_device() after device_add() errors Robert Richter
2019-06-11 17:28   ` Borislav Petkov
2019-06-12 17:17     ` Robert Richter
2019-05-29  8:44 ` [PATCH 03/21] EDAC: Kill EDAC_DIMM_PTR() macro Robert Richter
2019-05-29  8:44 ` [PATCH 04/21] EDAC: Kill EDAC_DIMM_OFF() macro Robert Richter
2019-05-29  8:44 ` [PATCH 05/21] EDAC: Introduce mci_for_each_dimm() iterator Robert Richter
2019-05-29  8:44 ` [PATCH 06/21] EDAC, mc: Cleanup _edac_mc_free() code Robert Richter
2019-05-29  8:44 ` [PATCH 07/21] EDAC, mc: Remove per layer counters Robert Richter
2019-05-29  8:44 ` [PATCH 08/21] EDAC, mc: Rework edac_raw_mc_handle_error() to use struct dimm_info Robert Richter
2019-05-29  8:44 ` [PATCH 09/21] EDAC, ghes: Use standard kernel macros for page calculations Robert Richter
2019-05-29 15:13   ` James Morse
2019-05-29  8:44 ` [PATCH 10/21] EDAC, ghes: Remove pvt->detail_location string Robert Richter
2019-05-29 15:13   ` James Morse
2019-06-12 18:13     ` Robert Richter
2019-05-29  8:44 ` [PATCH 11/21] EDAC, ghes: Unify trace_mc_event() code with edac_mc driver Robert Richter
2019-05-29 15:12   ` James Morse
2019-06-03 13:10     ` Robert Richter
2019-06-04 17:15       ` James Morse
2019-06-13 22:23         ` Robert Richter
2019-05-29  8:44 ` [PATCH 12/21] EDAC, ghes: Add support for legacy API counters Robert Richter
2019-05-29 15:13   ` James Morse
2019-06-12 18:41     ` Robert Richter
2019-06-19 17:22       ` James Morse [this message]
2019-06-20  6:55         ` Robert Richter
2019-06-26  9:33           ` James Morse
2019-06-26 10:27             ` Robert Richter
2019-05-29  8:44 ` [PATCH 13/21] EDAC, ghes: Rework memory hierarchy detection Robert Richter
2019-05-29 15:06   ` James Morse
2019-05-31 13:41     ` Robert Richter
2019-05-29  8:44 ` [PATCH 14/21] EDAC, ghes: Extract numa node information for each dimm Robert Richter
2019-05-29 17:51   ` James Morse
2019-06-13 20:52     ` Robert Richter
2019-05-29  8:44 ` [PATCH 15/21] EDAC, ghes: Moving code around ghes_edac_register() Robert Richter
2019-05-29  8:44 ` [PATCH 16/21] EDAC, ghes: Create one memory controller device per node Robert Richter
2019-05-29  8:44 ` [PATCH 17/21] EDAC, ghes: Fill sysfs with the DMI DIMM label information Robert Richter
2019-05-29  8:44 ` [PATCH 18/21] EDAC, mc: Introduce edac_mc_alloc_by_dimm() for per dimm allocation Robert Richter
2019-05-29  8:44 ` [PATCH 19/21] EDAC, ghes: Identify dimm by node, card, module and handle Robert Richter
2019-05-29  8:44 ` [PATCH 20/21] EDAC, ghes: Enable per-layer reporting based on card/module Robert Richter
2019-05-29  8:44 ` [PATCH 21/21] EDAC, Documentation: Describe CPER module definition and DIMM ranks Robert Richter
2019-05-29 14:54 ` [PATCH 00/21] EDAC, mc, ghes: Fixes and updates to improve memory error reporting Borislav Petkov
2019-05-31 14:48   ` Robert Richter

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c08290d8-3690-efa9-3bc7-37f8b1fdbfd4@arm.com \
    --to=james.morse@arm.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=rrichter@marvell.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-EDAC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-edac/0 linux-edac/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-edac linux-edac/ https://lore.kernel.org/linux-edac \
		linux-edac@vger.kernel.org linux-edac@archiver.kernel.org
	public-inbox-index linux-edac

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-edac


AGPL code for this site: git clone https://public-inbox.org/ public-inbox