linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Shiju Jose <shiju.jose@huawei.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"dave@stgolabs.net" <dave@stgolabs.net>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>,
	"alison.schofield@intel.com" <alison.schofield@intel.com>,
	"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
	"ira.weiny@intel.com" <ira.weiny@intel.com>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"david@redhat.com" <david@redhat.com>,
	"Vilas.Sridharan@amd.com" <Vilas.Sridharan@amd.com>,
	"leo.duran@amd.com" <leo.duran@amd.com>,
	"Yazen.Ghannam@amd.com" <Yazen.Ghannam@amd.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"jiaqiyan@google.com" <jiaqiyan@google.com>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"Jon.Grimm@amd.com" <Jon.Grimm@amd.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"rafael@kernel.org" <rafael@kernel.org>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"naoya.horiguchi@nec.com" <naoya.horiguchi@nec.com>,
	"james.morse@arm.com" <james.morse@arm.com>,
	"jthoughton@google.com" <jthoughton@google.com>,
	"somasundaram.a@hpe.com" <somasundaram.a@hpe.com>,
	"erdemaktas@google.com" <erdemaktas@google.com>,
	"pgonda@google.com" <pgonda@google.com>,
	"duenwen@google.com" <duenwen@google.com>,
	"mike.malvestuto@intel.com" <mike.malvestuto@intel.com>,
	"gthelen@google.com" <gthelen@google.com>,
	"wschwartz@amperecomputing.com" <wschwartz@amperecomputing.com>,
	"dferguson@amperecomputing.com" <dferguson@amperecomputing.com>,
	"wbs@os.amperecomputing.com" <wbs@os.amperecomputing.com>,
	"nifan.cxl@gmail.com" <nifan.cxl@gmail.com>,
	tanxiaofei <tanxiaofei@huawei.com>,
	"Zengtao (B)" <prime.zeng@hisilicon.com>,
	"kangkang.shen@futurewei.com" <kangkang.shen@futurewei.com>,
	wanghuiqiang <wanghuiqiang@huawei.com>,
	Linuxarm <linuxarm@huawei.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jean Delvare <jdelvare@suse.com>,
	Guenter Roeck <linux@roeck-us.net>,
	Dmitry Torokhov <dmitry.torokhov@gmail.com>
Subject: Re: [RFC PATCH v8 01/10] ras: scrub: Add scrub subsystem
Date: Fri, 10 May 2024 11:25:11 +0200	[thread overview]
Message-ID: <20240510092511.GBZj3n9ye_BCSepFZy@fat_crate.local> (raw)
In-Reply-To: <663d55515a2d9_db82d2941e@dwillia2-xfh.jf.intel.com.notmuch>

On Thu, May 09, 2024 at 03:59:29PM -0700, Dan Williams wrote:
> No, at a minimum that's a layering violation. This is a generic library
> facility that should not care if it is being called for a CXL device or
> an ACPI device.

Really?

Because this looks like creating a subsystem for those two newfangled
scrubbing functionalities which will be present in CXL devices and
by that ACPI RAS2 thing.

Because we have a *lot* of hw scrubbing functionality already. Just do:

git grep "scrub"

Some of it controls hw scrubbing. If this is a generic facility, does
this mean that all those older scrubbers should be converted to it?

Or is this thing going to support only new stuff? I.e., we want to
disable our scrubbers when doing performance-sensitive workloads and
reenable them after that but we don't care about old systems?

And all that other bla about controlling scrubbers from userspace.

So which is it?

> I think it works for x86 drivers because the functionality in those
> modules is wholly contained within that one module. This scrub module is
> a service library for other modules.

Well, you have that thing in EDAC. edac_core.ko is that service module
and the chipset-specific drivers - at least on x86 - use a match_id to
load only on the systems they should load on.

If I had a BIOS table which had "EDAC" in it, I won't load edac_core.ko
either but there isn't one.

> It is functionally the wrong place to do the check. When module_init()
> fails it causes not only the current module to be unloaded but any
> dependent modules will also fail to load.

See above. I'm under the assumption that this is using two methods to
detect scrubbing functionality.

> Let's take an example of the CXL driver wanting to register with this
> scrub interface to support the capability that *might* be available on
> some CXL devices. The cxl_pci.ko module, that houses cxl_pci_driver,
> grows a call to scrub_device_register(). That scrub_device_register()
> call is statically present in cxl_pci.ko so that when cxl_pci.ko loads
> symbol resolution requires scrub.ko to load.
> 
> Neither of those modules (cxl_pci.ko or scrub.ko) load automatically.
> Either udev loads cxl_pci.ko because it sees a device that matches
> cxl_mem_pci_tbl, or the user manually insmods those modules because they
> think they know better. No memory wasted unless the user explicitly asks
> for memory to be wasted.

The scrub.ko goes and asks the system: "Do you have a CXL device with
scrubbing functionality?" "Yes: load." "No: ok, won't."

> If no CXL devices in the system have scrub capabilities, great, then
> scrub_device_register() will never be called.
> 
> Now, if memory_scrub_control_init() did its own awkward and redundant
> CXL scan, and fails with "no CXL scrub capable devices found" it would
> not only block scrub.ko from loading, but also cxl_pci.ko since
> cxl_pci.ko needs to resolve that symbol to load.

Why would it fail the scan?

Isn't this fancy GET_SUPPORTED_FEATURES command giving you all info you
need?

> I would not say "no" to a generic facility that patches out module
> dependencies until the first call, just not sure the return on
> investment would be worth it.

From the discussion so far yeah, I think you're right, it is not worth
the effort.

> Lastly I think drivers based on ACPI tables are awkward. They really
> need to have an ACPI device to attach so that typical automatic Linux
> module loading machinery can be used. The fact this function is a
> subsys_initcall() is a red-flag since nothing should be depending on the
> load order of a little driver to control scrub parameters.

Yeah, it is becoming a mess before it has even started.

So I don't mind if such drivers get loaded as long as doing this is the
best we can do given the situation. What gets me up the palms, as they
say in .de, is "just because" and "look, the others do it too."

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2024-05-10  9:26 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-19 16:47 [RFC PATCH v8 00/10] ras: scrub: introduce subsystem + CXL/ACPI-RAS2 drivers shiju.jose
2024-04-19 16:47 ` [RFC PATCH v8 01/10] ras: scrub: Add scrub subsystem shiju.jose
2024-04-24 20:25   ` fan
2024-04-25 10:38     ` Shiju Jose
2024-04-25 10:15   ` Borislav Petkov
2024-04-25 18:11     ` Shiju Jose
2024-05-06 10:30       ` Borislav Petkov
2024-05-08 16:59         ` Shiju Jose
2024-05-08 17:20           ` Borislav Petkov
2024-05-08 17:44             ` Shiju Jose
2024-05-08 19:25               ` Borislav Petkov
2024-05-09  9:19                 ` Jonathan Cameron
2024-05-09 15:52                   ` Borislav Petkov
2024-05-09 20:03                     ` Borislav Petkov
2024-05-09 21:21                       ` Dan Williams
2024-05-09 21:51                         ` Borislav Petkov
2024-05-09 22:59                           ` Dan Williams
2024-05-10  9:25                             ` Borislav Petkov [this message]
2024-05-10 17:13                               ` Dan Williams
2024-05-11 10:17                                 ` Borislav Petkov
2024-05-17 11:15                                   ` Jonathan Cameron
2024-05-17 11:44                                     ` Jonathan Cameron
2024-05-21  8:06                                       ` Borislav Petkov
2024-05-22  9:40                                         ` Jonathan Cameron
2024-05-27  9:09                                           ` Borislav Petkov
2024-05-20 10:54                                   ` Shiju Jose
2024-05-20 11:58                                     ` Jonathan Cameron
2024-05-27  9:21                                       ` Borislav Petkov
2024-05-28  9:06                                         ` Jonathan Cameron
2024-05-10 13:31                     ` Jonathan Cameron
2024-05-09 21:47   ` Dan Williams
2024-05-10  9:03     ` Jonathan Cameron
2024-04-19 16:47 ` [RFC PATCH v8 02/10] cxl/mbox: Add GET_SUPPORTED_FEATURES mailbox command shiju.jose
2024-04-19 16:47 ` [RFC PATCH v8 03/10] cxl/mbox: Add GET_FEATURE " shiju.jose
2024-04-24 23:19   ` fan
2024-04-25 10:38     ` Shiju Jose
2024-04-19 16:47 ` [RFC PATCH v8 04/10] cxl/mbox: Add SET_FEATURE " shiju.jose
2024-04-25 17:26   ` fan
2024-04-19 16:47 ` [RFC PATCH v8 05/10] cxl/memscrub: Add CXL device patrol scrub control feature shiju.jose
2024-04-26 23:56   ` fan
2024-04-29 11:20     ` Shiju Jose
2024-04-29 12:21       ` Jonathan Cameron
2024-05-10  0:26   ` Dan Williams
2024-05-10 11:23     ` Jonathan Cameron
2024-04-19 16:47 ` [RFC PATCH v8 06/10] ACPICA: Add __free() based cleanup function for acpi_put_table shiju.jose
2024-04-19 18:06   ` Jonathan Cameron
2024-04-19 16:47 ` [RFC PATCH v8 07/10] platform: Add __free() based cleanup function for platform_device_put shiju.jose
2024-04-19 16:47 ` [RFC PATCH v8 08/10] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2024-04-19 16:47 ` [RFC PATCH v8 09/10] ras: scrub: Add scrub control attributes for ACPI RAS2 shiju.jose
2024-04-19 16:47 ` [RFC PATCH v8 10/10] ras: scrub: ACPI RAS2: Add memory ACPI RAS2 driver shiju.jose

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240510092511.GBZj3n9ye_BCSepFZy@fat_crate.local \
    --to=bp@alien8.de \
    --cc=Jon.Grimm@amd.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=Vilas.Sridharan@amd.com \
    --cc=Yazen.Ghannam@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dferguson@amperecomputing.com \
    --cc=dmitry.torokhov@gmail.com \
    --cc=duenwen@google.com \
    --cc=erdemaktas@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=gthelen@google.com \
    --cc=ira.weiny@intel.com \
    --cc=james.morse@arm.com \
    --cc=jdelvare@suse.com \
    --cc=jiaqiyan@google.com \
    --cc=jthoughton@google.com \
    --cc=kangkang.shen@futurewei.com \
    --cc=lenb@kernel.org \
    --cc=leo.duran@amd.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@roeck-us.net \
    --cc=linuxarm@huawei.com \
    --cc=mike.malvestuto@intel.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=nifan.cxl@gmail.com \
    --cc=pgonda@google.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=rafael@kernel.org \
    --cc=rientjes@google.com \
    --cc=shiju.jose@huawei.com \
    --cc=somasundaram.a@hpe.com \
    --cc=tanxiaofei@huawei.com \
    --cc=tony.luck@intel.com \
    --cc=vishal.l.verma@intel.com \
    --cc=wanghuiqiang@huawei.com \
    --cc=wbs@os.amperecomputing.com \
    --cc=wschwartz@amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).