All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tony Asleson <tasleson@redhat.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Christoph Hellwig <hch@infradead.org>,
	linux-scsi@vger.kernel.org, linux-block@vger.kernel.org,
	linux-ide@vger.kernel.org
Cc: Hannes Reinecke <hare@suse.de>
Subject: Re: [v5 01/12] struct device: Add function callback durable_name
Date: Tue, 29 Sep 2020 17:04:32 -0500	[thread overview]
Message-ID: <20e220a6-4bde-2331-6e5e-24de39f9aa3b@redhat.com> (raw)
In-Reply-To: <20200929180415.GA1400445@kroah.com>

On 9/29/20 1:04 PM, Greg Kroah-Hartman wrote:
> On Tue, Sep 29, 2020 at 06:51:02PM +0100, Christoph Hellwig wrote:
>> Independ of my opinion on the whole scheme that I shared last time,
>> we really should not bloat struct device with function pointers.

Christoph, thank you for sharing a bit more of your concerns and
bringing Greg into the discussion.  It's more productive.

>>
>> On Fri, Sep 25, 2020 at 11:19:18AM -0500, Tony Asleson wrote:
>>> Function callback and function to be used to write a persistent
>>> durable name to the supplied character buffer.  This will be used to add
>>> structured key-value data to log messages for hardware related errors
>>> which allows end users to correlate message and specific hardware.
>>>
>>> Signed-off-by: Tony Asleson <tasleson@redhat.com>
>>> ---
>>>  drivers/base/core.c    | 24 ++++++++++++++++++++++++
>>>  include/linux/device.h |  4 ++++
>>>  2 files changed, 28 insertions(+)
> 
> I can't find this patch anywhere in my archives, why was I not cc:ed on
> it?  It's a v5 and no one thought to ask the driver core
> developers/maintainers about it???

You were CC'd into v3, ref.

https://lore.kernel.org/linux-ide/20200714081750.GB862637@kroah.com/

I should have continued to CC you on it, sorry about that.


> {sigh}
> 
> And for log messages, what about the dynamic debug developers, why not
> include them as well?  Since when is this a storage-only thing?

Hannes Reinecke has been involved in the discussion some and he's
involved in dynamic debug AFAIK.

If others have a need to identify a specific piece of hardware in a
potential sea of identical hardware that is encountering errors and
logging messages and can optionally be added and removed at run-time,
then yes they should be included too.  There is nothing with this patch
series that is preventing any device from registering a callback which
provides this information when asked.


>>> diff --git a/include/linux/device.h b/include/linux/device.h
>>> index 5efed864b387..074125999dd8 100644
>>> --- a/include/linux/device.h
>>> +++ b/include/linux/device.h
>>> @@ -614,6 +614,8 @@ struct device {
>>>  	struct iommu_group	*iommu_group;
>>>  	struct dev_iommu	*iommu;
>>>  
>>> +	int (*durable_name)(const struct device *dev, char *buff, size_t len);
> 
> No, that's not ok at all, why is this even a thing?
> 
> Who is setting this?  Why can't the bus do this without anything
> "special" needed from the driver core?

I'm setting it in the different storage subsystems.  The intent is that
when we go through a common logging path eg. dev_printk you can ask the
device what it's ID is so that it can be logged as structured data with
the log message.  I was trying to avoid having separate logging
functions all over the place which enforces a consistent meta data to
log messages, but maybe that would be better than adding a function
pointer to struct device?  My first patch tried adding a call back to
struct device_type, but that ran into the issue where struct device_type
is static const in a number of areas.

Basically for any piece of hardware with a serial number, call this
function to get it.  That was the intent anyway.

> We have a mapping of 'struct device' to a unique hardware device at a
> specific point in time, why are you trying to create another one?

I don't think I'm creating anything new.  Can you clarify this a bit
more?  I'm trying to leverage what is already in place.

> What is wrong with what we have today?

I'm trying to figure out a way to positively identify which storage
device an error belongs to over time.  Logging how something is attached
and not what is attached, only solves for right now, this point in time.
 Additionally when the only identifier is in the actual message itself,
it makes user space break every time the message content changes.  Also
what we have today in dev_printk doesn't tag the meta-data consistently
for journald to leverage.

This patch series adds structured data to the log entry for positive
identification.  It doesn't change when the content of the log message
changes.  It's true over time and it's true if a device gets moved to a
different system.

> So this is a HARD NAK on this patch for now.

Thank you for supplying some feedback and asking questions.  I've been
asking for suggestions and would very much like to have a discussion on
how this issue is best solved.  I'm not attached to what I've provided.
I'm just trying to get towards a solution.


We've looked at user space quite a bit and there is an inherit race
condition with trying to fetch the unique hardware id for a message when
it gets emitted from the kernel as udev rules haven't even run (assuming
we even have the meta-data to make the association).  The last thing
people want to do is delay writing the log message to disk until the
device it belongs to can be identified.  Of course this patch series
still has a window from where the device is first discovered by the
kernel and fetches the needed vpd data from the device.  Any kernel
messages logged during that time have no id to associate with it.

Thanks,
Tony



  reply	other threads:[~2020-09-29 22:04 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-25 16:19 [v5 00/12] Add persistent durable identifier to storage log messages Tony Asleson
2020-09-25 16:19 ` [v5 01/12] struct device: Add function callback durable_name Tony Asleson
2020-09-26  9:08   ` Sergei Shtylyov
2020-09-27 14:22     ` Tony Asleson
2020-09-27 16:15       ` Sergei Shtylyov
2020-09-29 17:51   ` Christoph Hellwig
2020-09-29 18:04     ` Greg Kroah-Hartman
2020-09-29 22:04       ` Tony Asleson [this message]
2020-09-30  7:38         ` Greg Kroah-Hartman
2020-09-30  7:40           ` Greg Kroah-Hartman
2020-09-30 14:35           ` Tony Asleson
2020-10-01 11:48             ` Greg Kroah-Hartman
2020-10-07 20:10               ` Tony Asleson
2020-10-08  4:48                 ` Greg Kroah-Hartman
2020-10-08 20:49                   ` Martin K. Petersen
2020-10-08  5:54                 ` Hannes Reinecke
2020-10-08  6:22                 ` Finn Thain
2020-09-25 16:19 ` [v5 02/12] create_syslog_header: Add durable name Tony Asleson
2020-09-25 16:19 ` [v5 03/12] dev_vprintk_emit: Increase hdr size Tony Asleson
2020-09-25 16:19 ` [v5 04/12] scsi: Add durable_name for dev_printk Tony Asleson
2020-09-25 16:19 ` [v5 05/12] nvme: Add durable name " Tony Asleson
2020-09-25 16:19 ` [v5 06/12] libata: Add ata_scsi_durable_name Tony Asleson
2020-09-25 16:19 ` [v5 07/12] libata: Make ata_scsi_durable_name static Tony Asleson
2020-09-26  8:40   ` Sergei Shtylyov
2020-09-26 14:17     ` Tony Asleson
2020-09-26 15:57       ` James Bottomley
2020-09-28 20:35     ` Tony Asleson
2020-09-25 16:19 ` [v5 08/12] Add durable_name_printk Tony Asleson
2020-09-26 23:53   ` Randy Dunlap
2020-09-28 15:52     ` Tony Asleson
2020-09-28 17:32       ` Randy Dunlap
2020-09-25 16:19 ` [v5 09/12] libata: use durable_name_printk Tony Asleson
2020-09-25 16:19 ` [v5 10/12] Add durable_name_printk_ratelimited Tony Asleson
2020-09-25 16:19 ` [v5 11/12] print_req_error: Use durable_name_printk_ratelimited Tony Asleson
2020-09-25 16:19 ` [v5 12/12] buffer_io_error: " Tony Asleson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20e220a6-4bde-2331-6e5e-24de39f9aa3b@redhat.com \
    --to=tasleson@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hare@suse.de \
    --cc=hch@infradead.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.