linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alasdair G Kergon <agk@redhat.com>
To: Tony Asleson <tasleson@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>,
	Sweet Tea Dorminy <sweettea@redhat.com>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	linux-scsi@vger.kernel.org, linux-block@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC 9/9] __xfs_printk: Add durable name to output
Date: Thu, 9 Jan 2020 01:41:17 +0000	[thread overview]
Message-ID: <20200109014117.GB3809@agk-dp.fab.redhat.com> (raw)
In-Reply-To: <9e449c65-193c-d69c-1454-b1059221e5dc@redhat.com>

On Wed, Jan 08, 2020 at 10:53:13AM -0600, Tony Asleson wrote:
> We are not removing any existing information, we are adding.

A difficulty with this approach is:  Where do you stop when your storage
configuration is complicated and changing?  Do you add the complete
relevant part of the storage stack configuration to every storage
message in the kernel so that it is easy to search later?

Or do you catch the messages in userspace and add some of this
information there before sending them on to your favourite log message
database?  (ref. peripety, various rsyslog extensions)

> I think all the file systems should include their FS UUID in the FS log
> messages too, but that is not part of the problem we are trying to solve.

Each layer (subsystem) should already be tagging its messages in an
easy-to-parse way so that all those relating to the same object (e.g.
filesystem instance, disk) at its level of the stack can easily be
matched together later.  Where this doesn't already happen, we should
certainly be fixing that as it's a pre-requisite for any sensible
post-processing: As long as the right information got recorded, it can
all be joined together on demand later by some userspace software.
 
> The user has to systematically and methodically go through the logs
> trying to deduce what the identifier was referring to at the time of the
> error.  This isn't trivial and virtually impossible at times depending
> on circumstances.

So how about logging what these identifiers reference at different times
in a way that is easy to query later?

Come to think of it, we already get uevents when the references change,
and udev rules even already now create neat "by-*" links for us.  Maybe
we just need to log better what udev is actually already doing?

Then we could reproduce what the storage configuration looked like at
any particular time in the past to provide the missing context for
the identifiers in the log messages.

                    ---------------------
 
Which seems like an appropriate time to introduce storage-logger.

    https://github.com/lvmteam/storage-logger

    Fedora rawhide packages:
      https://copr.fedorainfracloud.org/coprs/agk/storage-logger/ 

The goal of this particular project is to maintain a record of the
storage configuration as it changes over time.  It should provide a
quick way to check the state of a system at a specified time in the
past.

The initial logging implementation is triggered by storage uevents and
consists of two components:

1. A new udev rule file, 99-zzz-storage-logger.rules, which runs after
all the other rules have run and invokes:

2. A script, udev_storage_logger.sh, that captures relevant
information about devices that changed and stores it in the system
journal.

The effect is to log the data from relevant uevents plus some
supplementary information (including device-mapper tables, for example).
It does not yet handle filesystem-related events.

Two methods to query the data are offered:

1. journalctl
Data is tagged with the identifier UDEVLOG and retrievable as
key-value pairs.
  journalctl -t UDEVLOG --output verbose
  journalctl -t UDEVLOG --output json
    --since 'YYYY-MM-DD HH:MM:SS' 
    --until 'YYYY-MM-DD HH:MM:SS'
  journalctl -t UDEVLOG --output verbose
    --output-fields=PERSISTENT_STORAGE_ID,MAJOR,MINOR
     PERSISTENT_STORAGE_ID=dm-name-vg1-lvol0

2. lsblkj  [appended j for journal]
This lsblk wrapper reprocesses the logged uevents to reconstruct a
dummy system environment that "looks like" the system did at a
specified earlier time and then runs lsblk against it.

Yes, I'm looking for feedback to help to decide whether or not it's
worth developing this any further.

Alasdair


  reply	other threads:[~2020-01-09  1:41 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-23 22:55 [RFC 0/9] Add persistent durable identifier to storage log messages Tony Asleson
2019-12-23 22:55 ` [RFC 1/9] lib/string: Add function to trim duplicate WS Tony Asleson
2019-12-23 23:28   ` Matthew Wilcox
2020-01-02 22:52     ` Tony Asleson
2020-01-03 14:30       ` Tony Asleson
2019-12-23 22:55 ` [RFC 2/9] printk: Bring back printk_emit Tony Asleson
2019-12-23 22:55 ` [RFC 3/9] printk: Add printk_emit_ratelimited macro Tony Asleson
2019-12-23 22:55 ` [RFC 4/9] struct device_type: Add function callback durable_name Tony Asleson
2019-12-23 22:55 ` [RFC 5/9] block: Add support functions for persistent durable name Tony Asleson
2019-12-23 22:55 ` [RFC 6/9] create_syslog_header: Add " Tony Asleson
2019-12-24  0:54   ` James Bottomley
2020-01-02 22:53     ` Tony Asleson
2019-12-23 22:55 ` [RFC 7/9] print_req_error: Add persistent " Tony Asleson
2019-12-23 22:55 ` [RFC 8/9] ata_dev_printk: Add durable name to output Tony Asleson
2019-12-24  0:56   ` James Bottomley
2019-12-23 22:55 ` [RFC 9/9] __xfs_printk: " Tony Asleson
2020-01-04  2:56   ` Dave Chinner
2020-01-06  2:45     ` Tony Asleson
2020-01-06 22:02       ` Dave Chinner
2020-01-07  0:19         ` Sweet Tea Dorminy
2020-01-07  1:23           ` Dave Chinner
2020-01-07 17:01             ` Tony Asleson
2020-01-08  2:10               ` Dave Chinner
2020-01-08 16:53                 ` Tony Asleson
2020-01-09  1:41                   ` Alasdair G Kergon [this message]
2020-01-09 23:22                     ` Dave Chinner
2020-01-10  1:28                       ` Alasdair G Kergon
2020-01-10 16:13                     ` Tony Asleson
2019-12-24  0:50 ` [RFC 0/9] Add persistent durable identifier to storage log messages James Bottomley
2020-01-02 22:52   ` Tony Asleson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200109014117.GB3809@agk-dp.fab.redhat.com \
    --to=agk@redhat.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=david@fromorbit.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=sweettea@redhat.com \
    --cc=tasleson@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).