All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Brian Foster <bfoster@redhat.com>,
	sandeen@sandeen.net, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 5/8] xfs_db: check that metadata updates have been committed
Date: Thu, 20 Feb 2020 16:39:21 -0800	[thread overview]
Message-ID: <20200221003921.GE9506@magnolia> (raw)
In-Reply-To: <20200221000119.GR10776@dread.disaster.area>

On Fri, Feb 21, 2020 at 11:01:19AM +1100, Dave Chinner wrote:
> On Thu, Feb 20, 2020 at 10:34:50AM -0800, Darrick J. Wong wrote:
> > On Thu, Feb 20, 2020 at 12:58:57PM -0500, Brian Foster wrote:
> > > On Thu, Feb 20, 2020 at 08:58:40AM -0800, Darrick J. Wong wrote:
> > > > On Thu, Feb 20, 2020 at 09:06:23AM -0500, Brian Foster wrote:
> > > > > On Wed, Feb 19, 2020 at 05:42:13PM -0800, Darrick J. Wong wrote:
> > > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > > 
> > > > > > Add a new function that will ensure that everything we scribbled on has
> > > > > > landed on stable media, and report the results.
> > > > > > 
> > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > > ---
> > > > > >  db/init.c |   14 ++++++++++++++
> > > > > >  1 file changed, 14 insertions(+)
> > > > > > 
> > > > > > 
> > > > > > diff --git a/db/init.c b/db/init.c
> > > > > > index 0ac37368..e92de232 100644
> > > > > > --- a/db/init.c
> > > > > > +++ b/db/init.c
> > > > > > @@ -184,6 +184,7 @@ main(
> > > > > >  	char	*input;
> > > > > >  	char	**v;
> > > > > >  	int	start_iocur_sp;
> > > > > > +	int	d, l, r;
> > > > > >  
> > > > > >  	init(argc, argv);
> > > > > >  	start_iocur_sp = iocur_sp;
> > > > > > @@ -216,6 +217,19 @@ main(
> > > > > >  	 */
> > > > > >  	while (iocur_sp > start_iocur_sp)
> > > > > >  		pop_cur();
> > > > > > +
> > > > > > +	libxfs_flush_devices(mp, &d, &l, &r);
> > > > > > +	if (d)
> > > > > > +		fprintf(stderr, _("%s: cannot flush data device (%d).\n"),
> > > > > > +				progname, d);
> > > > > > +	if (l)
> > > > > > +		fprintf(stderr, _("%s: cannot flush log device (%d).\n"),
> > > > > > +				progname, l);
> > > > > > +	if (r)
> > > > > > +		fprintf(stderr, _("%s: cannot flush realtime device (%d).\n"),
> > > > > > +				progname, r);
> > > > > > +
> > > > > > +
> > > > > 
> > > > > Seems like we could reduce some boilerplate by passing progname into
> > > > > libxfs_flush_devices() and letting it dump out of the error messages,
> > > > > unless there's some future code that cares about individual device error
> > > > > state.
> > > > 
> > > > Such a program could call libxfs_flush_devices directly, as we do here.
> > > > 
> > > 
> > > Right.. but does anything actually care about that level of granularity
> > > right now beyond having a nicer error message?
> > 
> > No, afaict.
> > 
> > > > Also, progname is defined in libxfs so we don't even need to pass it as
> > > > an argument.
> > > > 
> > > 
> > > Ok.
> > > 
> > > > I had originally thought that we should try not to add fprintf calls to
> > > > libxfs because libraries aren't really supposed to be doing things like
> > > > that, but perhaps you're right that all of this should be melded into
> > > > something else.
> > > > 
> > > 
> > > Yeah, fair point, though I guess it depends on the particular library. 
> > 
> > I mean... is libxfs even a real library? :)
> 
> It's an internal abstraction to allow code to be shared easily with
> the kernel and between xfsprogs binaries. It is not a library in the
> sense it has a fixed API and ABI that compatibility has to be
> maintained across releases (i.e. like, say, libhandle). It is a
> library in the sense is contains shared code that things within the
> build don't have to re-implement them over and over again, and
> because it is internal that means the rules for external/distro
> level libraries don't need to be strictly applied.
> 
> e.g. if -everything- uses a global variable and has to declares it
> themselves so the shared internal code can access it, then why not
> just declare it in the shared code? :)

Fair enough.  It's a sharedcode library.

> > > > I dunno.  My current thinking is that libxfs_umount should call
> > > > libxfs_flush_devices() and print error messages as necessary, and return
> > > > error codes as appropriate.  xfs_repair can then check the umount return
> > > > value and translate that into exit(1) as required.  The device_close
> > > > functions will fsync a second time, but that shouldn't be a big deal
> > > > because we haven't dirtied anything in the meantime.
> > > > 
> > > > Thoughts?
> > > > 
> > > 
> > > I was thinking of having a per-device libxfs_device_flush() along the
> > > lines of libxfs_device_close() and separating out that functionality,
> > > but one could argue we're also a bit inconsistent between libxfs_init()
> > > opening the devices and having to close them individually.
> > 
> > Yeah, I don't understand why libxfs_destroy doesn't empty out the same
> > struct libxfs_init that libxfs_init populates.  Or why we have a global
> > variable named "x", or why the buffer cache is a global variable.
> > However, those sound like refactoring for another series.
> 
> You mean structure the unmount code like we do in teh kernel? e.g:
> 
> > > I think
> > > having libxfs_umount() do a proper purge -> flush and returning any
> > > errors instead is a fair tradeoff for simplicity. Removing the
> > > flush_devices() API also eliminates risk of somebody incorrectly
> > > attempting the flush after the umount frees the buftarg structures
> > > (without reinitializing pointers :P).
> 
> You mean like this code that I'm slowly working on to bring the
> xfs_buf.c code across to userspace and get rid of the heap of crap
> we have in libxfs/{rdwr,cache}.c now and be able to use AIO properly
> and do non-synchronous delayed writes like we do in the kernel?
> 
> libxfs/init.c:
> ....
> static void
> buftarg_cleanup(
>         struct xfs_buftarg      *btp)
> {
>         if (!btp)
>                 return;
> 
>         while (btp->bt_lru.l_count > 0)
>                 xfs_buftarg_shrink(btp, 1000);
>         xfs_buftarg_wait(btp);
>         xfs_buftarg_free(btp);

Not quite what the v3 series does, but only because it's still stuck
with "whack the bcache and then go see what happened to each buftarg".

> }
> 
> /*
>  * Release any resource obtained during a mount.
>  */
> void
> libxfs_umount(
>         struct xfs_mount        *mp)
> {
>         struct xfs_perag        *pag;
>         int                     agno;
> 
>         libxfs_rtmount_destroy(mp);
> 
>         buftarg_cleanup(mp->m_ddev_targp);
>         buftarg_cleanup(mp->m_rtdev_targp);
>         if (mp->m_logdev_targp != mp->m_ddev_targp)
>                 buftarg_cleanup(mp->m_logdev_targp);

Yep, that's exactly where I moved the cleanup call in v3.

> .....
> 
> libxfs/xfs_buftarg.c:
> .....
> void
> xfs_buftarg_free(
>         struct xfs_buftarg      *btp)
> {
>         ASSERT(percpu_counter_sum(&btp->bt_io_count) == 0);
>         percpu_counter_destroy(&btp->bt_io_count);
>         platform_flush_device(btp->bt_fd, btp->bt_bdev);
> 	libxfs_device_close(btp->bt_bdev);
>         free(btp);

I'm assuming this means you've killed off the buffer handling parts of
struct libxfs_xinit too?

> }
> 
> I haven't added the error returns for this code yet - I'm still
> doing the conversion and making it work.
> 
> I'll probably have to throw the vast majority of that patchset away
> and start again if all this API change that darrick has done is
> merged. And that will probably involve me throwing away all of the
> changes in this patch series because they just don't make any sense
> once the code is restructured properly....

...or just throw them at me in whatever state they're in now and let me
help figure out how to get there?

Everyone: don't be afraid of the 'RFCRAP' for interim patchsets.
Granted, posting git branches with a timestamp might be more
practicable...

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2020-02-21  0:41 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-20  1:41 [PATCH v2 0/8] xfsprogs: actually check that writes succeeded Darrick J. Wong
2020-02-20  1:41 ` [PATCH 1/8] libxfs: libxfs_buf_delwri_submit should write buffers immediately Darrick J. Wong
2020-02-20  1:41 ` [PATCH 2/8] libxfs: complain when write IOs fail Darrick J. Wong
2020-02-20  1:42 ` [PATCH 3/8] libxfs: return flush failures Darrick J. Wong
2020-02-20  1:42 ` [PATCH 4/8] libxfs: enable tools to check that metadata updates have been committed Darrick J. Wong
2020-02-20 14:06   ` Brian Foster
2020-02-20 16:46     ` Darrick J. Wong
2020-02-20 17:58       ` Brian Foster
2020-02-20 18:26         ` Darrick J. Wong
2020-02-20 18:50           ` Brian Foster
2020-02-20 23:40   ` Dave Chinner
2020-02-21  0:33     ` Darrick J. Wong
2020-02-20  1:42 ` [PATCH 5/8] xfs_db: " Darrick J. Wong
2020-02-20 14:06   ` Brian Foster
2020-02-20 16:58     ` Darrick J. Wong
2020-02-20 17:58       ` Brian Foster
2020-02-20 18:34         ` Darrick J. Wong
2020-02-21  0:01           ` Dave Chinner
2020-02-21  0:39             ` Darrick J. Wong [this message]
2020-02-21  1:17               ` Dave Chinner
2020-02-20  1:42 ` [PATCH 6/8] mkfs: " Darrick J. Wong
2020-02-20  1:42 ` [PATCH 7/8] xfs_repair: " Darrick J. Wong
2020-02-20  1:42 ` [PATCH 8/8] libfrog: always fsync when flushing a device Darrick J. Wong
2020-02-20 14:06   ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200221003921.GE9506@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.