All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Rudoff <andy@rudoff.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: lsf-pc <lsf-pc@lists.linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	jmoyer <jmoyer@redhat.com>, david <david@fromorbit.com>,
	Chris Mason <clm@fb.com>, Jens Axboe <axboe@kernel.dk>,
	Bryan E Veal <bryan.e.veal@intel.com>,
	Annie Foong <annie.foong@intel.com>
Subject: Re: [LSF/MM TOPIC] atomic block device
Date: Sat, 15 Feb 2014 10:55:34 -0700	[thread overview]
Message-ID: <CABBL8ELycRzfyDGtKWk1nFySh9-a5Rh5uZXdgGEwMYHxCQzO3Q@mail.gmail.com> (raw)
In-Reply-To: <CAA9_cmf7Y1TL8XqR7dYUn=Pv-En2e0X0FM0zdpkiBkUuNBGKfQ@mail.gmail.com>

On Sat, Feb 15, 2014 at 8:04 AM, Dan Williams <dan.j.williams@intel.com> wrote:
>
> In response to Dave's call [1] and highlighting Jeff's attend request
> [2] I'd like to stoke a discussion on an emulation layer for atomic
> block commands.  Specifically, SNIA has laid out their position on the
> command set an atomic block device may support (NVM Programming Model
> [3]) and it is a good conversation piece for this effort.  The goal
> would be to review the proposed operations, identify the capabilities
> that would be readily useful to filesystems / existing use cases, and
> tear down a straw man implementation proposal.
...
> The argument for not doing this as a
> device-mapper target or stacked block device driver is to ease
> provisioning and make the emulation transparent.  On the other hand,
> the argument for doing this as a virtual block device is that the
> "failed to parse device metadata" is a known failure scenario for
> dm/md, but not sd for example.


Hi Dan,

Like Jeff, I'm a member of the NVMP workgroup and I'd like to ring in
here with a couple observations.  I think the most interesting cases
where atomics provide a benefit are cases where storage is RAIDed
across multiple devices.  Part of the argument for atomic writes on
SSDs is that databases and file systems can save bandwidth and
complexity by avoiding write-ahead-logging.  But even if every SSD
supported it, the majority of production databases span across devices
for either capacity, performance, or, most likely, high availability
reasons.  So in my opinion, that very much supports the idea of doing
atomics at a layer where it applies to SW RAIDed storage (as I believe
Dave and others are suggesting).

On the other side of the coin, I remember Dave talking about this
during our NVM discussion at LSF last year and I got the impression
the size and number of writes he'd need supported before he could
really stop using his journaling code was potentially large.  Dave:
perhaps you can re-state the number of writes and their total size
that would have to be supported by block level atomics in order for
them to be worth using by XFS?

Finally, I think atomics for file system use is interesting, but also
exposing them for database use is very interesting.  That means
exposing the size and number of writes supported to the app and making
the file system able to turn around and leverage those when a database
app tries to use them via the file system.  This has been the primary
focus of the NVMP workgroup, helping ISVs determine what features they
can leverage in a uniform way.  So my point here is we get the most
use out of atomics by exposing them both in-kernel for file systems
and in user space for apps.

-andy

  reply	other threads:[~2014-02-15 17:55 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-15 15:04 [LSF/MM TOPIC] atomic block device Dan Williams
2014-02-15 17:55 ` Andy Rudoff [this message]
2014-02-15 18:29   ` Howard Chu
2014-02-15 18:31     ` Howard Chu
2014-02-15 18:02 ` James Bottomley
2014-02-15 18:15   ` Andy Rudoff
2014-02-15 20:25     ` James Bottomley
2014-03-20 20:10       ` Jeff Moyer
     [not found] ` <CABBL8E+r+Uao9aJsezy16K_JXQgVuoD7ArepB46WTS=zruHL4g@mail.gmail.com>
2014-02-15 21:35   ` Dan Williams
2014-02-17  8:56   ` Dave Chinner
2014-02-17  9:51     ` [Lsf-pc] " Jan Kara
2014-02-17 10:20       ` Howard Chu
2014-02-18  0:10         ` Dave Chinner
2014-02-18  8:59           ` Alex Elsayed
2014-02-18 13:17             ` Dave Chinner
2014-02-18 14:09               ` Theodore Ts'o
2014-02-17 13:05 ` Chris Mason
2014-02-18 19:07   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABBL8ELycRzfyDGtKWk1nFySh9-a5Rh5uZXdgGEwMYHxCQzO3Q@mail.gmail.com \
    --to=andy@rudoff.com \
    --cc=annie.foong@intel.com \
    --cc=axboe@kernel.dk \
    --cc=bryan.e.veal@intel.com \
    --cc=clm@fb.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.