[LSF/MM TOPIC] atomic block device

* [LSF/MM TOPIC] atomic block device
@ 2014-02-15 15:04 Dan Williams
  2014-02-15 17:55 ` Andy Rudoff
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Dan Williams @ 2014-02-15 15:04 UTC (permalink / raw)
  To: lsf-pc
  Cc: linux-fsdevel, jmoyer, david, Chris Mason, Jens Axboe,
	Bryan E Veal, Annie Foong

In response to Dave's call [1] and highlighting Jeff's attend request
[2] I'd like to stoke a discussion on an emulation layer for atomic
block commands.  Specifically, SNIA has laid out their position on the
command set an atomic block device may support (NVM Programming Model
[3]) and it is a good conversation piece for this effort.  The goal
would be to review the proposed operations, identify the capabilities
that would be readily useful to filesystems / existing use cases, and
tear down a straw man implementation proposal.

The SNIA defined capabilities that seem the highest priority to implement are:
* ATOMIC_MULTIWRITE - dis-contiguous LBA ranges, power fail atomic, no
ordering constraint relative to other i/o

* ATOMIC_WRITE - contiguous LBA range, power fail atomic, no ordering
constraint relative to other i/o

* EXISTS - not an atomic command, but defined in the NPM.  It is akin
to SEEK_{DATA|HOLE} to test whether an LBA is mapped or unmapped.  If
the LBA is mapped additionally specifies whether data is present or
the LBA is only allocated.

* SCAR - again not an atomic command, but once we have metadata can
implement a bad block list, analogous to the bad-block-list support in
md.

Initial thought is that this functionality is better implemented as a
library a block device driver (bio-based or request-based) can call to
emulate these features.  In the case where the feature is directly
supported by the underlying hardware device the emulation layer will
stub out and pass it through.  The argument for not doing this as a
device-mapper target or stacked block device driver is to ease
provisioning and make the emulation transparent.  On the other hand,
the argument for doing this as a virtual block device is that the
"failed to parse device metadata" is a known failure scenario for
dm/md, but not sd for example.

Thoughts?

--
Dan

[1]: http://marc.info/?l=linux-fsdevel&m=138438717002687&w=2
[2]: http://marc.info/?l=linux-fsdevel&m=139041672718333&w=2
[3]: http://snia.org/sites/default/files/NVMProgrammingModel_v1.pdf

^ permalink raw reply	[flat|nested] 18+ messages in thread