All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-xfs@vger.kernel.org,
	linux-block <linux-block@vger.kernel.org>,
	linux-api@vger.kernel.org,
	"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Subject: Re: [RFC] failure atomic writes for file systems and block devices
Date: Wed, 1 Mar 2017 13:21:41 +0200	[thread overview]
Message-ID: <CAOQ4uxjYwP+XGdDu-HiKmiAVEFydqYj6xfQbaKULB9WWUSuX5g@mail.gmail.com> (raw)
In-Reply-To: <20170228145737.19016-1-hch@lst.de>

On Tue, Feb 28, 2017 at 4:57 PM, Christoph Hellwig <hch@lst.de> wrote:
> Hi all,
>
> this series implements a new O_ATOMIC flag for failure atomic writes
> to files.   It is based on and tries to unify to earlier proposals,
> the first one for block devices by Chris Mason:
>
>         https://lwn.net/Articles/573092/
>
> and the second one for regular files, published by HP Research at
> Usenix FAST 2015:
>
>         https://www.usenix.org/conference/fast15/technical-sessions/presentation/verma
>
> It adds a new O_ATOMIC flag for open, which requests writes to be
> failure-atomic, that is either the whole write makes it to persistent
> storage, or none of it, even in case of power of other failures.
>
> There are two implementation various of this:  on block devices O_ATOMIC
> must be combined with O_(D)SYNC so that storage devices that can handle
> large writes atomically can simply do that without any additional work.
> This case is supported by NVMe.
>
> The second case is for file systems, where we simply write new blocks
> out of places and then remap them into the file atomically on either
> completion of an O_(D)SYNC write or when fsync is called explicitly.
>
> The semantics of the latter case are explained in detail at the Usenix
> paper above.
>
> Last but not least a new fcntl is implemented to provide information
> about I/O restrictions such as alignment requirements and the maximum
> atomic write size.
>
> The implementation is simple and clean, but I'm rather unhappy about
> the interface as it has too many failure modes to bullet proof.  For
> one old kernels ignore unknown open flags silently, so applications
> have to check the F_IOINFO fcntl before, which is a bit of a killer.
> Because of that I've also not implemented any other validity checks
> yet, as they might make thing even worse when an open on a not supported
> file system or device fails, but not on an old kernel.  Maybe we need
> a new open version that checks arguments properly first?
>

[CC += linux-api@vger.kernel.org] for that question and for the new API

> Also I'm really worried about the NVMe failure modes - devices simply
> advertise an atomic write size, with no way for the device to know
> that the host requested a given write to be atomic, and thus no
> error reporting.  This is made worse by NVMe 1.2 adding per-namespace
> atomic I/O parameters that devices can use to introduce additional
> odd alignment quirks - while there is some language in the spec
> requiring them not to weaken the per-controller guarantees it all
> looks rather weak and I'm not too confident in all implementations
> getting everything right.
>
> Last but not least this depends on a few XFS patches, so to actually
> apply / run the patches please use this git tree:
>
>     git://git.infradead.org/users/hch/vfs.git O_ATOMIC
>
> Gitweb:
>
>     http://git.infradead.org/users/hch/vfs.git/shortlog/refs/heads/O_ATOMIC

  parent reply	other threads:[~2017-03-01 11:24 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-28 14:57 Christoph Hellwig
2017-02-28 14:57 ` [PATCH 01/12] uapi/fs: add O_ATOMIC to the open flags Christoph Hellwig
2017-02-28 14:57 ` [PATCH 02/12] iomap: pass IOMAP_* flags to actors Christoph Hellwig
2017-02-28 14:57 ` [PATCH 03/12] iomap: add a IOMAP_ATOMIC flag Christoph Hellwig
2017-02-28 14:57 ` [PATCH 04/12] fs: add a BH_Atomic flag Christoph Hellwig
2017-02-28 14:57 ` [PATCH 05/12] fs: add a F_IOINFO fcntl Christoph Hellwig
2017-02-28 16:51   ` Darrick J. Wong
2017-03-01 15:11     ` Christoph Hellwig
2017-02-28 14:57 ` [PATCH 06/12] xfs: cleanup is_reflink checks Christoph Hellwig
2017-02-28 14:57 ` [PATCH 07/12] xfs: implement failure-atomic writes Christoph Hellwig
2017-02-28 23:09   ` Darrick J. Wong
2017-03-01 15:17     ` Christoph Hellwig
2017-02-28 14:57 ` [PATCH 08/12] xfs: implement the F_IOINFO fcntl Christoph Hellwig
2017-02-28 14:57 ` [PATCH 09/12] block: advertize max atomic write limit Christoph Hellwig
2017-02-28 14:57 ` [PATCH 10/12] block_dev: set REQ_NOMERGE for O_ATOMIC writes Christoph Hellwig
2017-02-28 14:57 ` [PATCH 11/12] block_dev: implement the F_IOINFO fcntl Christoph Hellwig
2017-02-28 14:57 ` [PATCH 12/12] nvme: export the atomic write limit Christoph Hellwig
2017-02-28 20:48 ` [RFC] failure atomic writes for file systems and block devices Chris Mason
2017-02-28 20:48   ` Chris Mason
2017-03-01 15:07   ` Christoph Hellwig
2017-02-28 23:22 ` Darrick J. Wong
2017-03-01 15:09   ` Christoph Hellwig
2017-03-01 11:21 ` Amir Goldstein [this message]
2017-03-01 15:07   ` Christoph Hellwig
2017-03-01 15:07     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxjYwP+XGdDu-HiKmiAVEFydqYj6xfQbaKULB9WWUSuX5g@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=hch@lst.de \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mtk.manpages@gmail.com \
    --subject='Re: [RFC] failure atomic writes for file systems and block devices' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.