linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christopher Lameter <cl@linux.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-rdma@vger.kernel.org, linux-mm@kvack.org,
	Michal Hocko <mhocko@kernel.org>,
	Dan Williams <dan.j.williams@intel.com>
Subject: Re: [LSFMM] RDMA data corruption potential during FS writeback
Date: Fri, 18 May 2018 16:47:48 +0000	[thread overview]
Message-ID: <0100016374267882-16b274b1-d6f6-4c13-94bb-8e78a51e9091-000000@email.amazonses.com> (raw)
In-Reply-To: <20180518154945.GC15611@ziepe.ca>

On Fri, 18 May 2018, Jason Gunthorpe wrote:

> > The solution that was proposed at the meeting was that mmu notifiers can
> > remedy that situation by allowing callbacks to the RDMA device to ensure
> > that the RDMA device and the filesystem do not do concurrent writeback.
>
> This keeps coming up, and I understand why it seems appealing from the
> MM side, but the reality is that very little RDMA hardware supports
> this, and it carries with it a fairly big performance penalty so many
> users don't like using it.

Ok so we have a latent data corruption issue that is not being addressed.

> > But could we do more to prevent issues here? I think what may be useful is
> > to not allow the memory registrations of file back writable mappings
> > unless the device driver provides mmu callbacks or something like that.
>
> Why does every proposed solution to this involve crippling RDMA? Are
> there really no ideas no ideas to allow the FS side to accommodate
> this use case??

The newcomer here is RDMA. The FS side is the mainstream use case and has
been there since Unix learned to do paging.

> > There may even be more issues if DAX is being used but the FS writeback
> > has the potential of biting anyone at this point it seems.
>
> I think Dan already 'solved' this via get_user_pages_longterm which
> just fails for DAX backed pages.

That is indeed crippling and would be killing the ideas that we had around
here for using DAX. There needs to be an ability to shove large amounts of
data into memory via RDMA and from there onto a disk without too much of a
fuss and without copying. In the case of DAX this trivially should avoid
the copying to disk since its already in memory. If this does not work
then the whole thing is really not that high performant anymore since it
requires a copy operation.

  reply	other threads:[~2018-05-18 16:47 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-18 14:37 [LSFMM] RDMA data corruption potential during FS writeback Christopher Lameter
2018-05-18 15:49 ` Jason Gunthorpe
2018-05-18 16:47   ` Christopher Lameter [this message]
2018-05-18 17:36     ` Jason Gunthorpe
2018-05-18 20:23       ` Dan Williams
2018-05-19  2:33         ` John Hubbard
2018-05-19  3:24           ` Jason Gunthorpe
2018-05-19  3:51             ` Dan Williams
2018-05-19  5:38               ` John Hubbard
2018-05-21 14:38               ` Matthew Wilcox
2018-05-23 23:03                 ` John Hubbard
2018-05-21 13:37             ` Christopher Lameter
2018-05-21 13:59           ` Christopher Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0100016374267882-16b274b1-d6f6-4c13-94bb-8e78a51e9091-000000@email.amazonses.com \
    --to=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=jgg@ziepe.ca \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).