LKML Archive on lore.kernel.org
 help / Atom feed
From: Jan Kara <jack@suse.cz>
To: Dave Chinner <david@fromorbit.com>
Cc: Jerome Glisse <jglisse@redhat.com>, Jan Kara <jack@suse.cz>,
	John Hubbard <jhubbard@nvidia.com>,
	Matthew Wilcox <willy@infradead.org>,
	Dan Williams <dan.j.williams@intel.com>,
	John Hubbard <john.hubbard@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,
	tom@talpey.com, Al Viro <viro@zeniv.linux.org.uk>,
	benve@cisco.com, Christoph Hellwig <hch@infradead.org>,
	Christopher Lameter <cl@linux.com>,
	"Dalessandro, Dennis" <dennis.dalessandro@intel.com>,
	Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, Michal Hocko <mhocko@kernel.org>,
	mike.marciniszyn@intel.com, rcampbell@nvidia.com,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
Date: Fri, 14 Dec 2018 16:43:21 +0100
Message-ID: <20181214154321.GF8896@quack2.suse.cz> (raw)
In-Reply-To: <20181212214641.GB29416@dastard>

Hi!

On Thu 13-12-18 08:46:41, Dave Chinner wrote:
> On Wed, Dec 12, 2018 at 10:03:20AM -0500, Jerome Glisse wrote:
> > On Mon, Dec 10, 2018 at 11:28:46AM +0100, Jan Kara wrote:
> > > On Fri 07-12-18 21:24:46, Jerome Glisse wrote:
> > > So this approach doesn't look like a win to me over using counter in struct
> > > page and I'd rather try looking into squeezing HMM public page usage of
> > > struct page so that we can fit that gup counter there as well. I know that
> > > it may be easier said than done...
> > 
> > So i want back to the drawing board and first i would like to ascertain
> > that we all agree on what the objectives are:
> > 
> >     [O1] Avoid write back from a page still being written by either a
> >          device or some direct I/O or any other existing user of GUP.
> >          This would avoid possible file system corruption.
> > 
> >     [O2] Avoid crash when set_page_dirty() is call on a page that is
> >          considered clean by core mm (buffer head have been remove and
> >          with some file system this turns into an ugly mess).
> 
> I think that's wrong. This isn't an "avoid a crash" case, this is a
> "prevent data and/or filesystem corruption" case. The primary goal
> we have here is removing our exposure to potential corruption, which
> has the secondary effect of avoiding the crash/panics that currently
> occur as a result of inconsistent page/filesystem state.
> 
> i.e. The goal is to have ->page_mkwrite() called on the clean page
> /before/ the file-backed page is marked dirty, and hence we don't
> expose ourselves to potential corruption or crashes that are a
> result of inappropriately calling set_page_dirty() on clean
> file-backed pages.

I agree that [O1] - i.e., avoid corrupting fs data - is more important and
[O2] is just one consequence of [O1].

> > For [O1] and [O2] i believe a solution with mapcount would work. So
> > no new struct, no fake vma, nothing like that. In GUP for file back
> > pages we increment both refcount and mapcount (we also need a special
> > put_user_page to decrement mapcount when GUP user are done with the
> > page).
> 
> I don't see how a mapcount can prevent anyone from calling
> set_page_dirty() inappropriately.
> 
> > Now for [O1] the write back have to call page_mkclean() to go through
> > all reverse mapping of the page and map read only. This means that
> > we can count the number of real mapping and see if the mapcount is
> > bigger than that. If mapcount is bigger than page is pin and we need
> > to use a bounce page to do the writeback.
> 
> Doesn't work. Generally filesystems have already mapped the page
> into bios before they call clear_page_dirty_for_io(), so it's too
> late for the filesystem to bounce the page at that point.

Yes, for filesystem it is too late. But the plan we figured back in October
was to do the bouncing in the block layer. I.e., mark the bio (or just the
particular page) as needing bouncing and then use the existing page
bouncing mechanism in the block layer to do the bouncing for us. Ext3 (when
it was still a separate fs driver) has been using a mechanism like this to
make DIF/DIX work with its metadata.

> > For [O2] i believe we can handle that case in the put_user_page()
> > function to properly dirty the page without causing filesystem
> > freak out.
> 
> I'm pretty sure you can't call ->page_mkwrite() from
> put_user_page(), so I don't think this is workable at all.

Yes, calling ->page_mkwrite() in put_user_page() is not only technically
complicated but also too late - DMA has already modified page contents.
What we planned to do (again discussed back in October) was to never allow
the pinned page to become clean. I.e., clear_page_dirty_for_io() would
leave pinned pages dirty. Also we would skip pinned pages for WB_SYNC_NONE
writeback as there's no point in that really. That way MM and filesystems
would be aware of the real page state - i.e., what's in memory is not in
sync (potentially) with what's on disk. I was thinking whether this
permanently-dirty state couldn't confuse filesystem in some way but I
didn't find anything serious - the worst I could think of are places that
do filemap_write_and_wait() and then invalidate page cache e.g. before hole
punching or extent shifting. But these should work fine as is (page cache
invalidation will just happily truncate dirty pages). DIO might get
confused by the inability to invalidate dirty pages but then user combining
RDMA with DIO on the same file at one moment gets what he deserves...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  parent reply index

Thread overview: 179+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-04  0:17 [PATCH 0/2] put_user_page*(): start converting the call sites john.hubbard
2018-12-04  0:17 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard
2018-12-04  7:53   ` Mike Rapoport
2018-12-05  1:40     ` John Hubbard
2018-12-04 20:28   ` Dan Williams
2018-12-04 21:56     ` John Hubbard
2018-12-04 23:03       ` Dan Williams
2018-12-05  0:36         ` Jerome Glisse
2018-12-05  0:40           ` Dan Williams
2018-12-05  0:59             ` John Hubbard
2018-12-05  0:58         ` John Hubbard
2018-12-05  1:00           ` Dan Williams
2018-12-05  1:15           ` Matthew Wilcox
2018-12-05  1:44             ` Jerome Glisse
2018-12-05  1:57               ` John Hubbard
2018-12-07  2:45                 ` John Hubbard
2018-12-07 19:16                   ` Jerome Glisse
2018-12-07 19:26                     ` Dan Williams
2018-12-07 19:40                       ` Jerome Glisse
2018-12-08  0:52                     ` John Hubbard
2018-12-08  2:24                       ` Jerome Glisse
2018-12-10 10:28                         ` Jan Kara
2018-12-12 15:03                           ` Jerome Glisse
2018-12-12 16:27                             ` Dan Williams
2018-12-12 17:02                               ` Jerome Glisse
2018-12-12 17:49                                 ` Dan Williams
2018-12-12 19:07                                   ` John Hubbard
2018-12-12 21:30                               ` Jerome Glisse
2018-12-12 21:40                                 ` Dan Williams
2018-12-12 21:53                                   ` Jerome Glisse
2018-12-12 22:11                                     ` Matthew Wilcox
2018-12-12 22:16                                       ` Jerome Glisse
2018-12-12 23:37                                     ` Jason Gunthorpe
2018-12-12 23:46                                       ` John Hubbard
2018-12-12 23:54                                       ` Dan Williams
2018-12-13  0:01                                       ` Jerome Glisse
2018-12-13  0:18                                         ` Dan Williams
2018-12-13  0:44                                           ` Jerome Glisse
2018-12-13  3:26                                             ` Jason Gunthorpe
2018-12-13  3:20                                         ` Jason Gunthorpe
2018-12-13 12:43                                           ` Jerome Glisse
2018-12-13 13:40                                             ` Tom Talpey
2018-12-13 14:18                                               ` Jerome Glisse
2018-12-13 14:51                                                 ` Tom Talpey
2018-12-13 15:18                                                   ` Jerome Glisse
2018-12-13 18:12                                                     ` Tom Talpey
2018-12-13 19:18                                                       ` Jerome Glisse
2018-12-14 10:41                                             ` Jan Kara
2018-12-14 15:25                                               ` Jerome Glisse
2018-12-12 21:56                                 ` John Hubbard
2018-12-12 22:04                                   ` Jerome Glisse
2018-12-12 22:11                                     ` John Hubbard
2018-12-12 22:14                                       ` Jerome Glisse
2018-12-12 22:17                                         ` John Hubbard
2018-12-12 21:46                             ` Dave Chinner
2018-12-12 21:59                               ` Jerome Glisse
2018-12-13  0:51                                 ` Dave Chinner
2018-12-13  2:02                                   ` Jerome Glisse
2018-12-13 15:56                                     ` Christopher Lameter
2018-12-13 16:02                                       ` Jerome Glisse
2018-12-14  6:00                                     ` Dave Chinner
2018-12-14 15:13                                       ` Jerome Glisse
2018-12-14  3:52                                   ` John Hubbard
2018-12-14  5:21                                     ` Dan Williams
2018-12-14  6:11                                       ` John Hubbard
2018-12-14 15:20                                         ` Jerome Glisse
2018-12-14 19:38                                         ` Dan Williams
2018-12-14 19:48                                           ` Matthew Wilcox
2018-12-14 19:53                                             ` Dave Hansen
2018-12-14 20:03                                               ` Matthew Wilcox
2018-12-14 20:17                                                 ` Dan Williams
2018-12-14 20:29                                                   ` Matthew Wilcox
2018-12-15  0:41                                                 ` John Hubbard
2018-12-17  8:56                                           ` Jan Kara
2018-12-17 18:28                                             ` Dan Williams
2018-12-14 15:43                               ` Jan Kara [this message]
2018-12-16 21:58                                 ` Dave Chinner
2018-12-17 18:11                                   ` Jerome Glisse
2018-12-17 18:34                                     ` Matthew Wilcox
2018-12-17 19:48                                       ` Jerome Glisse
2018-12-17 19:51                                         ` Matthew Wilcox
2018-12-17 19:54                                           ` Jerome Glisse
2018-12-17 19:59                                             ` Matthew Wilcox
2018-12-17 20:55                                               ` Jerome Glisse
2018-12-17 21:03                                                 ` Matthew Wilcox
2018-12-17 21:15                                                   ` Jerome Glisse
2018-12-18  1:09                                       ` Dave Chinner
2018-12-18  6:12                                       ` Darrick J. Wong
2018-12-18  9:30                                       ` Jan Kara
2018-12-18 23:29                                         ` John Hubbard
2018-12-19  2:07                                           ` Jerome Glisse
2018-12-19 11:08                                             ` Jan Kara
2018-12-20 10:54                                               ` John Hubbard
2018-12-20 16:50                                                 ` Jerome Glisse
2018-12-20 16:57                                                   ` Dan Williams
2018-12-20 16:49                                               ` Jerome Glisse
2019-01-03  1:55                                               ` Jerome Glisse
2019-01-03  3:27                                                 ` John Hubbard
2019-01-03 14:57                                                   ` Jerome Glisse
2019-01-03  9:26                                                 ` Jan Kara
2019-01-03 14:44                                                   ` Jerome Glisse
2019-01-11  2:59                                                     ` John Hubbard
2019-01-11 16:51                                                       ` Jerome Glisse
2019-01-12  1:04                                                         ` John Hubbard
2019-01-12  2:02                                                           ` Jerome Glisse
2019-01-12  2:38                                                             ` John Hubbard
2019-01-12  2:46                                                               ` Jerome Glisse
2019-01-12  3:06                                                                 ` John Hubbard
2019-01-12  3:25                                                                   ` Jerome Glisse
2019-01-12 20:46                                                                     ` John Hubbard
2019-01-14 14:54                                                                   ` Jan Kara
2019-01-14 17:21                                                                     ` Jerome Glisse
2019-01-14 19:09                                                                       ` John Hubbard
2019-01-15  8:34                                                                         ` Jan Kara
2019-01-15 21:39                                                                           ` John Hubbard
2019-01-15  8:07                                                                       ` Jan Kara
2019-01-15 17:15                                                                         ` Jerome Glisse
2019-01-15 21:56                                                                           ` John Hubbard
2019-01-15 22:12                                                                             ` Jerome Glisse
2019-01-16  0:44                                                                               ` John Hubbard
2019-01-16  1:56                                                                                 ` Jerome Glisse
2019-01-16  2:01                                                                                   ` Dan Williams
2019-01-16  2:23                                                                                     ` Jerome Glisse
2019-01-16  4:34                                                                                       ` Dave Chinner
2019-01-16 14:50                                                                                         ` Jerome Glisse
2019-01-16 22:51                                                                                           ` Dave Chinner
2019-01-16 11:38                                                                         ` Jan Kara
2019-01-16 13:08                                                                           ` Jerome Glisse
2019-01-17  5:42                                                                             ` John Hubbard
2019-01-17 15:21                                                                               ` Jerome Glisse
2019-01-18  0:16                                                                                 ` Dave Chinner
2019-01-18  1:59                                                                                   ` Jerome Glisse
2019-01-17  9:30                                                                             ` Jan Kara
2019-01-17 15:17                                                                               ` Jerome Glisse
2019-01-22 15:24                                                                                 ` Jan Kara
2019-01-22 16:46                                                                                   ` Jerome Glisse
2019-01-23 18:02                                                                                     ` Jan Kara
2019-01-23 19:04                                                                                       ` Jerome Glisse
2019-01-29  0:22                                                                                         ` John Hubbard
2019-01-29  1:23                                                                                           ` Jerome Glisse
2019-01-29  6:41                                                                                             ` John Hubbard
2019-01-29 10:12                                                                                               ` Jan Kara
2019-01-30  2:21                                                                                                 ` John Hubbard
2019-01-17  5:25                                                                         ` John Hubbard
2019-01-17  9:04                                                                           ` Jan Kara
2019-01-12  3:14                                                               ` Jerome Glisse
2018-12-18 10:33                                   ` Jan Kara
2018-12-18 23:42                                     ` Dave Chinner
2018-12-19  3:03                                       ` Jason Gunthorpe
2018-12-19  5:26                                         ` Dan Williams
2018-12-19 11:19                                           ` Jan Kara
2018-12-19 10:28                                         ` Dave Chinner
2018-12-19 11:35                                           ` Jan Kara
2018-12-19 16:56                                             ` Jason Gunthorpe
2018-12-19 22:33                                             ` Dave Chinner
2018-12-20  9:07                                               ` Jan Kara
2018-12-20 16:54                                               ` Jerome Glisse
2018-12-19 13:24                                       ` Jan Kara
2018-12-08  5:18                       ` Matthew Wilcox
2018-12-12 19:13                         ` John Hubbard
2018-12-08  7:16                       ` Dan Williams
2018-12-08 16:33                         ` Jerome Glisse
2018-12-08 16:48                           ` Christoph Hellwig
2018-12-08 17:47                             ` Jerome Glisse
2018-12-08 18:26                               ` Christoph Hellwig
2018-12-08 18:45                                 ` Jerome Glisse
2018-12-08 18:09                             ` Dan Williams
2018-12-08 18:12                               ` Christoph Hellwig
2018-12-11  6:18                               ` Dave Chinner
2018-12-05  5:52             ` Dan Williams
2018-12-05 11:16       ` Jan Kara
2018-12-04  0:17 ` [PATCH 2/2] infiniband/mm: convert put_page() to put_user_page*() john.hubbard
2018-12-04 17:10 ` [PATCH 0/2] put_user_page*(): start converting the call sites David Laight
2018-12-05  1:05   ` John Hubbard
2018-12-05 14:08     ` David Laight
2018-12-28  8:37       ` Pavel Machek
2019-02-08  7:56 [PATCH 0/2] mm: put_user_page() call site conversion first john.hubbard
2019-02-08  7:56 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard
2019-02-08 10:32   ` Mike Rapoport
2019-02-08 20:44     ` John Hubbard

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181214154321.GF8896@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=benve@cisco.com \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=dledford@redhat.com \
    --cc=hch@infradead.org \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=john.hubbard@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.marciniszyn@intel.com \
    --cc=rcampbell@nvidia.com \
    --cc=tom@talpey.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox