All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Boaz Harrosh <boaz@plexistor.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Jan Kara <jack@suse.cz>, Hugh Dickins <hughd@google.com>,
	Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, linux-nvdimm <linux-nvdimm@ml01.01.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Eryu Guan <eguan@redhat.com>
Subject: Re: [PATCH 3/3] RFC: dax: dax_prepare_freeze
Date: Wed, 25 Mar 2015 20:29:22 +1100	[thread overview]
Message-ID: <20150325092922.GH31342@dastard> (raw)
In-Reply-To: <55126D77.7040105@plexistor.com>

On Wed, Mar 25, 2015 at 10:10:31AM +0200, Boaz Harrosh wrote:
> On 03/25/2015 04:22 AM, Dave Chinner wrote:
> > On Tue, Mar 24, 2015 at 08:14:59AM +0200, Boaz Harrosh wrote:
> <>
> > 
> > Then we have wider problem with DAX, then: sync doesn't work
> > properly. i.e. if we still has write mapped pages, then we haven't
> > flushed dirty cache lines on write-mapped files to the persistent
> > domain by the time sync completes.
> > 
> > So, this shouldn't be some special case that only the freeze code
> > takes into account - we need to make sure that sync (and therefore
> > freeze) flushes all dirty cache lines and marks all mappings
> > clean....
> > 
> 
> This is not how I understood it and how I read the code.
> 
> The sync does happen, .fsync of the FS is called on each
> file just as if the user called it. If this is broken it just
> needs to be fixed there at the .fsync vector. POSIX mandate
> persistence at .fsync so at the vfs layer we rely on that.

right now, the filesystems will see that there are no dirty pages
on the inode, and then just sync the inode metadata. They will do
nothing else as filesystems are not aware of CPU cachelines at all.

> So everything at this stage should be synced to real media.

Actually no. This is what intel are introducing new CPU instructions
for - so fsync can flush the cpu caches and commit them to th
persistence domain correctly.

> What does not happen is writeback. since dax does not have
> any writeback.

Which is precisely the problem we need to address - we don't need
writeback to a block device, but we do need the dirty CPU cachelines
flushed and the mappings cleaned.

> And because of that nothing turned the
> user mappings to read only. This is what I do here but
> instead of write-protecting I just unmap because it is
> easier for me to code it.

That doesn't mean it is the correct solution.

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Boaz Harrosh <boaz@plexistor.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Jan Kara <jack@suse.cz>, Hugh Dickins <hughd@google.com>,
	Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, linux-nvdimm <linux-nvdimm@ml01.01.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Eryu Guan <eguan@redhat.com>
Subject: Re: [PATCH 3/3] RFC: dax: dax_prepare_freeze
Date: Wed, 25 Mar 2015 20:29:22 +1100	[thread overview]
Message-ID: <20150325092922.GH31342@dastard> (raw)
In-Reply-To: <55126D77.7040105@plexistor.com>

On Wed, Mar 25, 2015 at 10:10:31AM +0200, Boaz Harrosh wrote:
> On 03/25/2015 04:22 AM, Dave Chinner wrote:
> > On Tue, Mar 24, 2015 at 08:14:59AM +0200, Boaz Harrosh wrote:
> <>
> > 
> > Then we have wider problem with DAX, then: sync doesn't work
> > properly. i.e. if we still has write mapped pages, then we haven't
> > flushed dirty cache lines on write-mapped files to the persistent
> > domain by the time sync completes.
> > 
> > So, this shouldn't be some special case that only the freeze code
> > takes into account - we need to make sure that sync (and therefore
> > freeze) flushes all dirty cache lines and marks all mappings
> > clean....
> > 
> 
> This is not how I understood it and how I read the code.
> 
> The sync does happen, .fsync of the FS is called on each
> file just as if the user called it. If this is broken it just
> needs to be fixed there at the .fsync vector. POSIX mandate
> persistence at .fsync so at the vfs layer we rely on that.

right now, the filesystems will see that there are no dirty pages
on the inode, and then just sync the inode metadata. They will do
nothing else as filesystems are not aware of CPU cachelines at all.

> So everything at this stage should be synced to real media.

Actually no. This is what intel are introducing new CPU instructions
for - so fsync can flush the cpu caches and commit them to th
persistence domain correctly.

> What does not happen is writeback. since dax does not have
> any writeback.

Which is precisely the problem we need to address - we don't need
writeback to a block device, but we do need the dirty CPU cachelines
flushed and the mappings cleaned.

> And because of that nothing turned the
> user mappings to read only. This is what I do here but
> instead of write-protecting I just unmap because it is
> easier for me to code it.

That doesn't mean it is the correct solution.

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-03-25  9:29 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-23 12:47 [PATCH 0/3 v3] dax: Fix mmap-write not updating c/mtime Boaz Harrosh
2015-03-23 12:47 ` Boaz Harrosh
2015-03-23 12:49 ` [PATCH 1/3] mm: New pfn_mkwrite same as page_mkwrite for VM_PFNMAP Boaz Harrosh
2015-03-23 22:49   ` Andrew Morton
2015-03-23 22:49     ` Andrew Morton
2015-03-23 12:52 ` [PATCH 2/3] dax: use pfn_mkwrite to update c/mtime + freeze protection Boaz Harrosh
2015-03-23 12:54 ` [PATCH 3/3] RFC: dax: dax_prepare_freeze Boaz Harrosh
2015-03-23 22:40   ` Dave Chinner
2015-03-23 22:40     ` Dave Chinner
2015-03-24  6:14     ` Boaz Harrosh
2015-03-24  6:14       ` Boaz Harrosh
2015-03-25  2:22       ` Dave Chinner
2015-03-25  2:22         ` Dave Chinner
2015-03-25  8:10         ` Boaz Harrosh
2015-03-25  9:29           ` Dave Chinner [this message]
2015-03-25  9:29             ` Dave Chinner
2015-03-25 10:19             ` Boaz Harrosh
2015-03-25 10:19               ` Boaz Harrosh
2015-03-25 20:00               ` Dave Chinner
2015-03-25 20:00                 ` Dave Chinner
2015-03-26  8:02                 ` Boaz Harrosh
2015-03-26 20:58                   ` Dave Chinner
2015-03-26 20:58                     ` Dave Chinner
2015-03-24 12:37   ` Boaz Harrosh
2015-03-24 12:37     ` Boaz Harrosh
2015-03-25  2:26     ` Dave Chinner
2015-03-25  2:26       ` Dave Chinner
2015-03-25  8:31       ` Boaz Harrosh
2015-03-25  8:31         ` Boaz Harrosh
2015-03-25  9:41         ` Dave Chinner
2015-03-25  9:41           ` Dave Chinner
2015-03-25 10:40           ` Boaz Harrosh
2015-03-25 10:40             ` Boaz Harrosh
2015-03-25 20:05             ` Dave Chinner
2015-03-25 20:05               ` Dave Chinner
2015-03-23 12:56 ` [PATCH v4] xfstest: generic/080 test that mmap-write updates c/mtime Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150325092922.GH31342@dastard \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=boaz@plexistor.com \
    --cc=eguan@redhat.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=mgorman@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.