All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boaz Harrosh <boaz@plexistor.com>
To: Dave Chinner <david@fromorbit.com>, Boaz Harrosh <boaz@plexistor.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Jan Kara <jack@suse.cz>, Hugh Dickins <hughd@google.com>,
	Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, linux-nvdimm <linux-nvdimm@ml01.01.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Eryu Guan <eguan@redhat.com>
Subject: Re: [PATCH 3/3] RFC: dax: dax_prepare_freeze
Date: Wed, 25 Mar 2015 12:40:44 +0200	[thread overview]
Message-ID: <551290AC.7080402@plexistor.com> (raw)
In-Reply-To: <20150325094135.GI31342@dastard>

On 03/25/2015 11:41 AM, Dave Chinner wrote:
> On Wed, Mar 25, 2015 at 10:31:22AM +0200, Boaz Harrosh wrote:
>> On 03/25/2015 04:26 AM, Dave Chinner wrote:
<>
>> sync and fsync should and will work correctly, but this does not
>> solve our problem. because what turns pages to read-only is the
>> writeback. And we do not have this in dax. Therefore we need to
>> do this here as a special case.
> 
> We can still use exactly the same dirty tracking as we use for data
> writeback. The difference is that we don't need to go through all
> teh page writeback; we can just flush the CPU caches and mark all
> the mappings clean, then clear the I_DIRTY_PAGES flag and move on to
> inode writeback....
> 

I see what you mean. the sb wide sync will not step into mmaped inodes
and fsync them.

If we go my way and write NT (None Temporal) style in Kernel.
NT instructions exist since xeon and all the Intel iX core CPUs have
them. In tests we conducted doing xeon NT-writes vs
regular-writes-and-cl_flush at .fsync showed minimum of 20% improvement.
That is on very large IOs. On 4k IOs it was even better.

It looks like you have a much better picture in your mind how to
fit this properly at the inode-dirty picture. Can you attempt a rough draft?

If we are going the NT way. Then we can only I_DIRTY_ track the mmaped
inodes. For me this is really scary because I do not want to trigger
any writeback threads. If you could please draw me an outline (or write
something up ;-)) it would be great.

> Cheers,
> Dave.

Thanks
Boaz



WARNING: multiple messages have this Message-ID (diff)
From: Boaz Harrosh <boaz@plexistor.com>
To: Dave Chinner <david@fromorbit.com>, Boaz Harrosh <boaz@plexistor.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Jan Kara <jack@suse.cz>, Hugh Dickins <hughd@google.com>,
	Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, linux-nvdimm <linux-nvdimm@ml01.01.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Eryu Guan <eguan@redhat.com>
Subject: Re: [PATCH 3/3] RFC: dax: dax_prepare_freeze
Date: Wed, 25 Mar 2015 12:40:44 +0200	[thread overview]
Message-ID: <551290AC.7080402@plexistor.com> (raw)
In-Reply-To: <20150325094135.GI31342@dastard>

On 03/25/2015 11:41 AM, Dave Chinner wrote:
> On Wed, Mar 25, 2015 at 10:31:22AM +0200, Boaz Harrosh wrote:
>> On 03/25/2015 04:26 AM, Dave Chinner wrote:
<>
>> sync and fsync should and will work correctly, but this does not
>> solve our problem. because what turns pages to read-only is the
>> writeback. And we do not have this in dax. Therefore we need to
>> do this here as a special case.
> 
> We can still use exactly the same dirty tracking as we use for data
> writeback. The difference is that we don't need to go through all
> teh page writeback; we can just flush the CPU caches and mark all
> the mappings clean, then clear the I_DIRTY_PAGES flag and move on to
> inode writeback....
> 

I see what you mean. the sb wide sync will not step into mmaped inodes
and fsync them.

If we go my way and write NT (None Temporal) style in Kernel.
NT instructions exist since xeon and all the Intel iX core CPUs have
them. In tests we conducted doing xeon NT-writes vs
regular-writes-and-cl_flush at .fsync showed minimum of 20% improvement.
That is on very large IOs. On 4k IOs it was even better.

It looks like you have a much better picture in your mind how to
fit this properly at the inode-dirty picture. Can you attempt a rough draft?

If we are going the NT way. Then we can only I_DIRTY_ track the mmaped
inodes. For me this is really scary because I do not want to trigger
any writeback threads. If you could please draw me an outline (or write
something up ;-)) it would be great.

> Cheers,
> Dave.

Thanks
Boaz


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-03-25 10:40 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-23 12:47 [PATCH 0/3 v3] dax: Fix mmap-write not updating c/mtime Boaz Harrosh
2015-03-23 12:47 ` Boaz Harrosh
2015-03-23 12:49 ` [PATCH 1/3] mm: New pfn_mkwrite same as page_mkwrite for VM_PFNMAP Boaz Harrosh
2015-03-23 22:49   ` Andrew Morton
2015-03-23 22:49     ` Andrew Morton
2015-03-23 12:52 ` [PATCH 2/3] dax: use pfn_mkwrite to update c/mtime + freeze protection Boaz Harrosh
2015-03-23 12:54 ` [PATCH 3/3] RFC: dax: dax_prepare_freeze Boaz Harrosh
2015-03-23 22:40   ` Dave Chinner
2015-03-23 22:40     ` Dave Chinner
2015-03-24  6:14     ` Boaz Harrosh
2015-03-24  6:14       ` Boaz Harrosh
2015-03-25  2:22       ` Dave Chinner
2015-03-25  2:22         ` Dave Chinner
2015-03-25  8:10         ` Boaz Harrosh
2015-03-25  9:29           ` Dave Chinner
2015-03-25  9:29             ` Dave Chinner
2015-03-25 10:19             ` Boaz Harrosh
2015-03-25 10:19               ` Boaz Harrosh
2015-03-25 20:00               ` Dave Chinner
2015-03-25 20:00                 ` Dave Chinner
2015-03-26  8:02                 ` Boaz Harrosh
2015-03-26 20:58                   ` Dave Chinner
2015-03-26 20:58                     ` Dave Chinner
2015-03-24 12:37   ` Boaz Harrosh
2015-03-24 12:37     ` Boaz Harrosh
2015-03-25  2:26     ` Dave Chinner
2015-03-25  2:26       ` Dave Chinner
2015-03-25  8:31       ` Boaz Harrosh
2015-03-25  8:31         ` Boaz Harrosh
2015-03-25  9:41         ` Dave Chinner
2015-03-25  9:41           ` Dave Chinner
2015-03-25 10:40           ` Boaz Harrosh [this message]
2015-03-25 10:40             ` Boaz Harrosh
2015-03-25 20:05             ` Dave Chinner
2015-03-25 20:05               ` Dave Chinner
2015-03-23 12:56 ` [PATCH v4] xfstest: generic/080 test that mmap-write updates c/mtime Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=551290AC.7080402@plexistor.com \
    --to=boaz@plexistor.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=eguan@redhat.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=mgorman@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.