linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Akira Hayakawa <ruby.wktk@gmail.com>
To: snitzer@redhat.com
Cc: hch@infradead.org, dm-devel@redhat.com,
	devel@driverdev.osuosl.org, thornber@redhat.com,
	gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org,
	mpatocka@redhat.com, dan.carpenter@oracle.com, joe@perches.com,
	akpm@linux-foundation.org, m.chehab@samsung.com, ejt@redhat.com,
	agk@redhat.com, cesarb@cesarb.net, ruby.wktk@gmail.com
Subject: Re: Reworking dm-writeboost [was: Re: staging: Add dm-writeboost]
Date: Wed, 09 Oct 2013 10:07:34 +0900	[thread overview]
Message-ID: <5254AC56.3070608@gmail.com> (raw)
In-Reply-To: <20131008152924.GA3644@redhat.com>

Mike,

I am happy to see that
guys from filesystem to the block subsystem
have been discussing how to handle barriers in each layer
almost independently.

>> Merging the barriers and replacing it with a single FLUSH
>> by accepting a lot of writes
>> is the reason for deferring barriers in writeboost.
>> If you want to know further I recommend you to
>> look at the source code to see
>> how queue_barrier_io() is used and
>> how the barriers are kidnapped in queue_flushing().
> 
> AFAICT, this is an unfortunate hack resulting from dm-writeboost being a
> bio-based DM target.  The block layer already has support for FLUSH
> merging, see commit ae1b1539622fb4 ("block: reimplement FLUSH/FUA to
> support merge")

I have read the comments on this patch.
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ae1b1539622fb46e51b4d13b3f9e5f4c713f86ae

My understanding is that
REQ_FUA and REQ_FLUSH are decomposed to more primitive flags
in accordance with the property of the device.
{PRE|POST}FLUSH request are queued in flush_queue[one of the two]
(which is often called "pending" queue) and
calls blk_kick_flush that defers flushing and later
if few conditions are satisfied it actually inserts "a single" flush request
no matter how many flush requests are in the pending queue
(just judged by !list_empty(pending)).

If my understanding is correct,
we are deferring flush across three layers.

Let me summarize.
- For filesystem, Dave said that metadata journaling defers
  barriers.
- For device-mapper, writeboost, dm-cache and dm-thin defers
  barriers.
- For block, it defers barriers and results it to
  merging several requests into one after all.

I think writeboost can not discard this deferring hack because
deferring the barriers is usually very effective to
make it likely to fulfill the RAM buffer which
makes the write throughput higher and decrease the CPU usage.
However, for particular case such as what Dave pointed out,
this hack is just a disturbance.
Even for writeboost, the hack in the patch
is just a disturbance too unfortunately.
Upper layer dislikes the lower layers hidden optimization is
just a limitation of the layered architecture of Linux kernel.

I think these three layers are thinking almost the same thing
is that these hacks are all good and each layer
preparing a switch to turn on/off the optimization
is what we have to do for compromise.

All the problems originates from the fact that
we have volatile cache and persistent memory can
take these problems away.

With persistent memory provided
writeboost can switch off the deferring barriers.
However,
I think all the servers are equipped with
persistent memory is the future tale.
So, my idea is to maintain both modes
for RAM buffer type (volatile, non-volatile)
and in case of the former type
deferring hack is a good compromise.

Akira

  parent reply	other threads:[~2013-10-09  1:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-01 11:10 [PATCH] staging: Add dm-writeboost Akira Hayakawa
2013-09-16 21:53 ` Mike Snitzer
2013-09-16 22:49   ` Dan Carpenter
2013-09-17 12:41   ` Akira Hayakawa
2013-09-17 20:18     ` Mike Snitzer
2013-09-17 12:43   ` Akira Hayakawa
2013-09-17 20:59     ` Mike Snitzer
2013-09-22  0:09       ` Reworking dm-writeboost [was: Re: staging: Add dm-writeboost] Akira Hayakawa
2013-09-24 12:20         ` Akira Hayakawa
2013-09-25 17:37           ` Mike Snitzer
2013-09-26  1:42             ` Akira Hayakawa
2013-09-26  1:47             ` Akira Hayakawa
2013-09-27 18:35               ` Mike Snitzer
2013-09-28 11:29                 ` Akira Hayakawa
2013-09-25 23:03           ` Greg KH
2013-09-26  3:43           ` Dave Chinner
2013-10-01  8:26             ` Joe Thornber
2013-10-03  0:01               ` [dm-devel] " Mikulas Patocka
2013-10-04  2:04                 ` Dave Chinner
2013-10-05  7:51                   ` Akira Hayakawa
2013-10-07 23:43                     ` Dave Chinner
2013-10-08  9:41                       ` Christoph Hellwig
2013-10-08 10:37                         ` Akira Hayakawa
     [not found]                           ` <20131008152924.GA3644@redhat.com>
2013-10-09  1:07                             ` Akira Hayakawa [this message]
2013-10-08 10:57                       ` Akira Hayakawa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5254AC56.3070608@gmail.com \
    --to=ruby.wktk@gmail.com \
    --cc=agk@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cesarb@cesarb.net \
    --cc=dan.carpenter@oracle.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=dm-devel@redhat.com \
    --cc=ejt@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=joe@perches.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m.chehab@samsung.com \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@redhat.com \
    --cc=thornber@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).