All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jiang, Dave" <dave.jiang@intel.com>
To: "Williams, Dan J" <dan.j.williams@intel.com>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
	"neilb@suse.de" <neilb@suse.de>,
	"Kernel-team@fb.com" <Kernel-team@fb.com>,
	"piergiorgio.sartor@nexgo.de" <piergiorgio.sartor@nexgo.de>,
	"shli@fb.com" <shli@fb.com>,
	"songliubraving@fb.com" <songliubraving@fb.com>
Subject: Re: [RFC] raid5: add a log device to fix raid5/6 write hole issue
Date: Wed, 1 Apr 2015 20:07:01 +0000	[thread overview]
Message-ID: <1427918821.125860.139.camel@intel.com> (raw)
In-Reply-To: <CAPcyv4jQZ6jeQpFU+cZ-AXzfQVXQCdK1muxNO35pCdavrggh4Q@mail.gmail.com>


On Wed, 2015-04-01 at 18:46 +0000, Williams, Dan J wrote:
> On Wed, Apr 1, 2015 at 11:36 AM, Piergiorgio Sartor
> <piergiorgio.sartor@nexgo.de> wrote:
> > On Tue, Mar 31, 2015 at 08:47:04PM -0700, Dan Williams wrote:
> >> On Mon, Mar 30, 2015 at 3:25 PM, Shaohua Li <shli@fb.com> wrote:
> >> > This is my attempt to fix raid5/6 write hole issue, it's not for merge
> >> > yet, I post it out for comments. Any comments and suggestions are
> >> > welcome!
> >> >
> >> > Thanks,
> >> > Shaohua
> >> >
> >> > We expect a completed raid5/6 stack with reliability and high
> >> > performance. Currently raid5/6 has 2 issues:
> >> >
> >> > 1. read-modify-write for small size IO. To fix this issue, a cache layer
> >> > above raid5/6 can be used to aggregate write to full stripe write.
> >> > 2. write hole issue. A write log below raid5/6 can fix the issue.
> >> >
> >> > We plan to use a SSD to fix the two issues. Here we just fix the write
> >> > hole issue.
> >> >
> >> > 1. We don't try to fix the issues together. A cache layer will do write
> >> > acceleration. A log layer will fix write hole. The seperation will
> >> > simplify things a lot.
> >> >
> >> > 2. Current assumption is flashcache/bcache will be used as the cache
> >> > layer. If they don't work well, we can fix them or add a simple cache
> >> > layer for raid write aggregation later. We also assume cache layer will
> >> > absorb write, so log doesn't worry about write latency.
> >>
> >> It seems neither bcache nor dm-cache are tackling the write-buffering
> >> problem head on... they still seem to be concerned with some amount of
> >> read caching which I can see as useful for file servers and
> >> workstations, but not necessarily scale out storage.
> >>
> >> I'll try to set aside time to take a look at the patch this week.
> >
> > There is one thing I do not really get.
> >
> > The target is to avoid the "write hole", which happens,
> > for example, when there is a sudden power failure.
> >
> > Now, how can be assured, in that case, that the "cache"
> > device is safe after the power is restored?
> 
> If you lose the cache the data-loss damage is greater, but this has
> always been the case with hardware-raid adapters.
> 
> > Doesn't this solution just shifts the problem from
> > the array to a different device (SSD, for example)?
> >
> > Speaking of SSD, these are quite "power failure"
> > sensitive, it seems...
> 
> Simple, if a cache-device is not itself power-failure safe then it
> should not be used for power-failure protection.

I think this would be a good application for some of the newer
technology coming out such as NVDIMM and persistent memory.

  reply	other threads:[~2015-04-01 20:07 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-30 22:25 [RFC] raid5: add a log device to fix raid5/6 write hole issue Shaohua Li
2015-04-01  3:47 ` Dan Williams
2015-04-01  5:53   ` Shaohua Li
2015-04-01  6:02     ` NeilBrown
2015-04-01 17:14       ` Shaohua Li
2015-04-01 18:36   ` Piergiorgio Sartor
2015-04-01 18:46     ` Dan Williams
2015-04-01 20:07       ` Jiang, Dave [this message]
2015-04-01 18:46     ` Alireza Haghdoost
2015-04-01 19:57       ` Wols Lists
2015-04-01 20:04         ` Alireza Haghdoost
2015-04-01 20:18           ` Wols Lists
2015-04-01 20:17         ` Jens Axboe
2015-04-01 21:53 ` NeilBrown
2015-04-01 23:40   ` Shaohua Li
2015-04-02  0:19     ` NeilBrown
2015-04-02  4:07       ` Shaohua Li
2015-04-09  0:43         ` Shaohua Li
2015-04-09  5:04           ` NeilBrown
2015-04-09  6:15             ` Shaohua Li
2015-04-09 15:37               ` Dan Williams
2015-04-09 16:03                 ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1427918821.125860.139.camel@intel.com \
    --to=dave.jiang@intel.com \
    --cc=Kernel-team@fb.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=piergiorgio.sartor@nexgo.de \
    --cc=shli@fb.com \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.