All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin ESTRABAUD <be@mpstor.com>
To: NeilBrown <neilb@suse.de>
Cc: Ric Wheeler <ricwheeler@gmail.com>,
	Chris Friesen <chris.friesen@genband.com>,
	linux-raid@vger.kernel.org
Subject: Re: RFC: use TRIM data from filesystems to speed up array rebuild?
Date: Thu, 06 Sep 2012 18:17:35 +0100	[thread overview]
Message-ID: <5048DAAF.8060300@mpstor.com> (raw)
In-Reply-To: <20120905062405.3741239a@notabene.brown>

On 04/09/12 21:24, NeilBrown wrote:
> On Tue, 04 Sep 2012 15:11:26 -0400 Ric Wheeler<ricwheeler@gmail.com>  wrote:
>
>> On 09/04/2012 02:06 PM, Chris Friesen wrote:
>>> Hi,
>>>
>>> I'm not really a filesystem guy so this may be a really dumb question.
>>>
>>> We currently have an issue where we have a ~1TB RAID1 array that is mostly
>>> given over to LVM.  If we swap one of the disks it will rebuild everything,
>>> even though we may only be using a small fraction of the space.
>>>
>>> This got me thinking.  Has anyone given thought to using the TRIM information
>>> from filesystems to allow the RAID code to maintain a bitmask of used disk
>>> blocks and only sync the ones that are actually used?
>>>
>>> Presumably this bitmask would itself need to be stored on the disk.
>>>
>>> Thanks,
>>> Chris
>>>
>> Device mapper has a "thin" target now that tracks blocks that are allocated or
>> free (and works with discard).
>>
>> That might be a basis for doing an focused RAID rebuild,
> I wonder how....
> Maybe the block-later interface could grow something equivalent to
> "SEEK_HOLE" and friends so that the upper level can find "holes" and
> "allocated space" in the underlying device.
> I wonder if it is time to discard the 'block device' abstraction and just use
> files every .... but I seriously doubt it.
>
> NeilBrown
Hi,

I've got a brief question about this feature that seems extremely promising:

You mentioned on your blog:

"A 'write' to a non-in-sync region should cause that region to be 
resynced. Writing zeros would in some sense be ideal, but to do that we 
would have to block the write, which would be unfortunate."

So, if we had a write on a "non-in-sync" region (let's imagine the 
bitmap allows for 1M granularity), we would compute the parity of every 
stripe that this write "touches" and update it? Is the solution zeroing 
the area used to save time reading and writing the data on the stripe to 
compute the parity, as well as any other stripes that are referenced by 
this "non-in-sync" region, even if the write wouldn't affect them, 
allowing us to then flip that entire region to "clean"?

Would this open the door to some "thin provisioned" MD RAID, where one 
could grow the underlying devices (in the case of a RAID built ontop of 
say LVM devices), and marking the new "space" as "non-in-sync" without 
disrupting (slowing) operations on the array with a sync?

In any case, seems like a great feature.

Regards,
Ben.

  parent reply	other threads:[~2012-09-06 17:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-04 18:06 RFC: use TRIM data from filesystems to speed up array rebuild? Chris Friesen
2012-09-04 19:11 ` Ric Wheeler
2012-09-04 20:24   ` NeilBrown
2012-09-04 22:59     ` Ric Wheeler
2012-09-06 17:17     ` Benjamin ESTRABAUD [this message]
2012-09-06 18:42       ` David Brown
2012-09-07  9:23         ` Benjamin ESTRABAUD
2012-09-04 20:21 ` NeilBrown
2012-09-04 20:28   ` Chris Friesen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5048DAAF.8060300@mpstor.com \
    --to=be@mpstor.com \
    --cc=chris.friesen@genband.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=ricwheeler@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.