All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH RFC 00/14] Yet Another In-band(online) deduplication implement
Date: Tue, 28 Jul 2015 16:56:10 +0800	[thread overview]
Message-ID: <55B743AA.80906@cn.fujitsu.com> (raw)
In-Reply-To: <1438072250-2871-1-git-send-email-quwenruo@cn.fujitsu.com>

Oh, there seems to be something wrong with the internal mail server.

The codes and patches can also get from github, as only the first 4 
patches are successfully sent...

https://github.com/adam900710/linux/tree/dedup

Thanks,
Qu

Qu Wenruo wrote on 2015/07/28 16:30 +0800:
> Although Liu Bo has already submitted a V10 version of his deduplication
> implement, here is another implement for it.
>
> [[CORE FEATURES]]
> The main design concept is the following:
> 1) Controllable memory usage
> 2) No guarantee to dedup every duplication.
> 3) No on-disk format change or new format
> 4) Page size level deduplication
>
> [[IMPLEMENT]]
> Implement details includes the following:
> 1) LRU hash maps to limit the memory usage
>     The hash -> extent mapping is control by LRU (or unlimited), to
>     get a controllable memory usage (can be tuned by mount option)
>     alone with controllable read/write overhead used for hash searching.
>
> 2) Reuse existing ordered_extent infrastructure
>     For duplicated page, it will still submit a ordered_extent(only one
>     page long), to make the full use of all existing infrastructure.
>     But only not submit a bio.
>     This can reduce the number of code lines.
>
> 3) Mount option to control dedup behavior
>     Deduplication and its memory usage can be tuned by mount option.
>     No need to indicated ioctl interface.
>     And further more, it can easily support BTRFS_INODE flag like
>     compression, to allow further per file dedup fine tunning.
>
> [[TODO]]
> 1. Add support for compressed extent
>     Shouldn't be quite hard.
> 2. Try to merge dedup extent to reduce metadata size
>     Currently, dedup extent is always in 4K size, although its reference
>     source can be quite large.
> 3. Add support for per file dedup flags
>     Much easier, just like compression flags.
>
> [[KNOWN BUG, NEED HELP!]]
> On the other hand, since it's still a RFC patch, it must has one or more
> problem:
> 1) Race between __btrfs_free_extent() and dedup ordered_extent.
>     The hook in __btrfs_free_extent() will free the corresponding hashes
>     of a extent, even there is a dedup ordered_extent referring it.
>
>     The problem will happen like the following case:
> ======================================================================
>     cow_file_range()
>       Submit dedup ordered_extent for extent A
>
>     commit_transaction()
>       Extent A needs freeing. As the its ref is decreased to 0.
>       And dedup ordered_extent can increase only when it hit endio time.
>
>     finish_ordered_io()
>       Add reference to Extent A for dedup ordered_extent.
>       But it is already freed in previous transaction.
>       Causing abort_transaction().
> ======================================================================
>     I'd like to keep the current ordered_extent method, as it adds the
>     least number of code lines.
>     But I can't find a good idea to either delay transaction until dedup
>     ordered_extent is done or things like that.
>
>     Trans->ordered seems to be a good idea, but it seems to cause list
>     corruption without extra protection in tree log infrastructure.
>
> That's the only problem spotted yet.
> Any early review or advice/question on the design is welcomed.
>
> Thanks.
>
> Qu Wenruo (14):
>    btrfs: file-item: Introduce btrfs_setup_file_extent function.
>    btrfs: Use btrfs_fill_file_extent to reduce duplicated codes
>    btrfs: dedup: Add basic init/free functions for inband dedup.
>    btrfs: dedup: Add internal add/remove/search function for btrfs dedup.
>    btrfs: dedup: add ordered extent hook for inband dedup
>    btrfs: dedup: Apply dedup hook for write time dedup.
>    btrfs: extent_map: Add new dedup flag and corresponding hook.
>    btrfs: extent-map: Introduce orig_block_start member for extent-map.
>    btrfs: dedup: Add inband dedup hook for read extent.
>    btrfs: dedup: Introduce btrfs_dedup_free_extent_range function.
>    btrfs: dedup: Add hook to free dedup hash at extent free time.
>    btrfs: dedup: Add mount option support for btrfs inband deduplication.
>    Btrfs: dedup: Support dedup change at remount time.
>    btrfs: dedup: Add mount option output for inband dedup.
>
>   fs/btrfs/Makefile       |   2 +-
>   fs/btrfs/ctree.h        |  16 ++
>   fs/btrfs/dedup.c        | 701 ++++++++++++++++++++++++++++++++++++++++++++++++
>   fs/btrfs/dedup.h        | 132 +++++++++
>   fs/btrfs/disk-io.c      |   7 +
>   fs/btrfs/extent-tree.c  |  10 +
>   fs/btrfs/extent_io.c    |   6 +-
>   fs/btrfs/extent_map.h   |   4 +
>   fs/btrfs/file-item.c    |  61 +++--
>   fs/btrfs/inode.c        | 228 ++++++++++++----
>   fs/btrfs/ordered-data.c |  32 ++-
>   fs/btrfs/ordered-data.h |   8 +
>   fs/btrfs/super.c        |  39 ++-
>   13 files changed, 1163 insertions(+), 83 deletions(-)
>   create mode 100644 fs/btrfs/dedup.c
>   create mode 100644 fs/btrfs/dedup.h
>

  parent reply	other threads:[~2015-07-28  8:56 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-28  8:30 [PATCH RFC 00/14] Yet Another In-band(online) deduplication implement Qu Wenruo
2015-07-28  8:30 ` [PATCH RFC 01/14] btrfs: file-item: Introduce btrfs_setup_file_extent function Qu Wenruo
2015-07-28  8:30 ` [PATCH RFC 02/14] btrfs: Use btrfs_fill_file_extent to reduce duplicated codes Qu Wenruo
2015-07-28  8:30 ` [PATCH RFC 03/14] btrfs: dedup: Add basic init/free functions for inband dedup Qu Wenruo
2015-07-28  8:30 ` [PATCH RFC 04/14] btrfs: dedup: Add internal add/remove/search function for btrfs dedup Qu Wenruo
2015-07-28  8:56 ` Qu Wenruo [this message]
2015-07-28  9:52 ` [PATCH RFC 00/14] Yet Another In-band(online) deduplication implement Liu Bo
2015-07-29  2:09   ` Qu Wenruo
2015-07-28 14:50 ` David Sterba
2015-07-29  1:07   ` Chris Mason
2015-07-29  1:47   ` Qu Wenruo
2015-07-29  2:40     ` Liu Bo
2015-08-03  7:18   ` Qu Wenruo
2015-08-27  0:52     ` Qu Wenruo
2015-08-27  9:14     ` David Sterba
2015-08-31  1:13       ` Qu Wenruo
2015-09-22 15:07         ` David Sterba
2015-09-23  7:16           ` Qu Wenruo
2015-07-28  9:14 Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55B743AA.80906@cn.fujitsu.com \
    --to=quwenruo@cn.fujitsu.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.