All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: linux-btrfs@vger.kernel.org
Cc: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Subject: [PATCH v8 01/27] btrfs: dedupe: Introduce dedupe framework and its header
Date: Tue, 22 Mar 2016 09:35:26 +0800	[thread overview]
Message-ID: <1458610552-9845-2-git-send-email-quwenruo@cn.fujitsu.com> (raw)
In-Reply-To: <1458610552-9845-1-git-send-email-quwenruo@cn.fujitsu.com>

From: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>

Introduce the header for btrfs online(write time) de-duplication
framework and needed header.

The new de-duplication framework is going to support 2 different dedupe
methods and 1 dedupe hash.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
---
 fs/btrfs/ctree.h   |   5 ++
 fs/btrfs/dedupe.h  | 132 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/disk-io.c |   1 +
 3 files changed, 138 insertions(+)
 create mode 100644 fs/btrfs/dedupe.h

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 84a6a5b..022ab61 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1860,6 +1860,11 @@ struct btrfs_fs_info {
 	struct list_head pinned_chunks;
 
 	int creating_free_space_tree;
+
+	/* Inband de-duplication related structures*/
+	unsigned int dedupe_enabled:1;
+	struct btrfs_dedupe_info *dedupe_info;
+	struct mutex dedupe_ioctl_lock;
 };
 
 struct btrfs_subvolume_writers {
diff --git a/fs/btrfs/dedupe.h b/fs/btrfs/dedupe.h
new file mode 100644
index 0000000..916039c
--- /dev/null
+++ b/fs/btrfs/dedupe.h
@@ -0,0 +1,132 @@
+/*
+ * Copyright (C) 2015 Fujitsu.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+#ifndef __BTRFS_DEDUPE__
+#define __BTRFS_DEDUPE__
+
+#include <linux/btrfs.h>
+#include <linux/wait.h>
+#include <crypto/hash.h>
+
+/*
+ * Dedup storage backend
+ * On disk is persist storage but overhead is large
+ * In memory is fast but will lose all its hash on umount
+ */
+#define BTRFS_DEDUPE_BACKEND_INMEMORY		0
+#define BTRFS_DEDUPE_BACKEND_ONDISK		1
+#define BTRFS_DEDUPE_BACKEND_COUNT		2
+
+/* Dedup block size limit and default value */
+#define BTRFS_DEDUPE_BLOCKSIZE_MAX	(8 * 1024 * 1024)
+#define BTRFS_DEDUPE_BLOCKSIZE_MIN	(16 * 1024)
+#define BTRFS_DEDUPE_BLOCKSIZE_DEFAULT	(128 * 1024)
+
+/* Hash algorithm, only support SHA256 yet */
+#define BTRFS_DEDUPE_HASH_SHA256		0
+
+static int btrfs_dedupe_sizes[] = { 32 };
+
+/*
+ * For caller outside of dedup.c
+ *
+ * Different dedupe backends should have their own hash structure
+ */
+struct btrfs_dedupe_hash {
+	u64 bytenr;
+	u32 num_bytes;
+
+	/* last field is a variable length array of dedupe hash */
+	u8 hash[];
+};
+
+struct btrfs_dedupe_info {
+	/* dedupe blocksize */
+	u64 blocksize;
+	u16 backend;
+	u16 hash_type;
+
+	struct crypto_shash *dedupe_driver;
+	struct mutex lock;
+
+	/* following members are only used in in-memory dedupe mode */
+	struct rb_root hash_root;
+	struct rb_root bytenr_root;
+	struct list_head lru_list;
+	u64 limit_nr;
+	u64 current_nr;
+};
+
+struct btrfs_trans_handle;
+
+static inline int btrfs_dedupe_hash_hit(struct btrfs_dedupe_hash *hash)
+{
+	return (hash && hash->bytenr);
+}
+
+int btrfs_dedupe_hash_size(u16 type);
+struct btrfs_dedupe_hash *btrfs_dedupe_alloc_hash(u16 type);
+
+/*
+ * Initial inband dedupe info
+ * Called at dedupe enable time.
+ */
+int btrfs_dedupe_enable(struct btrfs_fs_info *fs_info, u16 type, u16 backend,
+			u64 blocksize, u64 limit_nr);
+
+/*
+ * Disable dedupe and invalidate all its dedupe data.
+ * Called at dedupe disable time.
+ */
+int btrfs_dedupe_disable(struct btrfs_fs_info *fs_info);
+
+/*
+ * Calculate hash for dedup.
+ * Caller must ensure [start, start + dedupe_bs) has valid data.
+ */
+int btrfs_dedupe_calc_hash(struct btrfs_fs_info *fs_info,
+			   struct inode *inode, u64 start,
+			   struct btrfs_dedupe_hash *hash);
+
+/*
+ * Search for duplicated extents by calculated hash
+ * Caller must call btrfs_dedupe_calc_hash() first to get the hash.
+ *
+ * @inode: the inode for we are writing
+ * @file_pos: offset inside the inode
+ * As we will increase extent ref immediately after a hash match,
+ * we need @file_pos and @inode in this case.
+ *
+ * Return > 0 for a hash match, and the extent ref will be
+ * *INCREASED*, and hash->bytenr/num_bytes will record the existing
+ * extent data.
+ * Return 0 for a hash miss. Nothing is done
+ */
+int btrfs_dedupe_search(struct btrfs_fs_info *fs_info,
+			struct inode *inode, u64 file_pos,
+			struct btrfs_dedupe_hash *hash);
+
+/* Add a dedupe hash into dedupe info */
+int btrfs_dedupe_add(struct btrfs_trans_handle *trans,
+		     struct btrfs_fs_info *fs_info,
+		     struct btrfs_dedupe_hash *hash);
+
+/* Remove a dedupe hash from dedupe info */
+int btrfs_dedupe_del(struct btrfs_trans_handle *trans,
+		     struct btrfs_fs_info *fs_info, u64 bytenr);
+#endif
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c95e3ce..3cf4c11 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2584,6 +2584,7 @@ int open_ctree(struct super_block *sb,
 	mutex_init(&fs_info->reloc_mutex);
 	mutex_init(&fs_info->delalloc_root_mutex);
 	mutex_init(&fs_info->cleaner_delayed_iput_mutex);
+	mutex_init(&fs_info->dedupe_ioctl_lock);
 	seqlock_init(&fs_info->profiles_lock);
 
 	INIT_LIST_HEAD(&fs_info->dirty_cowonly_roots);
-- 
2.7.3




  reply	other threads:[~2016-03-22  1:38 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-22  1:35 [PATCH v8 00/27][For 4.7] Btrfs: Add inband (write time) de-duplication framework Qu Wenruo
2016-03-22  1:35 ` Qu Wenruo [this message]
2016-03-22  1:35 ` [PATCH v8 02/27] btrfs: dedupe: Introduce function to initialize dedupe info Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 03/27] btrfs: dedupe: Introduce function to add hash into in-memory tree Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 04/27] btrfs: dedupe: Introduce function to remove hash from " Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 05/27] btrfs: delayed-ref: Add support for increasing data ref under spinlock Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 06/27] btrfs: dedupe: Introduce function to search for an existing hash Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 07/27] btrfs: dedupe: Implement btrfs_dedupe_calc_hash interface Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 08/27] btrfs: ordered-extent: Add support for dedupe Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 09/27] btrfs: dedupe: Inband in-memory only de-duplication implement Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 10/27] btrfs: dedupe: Add basic tree structure for on-disk dedupe method Qu Wenruo
2016-03-24 20:58   ` Chris Mason
2016-03-25  1:59     ` Qu Wenruo
2016-03-25 15:11       ` Chris Mason
2016-03-26 13:11         ` Qu Wenruo
2016-03-28 14:09           ` Chris Mason
2016-03-29  1:47             ` Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 11/27] btrfs: dedupe: Introduce interfaces to resume and cleanup dedupe info Qu Wenruo
2016-03-29 17:31   ` Alex Lyakas
2016-03-30  0:26     ` Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 12/27] btrfs: dedupe: Add support for on-disk hash search Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 13/27] btrfs: dedupe: Add support to delete hash for on-disk backend Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 14/27] btrfs: dedupe: Add support for adding " Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 15/27] btrfs: dedupe: Add ioctl for inband dedupelication Qu Wenruo
2016-03-22  2:29   ` kbuild test robot
2016-03-22  2:48   ` kbuild test robot
2016-03-22  1:35 ` [PATCH v8 16/27] btrfs: dedupe: add an inode nodedupe flag Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 17/27] btrfs: dedupe: add a property handler for online dedupe Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 18/27] btrfs: dedupe: add per-file online dedupe control Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 19/27] btrfs: try more times to alloc metadata reserve space Qu Wenruo
2016-04-22 18:06   ` Josef Bacik
2016-04-25  0:54     ` Qu Wenruo
2016-04-25 14:05       ` Josef Bacik
2016-04-26  0:50         ` Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 20/27] btrfs: dedupe: Fix a bug when running inband dedupe with balance Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 21/27] btrfs: Fix a memory leak in inband dedupe hash Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 22/27] btrfs: dedupe: Fix metadata balance error when dedupe is enabled Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 23/27] btrfs: dedupe: Avoid submit IO for hash hit extent Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 24/27] btrfs: dedupe: Preparation for compress-dedupe co-work Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 25/27] btrfs: dedupe: Add support for compression and dedpue Qu Wenruo
2016-03-24 20:35   ` Chris Mason
2016-03-25  1:44     ` Qu Wenruo
2016-03-25 15:12       ` Chris Mason
2016-03-22  1:35 ` [PATCH v8 26/27] btrfs: relocation: Enhance error handling to avoid BUG_ON Qu Wenruo
2016-03-22  1:35 ` [PATCH v8 27/27] btrfs: dedupe: Fix a space cache delalloc bytes underflow bug Qu Wenruo
2016-03-22 13:38 ` [PATCH v8 00/27][For 4.7] Btrfs: Add inband (write time) de-duplication framework David Sterba
2016-03-23  2:25   ` Qu Wenruo
2016-03-24 13:42     ` David Sterba
2016-03-25  1:38       ` Qu Wenruo
2016-04-04 16:55         ` David Sterba
2016-04-05  3:08           ` Qu Wenruo
2016-04-20  2:02             ` Qu Wenruo
2016-04-20 19:14               ` Chris Mason
2016-04-06  3:47           ` Nicholas D Steeves
2016-04-06  5:22             ` Qu Wenruo
2016-04-22 22:14               ` Nicholas D Steeves
2016-04-25  1:25                 ` Qu Wenruo
2016-03-29 17:22 ` Alex Lyakas
2016-03-30  0:34   ` Qu Wenruo
2016-03-30 10:36     ` Alex Lyakas
2016-04-03  8:22     ` Alex Lyakas
2016-04-05  3:51       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1458610552-9845-2-git-send-email-quwenruo@cn.fujitsu.com \
    --to=quwenruo@cn.fujitsu.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wangxg.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.