All of lore.kernel.org
 help / color / mirror / Atom feed
From: Timofey Titovets <nefelim4ag@gmail.com>
To: linux-btrfs@vger.kernel.org
Cc: Timofey Titovets <nefelim4ag@gmail.com>
Subject: [PATCH v7 5/6] Btrfs: heuristic add byte set calculation
Date: Fri, 25 Aug 2017 12:18:44 +0300	[thread overview]
Message-ID: <20170825091845.4120-6-nefelim4ag@gmail.com> (raw)
In-Reply-To: <20170825091845.4120-1-nefelim4ag@gmail.com>

Calculate byte set size for data sample:
Calculate how many unique bytes has been in sample
By count all bytes in bucket with count > 0
If byte set low (~25%), data are easily compressible

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
---
 fs/btrfs/heuristic.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/fs/btrfs/heuristic.c b/fs/btrfs/heuristic.c
index f1fa6e4f1c11..ef723e991576 100644
--- a/fs/btrfs/heuristic.c
+++ b/fs/btrfs/heuristic.c
@@ -24,6 +24,7 @@
 #define ITER_SHIFT 256
 #define BUCKET_SIZE 256
 #define MAX_SAMPLE_SIZE (BTRFS_MAX_UNCOMPRESSED*READ_SIZE/ITER_SHIFT)
+#define BYTE_SET_THRESHOLD 64

 struct bucket_item {
 	u32 count;
@@ -66,6 +67,27 @@ static struct list_head *heuristic_alloc_workspace(void)
 	return ERR_PTR(-ENOMEM);
 }

+static u32 byte_set_size(const struct workspace *ws)
+{
+	u32 a = 0;
+	u32 byte_set_size = 0;
+
+	for (; a < BYTE_SET_THRESHOLD; a++) {
+		if (ws->bucket[a].count > 0)
+			byte_set_size++;
+	}
+
+	for (; a < BUCKET_SIZE; a++) {
+		if (ws->bucket[a].count > 0) {
+			byte_set_size++;
+			if (byte_set_size > BYTE_SET_THRESHOLD)
+				return byte_set_size;
+		}
+	}
+
+	return byte_set_size;
+}
+
 static bool sample_repeated_patterns(struct workspace *ws)
 {
 	u32 i = 0;
@@ -138,6 +160,10 @@ static int heuristic(struct list_head *ws, struct inode *inode,
 		workspace->bucket[byte].count++;
 	}

+	a = byte_set_size(workspace);
+	if (a > BYTE_SET_THRESHOLD)
+		return 2;
+
 	return 1;
 }

--
2.14.1

  parent reply	other threads:[~2017-08-25  9:19 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-25  9:18 [PATCH v7 0/6] Btrfs: populate heuristic with code Timofey Titovets
2017-08-25  9:18 ` [PATCH v7 1/6] Btrfs: heuristic make use compression workspaces Timofey Titovets
2017-09-27 13:12   ` David Sterba
2017-08-25  9:18 ` [PATCH v7 2/6] Btrfs: heuristic workspace add bucket and sample items Timofey Titovets
2017-09-27 13:22   ` David Sterba
2017-08-25  9:18 ` [PATCH v7 3/6] Btrfs: implement heuristic sampling logic Timofey Titovets
2017-09-27 13:38   ` David Sterba
2017-08-25  9:18 ` [PATCH v7 4/6] Btrfs: heuristic add detection of repeated data patterns Timofey Titovets
2017-09-27 13:47   ` David Sterba
2017-08-25  9:18 ` Timofey Titovets [this message]
2017-09-27 13:50   ` [PATCH v7 5/6] Btrfs: heuristic add byte set calculation David Sterba
2017-08-25  9:18 ` [PATCH v7 6/6] Btrfs: heuristic add byte core " Timofey Titovets
2017-09-27 13:54   ` David Sterba
2017-09-27 13:56   ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170825091845.4120-6-nefelim4ag@gmail.com \
    --to=nefelim4ag@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.