All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Timofey Titovets <nefelim4ag@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v7 5/6] Btrfs: heuristic add byte set calculation
Date: Wed, 27 Sep 2017 15:50:43 +0200	[thread overview]
Message-ID: <20170927135043.GE31640@twin.jikos.cz> (raw)
In-Reply-To: <20170825091845.4120-6-nefelim4ag@gmail.com>

On Fri, Aug 25, 2017 at 12:18:44PM +0300, Timofey Titovets wrote:
> Calculate byte set size for data sample:
> Calculate how many unique bytes has been in sample
> By count all bytes in bucket with count > 0
> If byte set low (~25%), data are easily compressible
> 
> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
> ---
>  fs/btrfs/heuristic.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/fs/btrfs/heuristic.c b/fs/btrfs/heuristic.c
> index f1fa6e4f1c11..ef723e991576 100644
> --- a/fs/btrfs/heuristic.c
> +++ b/fs/btrfs/heuristic.c
> @@ -24,6 +24,7 @@
>  #define ITER_SHIFT 256
>  #define BUCKET_SIZE 256
>  #define MAX_SAMPLE_SIZE (BTRFS_MAX_UNCOMPRESSED*READ_SIZE/ITER_SHIFT)
> +#define BYTE_SET_THRESHOLD 64

Explanation needed, partially covered by the changelog but we need to
see it in the code as well.

> 
>  struct bucket_item {
>  	u32 count;
> @@ -66,6 +67,27 @@ static struct list_head *heuristic_alloc_workspace(void)
>  	return ERR_PTR(-ENOMEM);
>  }
> 
> +static u32 byte_set_size(const struct workspace *ws)
> +{
> +	u32 a = 0;

Please use 'i'.

> +	u32 byte_set_size = 0;
> +
> +	for (; a < BYTE_SET_THRESHOLD; a++) {

	for (i = 0; ...)


> +		if (ws->bucket[a].count > 0)
> +			byte_set_size++;
> +	}
> +
> +	for (; a < BUCKET_SIZE; a++) {

So here the initialization is intentionally skipped, please add a
comment that this is expected or explain like "continue past the first
sample until ...".

> +		if (ws->bucket[a].count > 0) {
> +			byte_set_size++;
> +			if (byte_set_size > BYTE_SET_THRESHOLD)
> +				return byte_set_size;
> +		}
> +	}
> +
> +	return byte_set_size;
> +}
> +
>  static bool sample_repeated_patterns(struct workspace *ws)
>  {
>  	u32 i = 0;
> @@ -138,6 +160,10 @@ static int heuristic(struct list_head *ws, struct inode *inode,
>  		workspace->bucket[byte].count++;
>  	}
> 
> +	a = byte_set_size(workspace);
> +	if (a > BYTE_SET_THRESHOLD)
> +		return 2;
> +
>  	return 1;
>  }
> 
> --
> 2.14.1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2017-09-27 13:52 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-25  9:18 [PATCH v7 0/6] Btrfs: populate heuristic with code Timofey Titovets
2017-08-25  9:18 ` [PATCH v7 1/6] Btrfs: heuristic make use compression workspaces Timofey Titovets
2017-09-27 13:12   ` David Sterba
2017-08-25  9:18 ` [PATCH v7 2/6] Btrfs: heuristic workspace add bucket and sample items Timofey Titovets
2017-09-27 13:22   ` David Sterba
2017-08-25  9:18 ` [PATCH v7 3/6] Btrfs: implement heuristic sampling logic Timofey Titovets
2017-09-27 13:38   ` David Sterba
2017-08-25  9:18 ` [PATCH v7 4/6] Btrfs: heuristic add detection of repeated data patterns Timofey Titovets
2017-09-27 13:47   ` David Sterba
2017-08-25  9:18 ` [PATCH v7 5/6] Btrfs: heuristic add byte set calculation Timofey Titovets
2017-09-27 13:50   ` David Sterba [this message]
2017-08-25  9:18 ` [PATCH v7 6/6] Btrfs: heuristic add byte core " Timofey Titovets
2017-09-27 13:54   ` David Sterba
2017-09-27 13:56   ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170927135043.GE31640@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nefelim4ag@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.