linux-erofs.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: Huang Jianan <huangjianan@oppo.com>
Cc: yh@oppo.com, kevin.liw@oppo.com, guoweichao@oppo.com,
	linux-erofs@lists.ozlabs.org, guanyuwei@oppo.com
Subject: Re: [PATCH v2] erofs-utils: support per-inode compress pcluster
Date: Wed, 25 Aug 2021 09:17:38 +0800	[thread overview]
Message-ID: <YSWaMjYusTMt7Ccf@B-P7TQMD6M-0146.local> (raw)
In-Reply-To: <20210818042715.24416-1-huangjianan@oppo.com>

On Wed, Aug 18, 2021 at 12:27:15PM +0800, Huang Jianan via Linux-erofs wrote:
> Add an option to configure per-inode compression strategy. Each line
> of the file should be in the following form:
> 
> <Regular-expression> <pcluster-in-bytes>
> 
> When pcluster is 0, it means that the file shouldn't be compressed.
> 
> Signed-off-by: Huang Jianan <huangjianan@oppo.com>

Sorry for the delay. Due to busy work, I will look into the details
this weekend. Some comments in advance.

> ---
> changes since v1:
>  - rename c_pclusterblks to c_physical_clusterblks and place it in union
>  - change cfg.c_physical_clusterblks > 1 to erofs_sb_has_big_pcluster()
>    since it's per-inode compression strategy
> 
>  include/erofs/compress_rule.h |  25 ++++++++

How about calling "compress_hints"? Does it sound better?

>  include/erofs/config.h        |   1 +

...

> index 0000000..497d662
> --- /dev/null
> +++ b/lib/compress_rule.c
> @@ -0,0 +1,106 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * erofs-utils/lib/compress_rule.c
> + *
> + * Copyright (C), 2008-2021, OPPO Mobile Comm Corp., Ltd.
> + * Created by Huang Jianan <huangjianan@oppo.com>
> + */
> +#include <string.h>
> +#include <stdlib.h>
> +#include "erofs/err.h"
> +#include "erofs/list.h"
> +#include "erofs/print.h"
> +#include "erofs/compress_rule.h"
> +
> +static LIST_HEAD(compress_rule_head);
> +
> +static void dump_regerror(int errcode, const char *s, const regex_t *preg)
> +{
> +	char str[512];
> +
> +	regerror(errcode, preg, str, sizeof(str));
> +	erofs_err("invalid regex %s (%s)\n", s, str);
> +}
> +
> +static int erofs_insert_compress_rule(const char *s, unsigned int blks)
> +{
> +	struct erofs_compress_rule *r;
> +	int ret = 0;
> +
> +	r = malloc(sizeof(struct erofs_compress_rule));
> +	if (!r)
> +		return -ENOMEM;
> +
> +	ret = regcomp(&r->reg, s, REG_EXTENDED|REG_NOSUB);
> +	if (ret) {
> +		dump_regerror(ret, s, &r->reg);
> +		goto err;
> +	}
> +	r->c_physical_clusterblks = blks;
> +
> +	list_add_tail(&r->list, &compress_rule_head);
> +	erofs_info("insert compress rule %s: %u", s, blks);
> +	return ret;
> +
> +err:
> +	free(r);
> +	return ret;
> +}
> +
> +unsigned int erofs_parse_pclusterblks(struct erofs_inode *inode)
> +{
> +	const char *s;
> +	struct erofs_compress_rule *r;
> +
> +	if (inode->c_physical_clusterblks)
> +		return inode->c_physical_clusterblks;
> +
> +	s = erofs_fspath(inode->i_srcpath);
> +
> +	list_for_each_entry(r, &compress_rule_head, list) {
> +		int ret = regexec(&r->reg, s, (size_t)0, NULL, 0);
> +
> +		if (!ret) {
> +			inode->c_physical_clusterblks = r->c_physical_clusterblks;
> +			return r->c_physical_clusterblks;
> +		}
> +		if (ret > REG_NOMATCH)
> +			dump_regerror(ret, s, &r->reg);
> +	}
> +
> +	inode->c_physical_clusterblks = cfg.c_physical_clusterblks;
> +	return cfg.c_physical_clusterblks;
> +}
> +
> +int erofs_load_compress_rule()
> +{
> +	char buf[PATH_MAX + 100];
> +	FILE* f;
> +	int ret = 0;
> +
> +	if (!cfg.compress_rule_file)
> +		return 0;
> +
> +	f = fopen(cfg.compress_rule_file, "r");
> +	if (f == NULL)
> +		return -errno;
> +
> +	while (fgets(buf, sizeof(buf), f)) {
> +		char* line = buf;
> +		char* s;
> +		unsigned int blks;
> +
> +		s = strtok(line, " ");
> +		blks = atoi(strtok(NULL, " "));
> +		if (blks % EROFS_BLKSIZ) {

We might need to guarantee these are power of 2.
Also, how about just printing out warning message but using default "-C"
value instead?

> +			erofs_err("invalid physical clustersize %u", blks);
> +			ret = -EINVAL;
> +			goto out;
> +		}
> +		erofs_insert_compress_rule(s, blks / EROFS_BLKSIZ);
> +	}
> +
> +out:
> +	fclose(f);
> +	return ret;
> +}

...

> diff --git a/mkfs/main.c b/mkfs/main.c
> index 10fe14d..467e875 100644
> --- a/mkfs/main.c
> +++ b/mkfs/main.c
> @@ -23,6 +23,7 @@
>  #include "erofs/xattr.h"
>  #include "erofs/exclude.h"
>  #include "erofs/block_list.h"
> +#include "erofs/compress_rule.h"
>  
>  #ifdef HAVE_LIBUUID
>  #include <uuid.h>
> @@ -44,11 +45,12 @@ static struct option long_options[] = {
>  	{"random-pclusterblks", no_argument, NULL, 8},
>  #endif
>  	{"max-extent-bytes", required_argument, NULL, 9},
> +	{"compress-rule", required_argument, NULL, 10},
>  #ifdef WITH_ANDROID
> -	{"mount-point", required_argument, NULL, 10},
> -	{"product-out", required_argument, NULL, 11},
> -	{"fs-config-file", required_argument, NULL, 12},
> -	{"block-list-file", required_argument, NULL, 13},
> +	{"mount-point", required_argument, NULL, 20},
> +	{"product-out", required_argument, NULL, 21},
> +	{"fs-config-file", required_argument, NULL, 22},
> +	{"block-list-file", required_argument, NULL, 23},

I think we might clean up these first with a separated patch.
Use >= 256 for all of them instead.

Thanks,
Gao Xiang


  reply	other threads:[~2021-08-25  1:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-16  9:40 [PATCH] erofs-utils: support per-inode compress pcluster Huang Jianan via Linux-erofs
2021-08-18  4:27 ` [PATCH v2] " Huang Jianan via Linux-erofs
2021-08-25  1:17   ` Gao Xiang [this message]
2021-08-25  1:27     ` Gao Xiang
2021-08-25  2:38       ` Huang Jianan via Linux-erofs
2021-08-25  3:35   ` [PATCH v3] " Huang Jianan via Linux-erofs
2021-09-05 17:59     ` Gao Xiang
2021-09-06  9:38       ` Huang Jianan via Linux-erofs
2021-09-07  0:12         ` Gao Xiang
2021-09-15 11:21           ` [PATCH] erofs-utils: tests: check the compress-hints functionality Huang Jianan via Linux-erofs
2021-09-15 15:10             ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YSWaMjYusTMt7Ccf@B-P7TQMD6M-0146.local \
    --to=hsiangkao@linux.alibaba.com \
    --cc=guanyuwei@oppo.com \
    --cc=guoweichao@oppo.com \
    --cc=huangjianan@oppo.com \
    --cc=kevin.liw@oppo.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=yh@oppo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).