linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: David Howells <dhowells@redhat.com>, linux-cachefs@redhat.com
Cc: Trond Myklebust <trondmy@hammerspace.com>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	Steve French <sfrench@samba.org>,
	Dominique Martinet <asmadeus@codewreck.org>,
	Matthew Wilcox <willy@infradead.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Omar Sandoval <osandov@osandov.com>,
	JeffleXu <jefflexu@linux.alibaba.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org,
	linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org,
	v9fs-developer@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 44/68] cachefiles: Implement key to filename encoding
Date: Thu, 06 Jan 2022 12:43:37 -0500	[thread overview]
Message-ID: <1e102cc81aaf71df2b7f5ae906b79c188a34a111.camel@kernel.org> (raw)
In-Reply-To: <164021549223.640689.14762875188193982341.stgit@warthog.procyon.org.uk>

On Wed, 2021-12-22 at 23:24 +0000, David Howells wrote:
> Implement a function to encode a binary cookie key as something that can be
> used as a filename.  Four options are considered:
> 
>  (1) All printable chars with no '/' characters.  Prepend a 'D' to indicate
>      the encoding but otherwise use as-is.
> 
>  (2) Appears to be an array of __be32.  Encode as 'S' plus a list of
>      hex-encoded 32-bit ints separated by commas.  If a number is 0, it is
>      rendered as "" instead of "0".
> 
>  (3) Appears to be an array of __le32.  Encoded as (2) but with a 'T'
>      encoding prefix.
> 
>  (4) Encoded as base64 with an 'E' prefix plus a second char indicating how
>      much padding is involved.  A non-standard base64 encoding is used
>      because '/' cannot be used in the encoded form.
> 
> If (1) is not possible, whichever of (2), (3) or (4) produces the shortest
> string is selected (hex-encoding a number may be less dense than base64
> encoding it).
> 

Since most cookies are fairly small, is there any real benefit to
optimizing for length here? How much inflation are we talking about?

> Note that the prefix characters have to be selected from the set [DEIJST@]
> lest cachefilesd remove the files because it recognise the name.
> 
> Changes
> =======
> ver #2:
>  - Fix a short allocation that didn't allow for a string terminator[1]
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: linux-cachefs@redhat.com
> Link: https://lore.kernel.org/r/bcefb8f2-576a-b3fc-cc29-89808ebfd7c1@linux.alibaba.com/ [1]
> Link: https://lore.kernel.org/r/163819640393.215744.15212364106412961104.stgit@warthog.procyon.org.uk/ # v1
> Link: https://lore.kernel.org/r/163906940529.143852.17352132319136117053.stgit@warthog.procyon.org.uk/ # v2
> Link: https://lore.kernel.org/r/163967149827.1823006.6088580775428487961.stgit@warthog.procyon.org.uk/ # v3
> ---
> 
>  fs/cachefiles/Makefile   |    1 
>  fs/cachefiles/internal.h |    5 ++
>  fs/cachefiles/key.c      |  138 ++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 144 insertions(+)
>  create mode 100644 fs/cachefiles/key.c
> 
> diff --git a/fs/cachefiles/Makefile b/fs/cachefiles/Makefile
> index d67210ece9cd..6f025940a65c 100644
> --- a/fs/cachefiles/Makefile
> +++ b/fs/cachefiles/Makefile
> @@ -7,6 +7,7 @@ cachefiles-y := \
>  	cache.o \
>  	daemon.o \
>  	interface.o \
> +	key.o \
>  	main.o \
>  	namei.o \
>  	security.o \
> diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
> index 8763ee4a0df2..dbc37f5d4714 100644
> --- a/fs/cachefiles/internal.h
> +++ b/fs/cachefiles/internal.h
> @@ -173,6 +173,11 @@ extern struct cachefiles_object *cachefiles_grab_object(struct cachefiles_object
>  extern void cachefiles_put_object(struct cachefiles_object *object,
>  				  enum cachefiles_obj_ref_trace why);
>  
> +/*
> + * key.c
> + */
> +extern bool cachefiles_cook_key(struct cachefiles_object *object);
> +
>  /*
>   * main.c
>   */
> diff --git a/fs/cachefiles/key.c b/fs/cachefiles/key.c
> new file mode 100644
> index 000000000000..bf935e25bdbe
> --- /dev/null
> +++ b/fs/cachefiles/key.c
> @@ -0,0 +1,138 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/* Key to pathname encoder
> + *
> + * Copyright (C) 2021 Red Hat, Inc. All Rights Reserved.
> + * Written by David Howells (dhowells@redhat.com)
> + */
> +
> +#include <linux/slab.h>
> +#include "internal.h"
> +
> +static const char cachefiles_charmap[64] =
> +	"0123456789"			/* 0 - 9 */
> +	"abcdefghijklmnopqrstuvwxyz"	/* 10 - 35 */
> +	"ABCDEFGHIJKLMNOPQRSTUVWXYZ"	/* 36 - 61 */
> +	"_-"				/* 62 - 63 */
> +	;
> +
> +static const char cachefiles_filecharmap[256] = {
> +	/* we skip space and tab and control chars */
> +	[33 ... 46] = 1,		/* '!' -> '.' */
> +	/* we skip '/' as it's significant to pathwalk */
> +	[48 ... 127] = 1,		/* '0' -> '~' */
> +};
> +
> +static inline unsigned int how_many_hex_digits(unsigned int x)
> +{
> +	return x ? round_up(ilog2(x) + 1, 4) / 4 : 0;
> +}
> +
> +/*
> + * turn the raw key into something cooked
> + * - the key may be up to NAME_MAX in length (including the length word)
> + *   - "base64" encode the strange keys, mapping 3 bytes of raw to four of
> + *     cooked
> + *   - need to cut the cooked key into 252 char lengths (189 raw bytes)
> + */
> +bool cachefiles_cook_key(struct cachefiles_object *object)
> +{
> +	const u8 *key = fscache_get_key(object->cookie), *kend;
> +	unsigned char ch;
> +	unsigned int acc, i, n, nle, nbe, keylen = object->cookie->key_len;
> +	unsigned int b64len, len, print, pad;
> +	char *name, sep;
> +
> +	_enter(",%u,%*phN", keylen, keylen, key);
> +
> +	BUG_ON(keylen > NAME_MAX - 3);
> +
> +	print = 1;
> +	for (i = 0; i < keylen; i++) {
> +		ch = key[i];
> +		print &= cachefiles_filecharmap[ch];
> +	}
> +
> +	/* If the path is usable ASCII, then we render it directly */
> +	if (print) {
> +		len = 1 + keylen;
> +		name = kmalloc(len + 1, GFP_KERNEL);
> +		if (!name)
> +			return false;
> +
> +		name[0] = 'D'; /* Data object type, string encoding */
> +		memcpy(name + 1, key, keylen);
> +		goto success;
> +	}
> +
> +	/* See if it makes sense to encode it as "hex,hex,hex" for each 32-bit
> +	 * chunk.  We rely on the key having been padded out to a whole number
> +	 * of 32-bit words.
> +	 */
> +	n = round_up(keylen, 4);
> +	nbe = nle = 0;
> +	for (i = 0; i < n; i += 4) {
> +		u32 be = be32_to_cpu(*(__be32 *)(key + i));
> +		u32 le = le32_to_cpu(*(__le32 *)(key + i));
> +
> +		nbe += 1 + how_many_hex_digits(be);
> +		nle += 1 + how_many_hex_digits(le);
> +	}
> +
> +	b64len = DIV_ROUND_UP(keylen, 3);
> +	pad = b64len * 3 - keylen;
> +	b64len = 2 + b64len * 4; /* Length if we base64-encode it */
> +	_debug("len=%u nbe=%u nle=%u b64=%u", keylen, nbe, nle, b64len);
> +	if (nbe < b64len || nle < b64len) {
> +		unsigned int nlen = min(nbe, nle) + 1;
> +		name = kmalloc(nlen, GFP_KERNEL);
> +		if (!name)
> +			return false;
> +		sep = (nbe <= nle) ? 'S' : 'T'; /* Encoding indicator */
> +		len = 0;
> +		for (i = 0; i < n; i += 4) {
> +			u32 x;
> +			if (nbe <= nle)
> +				x = be32_to_cpu(*(__be32 *)(key + i));
> +			else
> +				x = le32_to_cpu(*(__le32 *)(key + i));
> +			name[len++] = sep;
> +			if (x != 0)
> +				len += snprintf(name + len, nlen - len, "%x", x);
> +			sep = ',';
> +		}
> +		goto success;
> +	}
> +
> +	/* We need to base64-encode it */
> +	name = kmalloc(b64len + 1, GFP_KERNEL);
> +	if (!name)
> +		return false;
> +
> +	name[0] = 'E';
> +	name[1] = '0' + pad;
> +	len = 2;
> +	kend = key + keylen;
> +	do {
> +		acc  = *key++;
> +		if (key < kend) {
> +			acc |= *key++ << 8;
> +			if (key < kend)
> +				acc |= *key++ << 16;
> +		}
> +
> +		name[len++] = cachefiles_charmap[acc & 63];
> +		acc >>= 6;
> +		name[len++] = cachefiles_charmap[acc & 63];
> +		acc >>= 6;
> +		name[len++] = cachefiles_charmap[acc & 63];
> +		acc >>= 6;
> +		name[len++] = cachefiles_charmap[acc & 63];
> +	} while (key < kend);

It might be good to eventually consolidate this code with the base64
scheme that fscrypt uses. Are they compatible? If so, then that can be
done in a later merge.

> +
> +success:
> +	name[len] = 0;
> +	object->d_name = name;
> +	object->d_name_len = len;
> +	_leave(" = %s", object->d_name);
> +	return true;
> +}
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2022-01-06 17:43 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-22 23:13 [PATCH v4 00/68] fscache, cachefiles: Rewrite David Howells
2021-12-22 23:13 ` [PATCH v4 01/68] fscache, cachefiles: Disable configuration David Howells
2021-12-22 23:13 ` [PATCH v4 02/68] cachefiles: Delete the cachefiles driver pending rewrite David Howells
2021-12-22 23:14 ` [PATCH v4 03/68] fscache: Remove the contents of the fscache driver, " David Howells
2021-12-22 23:14 ` [PATCH v4 04/68] netfs: Display the netfs inode number in the netfs_read tracepoint David Howells
2021-12-22 23:14 ` [PATCH v4 05/68] netfs: Pass a flag to ->prepare_write() to say if there's no alloc'd space David Howells
2021-12-22 23:15 ` [PATCH v4 06/68] fscache: Introduce new driver David Howells
2021-12-22 23:15 ` [PATCH v4 07/68] fscache: Implement a hash function David Howells
2021-12-22 23:15 ` [PATCH v4 08/68] fscache: Implement cache registration David Howells
2021-12-22 23:15 ` [PATCH v4 09/68] fscache: Implement volume registration David Howells
2021-12-22 23:16 ` [PATCH v4 10/68] fscache: Implement cookie registration David Howells
2021-12-22 23:16 ` [PATCH v4 11/68] fscache: Implement cache-level access helpers David Howells
2021-12-22 23:16 ` [PATCH v4 12/68] fscache: Implement volume-level " David Howells
2021-12-22 23:17 ` [PATCH v4 13/68] fscache: Implement cookie-level " David Howells
2021-12-22 23:17 ` [PATCH v4 14/68] fscache: Implement functions add/remove a cache David Howells
2021-12-22 23:17 ` [PATCH v4 15/68] fscache: Provide and use cache methods to lookup/create/free a volume David Howells
2021-12-22 23:18 ` [PATCH v4 16/68] fscache: Add a function for a cache backend to note an I/O error David Howells
2021-12-22 23:18 ` [PATCH v4 17/68] fscache: Implement simple cookie state machine David Howells
2021-12-22 23:18 ` [PATCH v4 18/68] fscache: Implement cookie user counting and resource pinning David Howells
2021-12-22 23:18 ` [PATCH v4 19/68] fscache: Implement cookie invalidation David Howells
2021-12-22 23:18 ` [PATCH v4 20/68] fscache: Provide a means to begin an operation David Howells
2021-12-22 23:19 ` [PATCH v4 21/68] fscache: Count data storage objects in a cache David Howells
2021-12-22 23:19 ` [PATCH v4 22/68] fscache: Provide read/write stat counters for the cache David Howells
2021-12-22 23:19 ` [PATCH v4 23/68] fscache: Provide a function to let the netfs update its coherency data David Howells
2021-12-22 23:19 ` [PATCH v4 24/68] netfs: Pass more information on how to deal with a hole in the cache David Howells
2021-12-22 23:20 ` [PATCH v4 25/68] fscache: Implement raw I/O interface David Howells
2021-12-22 23:20 ` [PATCH v4 26/68] fscache: Implement higher-level write " David Howells
2021-12-22 23:20 ` [PATCH v4 27/68] vfs, fscache: Implement pinning of cache usage for writeback David Howells
2021-12-22 23:20 ` [PATCH v4 28/68] fscache: Provide a function to note the release of a page David Howells
2022-01-06 15:55   ` Jeff Layton
2022-01-06 16:26   ` David Howells
2021-12-22 23:21 ` [PATCH v4 29/68] fscache: Provide a function to resize a cookie David Howells
2021-12-22 23:21 ` [PATCH v4 30/68] cachefiles: Introduce rewritten driver David Howells
2021-12-22 23:21 ` [PATCH v4 31/68] cachefiles: Define structs David Howells
2021-12-22 23:22 ` [PATCH v4 32/68] cachefiles: Add some error injection support David Howells
2021-12-22 23:22 ` [PATCH v4 33/68] cachefiles: Add a couple of tracepoints for logging errors David Howells
2021-12-22 23:22 ` [PATCH v4 34/68] cachefiles: Add cache error reporting macro David Howells
2021-12-22 23:22 ` [PATCH v4 35/68] cachefiles: Add security derivation David Howells
2021-12-22 23:23 ` [PATCH v4 36/68] cachefiles: Register a miscdev and parse commands over it David Howells
2021-12-22 23:23 ` [PATCH v4 37/68] cachefiles: Provide a function to check how much space there is David Howells
2021-12-22 23:23 ` [PATCH v4 38/68] vfs, cachefiles: Mark a backing file in use with an inode flag David Howells
2022-01-06 17:04   ` Jeff Layton
2022-01-08  7:08   ` Amir Goldstein
2022-01-08  7:17     ` Matthew Wilcox
2022-01-10  7:53       ` Christoph Hellwig
2022-01-10 11:31       ` David Howells
2022-01-08  8:43     ` David Howells
2022-01-08  8:41   ` David Howells
2021-12-22 23:23 ` [PATCH v4 39/68] cachefiles: Implement a function to get/create a directory in the cache David Howells
2021-12-22 23:23 ` [PATCH v4 40/68] cachefiles: Implement cache registration and withdrawal David Howells
2022-01-06 17:17   ` Jeff Layton
2022-01-06 17:44   ` David Howells
2021-12-22 23:24 ` [PATCH v4 41/68] cachefiles: Implement volume support David Howells
2021-12-22 23:24 ` [PATCH v4 42/68] cachefiles: Add tracepoints for calls to the VFS David Howells
2021-12-22 23:24 ` [PATCH v4 43/68] cachefiles: Implement object lifecycle funcs David Howells
2021-12-22 23:24 ` [PATCH v4 44/68] cachefiles: Implement key to filename encoding David Howells
2022-01-06 17:43   ` Jeff Layton [this message]
2022-01-07 11:19   ` David Howells
2021-12-22 23:25 ` [PATCH v4 45/68] cachefiles: Implement metadata/coherency data storage in xattrs David Howells
2021-12-22 23:25 ` [PATCH v4 46/68] cachefiles: Mark a backing file in use with an inode flag David Howells
2022-01-06 18:04   ` Jeff Layton
2022-01-07 11:25   ` David Howells
2021-12-22 23:25 ` [PATCH v4 47/68] cachefiles: Implement culling daemon commands David Howells
2021-12-22 23:26 ` [PATCH v4 48/68] cachefiles: Implement backing file wrangling David Howells
2021-12-22 23:26 ` [PATCH v4 49/68] cachefiles: Implement begin and end I/O operation David Howells
2021-12-22 23:26 ` [PATCH v4 50/68] cachefiles: Implement cookie resize for truncate David Howells
2021-12-22 23:27 ` [PATCH v4 51/68] cachefiles: Implement the I/O routines David Howells
2021-12-22 23:27 ` [PATCH v4 52/68] fscache, cachefiles: Store the volume coherency data David Howells
2021-12-22 23:27 ` [PATCH v4 53/68] cachefiles: Allow cachefiles to actually function David Howells
2021-12-22 23:27 ` [PATCH v4 54/68] fscache, cachefiles: Display stats of no-space events David Howells
2021-12-22 23:27 ` [PATCH v4 55/68] fscache, cachefiles: Display stat of culling events David Howells
2021-12-22 23:28 ` [PATCH v4 56/68] afs: Convert afs to use the new fscache API David Howells
2021-12-22 23:28 ` [PATCH v4 57/68] afs: Copy local writes to the cache when writing to the server David Howells
2021-12-22 23:28 ` [PATCH v4 58/68] afs: Skip truncation on the server of data we haven't written yet David Howells
2021-12-22 23:28 ` [PATCH v4 59/68] 9p: Use fscache indexing rewrite and reenable caching David Howells
2021-12-22 23:29 ` [PATCH v4 60/68] 9p: Copy local writes to the cache when writing to the server David Howells
2021-12-22 23:29 ` [PATCH v4 61/68] nfs: Convert to new fscache volume/cookie API David Howells
2021-12-22 23:29 ` [PATCH v4 62/68] nfs: Implement cache I/O by accessing the cache directly David Howells
2021-12-22 23:29 ` [PATCH v4 63/68] cifs: Support fscache indexing rewrite (untested) David Howells
2021-12-22 23:30 ` [PATCH v4 64/68] ceph: conversion to new fscache API David Howells
2021-12-22 23:30 ` [PATCH v4 65/68] ceph: add fscache writeback support David Howells
2021-12-22 23:30 ` [PATCH v4 66/68] fscache: Rewrite documentation David Howells
2021-12-22 23:31 ` [PATCH v4 67/68] fscache: Add a tracepoint for cookie use/unuse David Howells
2021-12-22 23:31 ` [PATCH v4 68/68] 9p, afs, ceph, cifs, nfs: Use current_is_kswapd() rather than gfpflags_allow_blocking() David Howells
2022-01-04 10:50 ` [PATCH v4 00/68] fscache, cachefiles: Rewrite Jeff Layton
2022-01-04 11:27 ` David Wysochanski
2022-01-06 18:19 ` [Linux-cachefs] " Marc Dionne
2022-01-06 18:29 ` Jeff Layton
2022-01-07 21:52 ` [PATCH v4 63/68] cifs: Support fscache indexing rewrite (untested) David Howells
2022-01-09 15:27   ` Jeff Layton
2022-01-07 22:16 ` [PATCH v5 63/68] cifs: Support fscache indexing rewrite David Howells
2022-01-10 11:21 ` [PATCH v4 00/68] fscache, cachefiles: Rewrite Dominique Martinet
2022-01-12  7:20 ` [PATCH v5 63/68] cifs: Support fscache indexing rewrite David Howells
2022-01-12 21:56 ` [PATCH v6] " David Howells
2022-01-13 16:20 ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1e102cc81aaf71df2b7f5ae906b79c188a34a111.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=anna.schumaker@netapp.com \
    --cc=asmadeus@codewreck.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=dhowells@redhat.com \
    --cc=jefflexu@linux.alibaba.com \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=osandov@osandov.com \
    --cc=sfrench@samba.org \
    --cc=torvalds@linux-foundation.org \
    --cc=trondmy@hammerspace.com \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).