All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shreeya Patel <shreeya.patel@collabora.com>
To: tytso@mit.edu, adilger.kernel@dilger.ca, jaegeuk@kernel.org,
	chao@kernel.org, krisman@collabora.com, ebiggers@google.com,
	drosen@google.com, ebiggers@kernel.org, yuchao0@huawei.com
Cc: linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org, kernel@collabora.com,
	andre.almeida@collabora.com
Subject: [PATCH v8 0/4] Make UTF-8 encoding loadable
Date: Sat, 24 Apr 2021 02:21:32 +0530	[thread overview]
Message-ID: <20210423205136.1015456-1-shreeya.patel@collabora.com> (raw)

utf8data.h_shipped has a large database table which is an auto-generated
decodification trie for the unicode normalization functions and it is not
necessary to carry this large table in the kernel unless it is required by
the filesystem during boot time.

Goal is to make UTF-8 encoding loadable by converting it into a module
and adding a unicode subsystem layer between the filesystems and the
utf8 module.
This layer will then load the module whenever any filesystem that
needs unicode is mounted or utf8 can also be built into the kernel incase
it is required by the filesystem during boot time.

Currently, only UTF-8 encoding is supported but if any other encodings
are supported in future then the layer file would be responsible for
loading the desired encoding module.

1st patch in the series resolves the warning reported by kernel test
robot by using strscpy instead of strncpy.

Unicode is the subsystem and utf8 is a charachter encoding for the
subsystem, hence 2nd and 3rd patches in the series are renaming functions
and file name to unicode for better understanding the difference between
UTF-8 module and unicode layer.

Last patch in the series adds the layer and utf8 module and also uses
static calls which gives performance benefit when compared to indirect
calls using function pointers.

---
Changes in v8
  - Improve the commit message of patch 1 to decribe about how
    overly-long strings should be handled.
  - Improve the commit messages in patches 2/3/4 to better understand
    the use of built-in option.
  - Improve the help text in Kconfig for avoiding contradictory
    statements.
  - Make spinlock definition static.
  - Use int instead of bool to avoid gcc warning.
  - Add a comment for decribing why we are using
    try_then_request_module() instead of request_module()

Changes in v7
  - Update the help text in Kconfig
  - Handle the unicode_load_static_call function failure by decrementing
    the reference.
  - Correct the code for handling built-in utf8 option as well.
  - Correct the synchronization for accessing utf8mod.
  - Make changes to unicode_unload() for handling the situation where
    utf8mod != NULL and um == NULL.

Changes in v6
  - Add spinlock to protect utf8mod and avoid NULL pointer
    dereference.
  - Change the static call function names for being consistent with
    kernel coding style.
  - Merge the unicode_load_module function with unicode_load as it is
    not really needed to have a separate function.
  - Use try_then_module_get instead of module_get to avoid loading the
    module even when it is already loaded.
  - Improve the commit message.

Changes in v5
  - Remove patch which adds NULL check in ext4/super.c and f2fs/super.c
    before calling unicode_unload().
  - Rename global variables and default static call functions for better
    understanding
  - Make only config UNICODE_UTF8 visible and config UNICODE to be always
    enabled provided UNICODE_UTF8 is enabled.  
  - Improve the documentation for Kconfig
  - Improve the commit message.
 
Changes in v4
  - Return error from the static calls instead of doing nothing and
    succeeding even without loading the module.
  - Remove the complete usage of utf8_ops and use static calls at all
    places.
  - Restore the static calls to default values when module is unloaded.
  - Decrement the reference of module after calling the unload function.
  - Remove spinlock as there will be no race conditions after removing
    utf8_ops.

Changes in v3
  - Add a patch which checks if utf8 is loaded before calling utf8_unload()
    in ext4 and f2fs filesystems
  - Return error if strscpy() returns value < 0
  - Correct the conditions to prevent NULL pointer dereference while
    accessing functions via utf8_ops variable.
  - Add spinlock to avoid race conditions.
  - Use static_call() for preventing speculative execution attacks.

Changes in v2
  - Remove the duplicate file from the last patch.
  - Make the wrapper functions inline.
  - Remove msleep and use try_module_get() and module_put()
    for ensuring that module is loaded correctly and also
    doesn't get unloaded while in use.
  - Resolve the warning reported by kernel test robot.
  - Resolve all the checkpatch.pl warnings.

Shreeya Patel (4):
  fs: unicode: Use strscpy() instead of strncpy()
  fs: unicode: Rename function names from utf8 to unicode
  fs: unicode: Rename utf8-core file to unicode-core
  fs: unicode: Add utf8 module and a unicode layer

 fs/ext4/hash.c                             |   2 +-
 fs/ext4/namei.c                            |  12 +-
 fs/ext4/super.c                            |   6 +-
 fs/f2fs/dir.c                              |  12 +-
 fs/f2fs/super.c                            |   6 +-
 fs/libfs.c                                 |   6 +-
 fs/unicode/Kconfig                         |  26 ++-
 fs/unicode/Makefile                        |   5 +-
 fs/unicode/unicode-core.c                  | 175 +++++++++++++++++++++
 fs/unicode/{utf8-core.c => unicode-utf8.c} |  98 +++++++-----
 fs/unicode/utf8-selftest.c                 |   8 +-
 include/linux/unicode.h                    | 100 ++++++++++--
 12 files changed, 374 insertions(+), 82 deletions(-)
 create mode 100644 fs/unicode/unicode-core.c
 rename fs/unicode/{utf8-core.c => unicode-utf8.c} (57%)

-- 
2.30.2


WARNING: multiple messages have this Message-ID (diff)
From: Shreeya Patel <shreeya.patel@collabora.com>
To: tytso@mit.edu, adilger.kernel@dilger.ca, jaegeuk@kernel.org,
	chao@kernel.org, krisman@collabora.com, ebiggers@google.com,
	drosen@google.com, ebiggers@kernel.org, yuchao0@huawei.com
Cc: kernel@collabora.com, linux-kernel@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org, andre.almeida@collabora.com,
	linux-ext4@vger.kernel.org
Subject: [f2fs-dev] [PATCH v8 0/4] Make UTF-8 encoding loadable
Date: Sat, 24 Apr 2021 02:21:32 +0530	[thread overview]
Message-ID: <20210423205136.1015456-1-shreeya.patel@collabora.com> (raw)

utf8data.h_shipped has a large database table which is an auto-generated
decodification trie for the unicode normalization functions and it is not
necessary to carry this large table in the kernel unless it is required by
the filesystem during boot time.

Goal is to make UTF-8 encoding loadable by converting it into a module
and adding a unicode subsystem layer between the filesystems and the
utf8 module.
This layer will then load the module whenever any filesystem that
needs unicode is mounted or utf8 can also be built into the kernel incase
it is required by the filesystem during boot time.

Currently, only UTF-8 encoding is supported but if any other encodings
are supported in future then the layer file would be responsible for
loading the desired encoding module.

1st patch in the series resolves the warning reported by kernel test
robot by using strscpy instead of strncpy.

Unicode is the subsystem and utf8 is a charachter encoding for the
subsystem, hence 2nd and 3rd patches in the series are renaming functions
and file name to unicode for better understanding the difference between
UTF-8 module and unicode layer.

Last patch in the series adds the layer and utf8 module and also uses
static calls which gives performance benefit when compared to indirect
calls using function pointers.

---
Changes in v8
  - Improve the commit message of patch 1 to decribe about how
    overly-long strings should be handled.
  - Improve the commit messages in patches 2/3/4 to better understand
    the use of built-in option.
  - Improve the help text in Kconfig for avoiding contradictory
    statements.
  - Make spinlock definition static.
  - Use int instead of bool to avoid gcc warning.
  - Add a comment for decribing why we are using
    try_then_request_module() instead of request_module()

Changes in v7
  - Update the help text in Kconfig
  - Handle the unicode_load_static_call function failure by decrementing
    the reference.
  - Correct the code for handling built-in utf8 option as well.
  - Correct the synchronization for accessing utf8mod.
  - Make changes to unicode_unload() for handling the situation where
    utf8mod != NULL and um == NULL.

Changes in v6
  - Add spinlock to protect utf8mod and avoid NULL pointer
    dereference.
  - Change the static call function names for being consistent with
    kernel coding style.
  - Merge the unicode_load_module function with unicode_load as it is
    not really needed to have a separate function.
  - Use try_then_module_get instead of module_get to avoid loading the
    module even when it is already loaded.
  - Improve the commit message.

Changes in v5
  - Remove patch which adds NULL check in ext4/super.c and f2fs/super.c
    before calling unicode_unload().
  - Rename global variables and default static call functions for better
    understanding
  - Make only config UNICODE_UTF8 visible and config UNICODE to be always
    enabled provided UNICODE_UTF8 is enabled.  
  - Improve the documentation for Kconfig
  - Improve the commit message.
 
Changes in v4
  - Return error from the static calls instead of doing nothing and
    succeeding even without loading the module.
  - Remove the complete usage of utf8_ops and use static calls at all
    places.
  - Restore the static calls to default values when module is unloaded.
  - Decrement the reference of module after calling the unload function.
  - Remove spinlock as there will be no race conditions after removing
    utf8_ops.

Changes in v3
  - Add a patch which checks if utf8 is loaded before calling utf8_unload()
    in ext4 and f2fs filesystems
  - Return error if strscpy() returns value < 0
  - Correct the conditions to prevent NULL pointer dereference while
    accessing functions via utf8_ops variable.
  - Add spinlock to avoid race conditions.
  - Use static_call() for preventing speculative execution attacks.

Changes in v2
  - Remove the duplicate file from the last patch.
  - Make the wrapper functions inline.
  - Remove msleep and use try_module_get() and module_put()
    for ensuring that module is loaded correctly and also
    doesn't get unloaded while in use.
  - Resolve the warning reported by kernel test robot.
  - Resolve all the checkpatch.pl warnings.

Shreeya Patel (4):
  fs: unicode: Use strscpy() instead of strncpy()
  fs: unicode: Rename function names from utf8 to unicode
  fs: unicode: Rename utf8-core file to unicode-core
  fs: unicode: Add utf8 module and a unicode layer

 fs/ext4/hash.c                             |   2 +-
 fs/ext4/namei.c                            |  12 +-
 fs/ext4/super.c                            |   6 +-
 fs/f2fs/dir.c                              |  12 +-
 fs/f2fs/super.c                            |   6 +-
 fs/libfs.c                                 |   6 +-
 fs/unicode/Kconfig                         |  26 ++-
 fs/unicode/Makefile                        |   5 +-
 fs/unicode/unicode-core.c                  | 175 +++++++++++++++++++++
 fs/unicode/{utf8-core.c => unicode-utf8.c} |  98 +++++++-----
 fs/unicode/utf8-selftest.c                 |   8 +-
 include/linux/unicode.h                    | 100 ++++++++++--
 12 files changed, 374 insertions(+), 82 deletions(-)
 create mode 100644 fs/unicode/unicode-core.c
 rename fs/unicode/{utf8-core.c => unicode-utf8.c} (57%)

-- 
2.30.2



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

             reply	other threads:[~2021-04-23 20:52 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-23 20:51 Shreeya Patel [this message]
2021-04-23 20:51 ` [f2fs-dev] [PATCH v8 0/4] Make UTF-8 encoding loadable Shreeya Patel
2021-04-23 20:51 ` [PATCH v8 1/4] fs: unicode: Use strscpy() instead of strncpy() Shreeya Patel
2021-04-23 20:51   ` [f2fs-dev] " Shreeya Patel
2021-04-23 20:51 ` [PATCH v8 2/4] fs: unicode: Rename function names from utf8 to unicode Shreeya Patel
2021-04-23 20:51   ` [f2fs-dev] " Shreeya Patel
2021-04-23 20:51 ` [PATCH v8 3/4] fs: unicode: Rename utf8-core file to unicode-core Shreeya Patel
2021-04-23 20:51   ` [f2fs-dev] " Shreeya Patel
2021-04-23 20:51 ` [PATCH v8 4/4] fs: unicode: Add utf8 module and a unicode layer Shreeya Patel
2021-04-23 20:51   ` [f2fs-dev] " Shreeya Patel
2021-04-27  6:29   ` Christoph Hellwig
2021-04-27  6:29     ` [f2fs-dev] " Christoph Hellwig
2021-04-27 10:09     ` Shreeya Patel
2021-04-27 10:09       ` [f2fs-dev] " Shreeya Patel
2021-04-27 14:50       ` Theodore Ts'o
2021-04-27 14:50         ` [f2fs-dev] " Theodore Ts'o
2021-04-27 15:06         ` Gabriel Krisman Bertazi
2021-04-27 15:06           ` [f2fs-dev] " Gabriel Krisman Bertazi
2021-04-28 14:12           ` Theodore Ts'o
2021-04-28 14:12             ` [f2fs-dev] " Theodore Ts'o
2021-04-28 18:58             ` Gabriel Krisman Bertazi
2021-04-28 18:58               ` [f2fs-dev] " Gabriel Krisman Bertazi
     [not found]               ` <7caab939-2800-0cc2-7b65-345af3fce73d@collabora.com>
2021-05-11  4:35                 ` Christoph Hellwig
2021-05-11  4:35                   ` [f2fs-dev] " Christoph Hellwig
2021-05-20 20:19                   ` Shreeya Patel
2021-05-20 20:19                     ` [f2fs-dev] " Shreeya Patel
2021-06-03  0:07                     ` Gabriel Krisman Bertazi
2021-06-03  0:07                       ` [f2fs-dev] " Gabriel Krisman Bertazi
2021-06-16  4:09                       ` Christoph Hellwig
2021-06-16  4:09                         ` [f2fs-dev] " Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210423205136.1015456-1-shreeya.patel@collabora.com \
    --to=shreeya.patel@collabora.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=andre.almeida@collabora.com \
    --cc=chao@kernel.org \
    --cc=drosen@google.com \
    --cc=ebiggers@google.com \
    --cc=ebiggers@kernel.org \
    --cc=jaegeuk@kernel.org \
    --cc=kernel@collabora.com \
    --cc=krisman@collabora.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=yuchao0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.