* [PATCH] unicode: refactor the rule for regenerating utf8data.h
@ 2019-04-27 6:24 Masahiro Yamada
2019-04-27 6:27 ` Masahiro Yamada
0 siblings, 1 reply; 4+ messages in thread
From: Masahiro Yamada @ 2019-04-27 6:24 UTC (permalink / raw)
To: Olaf Weber, Gabriel Krisman Bertazi, Theodore Ts'o
Cc: Masahiro Yamada, Gabriel Krisman Bertazi, linux-doc,
linux-kbuild, linux-kernel, Jonathan Corbet, Michal Marek,
linux-fsdevel
scripts/mkutf8data is used only when regenerating utf8data.h,
which never happens in the normal kernel build. However, it is
irrespectively built if CONFIG_UNICODE is enabled.
Moreover, there is no good reason for it to reside in the scripts/
directory since it is only used in fs/unicode/.
Hence, move it from scripts/ to fs/unicode/.
In some cases, we bypass build artifacts in the normal build. The
conventianl way to do so is to surround the code with ifdef REGENERATE_*.
For example,
- 7373f4f83c71 ("kbuild: add implicit rules for parser generation")
- 6aaf49b495b4 ("crypto: arm,arm64 - Fix random regeneration of S_shipped")
I rewrote the rule in a more kbuild'ish style.
It works like this:
$ make REGENERATE_UTF8DATA=1 fs/unicode/
[ snip ]
HOSTCC fs/unicode/mkutf8data
GEN fs/unicode/utf8data.h
CC fs/unicode/utf8-norm.o
CC fs/unicode/utf8-core.o
AR fs/unicode/built-in.a
Also, I added utf8data.h to .gitignore and dontdiff.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---
Documentation/dontdiff | 1 +
fs/unicode/.gitignore | 1 +
fs/unicode/Makefile | 37 +++++++++++++++++++++++++-----------
fs/unicode/README.utf8data | 10 +++++-----
{scripts => fs/unicode}/mkutf8data.c | 0
scripts/Makefile | 1 -
6 files changed, 33 insertions(+), 17 deletions(-)
create mode 100644 fs/unicode/.gitignore
rename {scripts => fs/unicode}/mkutf8data.c (100%)
diff --git a/Documentation/dontdiff b/Documentation/dontdiff
index ef25a06..bc353ad 100644
--- a/Documentation/dontdiff
+++ b/Documentation/dontdiff
@@ -176,6 +176,7 @@ mkprep
mkregtable
mktables
mktree
+mkutf8data
modpost
modules.builtin
modules.order
diff --git a/fs/unicode/.gitignore b/fs/unicode/.gitignore
new file mode 100644
index 0000000..44811fc
--- /dev/null
+++ b/fs/unicode/.gitignore
@@ -0,0 +1 @@
+mkutf8data
diff --git a/fs/unicode/Makefile b/fs/unicode/Makefile
index 671d31f..1a109b7 100644
--- a/fs/unicode/Makefile
+++ b/fs/unicode/Makefile
@@ -5,15 +5,30 @@ obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
unicode-y := utf8-norm.o utf8-core.o
-# This rule is not invoked during the kernel compilation. It is used to
-# regenerate the utf8data.h header file.
-utf8data.h.new: *.txt $(objdir)/scripts/mkutf8data
- $(objdir)/scripts/mkutf8data \
- -a DerivedAge.txt \
- -c DerivedCombiningClass.txt \
- -p DerivedCoreProperties.txt \
- -d UnicodeData.txt \
- -f CaseFolding.txt \
- -n NormalizationCorrections.txt \
- -t NormalizationTest.txt \
+
+# To regenerate utf8data.h, run the following in the top directory:
+# $ make REGENERATE_UTF8DATA=1 fs/unicode/
+ifdef REGENERATE_UTF8DATA
+
+$(obj)/utf8-norm.o: $(obj)/utf8data.h
+
+quiet_cmd_utf8data = GEN $@
+ cmd_utf8data = $(obj)/mkutf8data \
+ -a $(src)/DerivedAge.txt \
+ -c $(src)/DerivedCombiningClass.txt \
+ -p $(src)/DerivedCoreProperties.txt \
+ -d $(src)/UnicodeData.txt \
+ -f $(src)/CaseFolding.txt \
+ -n $(src)/NormalizationCorrections.txt \
+ -t $(src)/NormalizationTest.txt \
-o $@
+
+$(obj)/utf8data.h: $(filter %.txt, $(cmd_utf8data)) $(obj)/mkutf8data FORCE
+ $(call if_changed,utf8data)
+
+always += utf8data.h
+no-clean-files += utf8data.h
+
+endif
+
+hostprogs-y += mkutf8data
diff --git a/fs/unicode/README.utf8data b/fs/unicode/README.utf8data
index eeb7561..155d56e 100644
--- a/fs/unicode/README.utf8data
+++ b/fs/unicode/README.utf8data
@@ -41,15 +41,15 @@ released version of the UCD can be found here:
http://www.unicode.org/Public/UCD/latest/
-To build the utf8data.h file, from a kernel tree that has been built,
-cd to this directory (fs/unicode) and run this command:
+To regenerate utf8data.h in the build process, pass REGENERATE_UTF8DATA=1
+from the command line. The easiest command to update it is this:
- make C=../.. objdir=../.. utf8data.h.new
+ make REGENERATE_UTF8DATA=1 fs/unicode/
-After sanity checking the newly generated utf8data.h.new file (the
+After sanity checking the newly generated utf8data.h file (the
version generated from the 12.1.0 UCD should be 4,109 lines long, and
have a total size of 324k) and/or comparing it with the older version
-of utf8data.h, rename it to utf8data.h.
+of utf8data.h, check it in.
If you are a kernel developer updating to a newer version of the
Unicode Character Database, please update this README.utf8data file
diff --git a/scripts/mkutf8data.c b/fs/unicode/mkutf8data.c
similarity index 100%
rename from scripts/mkutf8data.c
rename to fs/unicode/mkutf8data.c
diff --git a/scripts/Makefile b/scripts/Makefile
index b87e3e0..9d442ee 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -20,7 +20,6 @@ hostprogs-$(CONFIG_ASN1) += asn1_compiler
hostprogs-$(CONFIG_MODULE_SIG) += sign-file
hostprogs-$(CONFIG_SYSTEM_TRUSTED_KEYRING) += extract-cert
hostprogs-$(CONFIG_SYSTEM_EXTRA_CERTIFICATE) += insert-sys-cert
-hostprogs-$(CONFIG_UNICODE) += mkutf8data
HOSTCFLAGS_sortextable.o = -I$(srctree)/tools/include
HOSTCFLAGS_asn1_compiler.o = -I$(srctree)/include
--
2.7.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] unicode: refactor the rule for regenerating utf8data.h
2019-04-27 6:24 [PATCH] unicode: refactor the rule for regenerating utf8data.h Masahiro Yamada
@ 2019-04-27 6:27 ` Masahiro Yamada
2019-04-28 17:43 ` Theodore Ts'o
0 siblings, 1 reply; 4+ messages in thread
From: Masahiro Yamada @ 2019-04-27 6:27 UTC (permalink / raw)
To: Olaf Weber, Gabriel Krisman Bertazi, Theodore Ts'o
Cc: Gabriel Krisman Bertazi, open list:DOCUMENTATION,
Linux Kbuild mailing list, Linux Kernel Mailing List,
Jonathan Corbet, Michal Marek, linux-fsdevel
On Sat, Apr 27, 2019 at 3:24 PM Masahiro Yamada
<yamada.masahiro@socionext.com> wrote:
>
> scripts/mkutf8data is used only when regenerating utf8data.h,
> which never happens in the normal kernel build. However, it is
> irrespectively built if CONFIG_UNICODE is enabled.
>
> Moreover, there is no good reason for it to reside in the scripts/
> directory since it is only used in fs/unicode/.
>
> Hence, move it from scripts/ to fs/unicode/.
>
> In some cases, we bypass build artifacts in the normal build. The
> conventianl way to do so is to surround the code with ifdef REGENERATE_*.
This is a typo.
conventianl -> conventional
> For example,
>
> - 7373f4f83c71 ("kbuild: add implicit rules for parser generation")
> - 6aaf49b495b4 ("crypto: arm,arm64 - Fix random regeneration of S_shipped")
>
> I rewrote the rule in a more kbuild'ish style.
>
> It works like this:
>
> $ make REGENERATE_UTF8DATA=1 fs/unicode/
> [ snip ]
> HOSTCC fs/unicode/mkutf8data
> GEN fs/unicode/utf8data.h
> CC fs/unicode/utf8-norm.o
> CC fs/unicode/utf8-core.o
> AR fs/unicode/built-in.a
>
> Also, I added utf8data.h to .gitignore and dontdiff.
>
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---
--
Best Regards
Masahiro Yamada
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] unicode: refactor the rule for regenerating utf8data.h
2019-04-27 6:27 ` Masahiro Yamada
@ 2019-04-28 17:43 ` Theodore Ts'o
2019-04-29 3:09 ` Masahiro Yamada
0 siblings, 1 reply; 4+ messages in thread
From: Theodore Ts'o @ 2019-04-28 17:43 UTC (permalink / raw)
To: Masahiro Yamada
Cc: Olaf Weber, Gabriel Krisman Bertazi, Gabriel Krisman Bertazi,
open list:DOCUMENTATION, Linux Kbuild mailing list,
Linux Kernel Mailing List, Jonathan Corbet, Michal Marek,
linux-fsdevel
Thanks, for the suggestion and the patch! I agree it's much better
with your proposed change (and it gets mkutf8data out of your hair :-)
I did need to make one change to your patch in order for it to work
correctly with a build directory. That is, to support
make -O /build/ext4 REGENERATE_UTF8DATA=1 fs/unicode/
I'll apply it to the ext4 git tree with this change.
- Ted
diff --git a/fs/unicode/Makefile b/fs/unicode/Makefile
index 1a109b7a1da9..45955264ac04 100644
--- a/fs/unicode/Makefile
+++ b/fs/unicode/Makefile
@@ -14,13 +14,13 @@ $(obj)/utf8-norm.o: $(obj)/utf8data.h
quiet_cmd_utf8data = GEN $@
cmd_utf8data = $(obj)/mkutf8data \
- -a $(src)/DerivedAge.txt \
- -c $(src)/DerivedCombiningClass.txt \
- -p $(src)/DerivedCoreProperties.txt \
- -d $(src)/UnicodeData.txt \
- -f $(src)/CaseFolding.txt \
- -n $(src)/NormalizationCorrections.txt \
- -t $(src)/NormalizationTest.txt \
+ -a $(srctree)/$(src)/DerivedAge.txt \
+ -c $(srctree)/$(src)/DerivedCombiningClass.txt \
+ -p $(srctree)/$(src)/DerivedCoreProperties.txt \
+ -d $(srctree)/$(src)/UnicodeData.txt \
+ -f $(srctree)/$(src)/CaseFolding.txt \
+ -n $(srctree)/$(src)/NormalizationCorrections.txt \
+ -t $(srctree)/$(src)/NormalizationTest.txt \
-o $@
$(obj)/utf8data.h: $(filter %.txt, $(cmd_utf8data)) $(obj)/mkutf8data FORCE
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] unicode: refactor the rule for regenerating utf8data.h
2019-04-28 17:43 ` Theodore Ts'o
@ 2019-04-29 3:09 ` Masahiro Yamada
0 siblings, 0 replies; 4+ messages in thread
From: Masahiro Yamada @ 2019-04-29 3:09 UTC (permalink / raw)
To: Theodore Ts'o, Masahiro Yamada, Olaf Weber,
Gabriel Krisman Bertazi, Gabriel Krisman Bertazi,
open list:DOCUMENTATION, Linux Kbuild mailing list,
Linux Kernel Mailing List, Jonathan Corbet, Michal Marek,
linux-fsdevel
On Mon, Apr 29, 2019 at 2:44 AM Theodore Ts'o <tytso@mit.edu> wrote:
>
> Thanks, for the suggestion and the patch! I agree it's much better
> with your proposed change (and it gets mkutf8data out of your hair :-)
Yes, this is my main motivation.
>
> I did need to make one change to your patch in order for it to work
> correctly with a build directory. That is, to support
>
> make -O /build/ext4 REGENERATE_UTF8DATA=1 fs/unicode/
>
> I'll apply it to the ext4 git tree with this change.
Thanks for the suggestion.
My first thought was "don't do this",
but it would be better to make it work with O= option.
However, even with your fix-up, it won't work correctly.
If O= is given, the newly-generated utf8data.h will be
put in the object tree.
It will co-exist with the old check-in utf8data.h
and the old one will be included because
the include paths in the srctree are searched first.
I will send v2 shortly so that O= build will work
correctly.
Thanks.
> - Ted
>
> diff --git a/fs/unicode/Makefile b/fs/unicode/Makefile
> index 1a109b7a1da9..45955264ac04 100644
> --- a/fs/unicode/Makefile
> +++ b/fs/unicode/Makefile
> @@ -14,13 +14,13 @@ $(obj)/utf8-norm.o: $(obj)/utf8data.h
>
> quiet_cmd_utf8data = GEN $@
> cmd_utf8data = $(obj)/mkutf8data \
> - -a $(src)/DerivedAge.txt \
> - -c $(src)/DerivedCombiningClass.txt \
> - -p $(src)/DerivedCoreProperties.txt \
> - -d $(src)/UnicodeData.txt \
> - -f $(src)/CaseFolding.txt \
> - -n $(src)/NormalizationCorrections.txt \
> - -t $(src)/NormalizationTest.txt \
> + -a $(srctree)/$(src)/DerivedAge.txt \
> + -c $(srctree)/$(src)/DerivedCombiningClass.txt \
> + -p $(srctree)/$(src)/DerivedCoreProperties.txt \
> + -d $(srctree)/$(src)/UnicodeData.txt \
> + -f $(srctree)/$(src)/CaseFolding.txt \
> + -n $(srctree)/$(src)/NormalizationCorrections.txt \
> + -t $(srctree)/$(src)/NormalizationTest.txt \
> -o $@
>
> $(obj)/utf8data.h: $(filter %.txt, $(cmd_utf8data)) $(obj)/mkutf8data FORCE
--
Best Regards
Masahiro Yamada
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-04-29 3:10 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-27 6:24 [PATCH] unicode: refactor the rule for regenerating utf8data.h Masahiro Yamada
2019-04-27 6:27 ` Masahiro Yamada
2019-04-28 17:43 ` Theodore Ts'o
2019-04-29 3:09 ` Masahiro Yamada
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.