All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jiang Xin <worldhello.net@gmail.com>
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Git List" <git@vger.kernel.org>,
	"Jiang Xin" <zhiyou.jx@alibaba-inc.com>,
	"Alexander Shopov" <ash@kambanaria.org>,
	"Jordi Mas" <jmas@softcatala.org>,
	"Matthias Rüster" <matthias.ruester@gmail.com>,
	"Jimmy Angelakos" <vyruss@hellug.gr>,
	"Christopher Díaz" <christopher.diaz.riv@gmail.com>,
	"Jean-Noël Avila" <jn.avila@free.fr>,
	"Bagas Sanjaya" <bagasdotme@gmail.com>,
	"Alessandro Menti" <alessandro.menti@alessandromenti.it>,
	"Gwan-gyeong Mun" <elongbug@gmail.com>, Arusekk <arek_koz@o2.pl>,
	"Daniel Santos" <dacs.git@brilhante.top>,
	"Dimitriy Ryazantcev" <DJm00n@mail.ru>,
	"Peter Krefting" <peter@softwolves.pp.se>,
	"Emir SARI" <bitigchi@me.com>,
	"Trần Ngọc Quân" <vnwildman@gmail.com>,
	"Fangyi Zhou" <me@fangyi.io>, "Yi-Jyun Pan" <pan93412@gmail.com>
Subject: Re: [PATCH] Makefile: dedup git-ls-files output to prevent duplicate targets
Date: Thu, 26 May 2022 12:00:04 +0200	[thread overview]
Message-ID: <220526.86tu9c625s.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <CANYiYbGn08N_9bOw+ss6L4U_iTomc-08_961bk40eq1BnEstiw@mail.gmail.com>


On Thu, May 26 2022, Jiang Xin wrote:

> On Thu, May 26, 2022 at 2:23 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Jiang Xin <worldhello.net@gmail.com> writes:
>>
>> >>         )
>> >> -FOUND_SOURCE_FILES := $(shell $(SOURCES_CMD))
>> >> +FOUND_SOURCE_FILES := $(sort $(shell $(SOURCES_CMD)))
>> >>
>> >>  FOUND_C_SOURCES = $(filter %.c,$(FOUND_SOURCE_FILES))
>> >>  FOUND_H_SOURCES = $(filter %.h,$(FOUND_SOURCE_FILES))
>> >>
>> >
>> > If I disabled the git-ls-files command like below,
>> >
>> >     @@ -846,7 +846,7 @@ generated-hdrs: $(GENERATED_H)
>> >...
>> > This is because the three generated header files (defined in
>> > $(GENERATED_H)) are also included in the result of "SOURCES_CMD". We
>> > can fix this by sorting LOCALIZED_C:
>> >
>> >     -LOCALIZED_C = $(FOUND_C_SOURCES) $(FOUND_H_SOURCES) $(SCALAR_SOURCES) \
>> >     -             $(GENERATED_H)
>> >     +LOCALIZED_C = $(sort $(FOUND_C_SOURCES) $(FOUND_H_SOURCES)
>> > $(SCALAR_SOURCES) \
>> >     +               $(GENERATED_H))
>>
>> If you make FOUND_SOURCE_FILES unique upfront, the at least there
>> wouldn't be any duplicates there.  Do you mean that some of what is
>> in FOUND_SOURCE_FILES appear in either SCALAR_SOURCES or GENERATED_H?
>
> Yes. If find source files using "git-ls-files", the generated headers
> are not included in FOUND_SOURCE_FILES, but this is not the case for
> the find command.
>
>> If not, I think deduplicating near the source of the issue, i.e.
>>
>>   FOUND_SOURCE_FILES := $(sort $(shell $(SOURCES_CMD)))
>
> If we pass the "--deduplicate" option to git-ls-files, we can make
> sure files are unique in FOUND_SOURCE_FILES. So sorting
> FOUND_SOURCE_FILES becomes unnecessary. But FOUND_SOURCE_FILES may
> contain generated files if using find instead of git-ls-files in
> SOURCES_CMD, that means sort FOUND_SOURCE_FILES is not enough. Instead
> of sorting it, we may want to filter-out the generated files from it,
> like:
>
>     FOUND_SOURCE_FILES := $(filter-out $(GENERATED_H),$(shell $(SOURCES_CMD)))
>
> Exclude the several generated headers by add extra fixed options to
> find command is not good, but we may need to exclude the ".build"
> directory from the find command in SOURCES_CMD, in case we may
> duplicate C files in it in future version.
>
>
>     --- a/Makefile
>     +++ b/Makefile
>     @@ -846,7 +846,7 @@ generated-hdrs: $(GENERATED_H)
>     ## Exhaustive lists of our source files, either dynamically generated,
>     ## or hardcoded.
>     SOURCES_CMD = ( \
>     -       git ls-files \
>     +       git ls-files --deduplicate \
>                    '*.[hcS]' \
>                    '*.sh' \
>                    ':!*[tp][0-9][0-9][0-9][0-9]*' \
>     @@ -857,12 +857,13 @@ SOURCES_CMD = ( \
>                    -o \( -name '[tp][0-9][0-9][0-9][0-9]*' -prune \) \
>                    -o \( -name contrib -type d -prune \) \
>                    -o \( -name build -type d -prune \) \
>     +               -o \( -name .build -type d -prune \) \

This change is good and something we should do in any case, I think (and
this is my fault) that due to this we'd have broken other things,
e.g. included these in the TAGS file if we don't have "git ls-files".

>                    -o \( -name 'trash*' -type d -prune \) \
>                    -o \( -name '*.[hcS]' -type f -print \) \
>                    -o \( -name '*.sh' -type f -print \) \
>                    | sed -e 's|^\./||' \
>            )
>     -FOUND_SOURCE_FILES := $(shell $(SOURCES_CMD))
>     +FOUND_SOURCE_FILES := $(filter-out $(GENERATED_H),$(shell $(SOURCES_CMD)))
>
>     FOUND_C_SOURCES = $(filter %.c,$(FOUND_SOURCE_FILES))
>     FOUND_H_SOURCES = $(filter %.h,$(FOUND_SOURCE_FILES))
>
>> may be sufficient.  Deduplicating near the consumer, like
>> LOCALIZED_C, may force us to dedup all the consumers of it (e.g.
>> LOCALIZED_C is not the sole consumer of FOUND_C_SOURCES; you'd need
>> to sort the input to COCCI_SOURCES, for example).
>
> If we apply the above patch, sorting LOCALIZED_C is not necessary.

Per earlier feedback in
https://lore.kernel.org/git/220519.86tu9l6fw4.gmgdl@evledraar.gmail.com/
this all seesm a bit too complex, especially now.

I pointed out then that with --sort-by-file added we:

 * Don't group the translations by C/SH/Perl anymore
 * Change the sort order within files, to be line/sorted instead of
   line/order (i.e. first occurring translations first)

I suggested then to just use $(sort) on the respective lists.

So why not just:

 1. Switch to the $(FOUND_C_SOURCES) (good)
 2. Filter that by C/Perl/SH as before (just a simple $(filter)
 3. $(sort) that (which as noted, also de-dupes it)

Then we don't have any of the behavior change of --sort-by-file, and we
don't have to carefully curate the ls-files/find commands to not include
duplicates (although as seen here that seems to have been a useful
canary in the "find" case).

  reply	other threads:[~2022-05-26 10:04 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-03 13:23 [PATCH 0/9] Incremental po/git.pot update and new l10n workflow Jiang Xin
2022-05-03 13:23 ` [PATCH 1/9] Makefile: sort "po/git.pot" by file location Jiang Xin
2022-05-03 13:23 ` [PATCH 2/9] Makefile: generate "po/git.pot" from stable LOCALIZED_C Jiang Xin
2022-05-03 13:23 ` [PATCH 3/9] Makefile: have "make pot" not "reset --hard" Jiang Xin
2022-05-03 13:23 ` [PATCH 4/9] i18n CI: stop allowing non-ASCII source messages in po/git.pot Jiang Xin
2022-05-03 13:23 ` [PATCH 5/9] po/git.pot: don't check in result of "make pot" Jiang Xin
2022-05-03 13:23 ` [PATCH 6/9] po/git.pot: remove this now generated file, see preceding commit Jiang Xin
2022-05-03 13:23 ` [PATCH 7/9] Makefile: add "po-update" rule to update po/XX.po Jiang Xin
2022-05-03 13:23 ` [PATCH 8/9] Makefile: add "po-init" rule to initialize po/XX.po Jiang Xin
2022-05-03 13:23 ` [PATCH 9/9] l10n: Document the new l10n workflow Jiang Xin
2022-05-03 14:07 ` [PATCH 0/9] Incremental po/git.pot update and " Peter Krefting
2022-05-04 12:41   ` Jiang Xin
2022-05-04 14:35 ` Junio C Hamano
2022-05-04 14:51   ` Daniel Santos
2022-05-05  0:20     ` Jiang Xin
2022-05-05 22:00       ` Daniel Santos
2022-05-05 22:49         ` Junio C Hamano
2022-05-06  0:50         ` Jiang Xin
2022-05-05  0:07   ` Jiang Xin
2022-05-04 17:58 ` Junio C Hamano
2022-05-19  8:15 ` [PATCH v2 " Jiang Xin
2022-05-19 10:28   ` Ævar Arnfjörð Bjarmason
2022-05-19 14:32     ` Jiang Xin
2022-05-19 14:41       ` Ævar Arnfjörð Bjarmason
2022-05-23  1:25   ` [PATCH v3 " Jiang Xin
2022-05-23  7:15     ` Ævar Arnfjörð Bjarmason
2022-05-23  8:12       ` Ævar Arnfjörð Bjarmason
2022-05-23 13:42         ` Jiang Xin
2022-05-23 14:38           ` Ævar Arnfjörð Bjarmason
2022-05-23 16:13             ` Jiang Xin
2022-05-23  8:26       ` Jiang Xin
2022-05-23 15:21     ` [PATCH v4 " Jiang Xin
2022-05-23 18:19       ` Junio C Hamano
2022-05-26 14:50       ` [PATCH v5 00/10] " Jiang Xin
2022-05-26 14:50       ` [PATCH v5 01/10] Makefile: sort source files before feeding to xgettext Jiang Xin
2022-05-26 14:50       ` [PATCH v5 02/10] Makefile: generate "po/git.pot" from stable LOCALIZED_C Jiang Xin
2022-05-26 14:50       ` [PATCH v5 03/10] Makefile: have "make pot" not "reset --hard" Jiang Xin
2022-05-26 14:50       ` [PATCH v5 04/10] i18n CI: stop allowing non-ASCII source messages in po/git.pot Jiang Xin
2022-05-26 14:50       ` [PATCH v5 05/10] Makefile: remove duplicate and unwanted files in FOUND_SOURCE_FILES Jiang Xin
2022-05-26 14:50       ` [PATCH v5 06/10] po/git.pot: this is now a generated file Jiang Xin
2022-05-26 17:32         ` Junio C Hamano
2022-05-26 14:50       ` [PATCH v5 07/10] po/git.pot: don't check in result of "make pot" Jiang Xin
2022-05-26 14:50       ` [PATCH v5 08/10] Makefile: add "po-update" rule to update po/XX.po Jiang Xin
2022-05-26 14:50       ` [PATCH v5 09/10] Makefile: add "po-init" rule to initialize po/XX.po Jiang Xin
2022-05-26 14:50       ` [PATCH v5 10/10] l10n: Document the new l10n workflow Jiang Xin
2022-05-23 15:21     ` [PATCH v4 1/9] Makefile: sort "po/git.pot" by file location Jiang Xin
2022-05-23 15:21     ` [PATCH v4 2/9] Makefile: generate "po/git.pot" from stable LOCALIZED_C Jiang Xin
2022-05-23 15:21     ` [PATCH v4 3/9] Makefile: have "make pot" not "reset --hard" Jiang Xin
2022-05-25 22:19       ` Junio C Hamano
2022-05-25 22:24         ` Junio C Hamano
2022-05-26  1:10           ` Jiang Xin
2022-05-26  2:15           ` [PATCH] Makefile: dedup git-ls-files output to prevent duplicate targets Jiang Xin
2022-05-26  4:02             ` Junio C Hamano
2022-05-26  6:06               ` Jiang Xin
2022-05-26  6:23                 ` Junio C Hamano
2022-05-26  7:04                   ` Jiang Xin
2022-05-26 10:00                     ` Ævar Arnfjörð Bjarmason [this message]
2022-05-26 11:06                       ` Jiang Xin
2022-05-26 17:18                       ` Junio C Hamano
2022-05-26 18:25                         ` Ævar Arnfjörð Bjarmason
2022-05-26 19:00                           ` Junio C Hamano
2022-05-26 19:17                             ` Ævar Arnfjörð Bjarmason
2022-05-23 15:21     ` [PATCH v4 4/9] i18n CI: stop allowing non-ASCII source messages in po/git.pot Jiang Xin
2022-05-23 15:21     ` [PATCH v4 5/9] po/git.pot: this is now a generated file Jiang Xin
2022-05-23 15:21     ` [PATCH v4 6/9] po/git.pot: don't check in result of "make pot" Jiang Xin
2022-05-23 15:21     ` [PATCH v4 7/9] Makefile: add "po-update" rule to update po/XX.po Jiang Xin
2022-05-23 15:21     ` [PATCH v4 8/9] Makefile: add "po-init" rule to initialize po/XX.po Jiang Xin
2022-05-23 15:21     ` [PATCH v4 9/9] l10n: Document the new l10n workflow Jiang Xin
2022-05-23  1:25   ` [PATCH v3 1/9] Makefile: sort "po/git.pot" by file location Jiang Xin
2022-05-23  8:05     ` Junio C Hamano
2022-05-23  8:50       ` Jiang Xin
2022-05-23  1:25   ` [PATCH v3 2/9] Makefile: generate "po/git.pot" from stable LOCALIZED_C Jiang Xin
2022-05-23  8:05     ` Junio C Hamano
2022-05-23  1:25   ` [PATCH v3 3/9] Makefile: have "make pot" not "reset --hard" Jiang Xin
2022-05-23  7:28     ` Ævar Arnfjörð Bjarmason
2022-05-23 15:00       ` Jiang Xin
2022-05-24  0:56       ` Jiang Xin
2022-05-23  8:15     ` Junio C Hamano
2022-05-23  9:37       ` Jiang Xin
2022-05-23  1:25   ` [PATCH v3 4/9] i18n CI: stop allowing non-ASCII source messages in po/git.pot Jiang Xin
2022-05-23  1:25   ` [PATCH v3 5/9] po/git.pot: this is now a generated file Jiang Xin
2022-05-23  1:25   ` [PATCH v3 6/9] po/git.pot: don't check in result of "make pot" Jiang Xin
2022-05-23  7:26     ` Ævar Arnfjörð Bjarmason
2022-05-23  8:30       ` Jiang Xin
2022-05-23  8:35         ` Jiang Xin
2022-05-23  9:28         ` Ævar Arnfjörð Bjarmason
2022-05-23  1:25   ` [PATCH v3 7/9] Makefile: add "po-update" rule to update po/XX.po Jiang Xin
2022-05-23  1:25   ` [PATCH v3 8/9] Makefile: add "po-init" rule to initialize po/XX.po Jiang Xin
2022-05-23  1:25   ` [PATCH v3 9/9] l10n: Document the new l10n workflow Jiang Xin
2022-05-19  8:15 ` [PATCH v2 1/9] Makefile: sort "po/git.pot" by file location Jiang Xin
2022-05-19  8:53   ` Ævar Arnfjörð Bjarmason
2022-05-19 12:41     ` Jiang Xin
2022-05-19  8:15 ` [PATCH v2 2/9] Makefile: generate "po/git.pot" from stable LOCALIZED_C Jiang Xin
2022-05-19  9:18   ` Ævar Arnfjörð Bjarmason
2022-05-19 12:48     ` Jiang Xin
2022-05-19  8:15 ` [PATCH v2 3/9] Makefile: have "make pot" not "reset --hard" Jiang Xin
2022-05-19  9:43   ` Ævar Arnfjörð Bjarmason
2022-05-19 13:19     ` Jiang Xin
2022-05-19 14:06       ` Ævar Arnfjörð Bjarmason
2022-05-19  8:15 ` [PATCH v2 4/9] i18n CI: stop allowing non-ASCII source messages in po/git.pot Jiang Xin
2022-05-19 10:02   ` Ævar Arnfjörð Bjarmason
2022-05-19  8:15 ` [PATCH v2 5/9] po/git.pot: this is now a generated file Jiang Xin
2022-05-19  8:15 ` [PATCH v2 6/9] po/git.pot: don't check in result of "make pot" Jiang Xin
2022-05-19  8:15 ` [PATCH v2 7/9] Makefile: add "po-update" rule to update po/XX.po Jiang Xin
2022-05-19 10:07   ` Ævar Arnfjörð Bjarmason
2022-05-19  8:15 ` [PATCH v2 8/9] Makefile: add "po-init" rule to initialize po/XX.po Jiang Xin
2022-05-19 10:22   ` Ævar Arnfjörð Bjarmason
2022-05-19  8:15 ` [PATCH v2 9/9] l10n: Document the new l10n workflow Jiang Xin
2022-05-19 17:18   ` Junio C Hamano
2022-05-21 15:06     ` Jiang Xin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220526.86tu9c625s.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=DJm00n@mail.ru \
    --cc=alessandro.menti@alessandromenti.it \
    --cc=arek_koz@o2.pl \
    --cc=ash@kambanaria.org \
    --cc=bagasdotme@gmail.com \
    --cc=bitigchi@me.com \
    --cc=christopher.diaz.riv@gmail.com \
    --cc=dacs.git@brilhante.top \
    --cc=elongbug@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jmas@softcatala.org \
    --cc=jn.avila@free.fr \
    --cc=matthias.ruester@gmail.com \
    --cc=me@fangyi.io \
    --cc=pan93412@gmail.com \
    --cc=peter@softwolves.pp.se \
    --cc=vnwildman@gmail.com \
    --cc=vyruss@hellug.gr \
    --cc=worldhello.net@gmail.com \
    --cc=zhiyou.jx@alibaba-inc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.