All of lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [PATCH] core/pkg-generic: only save latest package list
@ 2018-04-29 13:07 John Keeping
  2018-04-30 16:47 ` Yann E. MORIN
  0 siblings, 1 reply; 16+ messages in thread
From: John Keeping @ 2018-04-29 13:07 UTC (permalink / raw)
  To: buildroot

When rebuilding a package, simply appending the package's file list to
the global list means that the package list grows for every rebuild, as
does the time taken to check for files installed by multiple packages.
Furthermore, we get false positives where a file is reported as being
installed by multiple copies of the same package.

With this approach we may end up with orphaned files in the target
filesystem if a package that has been updated and rebuilt no longer
installs the same set of files, but we know that only a clean build will
produce reliable results.  In fact it may be helpful to identify these
orphaned files as evidence that the build is not clean.

Signed-off-by: John Keeping <john@metanate.com>
---
 package/pkg-generic.mk | 1 +
 1 file changed, 1 insertion(+)

diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
index 1c9dd1d734..edc2c9349c 100644
--- a/package/pkg-generic.mk
+++ b/package/pkg-generic.mk
@@ -64,6 +64,7 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
 # $(3): suffix of file  (optional)
 define step_pkg_size_inner
 	cd $(2); \
+	$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt; \
 	find . \( -type f -o -type l \) \
 		-newer $($(PKG)_DIR)/.stamp_built \
 		-exec printf '$(1),%s\n' {} + \
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH] core/pkg-generic: only save latest package list
  2018-04-29 13:07 [Buildroot] [PATCH] core/pkg-generic: only save latest package list John Keeping
@ 2018-04-30 16:47 ` Yann E. MORIN
  2018-04-30 16:56   ` John Keeping
  0 siblings, 1 reply; 16+ messages in thread
From: Yann E. MORIN @ 2018-04-30 16:47 UTC (permalink / raw)
  To: buildroot

John, All,

On 2018-04-29 14:07 +0100, John Keeping spake thusly:
> When rebuilding a package, simply appending the package's file list to
> the global list means that the package list grows for every rebuild, as
> does the time taken to check for files installed by multiple packages.
> Furthermore, we get false positives where a file is reported as being
> installed by multiple copies of the same package.
> 
> With this approach we may end up with orphaned files in the target
> filesystem if a package that has been updated and rebuilt no longer
> installs the same set of files, but we know that only a clean build will
> produce reliable results.  In fact it may be helpful to identify these
> orphaned files as evidence that the build is not clean.
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---
>  package/pkg-generic.mk | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index 1c9dd1d734..edc2c9349c 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -64,6 +64,7 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
>  # $(3): suffix of file  (optional)
>  define step_pkg_size_inner
>  	cd $(2); \
> +	$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt; \

Since BUILD_DIR is a fully-qualified path, I would have put it as the
first line of the macros, and thus it would not have required the
trailing '\':

    $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
    cd $(2); \
    find [...]

Otherwise:

Reviewed-by: "Yann E. MORIN" <yann.morin.1998@free.fr>

Regards,
Yann E. MORIN.

>  	find . \( -type f -o -type l \) \
>  		-newer $($(PKG)_DIR)/.stamp_built \
>  		-exec printf '$(1),%s\n' {} + \
> -- 
> 2.17.0
> 

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH] core/pkg-generic: only save latest package list
  2018-04-30 16:47 ` Yann E. MORIN
@ 2018-04-30 16:56   ` John Keeping
  2018-04-30 19:41     ` Yann E. MORIN
  0 siblings, 1 reply; 16+ messages in thread
From: John Keeping @ 2018-04-30 16:56 UTC (permalink / raw)
  To: buildroot

On Mon, 30 Apr 2018 18:47:28 +0200
"Yann E. MORIN" <yann.morin.1998@free.fr> wrote:

> John, All,
> 
> On 2018-04-29 14:07 +0100, John Keeping spake thusly:
> > When rebuilding a package, simply appending the package's file list
> > to the global list means that the package list grows for every
> > rebuild, as does the time taken to check for files installed by
> > multiple packages. Furthermore, we get false positives where a file
> > is reported as being installed by multiple copies of the same
> > package.
> > 
> > With this approach we may end up with orphaned files in the target
> > filesystem if a package that has been updated and rebuilt no longer
> > installs the same set of files, but we know that only a clean build
> > will produce reliable results.  In fact it may be helpful to
> > identify these orphaned files as evidence that the build is not
> > clean.
> > 
> > Signed-off-by: John Keeping <john@metanate.com>
> > ---
> >  package/pkg-generic.mk | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> > index 1c9dd1d734..edc2c9349c 100644
> > --- a/package/pkg-generic.mk
> > +++ b/package/pkg-generic.mk
> > @@ -64,6 +64,7 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
> >  # $(3): suffix of file  (optional)
> >  define step_pkg_size_inner
> >  	cd $(2); \
> > +	$(SED) '/^$(1),/d'
> > $(BUILD_DIR)/packages-file-list$(3).txt; \  
> 
> Since BUILD_DIR is a fully-qualified path, I would have put it as the
> first line of the macros, and thus it would not have required the
> trailing '\':
> 
>     $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
>     cd $(2); \
>     find [...]

This will cause a build error if packages-file-list$(3).txt doesn't
exist, which will be the case for the first package built.

We could use:

      -$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt

but then make will log that it is ignoring an error.


Regards,
John

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH] core/pkg-generic: only save latest package list
  2018-04-30 16:56   ` John Keeping
@ 2018-04-30 19:41     ` Yann E. MORIN
  2018-05-01 11:13       ` [Buildroot] [PATCH v2] " John Keeping
  0 siblings, 1 reply; 16+ messages in thread
From: Yann E. MORIN @ 2018-04-30 19:41 UTC (permalink / raw)
  To: buildroot

John, All,

On 2018-04-30 17:56 +0100, John Keeping spake thusly:
> On Mon, 30 Apr 2018 18:47:28 +0200
> "Yann E. MORIN" <yann.morin.1998@free.fr> wrote:
> > On 2018-04-29 14:07 +0100, John Keeping spake thusly:
> > > When rebuilding a package, simply appending the package's file list
> > > to the global list means that the package list grows for every
> > > rebuild, as does the time taken to check for files installed by
> > > multiple packages. Furthermore, we get false positives where a file
> > > is reported as being installed by multiple copies of the same
> > > package.
> > > 
> > > With this approach we may end up with orphaned files in the target
> > > filesystem if a package that has been updated and rebuilt no longer
> > > installs the same set of files, but we know that only a clean build
> > > will produce reliable results.  In fact it may be helpful to
> > > identify these orphaned files as evidence that the build is not
> > > clean.
> > > 
> > > Signed-off-by: John Keeping <john@metanate.com>
> > > ---
> > >  package/pkg-generic.mk | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> > > index 1c9dd1d734..edc2c9349c 100644
> > > --- a/package/pkg-generic.mk
> > > +++ b/package/pkg-generic.mk
> > > @@ -64,6 +64,7 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
> > >  # $(3): suffix of file  (optional)
> > >  define step_pkg_size_inner
> > >  	cd $(2); \
> > > +	$(SED) '/^$(1),/d'
> > > $(BUILD_DIR)/packages-file-list$(3).txt; \  
> > 
> > Since BUILD_DIR is a fully-qualified path, I would have put it as the
> > first line of the macros, and thus it would not have required the
> > trailing '\':
> > 
> >     $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
> >     cd $(2); \
> >     find [...]
> 
> This will cause a build error if packages-file-list$(3).txt doesn't
> exist, which will be the case for the first package built.

Hmm, indeed. But that is the case with either your original solution or
my proposal, except in your case, the error is silently ignored.

> We could use:
>       -$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
> but then make will log that it is ignoring an error.

I don;t like that much either. What about:

    @touch $(BUILD_DIR)/packages-file-list$(3).txt
    $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
    cd $(2); \
    find [...]

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v2] core/pkg-generic: only save latest package list
  2018-04-30 19:41     ` Yann E. MORIN
@ 2018-05-01 11:13       ` John Keeping
  2018-05-01 12:04         ` Yann E. MORIN
  0 siblings, 1 reply; 16+ messages in thread
From: John Keeping @ 2018-05-01 11:13 UTC (permalink / raw)
  To: buildroot

When rebuilding a package, simply appending the package's file list to
the global list means that the package list grows for every rebuild, as
does the time taken to check for files installed by multiple packages.
Furthermore, we get false positives where a file is reported as being
installed by multiple copies of the same package.

With this approach we may end up with orphaned files in the target
filesystem if a package that has been updated and rebuilt no longer
installs the same set of files, but we know that only a clean build will
produce reliable results.  In fact it may be helpful to identify these
orphaned files as evidence that the build is not clean.

Signed-off-by: John Keeping <john@metanate.com>
---
Changes in v2:
- Do the sed as a separate step rather than in the same shell invocation
  as the find (we know that $(BUILD_DIR) is an absolute path); since we
  no longer ignore the exit status of sed, we also have to ensure that
  the file exists before executing sed

 package/pkg-generic.mk | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
index bd47ca1964..8ebbacc180 100644
--- a/package/pkg-generic.mk
+++ b/package/pkg-generic.mk
@@ -63,6 +63,8 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
 # $(2): base directory to search in
 # $(3): suffix of file  (optional)
 define step_pkg_size_inner
+	@>$(BUILD_DIR)/packages-file-list$(3).txt
+	$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
 	cd $(2); \
 	find . \( -type f -o -type l \) \
 		-newer $($(PKG)_DIR)/.stamp_built \
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v2] core/pkg-generic: only save latest package list
  2018-05-01 11:13       ` [Buildroot] [PATCH v2] " John Keeping
@ 2018-05-01 12:04         ` Yann E. MORIN
  2018-05-01 12:26           ` John Keeping
  0 siblings, 1 reply; 16+ messages in thread
From: Yann E. MORIN @ 2018-05-01 12:04 UTC (permalink / raw)
  To: buildroot

John, All,

On 2018-05-01 12:13 +0100, John Keeping spake thusly:
> When rebuilding a package, simply appending the package's file list to
> the global list means that the package list grows for every rebuild, as
> does the time taken to check for files installed by multiple packages.
> Furthermore, we get false positives where a file is reported as being
> installed by multiple copies of the same package.
> 
> With this approach we may end up with orphaned files in the target
> filesystem if a package that has been updated and rebuilt no longer
> installs the same set of files, but we know that only a clean build will
> produce reliable results.  In fact it may be helpful to identify these
> orphaned files as evidence that the build is not clean.
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---
> Changes in v2:
> - Do the sed as a separate step rather than in the same shell invocation
>   as the find (we know that $(BUILD_DIR) is an absolute path); since we
>   no longer ignore the exit status of sed, we also have to ensure that
>   the file exists before executing sed
> 
>  package/pkg-generic.mk | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index bd47ca1964..8ebbacc180 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -63,6 +63,8 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
>  # $(2): base directory to search in
>  # $(3): suffix of file  (optional)
>  define step_pkg_size_inner
> +	@>$(BUILD_DIR)/packages-file-list$(3).txt

Have you tested that? In my little experience with dealing with the
shell, this actually truncates the file to a zero-size, and thus we
loose the existing package->file list, even for other packages.

So, just do as I suggested: use touch to ensure the file exists.

(Yeah, one could also use the append-redirection >> instead of touch,
but is that trick really an improvement over touch? touch is well known,
and virtually everyone will recognise a construct that uses touch before
sed-ing a file.)

Regards,
Yann E. MORIN.

> +	$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
>  	cd $(2); \
>  	find . \( -type f -o -type l \) \
>  		-newer $($(PKG)_DIR)/.stamp_built \
> -- 
> 2.17.0
> 

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v2] core/pkg-generic: only save latest package list
  2018-05-01 12:04         ` Yann E. MORIN
@ 2018-05-01 12:26           ` John Keeping
  2018-05-01 12:28             ` [Buildroot] [PATCH v3] " John Keeping
  0 siblings, 1 reply; 16+ messages in thread
From: John Keeping @ 2018-05-01 12:26 UTC (permalink / raw)
  To: buildroot

On Tue, 1 May 2018 14:04:16 +0200
"Yann E. MORIN" <yann.morin.1998@free.fr> wrote:

> John, All,
> 
> On 2018-05-01 12:13 +0100, John Keeping spake thusly:
> > When rebuilding a package, simply appending the package's file list
> > to the global list means that the package list grows for every
> > rebuild, as does the time taken to check for files installed by
> > multiple packages. Furthermore, we get false positives where a file
> > is reported as being installed by multiple copies of the same
> > package.
> > 
> > With this approach we may end up with orphaned files in the target
> > filesystem if a package that has been updated and rebuilt no longer
> > installs the same set of files, but we know that only a clean build
> > will produce reliable results.  In fact it may be helpful to
> > identify these orphaned files as evidence that the build is not
> > clean.
> > 
> > Signed-off-by: John Keeping <john@metanate.com>
> > ---
> > Changes in v2:
> > - Do the sed as a separate step rather than in the same shell
> > invocation as the find (we know that $(BUILD_DIR) is an absolute
> > path); since we no longer ignore the exit status of sed, we also
> > have to ensure that the file exists before executing sed
> > 
> >  package/pkg-generic.mk | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> > index bd47ca1964..8ebbacc180 100644
> > --- a/package/pkg-generic.mk
> > +++ b/package/pkg-generic.mk
> > @@ -63,6 +63,8 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
> >  # $(2): base directory to search in
> >  # $(3): suffix of file  (optional)
> >  define step_pkg_size_inner
> > +	@>$(BUILD_DIR)/packages-file-list$(3).txt  
> 
> Have you tested that? In my little experience with dealing with the
> shell, this actually truncates the file to a zero-size, and thus we
> loose the existing package->file list, even for other packages.
> 
> So, just do as I suggested: use touch to ensure the file exists.
> 
> (Yeah, one could also use the append-redirection >> instead of touch,
> but is that trick really an improvement over touch? touch is well
> known, and virtually everyone will recognise a construct that uses
> touch before sed-ing a file.)

Oops, you're right, that will truncate the file.  I'm used to the Git
guidelines of avoiding touch when we don't care about the mtime of the
file, but I think it's fine here.

> > +	$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
> >  	cd $(2); \
> >  	find . \( -type f -o -type l \) \
> >  		-newer $($(PKG)_DIR)/.stamp_built \
> > -- 
> > 2.17.0
> >   
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v3] core/pkg-generic: only save latest package list
  2018-05-01 12:26           ` John Keeping
@ 2018-05-01 12:28             ` John Keeping
  2018-05-01 12:31               ` Yann E. MORIN
                                 ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: John Keeping @ 2018-05-01 12:28 UTC (permalink / raw)
  To: buildroot

When rebuilding a package, simply appending the package's file list to
the global list means that the package list grows for every rebuild, as
does the time taken to check for files installed by multiple packages.
Furthermore, we get false positives where a file is reported as being
installed by multiple copies of the same package.

With this approach we may end up with orphaned files in the target
filesystem if a package that has been updated and rebuilt no longer
installs the same set of files, but we know that only a clean build will
produce reliable results.  In fact it may be helpful to identify these
orphaned files as evidence that the build is not clean.

Signed-off-by: John Keeping <john@metanate.com>
---
Changes in v3:
- Use touch instead of redirect to avoid truncating the file

 package/pkg-generic.mk | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
index bd47ca1964..87c856407d 100644
--- a/package/pkg-generic.mk
+++ b/package/pkg-generic.mk
@@ -63,6 +63,8 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
 # $(2): base directory to search in
 # $(3): suffix of file  (optional)
 define step_pkg_size_inner
+	@touch $(BUILD_DIR)/packages-file-list$(3).txt
+	$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
 	cd $(2); \
 	find . \( -type f -o -type l \) \
 		-newer $($(PKG)_DIR)/.stamp_built \
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v3] core/pkg-generic: only save latest package list
  2018-05-01 12:28             ` [Buildroot] [PATCH v3] " John Keeping
@ 2018-05-01 12:31               ` Yann E. MORIN
  2018-05-01 13:26               ` Thomas Petazzoni
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Yann E. MORIN @ 2018-05-01 12:31 UTC (permalink / raw)
  To: buildroot

John, All,

On 2018-05-01 13:28 +0100, John Keeping spake thusly:
> When rebuilding a package, simply appending the package's file list to
> the global list means that the package list grows for every rebuild, as
> does the time taken to check for files installed by multiple packages.
> Furthermore, we get false positives where a file is reported as being
> installed by multiple copies of the same package.
> 
> With this approach we may end up with orphaned files in the target
> filesystem if a package that has been updated and rebuilt no longer
> installs the same set of files, but we know that only a clean build will
> produce reliable results.  In fact it may be helpful to identify these
> orphaned files as evidence that the build is not clean.
> 
> Signed-off-by: John Keeping <john@metanate.com>

Reviewed-by: "Yann E. MORIN" <yann.morin.1998@free.fr>

Regards,
Yann E. MORIN.

> ---
> Changes in v3:
> - Use touch instead of redirect to avoid truncating the file
> 
>  package/pkg-generic.mk | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index bd47ca1964..87c856407d 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -63,6 +63,8 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
>  # $(2): base directory to search in
>  # $(3): suffix of file  (optional)
>  define step_pkg_size_inner
> +	@touch $(BUILD_DIR)/packages-file-list$(3).txt
> +	$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
>  	cd $(2); \
>  	find . \( -type f -o -type l \) \
>  		-newer $($(PKG)_DIR)/.stamp_built \
> -- 
> 2.17.0
> 

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v3] core/pkg-generic: only save latest package list
  2018-05-01 12:28             ` [Buildroot] [PATCH v3] " John Keeping
  2018-05-01 12:31               ` Yann E. MORIN
@ 2018-05-01 13:26               ` Thomas Petazzoni
  2018-05-01 21:01               ` Peter Korsgaard
  2019-01-04 13:12               ` Yann E. MORIN
  3 siblings, 0 replies; 16+ messages in thread
From: Thomas Petazzoni @ 2018-05-01 13:26 UTC (permalink / raw)
  To: buildroot

Hello,

On Tue,  1 May 2018 13:28:41 +0100, John Keeping wrote:
> When rebuilding a package, simply appending the package's file list to
> the global list means that the package list grows for every rebuild, as
> does the time taken to check for files installed by multiple packages.
> Furthermore, we get false positives where a file is reported as being
> installed by multiple copies of the same package.
> 
> With this approach we may end up with orphaned files in the target
> filesystem if a package that has been updated and rebuilt no longer
> installs the same set of files, but we know that only a clean build will
> produce reliable results.  In fact it may be helpful to identify these
> orphaned files as evidence that the build is not clean.
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---
> Changes in v3:
> - Use touch instead of redirect to avoid truncating the file

Applied to master, thanks.

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v3] core/pkg-generic: only save latest package list
  2018-05-01 12:28             ` [Buildroot] [PATCH v3] " John Keeping
  2018-05-01 12:31               ` Yann E. MORIN
  2018-05-01 13:26               ` Thomas Petazzoni
@ 2018-05-01 21:01               ` Peter Korsgaard
  2019-01-04 13:12               ` Yann E. MORIN
  3 siblings, 0 replies; 16+ messages in thread
From: Peter Korsgaard @ 2018-05-01 21:01 UTC (permalink / raw)
  To: buildroot

>>>>> "John" == John Keeping <john@metanate.com> writes:

 > When rebuilding a package, simply appending the package's file list to
 > the global list means that the package list grows for every rebuild, as
 > does the time taken to check for files installed by multiple packages.
 > Furthermore, we get false positives where a file is reported as being
 > installed by multiple copies of the same package.

 > With this approach we may end up with orphaned files in the target
 > filesystem if a package that has been updated and rebuilt no longer
 > installs the same set of files, but we know that only a clean build will
 > produce reliable results.  In fact it may be helpful to identify these
 > orphaned files as evidence that the build is not clean.

 > Signed-off-by: John Keeping <john@metanate.com>
 > ---
 > Changes in v3:
 > - Use touch instead of redirect to avoid truncating the file

Committed to 2018.02.x, thanks.

-- 
Bye, Peter Korsgaard

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v3] core/pkg-generic: only save latest package list
  2018-05-01 12:28             ` [Buildroot] [PATCH v3] " John Keeping
                                 ` (2 preceding siblings ...)
  2018-05-01 21:01               ` Peter Korsgaard
@ 2019-01-04 13:12               ` Yann E. MORIN
  2019-01-04 15:30                 ` Nicolas Cavallari
  3 siblings, 1 reply; 16+ messages in thread
From: Yann E. MORIN @ 2019-01-04 13:12 UTC (permalink / raw)
  To: buildroot

John. All,

On 2018-05-01 13:28 +0100, John Keeping spake thusly:
> When rebuilding a package, simply appending the package's file list to
> the global list means that the package list grows for every rebuild, as

Is there a particular issue with those three files growing on rebuild?

Compared to the rest of the build tree, those three files have very
small sizes in fact, and if they eventually end up big enough to be a
substantial part of the build tree, it's most probably time for a clean
build from scratch anyway, no?

Note that I am not questioning the same-package reported multiple time,
just the rationale part about size. If size is in fact not a problem,
then I have an alternate solution to not report the same package
multiple times.

Regards,
Yann E. MORIN.

> does the time taken to check for files installed by multiple packages.
> Furthermore, we get false positives where a file is reported as being
> installed by multiple copies of the same package.
> 
> With this approach we may end up with orphaned files in the target
> filesystem if a package that has been updated and rebuilt no longer
> installs the same set of files, but we know that only a clean build will
> produce reliable results.  In fact it may be helpful to identify these
> orphaned files as evidence that the build is not clean.
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---
> Changes in v3:
> - Use touch instead of redirect to avoid truncating the file
> 
>  package/pkg-generic.mk | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index bd47ca1964..87c856407d 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -63,6 +63,8 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
>  # $(2): base directory to search in
>  # $(3): suffix of file  (optional)
>  define step_pkg_size_inner
> +	@touch $(BUILD_DIR)/packages-file-list$(3).txt
> +	$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
>  	cd $(2); \
>  	find . \( -type f -o -type l \) \
>  		-newer $($(PKG)_DIR)/.stamp_built \
> -- 
> 2.17.0
> 

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v3] core/pkg-generic: only save latest package list
  2019-01-04 13:12               ` Yann E. MORIN
@ 2019-01-04 15:30                 ` Nicolas Cavallari
  2019-01-04 17:51                   ` Yann E. MORIN
  0 siblings, 1 reply; 16+ messages in thread
From: Nicolas Cavallari @ 2019-01-04 15:30 UTC (permalink / raw)
  To: buildroot

On 04/01/2019 14:12, Yann E. MORIN wrote:
> John. All,
> 
> On 2018-05-01 13:28 +0100, John Keeping spake thusly:
>> When rebuilding a package, simply appending the package's file list to
>> the global list means that the package list grows for every rebuild, as
> 
> Is there a particular issue with those three files growing on rebuild?

The problem is that not only does this makes the file list grow, it also
adds loads of incorrect data: when rebuilding a package that was
compiled early, all files modified since the package was built will be
included. So the list will include files that have been installed by
this package and all packages compiled after it. For some core packages,
this usually means "almost everything".

> Compared to the rest of the build tree, those three files have very
> small sizes in fact, and if they eventually end up big enough to be a
> substantial part of the build tree, it's most probably time for a clean
> build from scratch anyway, no?

Having a big list would not a problem if nobody used the content of this
file, but some tools scans it and use it to check files. After many
recompiling, I remember having problems with one tool taking a
considerable amount of time for each build because of all the duplicates
in the list. But i don't remember which tool it was.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v3] core/pkg-generic: only save latest package list
  2019-01-04 15:30                 ` Nicolas Cavallari
@ 2019-01-04 17:51                   ` Yann E. MORIN
  2019-01-05 10:23                     ` Nicolas Cavallari
  0 siblings, 1 reply; 16+ messages in thread
From: Yann E. MORIN @ 2019-01-04 17:51 UTC (permalink / raw)
  To: buildroot

Nicolas, All,

On 2019-01-04 16:30 +0100, Nicolas Cavallari spake thusly:
> On 04/01/2019 14:12, Yann E. MORIN wrote:
> > On 2018-05-01 13:28 +0100, John Keeping spake thusly:
> >> When rebuilding a package, simply appending the package's file list to
> >> the global list means that the package list grows for every rebuild, as
> > 
> > Is there a particular issue with those three files growing on rebuild?
> 
> The problem is that not only does this makes the file list grow, it also
> adds loads of incorrect data: when rebuilding a package that was
> compiled early, all files modified since the package was built will be
> included. So the list will include files that have been installed by
> this package and all packages compiled after it. For some core packages,
> this usually means "almost everything".

Sorry, but I don't understand what you mean.

Whether you install or reinstall a package, there is no guarantte on the
order of the other packages it is unrelated to: they may very well be
built before or after the package.

So yes, when you re-install a package, the file list contains all files
installed before it, but they are not assigned to this package, with one
exception: the .la files, which are all assigned to the latest package
installed. And in any case, even if you were not re-installing your
package, but it was the latest to be built, then you'd still have the
issue that the .la files would all be accounted to for your package.

But, two things;
  - this only accounts for .la files in staging, not in target/, and
  - I have already fixed that locally and will send the patch shortly.

Still, I am not usre I understood your problem anyway, so can you
explain in more details, and provide an easy reproducing sequence (based
on the git master)?

> > Compared to the rest of the build tree, those three files have very
> > small sizes in fact, and if they eventually end up big enough to be a
> > substantial part of the build tree, it's most probably time for a clean
> > build from scratch anyway, no?
> 
> Having a big list would not a problem if nobody used the content of this
> file, but some tools scans it and use it to check files. After many
> recompiling, I remember having problems with one tool taking a
> considerable amount of time for each build because of all the duplicates
> in the list. But i don't remember which tool it was.

Are you talking about a tool in Buildroot? Can you try to provide more
details, please? Are you still able to reproduce the slowness on the
current git master?

The only two tools that make use of the packages-file-list.txt are
check-uniq-files (which actually checks that files are not installed by
two or more packages), and support/scripts/size-stats, which graphs a
pie-chart representing the size contribution of each package to the
content of target/. The former is always run at the end of the build, in
the target-finalize step, while the latter is only run on-demand.

They both are written in python, and check-uniq-files is pretty trivial
and lightwieght, and in my experience, very fast.

And in any case, if the list becomes *so big* that reading it takes an
insane amount of time, then probably your host/ staging/ and target/
directories are in a state that is so far from being clean that you
probably should condider a rebuild from scratch at that point.

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v3] core/pkg-generic: only save latest package list
  2019-01-04 17:51                   ` Yann E. MORIN
@ 2019-01-05 10:23                     ` Nicolas Cavallari
  2019-01-05 16:39                       ` Yann E. MORIN
  0 siblings, 1 reply; 16+ messages in thread
From: Nicolas Cavallari @ 2019-01-05 10:23 UTC (permalink / raw)
  To: buildroot

On 04/01/2019 18:51, Yann E. MORIN wrote:
> Nicolas, All,
> 
> On 2019-01-04 16:30 +0100, Nicolas Cavallari spake thusly:
>> On 04/01/2019 14:12, Yann E. MORIN wrote:
>>> On 2018-05-01 13:28 +0100, John Keeping spake thusly:
>>>> When rebuilding a package, simply appending the package's file list to
>>>> the global list means that the package list grows for every rebuild, as
>>>
>>> Is there a particular issue with those three files growing on rebuild?
>>
>> The problem is that not only does this makes the file list grow, it also
>> adds loads of incorrect data: when rebuilding a package that was
>> compiled early, all files modified since the package was built will be
>> included. So the list will include files that have been installed by
>> this package and all packages compiled after it. For some core packages,
>> this usually means "almost everything".
> 
> [...]>
> Still, I am not usre I understood your problem anyway, so can you
> explain in more details, and provide an easy reproducing sequence (based
> on the git master)?

make qemu_x86_defconfig
make
make ifupdown-scripts-reinstall
grep "ifupdown-scripts" output/build/packages-file-list.txt

=>
ifupdown-scripts,./bin/busybox
ifupdown-scripts,./etc/issue
ifupdown-scripts,./etc/init.d/S20urandom
ifupdown-scripts,./etc/init.d/S40network
ifupdown-scripts,./etc/init.d/rcK
ifupdown-scripts,./etc/init.d/rcS
ifupdown-scripts,./etc/shells
ifupdown-scripts,./etc/os-release
ifupdown-scripts,./etc/network/interfaces
ifupdown-scripts,./etc/network/nfs_check
ifupdown-scripts,./etc/profile
ifupdown-scripts,./etc/hostname
ifupdown-scripts,./etc/inittab
ifupdown-scripts,./sbin/ldconfig
ifupdown-scripts,./lib/ld-uClibc-1.0.31.so
ifupdown-scripts,./lib/libgcc_s.so.1
ifupdown-scripts,./lib/libuClibc-1.0.31.so
ifupdown-scripts,./lib/modules/4.16.7/modules.dep.bin
ifupdown-scripts,./lib/modules/4.16.7/modules.order
ifupdown-scripts,./lib/modules/4.16.7/modules.builtin.bin
ifupdown-scripts,./lib/modules/4.16.7/modules.dep
ifupdown-scripts,./lib/modules/4.16.7/modules.symbols.bin
ifupdown-scripts,./lib/modules/4.16.7/modules.softdep
ifupdown-scripts,./lib/modules/4.16.7/kernel/crypto/jitterentropy_rng.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/crypto/sha256_generic.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/crypto/authencesn.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/crypto/drbg.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/crypto/hmac.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/crypto/crypto_engine.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/crypto/echainiv.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/crypto/authenc.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/drivers/char/hw_random/geode-rng.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/drivers/char/hw_random/amd-rng.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/drivers/char/hw_random/virtio-rng.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/drivers/char/hw_random/intel-rng.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/drivers/char/hw_random/via-rng.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/drivers/char/hw_random/rng-core.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/drivers/crypto/virtio/virtio_crypto.ko
ifupdown-scripts,./lib/modules/4.16.7/kernel/drivers/thermal/x86_pkg_temp_thermal.ko
ifupdown-scripts,./lib/modules/4.16.7/modules.builtin
ifupdown-scripts,./lib/modules/4.16.7/modules.symbols
ifupdown-scripts,./lib/modules/4.16.7/modules.alias.bin
ifupdown-scripts,./lib/modules/4.16.7/modules.alias
ifupdown-scripts,./lib/modules/4.16.7/modules.devname
ifupdown-scripts,./lib/libatomic.so.1.2.0
ifupdown-scripts,./THIS_IS_NOT_YOUR_ROOT_FILESYSTEM
ifupdown-scripts,./usr/bin/getconf
ifupdown-scripts,./usr/bin/ldd
ifupdown-scripts,./usr/lib/os-release
ifupdown-scripts,./usr/lib32
ifupdown-scripts,./lib32

Which is almost the whole system.

Now why is that ? The answer is here:

find . \( -type f -o -type l \) \
          -newer $($(PKG)_DIR)/.stamp_built \
          -exec printf '$(1),%s\n' {} + \
          >> $(BUILD_DIR)/packages-file-list$(3).txt

Since .stamp_built has not been modified since the first time the
package was built, this will list all files that have been installed
after ifupdown-scripts was built.

With a bigger system, the list of files can be much larger. And without
this patch, the list would grow by maybe the size of the whole system
each time an early package is reinstalled.

>>> Compared to the rest of the build tree, those three files have very
>>> small sizes in fact, and if they eventually end up big enough to be a
>>> substantial part of the build tree, it's most probably time for a clean
>>> build from scratch anyway, no?
>>
>> Having a big list would not a problem if nobody used the content of this
>> file, but some tools scans it and use it to check files. After many
>> recompiling, I remember having problems with one tool taking a
>> considerable amount of time for each build because of all the duplicates
>> in the list. But i don't remember which tool it was.
> 
> Are you talking about a tool in Buildroot? Can you try to provide more
> details, please? Are you still able to reproduce the slowness on the
> current git master?
> 
> The only two tools that make use of the packages-file-list.txt are
> check-uniq-files (which actually checks that files are not installed by
> two or more packages), and support/scripts/size-stats, which graphs a
> pie-chart representing the size contribution of each package to the
> content of target/. The former is always run at the end of the build, in
> the target-finalize step, while the latter is only run on-demand.

I don't remember which tool it was, but it was probably check-bin-arch.

It will scan the package file list and call readelf on all files. If the
package file list has duplicates, then readelf will be called multiple
times on the same files.  And it is also run at make $pkg-reinstall
times, which fits my memory of make $pkg-reinstall being slower over time.

> And in any case, if the list becomes *so big* that reading it takes an
> insane amount of time, then probably your host/ staging/ and target/
> directories are in a state that is so far from being clean that you
> probably should condider a rebuild from scratch at that point.

reinstalling a non-library package should not make the build unworkable.

In anyway, having check-bin-arch take 15s, 60s or minutes for nothing
will still be faster than rebuilding from scratch.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Buildroot] [PATCH v3] core/pkg-generic: only save latest package list
  2019-01-05 10:23                     ` Nicolas Cavallari
@ 2019-01-05 16:39                       ` Yann E. MORIN
  0 siblings, 0 replies; 16+ messages in thread
From: Yann E. MORIN @ 2019-01-05 16:39 UTC (permalink / raw)
  To: buildroot

Nicolas, All,

On 2019-01-05 11:23 +0100, Nicolas Cavallari spake thusly:
> make qemu_x86_defconfig
> make
> make ifupdown-scripts-reinstall
> grep "ifupdown-scripts" output/build/packages-file-list.txt
> 
> =>
> ifupdown-scripts,./bin/busybox
[--SNIP--]
> ifupdown-scripts,./lib/libuClibc-1.0.31.so
> ifupdown-scripts,./lib/modules/4.16.7/modules.dep.bin
[--SNIP--]
> ifupdown-scripts,./lib32
> 
> Which is almost the whole system.
> 
> Now why is that ? The answer is here:
> 
> find . \( -type f -o -type l \) \
>           -newer $($(PKG)_DIR)/.stamp_built \
>           -exec printf '$(1),%s\n' {} + \
>           >> $(BUILD_DIR)/packages-file-list$(3).txt
> 
> Since .stamp_built has not been modified since the first time the
> package was built, this will list all files that have been installed
> after ifupdown-scripts was built.

OK, so there are in fact two problems in what you described:

  - upon a re-install (not a rebuild), files are improperly reported
    to the just-reinstalled package,

  - the list grows after every re-install (which could have been
    triggered by a re-build, a re-configure or a dirclean).

The first issue is a consequnce of commit 7fb6e7825. Before that, we
would md5sum all files prior to a package installation, then install the
package, and then md5sum all files again (with new files!), and report
lines that were not in the first set, on the assumption that either they
were new files, or modified files. Back then the first issue did not
exist.

But since 7fb6e7825, we have no starting point before the installation,
so we need to find one. The best we found back at that time was to list
all files that were "new" since the package was built.

Of course, we did not envision the reinstall step of an early package.
And sonce noone complained, we did not even notice it. And even if those
files were to grow up to 10-100MiB, that is not really a problem for the
vast majority of people in practice, so the size increase was not even
spottted either.

Which gets us to the size increase, which is not realy a very big
problem most of the time. It just gets exacerbated by the full-list
duplication. And yet again, for most people this is not really a problem
either.

So, I think a middle ground and practical solution, would be that we find
another object whose date can serve as the starting point for the -newer
test. Since we do not have that, we have no choice but to introduce a
new timestamp file, I'm afraid...

Note: in addition to the above, the refereenced patch in this thread
also dealt with check-uniq-files reporting the same package as
installing the same files more than once.

> With a bigger system, the list of files can be much larger. And without
> this patch, the list would grow by maybe the size of the whole system
> each time an early package is reinstalled.

Well, I would highly doubt the list of all files would be "the size of
the whole system" (but I do understand that it could contain almost all
files of a system when re-installign an early package, yes).

> >>> Compared to the rest of the build tree, those three files have very
> >>> small sizes in fact, and if they eventually end up big enough to be a
> >>> substantial part of the build tree, it's most probably time for a clean
> >>> build from scratch anyway, no?
> >>
> >> Having a big list would not a problem if nobody used the content of this
> >> file, but some tools scans it and use it to check files. After many
> >> recompiling, I remember having problems with one tool taking a
> >> considerable amount of time for each build because of all the duplicates
> >> in the list. But i don't remember which tool it was.
> > 
> > Are you talking about a tool in Buildroot? Can you try to provide more
> > details, please? Are you still able to reproduce the slowness on the
> > current git master?
> > 
> > The only two tools that make use of the packages-file-list.txt are
> > check-uniq-files (which actually checks that files are not installed by
> > two or more packages), and support/scripts/size-stats, which graphs a
> > pie-chart representing the size contribution of each package to the
> > content of target/. The former is always run at the end of the build, in
> > the target-finalize step, while the latter is only run on-demand.
> 
> I don't remember which tool it was, but it was probably check-bin-arch.

Arg, indeed I had mnissed that one... Thanks for spotting it! :-)

> It will scan the package file list and call readelf on all files. If the
> package file list has duplicates, then readelf will be called multiple
> times on the same files.  And it is also run at make $pkg-reinstall
> times, which fits my memory of make $pkg-reinstall being slower over time.

I see, indeed.

However, removing from the list all files installed by a package is also
incorrect anyway. Consider this sequence:

  - make
    - package foo installs f1, f2, and f3
    - this is listed in packages-file-list.txt

  - user modifies package foo source because f3 should not be installed
    in this configuration

  - make foo-reinstall
    - all entries of foo are removed from packages-file-list,txt
    - package foo installs f1 and f2

  - f3 is still in target and no longer assigned to a package

So, we have to keep files already installed by all paclages up until
now.

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-01-05 16:39 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-29 13:07 [Buildroot] [PATCH] core/pkg-generic: only save latest package list John Keeping
2018-04-30 16:47 ` Yann E. MORIN
2018-04-30 16:56   ` John Keeping
2018-04-30 19:41     ` Yann E. MORIN
2018-05-01 11:13       ` [Buildroot] [PATCH v2] " John Keeping
2018-05-01 12:04         ` Yann E. MORIN
2018-05-01 12:26           ` John Keeping
2018-05-01 12:28             ` [Buildroot] [PATCH v3] " John Keeping
2018-05-01 12:31               ` Yann E. MORIN
2018-05-01 13:26               ` Thomas Petazzoni
2018-05-01 21:01               ` Peter Korsgaard
2019-01-04 13:12               ` Yann E. MORIN
2019-01-04 15:30                 ` Nicolas Cavallari
2019-01-04 17:51                   ` Yann E. MORIN
2019-01-05 10:23                     ` Nicolas Cavallari
2019-01-05 16:39                       ` Yann E. MORIN

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.