All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2] archiver: Configurable tarball compression
@ 2021-09-20 10:25 Ian Ray
  2021-09-24 16:09 ` [OE-core] " Khem Raj
  0 siblings, 1 reply; 5+ messages in thread
From: Ian Ray @ 2021-09-20 10:25 UTC (permalink / raw)
  To: openembedded-core; +Cc: ian.ray

In order to be more efficient, we use xz as compression method
to create GPL sources archives.

Signed-off-by: Fabien Lahoudere <fabien.lahoudere@collabora.com>
[V1 was https://patchwork.openembedded.org/patch/155985/]
[Rebased]
Signed-off-by: Ian Ray <ian.ray@ge.com>
---
 meta/classes/archiver.bbclass | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/meta/classes/archiver.bbclass b/meta/classes/archiver.bbclass
index dd31dc0..411d459 100644
--- a/meta/classes/archiver.bbclass
+++ b/meta/classes/archiver.bbclass
@@ -51,6 +51,7 @@ ARCHIVER_MODE[diff-exclude] ?= ".pc autom4te.cache patches"
 ARCHIVER_MODE[dumpdata] ?= "0"
 ARCHIVER_MODE[recipe] ?= "0"
 ARCHIVER_MODE[mirror] ?= "split"
+ARCHIVER_MODE[compression] ?= "gz"
 
 DEPLOY_DIR_SRC ?= "${DEPLOY_DIR}/sources"
 ARCHIVER_TOPDIR ?= "${WORKDIR}/archiver-sources"
@@ -409,15 +410,16 @@ def create_tarball(d, srcdir, suffix, ar_outdir):
     # that we archive the actual directory and not just the link.
     srcdir = os.path.realpath(srcdir)
 
+    compression_method = d.getVarFlag('ARCHIVER_MODE', 'compression')
     bb.utils.mkdirhier(ar_outdir)
     if suffix:
-        filename = '%s-%s.tar.gz' % (d.getVar('PF'), suffix)
+        filename = '%s-%s.tar.%s' % (d.getVar('PF'), suffix, compression_method)
     else:
-        filename = '%s.tar.gz' % d.getVar('PF')
+        filename = '%s.tar.%s' % (d.getVar('PF'), compression_method)
     tarname = os.path.join(ar_outdir, filename)
 
     bb.note('Creating %s' % tarname)
-    tar = tarfile.open(tarname, 'w:gz')
+    tar = tarfile.open(tarname, 'w:%s' % compression_method)
     tar.add(srcdir, arcname=os.path.basename(srcdir), filter=exclude_useless_paths)
     tar.close()
 
-- 
2.10.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [OE-core] [PATCH V2] archiver: Configurable tarball compression
  2021-09-20 10:25 [PATCH V2] archiver: Configurable tarball compression Ian Ray
@ 2021-09-24 16:09 ` Khem Raj
  0 siblings, 0 replies; 5+ messages in thread
From: Khem Raj @ 2021-09-24 16:09 UTC (permalink / raw)
  To: Ian Ray, openembedded-core



On 9/20/21 3:25 AM, Ian Ray wrote:
> In order to be more efficient, we use xz as compression method
> to create GPL sources archives.
> 
> Signed-off-by: Fabien Lahoudere <fabien.lahoudere@collabora.com>
> [V1 was https://patchwork.openembedded.org/patch/155985/]
> [Rebased]

xz has its own mind when it comes to parallel threads, how do we control 
that here? Does it use the global setting to control the number of xz 
threads ?

> Signed-off-by: Ian Ray <ian.ray@ge.com>
> ---
>   meta/classes/archiver.bbclass | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/meta/classes/archiver.bbclass b/meta/classes/archiver.bbclass
> index dd31dc0..411d459 100644
> --- a/meta/classes/archiver.bbclass
> +++ b/meta/classes/archiver.bbclass
> @@ -51,6 +51,7 @@ ARCHIVER_MODE[diff-exclude] ?= ".pc autom4te.cache patches"
>   ARCHIVER_MODE[dumpdata] ?= "0"
>   ARCHIVER_MODE[recipe] ?= "0"
>   ARCHIVER_MODE[mirror] ?= "split"
> +ARCHIVER_MODE[compression] ?= "gz"
>   
>   DEPLOY_DIR_SRC ?= "${DEPLOY_DIR}/sources"
>   ARCHIVER_TOPDIR ?= "${WORKDIR}/archiver-sources"
> @@ -409,15 +410,16 @@ def create_tarball(d, srcdir, suffix, ar_outdir):
>       # that we archive the actual directory and not just the link.
>       srcdir = os.path.realpath(srcdir)
>   
> +    compression_method = d.getVarFlag('ARCHIVER_MODE', 'compression')
>       bb.utils.mkdirhier(ar_outdir)
>       if suffix:
> -        filename = '%s-%s.tar.gz' % (d.getVar('PF'), suffix)
> +        filename = '%s-%s.tar.%s' % (d.getVar('PF'), suffix, compression_method)
>       else:
> -        filename = '%s.tar.gz' % d.getVar('PF')
> +        filename = '%s.tar.%s' % (d.getVar('PF'), compression_method)
>       tarname = os.path.join(ar_outdir, filename)
>   
>       bb.note('Creating %s' % tarname)
> -    tar = tarfile.open(tarname, 'w:gz')
> +    tar = tarfile.open(tarname, 'w:%s' % compression_method)
>       tar.add(srcdir, arcname=os.path.basename(srcdir), filter=exclude_useless_paths)
>       tar.close()
>   
> 
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#156329): https://lists.openembedded.org/g/openembedded-core/message/156329
> Mute This Topic: https://lists.openembedded.org/mt/85841980/1997914
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [raj.khem@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [OE-core] [PATCH V2] archiver: Configurable tarball compression
       [not found]   ` <16A6D8EA388563F3.1316@lists.openembedded.org>
@ 2021-09-21 13:48     ` Michael Opdenacker
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Opdenacker @ 2021-09-21 13:48 UTC (permalink / raw)
  To: openembedded-core


On 9/21/21 3:20 PM, Michael Opdenacker wrote:
> By the way, zstd seems to be marginally worse (+1%) than xz in terms of
> compressed size, but is orders of magnitude faster (see
> https://archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/).


Actually, this article only mentions decompression speed, but that's
also true for compression speed.

Here are my own tests:

mike@mike-laptop:~/tmp$ time gzip linux-5.15-rc2.tar

real    0m29.293s
user    0m28.712s
sys    0m0.553s

mike@mike-laptop:~/tmp$ time xz linux-5.15-rc2.tar

real    7m2.658s
user    7m1.096s
sys    0m1.280s

mike@mike-laptop:~/tmp$ time zstd linux-5.15-rc2.tar
linux-5.15-rc2.tar   : 16.29%   (1136803840 => 185233271 bytes,
linux-5.15-rc2.tar.zst)

real    0m5.476s
user    0m5.530s
sys    0m0.864s

mike@mike-laptop:~/tmp$ ls -la linux-5.15*
-rw-rw-r-- 1 mike mike 1136803840 Sep 21 15:31 linux-5.15-rc2.tar
-rw-rw-r-- 1 mike mike  198135832 Sep 21 15:24 linux-5.15-rc2.tar.gz
-rw-rw-r-- 1 mike mike  125980548 Sep 21 15:26 linux-5.15-rc2.tar.xz
-rw-rw-r-- 1 mike mike  185233271 Sep 21 15:31 linux-5.15-rc2.tar.zst

So, here the claim that zstd (with default options) is almost as good as
xz in compressed size is not confirmed. However, zstd is a clear winner
in terms of compression speed, and anyway better than gzip. This is
worth switching.

Cheers

Michael

-- 
Michael Opdenacker, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [OE-core] [PATCH V2] archiver: Configurable tarball compression
  2021-09-21 12:18 ` [OE-core] " Richard Purdie
@ 2021-09-21 13:20   ` Michael Opdenacker
       [not found]   ` <16A6D8EA388563F3.1316@lists.openembedded.org>
  1 sibling, 0 replies; 5+ messages in thread
From: Michael Opdenacker @ 2021-09-21 13:20 UTC (permalink / raw)
  To: Richard Purdie, Ian Ray, openembedded-core


On 9/21/21 2:18 PM, Richard Purdie wrote:
> On Tue, 2021-09-21 at 09:15 +0300, Ian Ray wrote:
>> In order to be more efficient, we use xz as compression method
>> to create GPL sources archives.
>>
>> Signed-off-by: Fabien Lahoudere <fabien.lahoudere@collabora.com>
>> [V1 was https://patchwork.openembedded.org/patch/155985/]
>> [Rebased]
>> Signed-off-by: Ian Ray <ian.ray@ge.com>
>> ---
>>  meta/classes/archiver.bbclass | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
> Would it be better just to move to zstd and rather than making it configurable,
> just switch to the better compression format?
>
> Configurability is good but where there is a clear good choice, it may be better
> just to do that rather than giving too many options if they aren't really
> needed?


I agree. We shouldn't add unnecessary complexity to the manuals ;-)

By the way, zstd seems to be marginally worse (+1%) than xz in terms of
compressed size, but is orders of magnitude faster (see
https://archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/).

I vote for zstd.

Thanks for starting the discussion.

Cheers,
Michael.

>
> Cheers,
>
> Richard
>
>
> 
>
-- 
Michael Opdenacker, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [OE-core] [PATCH V2] archiver: Configurable tarball compression
  2021-09-21  6:15 Ian Ray
@ 2021-09-21 12:18 ` Richard Purdie
  2021-09-21 13:20   ` Michael Opdenacker
       [not found]   ` <16A6D8EA388563F3.1316@lists.openembedded.org>
  0 siblings, 2 replies; 5+ messages in thread
From: Richard Purdie @ 2021-09-21 12:18 UTC (permalink / raw)
  To: Ian Ray, openembedded-core

On Tue, 2021-09-21 at 09:15 +0300, Ian Ray wrote:
> In order to be more efficient, we use xz as compression method
> to create GPL sources archives.
> 
> Signed-off-by: Fabien Lahoudere <fabien.lahoudere@collabora.com>
> [V1 was https://patchwork.openembedded.org/patch/155985/]
> [Rebased]
> Signed-off-by: Ian Ray <ian.ray@ge.com>
> ---
>  meta/classes/archiver.bbclass | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)

Would it be better just to move to zstd and rather than making it configurable,
just switch to the better compression format?

Configurability is good but where there is a clear good choice, it may be better
just to do that rather than giving too many options if they aren't really
needed?

Cheers,

Richard


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-24 16:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-20 10:25 [PATCH V2] archiver: Configurable tarball compression Ian Ray
2021-09-24 16:09 ` [OE-core] " Khem Raj
2021-09-21  6:15 Ian Ray
2021-09-21 12:18 ` [OE-core] " Richard Purdie
2021-09-21 13:20   ` Michael Opdenacker
     [not found]   ` <16A6D8EA388563F3.1316@lists.openembedded.org>
2021-09-21 13:48     ` Michael Opdenacker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.