* [PATCH V2] archiver: Configurable tarball compression
@ 2021-09-21 6:15 Ian Ray
2021-09-21 12:18 ` [OE-core] " Richard Purdie
0 siblings, 1 reply; 6+ messages in thread
From: Ian Ray @ 2021-09-21 6:15 UTC (permalink / raw)
To: openembedded-core; +Cc: ian.ray
In order to be more efficient, we use xz as compression method
to create GPL sources archives.
Signed-off-by: Fabien Lahoudere <fabien.lahoudere@collabora.com>
[V1 was https://patchwork.openembedded.org/patch/155985/]
[Rebased]
Signed-off-by: Ian Ray <ian.ray@ge.com>
---
meta/classes/archiver.bbclass | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/meta/classes/archiver.bbclass b/meta/classes/archiver.bbclass
index dd31dc0..411d459 100644
--- a/meta/classes/archiver.bbclass
+++ b/meta/classes/archiver.bbclass
@@ -51,6 +51,7 @@ ARCHIVER_MODE[diff-exclude] ?= ".pc autom4te.cache patches"
ARCHIVER_MODE[dumpdata] ?= "0"
ARCHIVER_MODE[recipe] ?= "0"
ARCHIVER_MODE[mirror] ?= "split"
+ARCHIVER_MODE[compression] ?= "gz"
DEPLOY_DIR_SRC ?= "${DEPLOY_DIR}/sources"
ARCHIVER_TOPDIR ?= "${WORKDIR}/archiver-sources"
@@ -409,15 +410,16 @@ def create_tarball(d, srcdir, suffix, ar_outdir):
# that we archive the actual directory and not just the link.
srcdir = os.path.realpath(srcdir)
+ compression_method = d.getVarFlag('ARCHIVER_MODE', 'compression')
bb.utils.mkdirhier(ar_outdir)
if suffix:
- filename = '%s-%s.tar.gz' % (d.getVar('PF'), suffix)
+ filename = '%s-%s.tar.%s' % (d.getVar('PF'), suffix, compression_method)
else:
- filename = '%s.tar.gz' % d.getVar('PF')
+ filename = '%s.tar.%s' % (d.getVar('PF'), compression_method)
tarname = os.path.join(ar_outdir, filename)
bb.note('Creating %s' % tarname)
- tar = tarfile.open(tarname, 'w:gz')
+ tar = tarfile.open(tarname, 'w:%s' % compression_method)
tar.add(srcdir, arcname=os.path.basename(srcdir), filter=exclude_useless_paths)
tar.close()
--
2.10.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [OE-core] [PATCH V2] archiver: Configurable tarball compression
2021-09-21 6:15 [PATCH V2] archiver: Configurable tarball compression Ian Ray
@ 2021-09-21 12:18 ` Richard Purdie
2021-09-21 13:20 ` Michael Opdenacker
[not found] ` <16A6D8EA388563F3.1316@lists.openembedded.org>
0 siblings, 2 replies; 6+ messages in thread
From: Richard Purdie @ 2021-09-21 12:18 UTC (permalink / raw)
To: Ian Ray, openembedded-core
On Tue, 2021-09-21 at 09:15 +0300, Ian Ray wrote:
> In order to be more efficient, we use xz as compression method
> to create GPL sources archives.
>
> Signed-off-by: Fabien Lahoudere <fabien.lahoudere@collabora.com>
> [V1 was https://patchwork.openembedded.org/patch/155985/]
> [Rebased]
> Signed-off-by: Ian Ray <ian.ray@ge.com>
> ---
> meta/classes/archiver.bbclass | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
Would it be better just to move to zstd and rather than making it configurable,
just switch to the better compression format?
Configurability is good but where there is a clear good choice, it may be better
just to do that rather than giving too many options if they aren't really
needed?
Cheers,
Richard
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [OE-core] [PATCH V2] archiver: Configurable tarball compression
2021-09-21 12:18 ` [OE-core] " Richard Purdie
@ 2021-09-21 13:20 ` Michael Opdenacker
[not found] ` <16A6D8EA388563F3.1316@lists.openembedded.org>
1 sibling, 0 replies; 6+ messages in thread
From: Michael Opdenacker @ 2021-09-21 13:20 UTC (permalink / raw)
To: Richard Purdie, Ian Ray, openembedded-core
On 9/21/21 2:18 PM, Richard Purdie wrote:
> On Tue, 2021-09-21 at 09:15 +0300, Ian Ray wrote:
>> In order to be more efficient, we use xz as compression method
>> to create GPL sources archives.
>>
>> Signed-off-by: Fabien Lahoudere <fabien.lahoudere@collabora.com>
>> [V1 was https://patchwork.openembedded.org/patch/155985/]
>> [Rebased]
>> Signed-off-by: Ian Ray <ian.ray@ge.com>
>> ---
>> meta/classes/archiver.bbclass | 8 +++++---
>> 1 file changed, 5 insertions(+), 3 deletions(-)
> Would it be better just to move to zstd and rather than making it configurable,
> just switch to the better compression format?
>
> Configurability is good but where there is a clear good choice, it may be better
> just to do that rather than giving too many options if they aren't really
> needed?
I agree. We shouldn't add unnecessary complexity to the manuals ;-)
By the way, zstd seems to be marginally worse (+1%) than xz in terms of
compressed size, but is orders of magnitude faster (see
https://archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/).
I vote for zstd.
Thanks for starting the discussion.
Cheers,
Michael.
>
> Cheers,
>
> Richard
>
>
>
>
--
Michael Opdenacker, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [OE-core] [PATCH V2] archiver: Configurable tarball compression
[not found] ` <16A6D8EA388563F3.1316@lists.openembedded.org>
@ 2021-09-21 13:48 ` Michael Opdenacker
2021-09-22 8:29 ` EXT: " Ian Ray
2021-10-27 11:31 ` Martyn Welch
0 siblings, 2 replies; 6+ messages in thread
From: Michael Opdenacker @ 2021-09-21 13:48 UTC (permalink / raw)
To: openembedded-core
On 9/21/21 3:20 PM, Michael Opdenacker wrote:
> By the way, zstd seems to be marginally worse (+1%) than xz in terms of
> compressed size, but is orders of magnitude faster (see
> https://archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/).
Actually, this article only mentions decompression speed, but that's
also true for compression speed.
Here are my own tests:
mike@mike-laptop:~/tmp$ time gzip linux-5.15-rc2.tar
real 0m29.293s
user 0m28.712s
sys 0m0.553s
mike@mike-laptop:~/tmp$ time xz linux-5.15-rc2.tar
real 7m2.658s
user 7m1.096s
sys 0m1.280s
mike@mike-laptop:~/tmp$ time zstd linux-5.15-rc2.tar
linux-5.15-rc2.tar : 16.29% (1136803840 => 185233271 bytes,
linux-5.15-rc2.tar.zst)
real 0m5.476s
user 0m5.530s
sys 0m0.864s
mike@mike-laptop:~/tmp$ ls -la linux-5.15*
-rw-rw-r-- 1 mike mike 1136803840 Sep 21 15:31 linux-5.15-rc2.tar
-rw-rw-r-- 1 mike mike 198135832 Sep 21 15:24 linux-5.15-rc2.tar.gz
-rw-rw-r-- 1 mike mike 125980548 Sep 21 15:26 linux-5.15-rc2.tar.xz
-rw-rw-r-- 1 mike mike 185233271 Sep 21 15:31 linux-5.15-rc2.tar.zst
So, here the claim that zstd (with default options) is almost as good as
xz in compressed size is not confirmed. However, zstd is a clear winner
in terms of compression speed, and anyway better than gzip. This is
worth switching.
Cheers
Michael
--
Michael Opdenacker, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: EXT: Re: [OE-core] [PATCH V2] archiver: Configurable tarball compression
2021-09-21 13:48 ` Michael Opdenacker
@ 2021-09-22 8:29 ` Ian Ray
2021-10-27 11:31 ` Martyn Welch
1 sibling, 0 replies; 6+ messages in thread
From: Ian Ray @ 2021-09-22 8:29 UTC (permalink / raw)
To: Michael Opdenacker; +Cc: openembedded-core
On Tue, Sep 21, 2021 at 03:48:29PM +0200, Michael Opdenacker wrote:
>
> On 9/21/21 3:20 PM, Michael Opdenacker wrote:
> > By the way, zstd seems to be marginally worse (+1%) than xz in terms of
> > compressed size, but is orders of magnitude faster (see
> > https://archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/).
>
>
> Actually, this article only mentions decompression speed, but that's
> also true for compression speed.
>
> Here are my own tests:
>
> mike@mike-laptop:~/tmp$ time gzip linux-5.15-rc2.tar
>
> real 0m29.293s
> user 0m28.712s
> sys 0m0.553s
>
> mike@mike-laptop:~/tmp$ time xz linux-5.15-rc2.tar
>
> real 7m2.658s
> user 7m1.096s
> sys 0m1.280s
>
> mike@mike-laptop:~/tmp$ time zstd linux-5.15-rc2.tar
> linux-5.15-rc2.tar : 16.29% (1136803840 => 185233271 bytes,
> linux-5.15-rc2.tar.zst)
>
> real 0m5.476s
> user 0m5.530s
> sys 0m0.864s
>
> mike@mike-laptop:~/tmp$ ls -la linux-5.15*
> -rw-rw-r-- 1 mike mike 1136803840 Sep 21 15:31 linux-5.15-rc2.tar
> -rw-rw-r-- 1 mike mike 198135832 Sep 21 15:24 linux-5.15-rc2.tar.gz
> -rw-rw-r-- 1 mike mike 125980548 Sep 21 15:26 linux-5.15-rc2.tar.xz
> -rw-rw-r-- 1 mike mike 185233271 Sep 21 15:31 linux-5.15-rc2.tar.zst
>
> So, here the claim that zstd (with default options) is almost as good as
> xz in compressed size is not confirmed. However, zstd is a clear winner
> in terms of compression speed, and anyway better than gzip. This is
> worth switching.
Thank you for measuring this!
I will re-submit the patch when we update to a more recent Yocto
version.
>
> Cheers
>
> Michael
>
> --
> Michael Opdenacker, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH V2] archiver: Configurable tarball compression
2021-09-21 13:48 ` Michael Opdenacker
2021-09-22 8:29 ` EXT: " Ian Ray
@ 2021-10-27 11:31 ` Martyn Welch
1 sibling, 0 replies; 6+ messages in thread
From: Martyn Welch @ 2021-10-27 11:31 UTC (permalink / raw)
To: openembedded-core
[-- Attachment #1: Type: text/plain, Size: 3567 bytes --]
>
> So, here the claim that zstd (with default options) is almost as good as
> xz in compressed size is not confirmed. However, zstd is a clear winner
> in terms of compression speed, and anyway better than gzip. This is
> worth switching.
>
That claim doesn't seem to be confirmed with any of the (admittedly small) selection of archives I tried, with zstd compression being approx 21 to 69% less efficient in terms of storage space than xz compression, but still being the best in terms of compression and decompression speeds.
However, I think neatly highlights why it may make sense to make this configurable, as which algorithm is "best" is going to depend on whether you're optimising for (de)compression speed or size.
Testing results below,
Martyn
---
$ time gzip -k linux-5.14.tar
real 0m26.807s
user 0m26.392s
sys 0m0.368s
$ time xz -k linux-5.14.tar
real 6m42.494s
user 6m40.167s
sys 0m1.757s
$ time zstd -k linux-5.14.tar
linux-5.14.tar : 16.28% (1126737920 => 183398470 bytes, linux-5.14.tar.zst)
real 0m3.531s
user 0m3.631s
sys 0m0.509s
$ ls -la *
-rw-r--r-- 1 martyn martyn 1126737920 Oct 27 10:54 linux-5.14.tar
-rw-r--r-- 1 martyn martyn 196107916 Oct 27 10:54 linux-5.14.tar.gz
-rw-r--r-- 1 martyn martyn 124724612 Oct 27 10:54 linux-5.14.tar.xz
-rw-r--r-- 1 martyn martyn 183398470 Oct 27 10:54 linux-5.14.tar.zst
$ time gunzip linux-5.14.tar.gz
real 0m5.141s
user 0m4.462s
sys 0m0.613s
$ time xz -d linux-5.14.tar.xz
real 0m8.571s
user 0m7.739s
sys 0m0.820s
$ time zstd -d linux-5.14.tar.zst
linux-5.14.tar.zst : 1126737920 bytes
real 0m1.906s
user 0m1.185s
sys 0m0.710s
$ time gzip -k coreutils-9.0.tar
real 0m1.685s
user 0m1.669s
sys 0m0.016s
$ time xz -k coreutils-9.0.tar
real 0m14.891s
user 0m14.795s
sys 0m0.060s
$ time zstd -k coreutils-9.0.tar
coreutils-9.0.tar : 19.21% (54394880 => 10447053 bytes, coreutils-9.0.tar.zst)
real 0m0.207s
user 0m0.215s
sys 0m0.029s
$ ls -la coreutils-9.0.tar*
-rw-r--r-- 1 martyn martyn 54394880 Oct 27 11:16 coreutils-9.0.tar
-rw-r--r-- 1 martyn martyn 13595007 Oct 27 11:16 coreutils-9.0.tar.gz
-rw-r--r-- 1 martyn martyn 6177372 Oct 27 11:16 coreutils-9.0.tar.xz
-rw-r--r-- 1 martyn martyn 10447053 Oct 27 11:16 coreutils-9.0.tar.zst
$ time gzip -d coreutils-9.0.tar.gz
real 0m0.362s
user 0m0.280s
sys 0m0.048s
$ time xz -d coreutils-9.0.tar.xz
real 0m0.444s
user 0m0.424s
sys 0m0.020s
$ time zstd -d coreutils-9.0.tar.zst
coreutils-9.0.tar.zst: 54394880 bytes
real 0m0.095s
user 0m0.044s
sys 0m0.052s
$ time gzip -k tcp_wrappers_7.6.tar
real 0m0.033s
user 0m0.033s
sys 0m0.000s
$ time xz -k tcp_wrappers_7.6.tar
real 0m0.116s
user 0m0.104s
sys 0m0.012s
$ time zstd -k tcp_wrappers_7.6.tar
tcp_wrappers_7.6.tar : 26.57% (360448 => 95772 bytes, tcp_wrappers_7.6.tar.zst)
real 0m0.006s
user 0m0.003s
sys 0m0.003s
$ ls -la tcp_wrappers_7.6.tar*
-rw-r--r-- 1 martyn martyn 360448 Oct 27 11:15 tcp_wrappers_7.6.tar
-rw-r--r-- 1 martyn martyn 99459 Oct 27 11:15 tcp_wrappers_7.6.tar.gz
-rw-r--r-- 1 martyn martyn 79316 Oct 27 11:15 tcp_wrappers_7.6.tar.xz
-rw-r--r-- 1 martyn martyn 95772 Oct 27 11:15 tcp_wrappers_7.6.tar.zst
$ time gzip -d tcp_wrappers_7.6.tar.gz
real 0m0.008s
user 0m0.004s
sys 0m0.004s
$ time xz -d tcp_wrappers_7.6.tar.xz
real 0m0.019s
user 0m0.015s
sys 0m0.004s
$ time zstd -d tcp_wrappers_7.6.tar.zst
tcp_wrappers_7.6.tar.zst: 360448 bytes
real 0m0.005s
user 0m0.000s
sys 0m0.005s
[-- Attachment #2: Type: text/html, Size: 8010 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-10-27 11:31 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-21 6:15 [PATCH V2] archiver: Configurable tarball compression Ian Ray
2021-09-21 12:18 ` [OE-core] " Richard Purdie
2021-09-21 13:20 ` Michael Opdenacker
[not found] ` <16A6D8EA388563F3.1316@lists.openembedded.org>
2021-09-21 13:48 ` Michael Opdenacker
2021-09-22 8:29 ` EXT: " Ian Ray
2021-10-27 11:31 ` Martyn Welch
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.