linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] lib/lzo: performance improvements
@ 2018-11-27 16:19 Dave Rodgman
  2018-11-27 16:19 ` [PATCH 1/7] lib/lzo: tidy-up ifdefs Dave Rodgman
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Dave Rodgman @ 2018-11-27 16:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: nd, herbert, davem, Matt Sealey, nitingupta910, rpurdie, markus,
	minchan, sergey.senozhatsky.work, sonnyrao, gregkh, akpm

This patch series introduces performance improvements for lzo.

The previous version of this patchset is here:
https://lkml.org/lkml/2018/11/21/625

This version tidies up the ifdefs as per Christoph's comment (although
certainly more could be done, this is at least a bit more consistent
with normal kernel coding style).

On 23/11/2018 2:12 am, Sergey Senozhatsky wrote:

>> The graph below shows the weighted round-trip throughput of lzo, lz4 and
>> lzo-rle, for randomly generated 4k chunks of data with varying levels of
>> entropy. (To calculate weighted round-trip throughput, compression performance
>> is emphasised to reflect the fact that zram does around 2.25x more compression
>> than decompression.
> 
> Right. The number is data dependent. Not all swapped out pages can be
> compressed; compressed pages that end up being >= zs_huge_class_size() are
> considered incompressible and stored as it.
> 
> I'd say that on my setups around 50-60% of pages are incompressible.

So, just to give a bit more detail: the test setup was a Samsung
Chromebook Pro, cycling through 80 tabs in Chrome. With lzo-rle, only
5% of pages increased in size, and 90% of pages compress to 75% of
original size (or better). Mean compression ratio was 41%. Importantly
for lzo-rle, there are a lot of low-entropy pages where it can do well:
in total about 20% of the data is zeros forming part of a run of 4 or
more bytes.

As a quick summary of the impact of these patches on bigger chunks of
data, I've compared the performance of four different variants of lzo
on two large (~40 MB) files. The numbers show round-trip throughput
in MB/s:

Variant         | Low-entropy | High-entropy
Current lzo     |  242        | 157
Arm opts        |  290        | 159
RLE             |  876        | 151
Arm opts + RLE  | 1150        | 181

So both the Arm optimisations (8,16-byte copy & CTZ patches), and the
RLE implementation make a significant contribution to the overall
performance uplift.


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-12-03  2:53 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-27 16:19 [PATCH v2 0/7] lib/lzo: performance improvements Dave Rodgman
2018-11-27 16:19 ` [PATCH 1/7] lib/lzo: tidy-up ifdefs Dave Rodgman
2018-11-27 16:19 ` [PATCH 2/7] lib/lzo: clean-up by introducing COPY16 Dave Rodgman
2018-11-27 22:50   ` Andrew Morton
2018-11-27 16:19 ` [PATCH 3/7] lib/lzo: enable 64-bit CTZ on Arm Dave Rodgman
2018-11-27 16:19 ` [PATCH 4/7] lib/lzo: 64-bit CTZ on arm64 Dave Rodgman
2018-11-27 16:19 ` [PATCH 5/7] lib/lzo: fast 8-byte copy " Dave Rodgman
2018-11-27 16:19 ` [PATCH 6/7] lib/lzo: implement run-length encoding Dave Rodgman
2018-11-29  3:08   ` Sergey Senozhatsky
2018-11-29  3:11     ` Sergey Senozhatsky
2018-11-27 16:19 ` [PATCH 7/7] lib/lzo: separate lzo-rle from lzo Dave Rodgman
2018-11-29  4:43   ` Sergey Senozhatsky
2018-11-29 10:21     ` Dave Rodgman
2018-11-29 20:32       ` Andrew Morton
2018-11-30  3:05       ` Sergey Senozhatsky
2018-11-30 10:45     ` Dave Rodgman
2018-12-03  2:40       ` Sergey Senozhatsky
2018-12-03  2:53         ` Herbert Xu
2018-11-29  4:46 ` [PATCH v2 0/7] lib/lzo: performance improvements Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).