All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] lib/lzo: performance improvements
@ 2018-11-27 16:19 Dave Rodgman
  2018-11-27 16:19 ` [PATCH 1/7] lib/lzo: tidy-up ifdefs Dave Rodgman
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Dave Rodgman @ 2018-11-27 16:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: nd, herbert, davem, Matt Sealey, nitingupta910, rpurdie, markus,
	minchan, sergey.senozhatsky.work, sonnyrao, gregkh, akpm

This patch series introduces performance improvements for lzo.

The previous version of this patchset is here:
https://lkml.org/lkml/2018/11/21/625

This version tidies up the ifdefs as per Christoph's comment (although
certainly more could be done, this is at least a bit more consistent
with normal kernel coding style).

On 23/11/2018 2:12 am, Sergey Senozhatsky wrote:

>> The graph below shows the weighted round-trip throughput of lzo, lz4 and
>> lzo-rle, for randomly generated 4k chunks of data with varying levels of
>> entropy. (To calculate weighted round-trip throughput, compression performance
>> is emphasised to reflect the fact that zram does around 2.25x more compression
>> than decompression.
> 
> Right. The number is data dependent. Not all swapped out pages can be
> compressed; compressed pages that end up being >= zs_huge_class_size() are
> considered incompressible and stored as it.
> 
> I'd say that on my setups around 50-60% of pages are incompressible.

So, just to give a bit more detail: the test setup was a Samsung
Chromebook Pro, cycling through 80 tabs in Chrome. With lzo-rle, only
5% of pages increased in size, and 90% of pages compress to 75% of
original size (or better). Mean compression ratio was 41%. Importantly
for lzo-rle, there are a lot of low-entropy pages where it can do well:
in total about 20% of the data is zeros forming part of a run of 4 or
more bytes.

As a quick summary of the impact of these patches on bigger chunks of
data, I've compared the performance of four different variants of lzo
on two large (~40 MB) files. The numbers show round-trip throughput
in MB/s:

Variant         | Low-entropy | High-entropy
Current lzo     |  242        | 157
Arm opts        |  290        | 159
RLE             |  876        | 151
Arm opts + RLE  | 1150        | 181

So both the Arm optimisations (8,16-byte copy & CTZ patches), and the
RLE implementation make a significant contribution to the overall
performance uplift.


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-12-03  2:53 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-27 16:19 [PATCH v2 0/7] lib/lzo: performance improvements Dave Rodgman
2018-11-27 16:19 ` [PATCH 1/7] lib/lzo: tidy-up ifdefs Dave Rodgman
2018-11-27 16:19 ` [PATCH 2/7] lib/lzo: clean-up by introducing COPY16 Dave Rodgman
2018-11-27 22:50   ` Andrew Morton
2018-11-27 16:19 ` [PATCH 3/7] lib/lzo: enable 64-bit CTZ on Arm Dave Rodgman
2018-11-27 16:19 ` [PATCH 4/7] lib/lzo: 64-bit CTZ on arm64 Dave Rodgman
2018-11-27 16:19 ` [PATCH 5/7] lib/lzo: fast 8-byte copy " Dave Rodgman
2018-11-27 16:19 ` [PATCH 6/7] lib/lzo: implement run-length encoding Dave Rodgman
2018-11-29  3:08   ` Sergey Senozhatsky
2018-11-29  3:11     ` Sergey Senozhatsky
2018-11-27 16:19 ` [PATCH 7/7] lib/lzo: separate lzo-rle from lzo Dave Rodgman
2018-11-29  4:43   ` Sergey Senozhatsky
2018-11-29 10:21     ` Dave Rodgman
2018-11-29 20:32       ` Andrew Morton
2018-11-30  3:05       ` Sergey Senozhatsky
2018-11-30 10:45     ` Dave Rodgman
2018-12-03  2:40       ` Sergey Senozhatsky
2018-12-03  2:53         ` Herbert Xu
2018-11-29  4:46 ` [PATCH v2 0/7] lib/lzo: performance improvements Sergey Senozhatsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.