All of lore.kernel.org
 help / color / mirror / Atom feed
* mm performance with zram
@ 2015-01-08 22:49 Luigi Semenzato
  2015-01-09  6:30 ` Andrew Morton
  2015-01-13  9:18 ` Vlastimil Babka
  0 siblings, 2 replies; 8+ messages in thread
From: Luigi Semenzato @ 2015-01-08 22:49 UTC (permalink / raw)
  To: linux-mm

I am taking a closer look at the performance of the Linux MM in the
context of heavy zram usage.  The bottom line is that there is
surprisingly high overhead (35-40%) from MM code other than
compression/decompression routines.  I'd like to share some results in
the hope they will be helpful in planning future development.

SETUP

I am running on an ASUS Chromebox with about 2GB RAM (actually 4GB,
but with mem=1931M).  The zram block device size is approx. 2.8GB
(uncompressed size).

http://www.amazon.com/Asus-CHROMEBOX-M004U-ASUS-Desktop/dp/B00IT1WJZQ

Intel(R) Celeron(R) 2955U @ 1.40GHz
MemTotal:        1930456 kB
SwapTotal:       2827816 kB

I took the kernel from Linus's tree a few days ago: Linux localhost
3.19.0-rc2+ (...) x86_64.  I also set maxcpus=1.  The kernel
configuration is available if needed.

EXPERIMENTS

I wrote a page walker (historically called "balloon") which allocates
a lot of memory, more than physical RAM, and fills it with a dump of
/dev/mem from a Chrome OS system running at capacity.  The memory
compresses down to about 35%.  I ran two main experiments.

1. Compression/decompression.  After filling the memory, the program
touches the first byte of all pages in a random permutation (I tried
sequentially too, it makes little difference).  At steady state, this
forces one page decompression and one compression (on average) at each
step of the walk.

2. Decompression only.  After filling the memory, the program walks
all pages sequentially.  Then it frees the second half of the pages
(the ones most recently touched), and walks the first half.  This
causes one page decompression at each step, and almost no
compressions.

RESULTS

The average time (real time) to walk a page in microseconds is

experiment 1 (compress + decompress): 26.5  us/page
experiment 2 (decompress only): 9.3 us/page

I ran "perf record -ag"during the relevant parts of the experiment.
(CAVEAT: the version of perf I used doesn't match the kernel, it's
quite a bit older, but that should be mostly OK).  I put the output of
"perf report" in this Google Drive folder:

https://drive.google.com/folderview?id=0B6kmZ3mOd0bzVzJKeTV6eExfeFE&usp=sharing

(You shouldn't need a Google ID to access it.  You may have to re-join
the link if the plain text mailer splits it into multiple lines.)

I also tried to analyze cumulative graph profiles.  Interestingly the
only tool I found to do this is gprof2dot (any other suggestion?  I
would prefer a text-based tool).  The output is in the .png files in
the same folder.  The interesting numbers are:

experiment 1
compression 43.2%
decompression 20.4%
everything else 36.4%

experiment 2
decompression 61.7%
everything else 38.3%

The graph profiles don't seem to show low-hanging fruits on any path.

CONCLUSION

Before zram, in a situation involving swapping, the MM overhead was
probably nearly invisible, especially with rotating disks.  But with
zram the MM is surprisingly close to being the main bottleneck.
Compression/decompression speeds will likely improve, and they are
tuneable (tradeoff between compression ratio and speed).  Compression
can happen often in the background, so decompression speed is more
important for latency, and LZ4 decompression can already be a lot
faster than LZO (the experiments use LZO, and LZ4 can be 2x faster).
This suggests that simplifying and speeding up the relevant code paths
in the Linux MM may be worth the effort.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mm performance with zram
  2015-01-08 22:49 mm performance with zram Luigi Semenzato
@ 2015-01-09  6:30 ` Andrew Morton
  2015-01-09 16:45   ` Luigi Semenzato
  2015-01-13  9:18 ` Vlastimil Babka
  1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2015-01-09  6:30 UTC (permalink / raw)
  To: Luigi Semenzato; +Cc: linux-mm

On Thu, 8 Jan 2015 14:49:45 -0800 Luigi Semenzato <semenzato@google.com> wrote:

> I am taking a closer look at the performance of the Linux MM in the
> context of heavy zram usage.  The bottom line is that there is
> surprisingly high overhead (35-40%) from MM code other than
> compression/decompression routines.

Those images hurt my eyes.

Did you work out where the time is being spent?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mm performance with zram
  2015-01-09  6:30 ` Andrew Morton
@ 2015-01-09 16:45   ` Luigi Semenzato
  2015-01-10  0:06     ` Joonsoo Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Luigi Semenzato @ 2015-01-09 16:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm

On Thu, Jan 8, 2015 at 10:30 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Thu, 8 Jan 2015 14:49:45 -0800 Luigi Semenzato <semenzato@google.com> wrote:
>
>> I am taking a closer look at the performance of the Linux MM in the
>> context of heavy zram usage.  The bottom line is that there is
>> surprisingly high overhead (35-40%) from MM code other than
>> compression/decompression routines.
>
> Those images hurt my eyes.

Sorry about that.  I didn't find other ways of computing the
cumulative cost of functions (i.e. time spent in a function and all
its descendants, like in gprof).  I couldn't get perf to do that
either.  A flat profile shows most functions take a fracion of 1%, so
it's not useful.  If anybody knows a better way I'll be glad to use
it.

> Did you work out where the time is being spent?

No, unfortunately it's difficult to make sense of the graph profile as
well, especially with my low familiarity with the code.  There is a
surprising number of different callers into the heaviest nodes and I
cannot tell which paths correspond to which high-level actions.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mm performance with zram
  2015-01-09 16:45   ` Luigi Semenzato
@ 2015-01-10  0:06     ` Joonsoo Kim
  2015-01-10  0:19       ` Luigi Semenzato
  0 siblings, 1 reply; 8+ messages in thread
From: Joonsoo Kim @ 2015-01-10  0:06 UTC (permalink / raw)
  To: Luigi Semenzato; +Cc: Andrew Morton, Linux Memory Management List

2015-01-10 1:45 GMT+09:00 Luigi Semenzato <semenzato@google.com>:
> On Thu, Jan 8, 2015 at 10:30 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
>> On Thu, 8 Jan 2015 14:49:45 -0800 Luigi Semenzato <semenzato@google.com> wrote:
>>
>>> I am taking a closer look at the performance of the Linux MM in the
>>> context of heavy zram usage.  The bottom line is that there is
>>> surprisingly high overhead (35-40%) from MM code other than
>>> compression/decompression routines.
>>
>> Those images hurt my eyes.
>
> Sorry about that.  I didn't find other ways of computing the
> cumulative cost of functions (i.e. time spent in a function and all
> its descendants, like in gprof).  I couldn't get perf to do that
> either.  A flat profile shows most functions take a fracion of 1%, so
> it's not useful.  If anybody knows a better way I'll be glad to use
> it.

Hello,

Recent version of perf has an ability to compute cumulative cost of functions.
And, it's a default configuration. :)
If you change your perf to recent version, you can easily get the data.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mm performance with zram
  2015-01-10  0:06     ` Joonsoo Kim
@ 2015-01-10  0:19       ` Luigi Semenzato
  2015-01-10  0:42         ` Joonsoo Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Luigi Semenzato @ 2015-01-10  0:19 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Andrew Morton, Linux Memory Management List

Thank you!  I am using perf version 3.13.11.10, will look for newer versions.

On Fri, Jan 9, 2015 at 4:06 PM, Joonsoo Kim <js1304@gmail.com> wrote:
> 2015-01-10 1:45 GMT+09:00 Luigi Semenzato <semenzato@google.com>:
>> On Thu, Jan 8, 2015 at 10:30 PM, Andrew Morton
>> <akpm@linux-foundation.org> wrote:
>>> On Thu, 8 Jan 2015 14:49:45 -0800 Luigi Semenzato <semenzato@google.com> wrote:
>>>
>>>> I am taking a closer look at the performance of the Linux MM in the
>>>> context of heavy zram usage.  The bottom line is that there is
>>>> surprisingly high overhead (35-40%) from MM code other than
>>>> compression/decompression routines.
>>>
>>> Those images hurt my eyes.
>>
>> Sorry about that.  I didn't find other ways of computing the
>> cumulative cost of functions (i.e. time spent in a function and all
>> its descendants, like in gprof).  I couldn't get perf to do that
>> either.  A flat profile shows most functions take a fracion of 1%, so
>> it's not useful.  If anybody knows a better way I'll be glad to use
>> it.
>
> Hello,
>
> Recent version of perf has an ability to compute cumulative cost of functions.
> And, it's a default configuration. :)
> If you change your perf to recent version, you can easily get the data.
>
> Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mm performance with zram
  2015-01-10  0:19       ` Luigi Semenzato
@ 2015-01-10  0:42         ` Joonsoo Kim
  0 siblings, 0 replies; 8+ messages in thread
From: Joonsoo Kim @ 2015-01-10  0:42 UTC (permalink / raw)
  To: Luigi Semenzato; +Cc: Andrew Morton, Linux Memory Management List

2015-01-10 9:19 GMT+09:00 Luigi Semenzato <semenzato@google.com>:
> Thank you!  I am using perf version 3.13.11.10, will look for newer versions.

I said one misleading word, 'default'. Command should have -g option
when recording.

perf record -g xxxx
perf report

And, I did quick test and found that this ability can be usable with >= 3.16.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mm performance with zram
  2015-01-08 22:49 mm performance with zram Luigi Semenzato
  2015-01-09  6:30 ` Andrew Morton
@ 2015-01-13  9:18 ` Vlastimil Babka
  2015-01-13 16:25   ` Luigi Semenzato
  1 sibling, 1 reply; 8+ messages in thread
From: Vlastimil Babka @ 2015-01-13  9:18 UTC (permalink / raw)
  To: Luigi Semenzato, linux-mm

On 01/08/2015 11:49 PM, Luigi Semenzato wrote:
> I am taking a closer look at the performance of the Linux MM in the
> context of heavy zram usage.  The bottom line is that there is
> surprisingly high overhead (35-40%) from MM code other than
> compression/decompression routines.  I'd like to share some results in
> the hope they will be helpful in planning future development.
> 
> SETUP
> 
> I am running on an ASUS Chromebox with about 2GB RAM (actually 4GB,
> but with mem=1931M).  The zram block device size is approx. 2.8GB
> (uncompressed size).
> 
> http://www.amazon.com/Asus-CHROMEBOX-M004U-ASUS-Desktop/dp/B00IT1WJZQ
> 
> Intel(R) Celeron(R) 2955U @ 1.40GHz
> MemTotal:        1930456 kB
> SwapTotal:       2827816 kB
> 
> I took the kernel from Linus's tree a few days ago: Linux localhost
> 3.19.0-rc2+ (...) x86_64.  I also set maxcpus=1.  The kernel
> configuration is available if needed.
> 
> EXPERIMENTS
> 
> I wrote a page walker (historically called "balloon") which allocates
> a lot of memory, more than physical RAM, and fills it with a dump of
> /dev/mem from a Chrome OS system running at capacity.  The memory
> compresses down to about 35%.  I ran two main experiments.
> 
> 1. Compression/decompression.  After filling the memory, the program
> touches the first byte of all pages in a random permutation (I tried
> sequentially too, it makes little difference).  At steady state, this
> forces one page decompression and one compression (on average) at each
> step of the walk.
> 
> 2. Decompression only.  After filling the memory, the program walks
> all pages sequentially.  Then it frees the second half of the pages
> (the ones most recently touched), and walks the first half.  This
> causes one page decompression at each step, and almost no
> compressions.
> 
> RESULTS
> 
> The average time (real time) to walk a page in microseconds is
> 
> experiment 1 (compress + decompress): 26.5  us/page
> experiment 2 (decompress only): 9.3 us/page
> 
> I ran "perf record -ag"during the relevant parts of the experiment.
> (CAVEAT: the version of perf I used doesn't match the kernel, it's
> quite a bit older, but that should be mostly OK).  I put the output of
> "perf report" in this Google Drive folder:
> 
> https://drive.google.com/folderview?id=0B6kmZ3mOd0bzVzJKeTV6eExfeFE&usp=sharing
> 
> (You shouldn't need a Google ID to access it.  You may have to re-join
> the link if the plain text mailer splits it into multiple lines.)
> 
> I also tried to analyze cumulative graph profiles.  Interestingly the
> only tool I found to do this is gprof2dot (any other suggestion?  I

I think this could be useful for better graphs here:

http://www.brendangregg.com/flamegraphs.html

> would prefer a text-based tool).  The output is in the .png files in
> the same folder.  The interesting numbers are:
> 
> experiment 1
> compression 43.2%
> decompression 20.4%
> everything else 36.4%
> 
> experiment 2
> decompression 61.7%
> everything else 38.3%
> 
> The graph profiles don't seem to show low-hanging fruits on any path.
> 
> CONCLUSION
> 
> Before zram, in a situation involving swapping, the MM overhead was
> probably nearly invisible, especially with rotating disks.  But with
> zram the MM is surprisingly close to being the main bottleneck.
> Compression/decompression speeds will likely improve, and they are
> tuneable (tradeoff between compression ratio and speed).  Compression
> can happen often in the background, so decompression speed is more
> important for latency, and LZ4 decompression can already be a lot
> faster than LZO (the experiments use LZO, and LZ4 can be 2x faster).
> This suggests that simplifying and speeding up the relevant code paths
> in the Linux MM may be worth the effort.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mm performance with zram
  2015-01-13  9:18 ` Vlastimil Babka
@ 2015-01-13 16:25   ` Luigi Semenzato
  0 siblings, 0 replies; 8+ messages in thread
From: Luigi Semenzato @ 2015-01-13 16:25 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Linux Memory Management List

Very nice!  But that looks like a tree.  What if a function is called
from multiple locations?

On Tue, Jan 13, 2015 at 1:18 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> On 01/08/2015 11:49 PM, Luigi Semenzato wrote:
>> I am taking a closer look at the performance of the Linux MM in the
>> context of heavy zram usage.  The bottom line is that there is
>> surprisingly high overhead (35-40%) from MM code other than
>> compression/decompression routines.  I'd like to share some results in
>> the hope they will be helpful in planning future development.
>>
>> SETUP
>>
>> I am running on an ASUS Chromebox with about 2GB RAM (actually 4GB,
>> but with mem=1931M).  The zram block device size is approx. 2.8GB
>> (uncompressed size).
>>
>> http://www.amazon.com/Asus-CHROMEBOX-M004U-ASUS-Desktop/dp/B00IT1WJZQ
>>
>> Intel(R) Celeron(R) 2955U @ 1.40GHz
>> MemTotal:        1930456 kB
>> SwapTotal:       2827816 kB
>>
>> I took the kernel from Linus's tree a few days ago: Linux localhost
>> 3.19.0-rc2+ (...) x86_64.  I also set maxcpus=1.  The kernel
>> configuration is available if needed.
>>
>> EXPERIMENTS
>>
>> I wrote a page walker (historically called "balloon") which allocates
>> a lot of memory, more than physical RAM, and fills it with a dump of
>> /dev/mem from a Chrome OS system running at capacity.  The memory
>> compresses down to about 35%.  I ran two main experiments.
>>
>> 1. Compression/decompression.  After filling the memory, the program
>> touches the first byte of all pages in a random permutation (I tried
>> sequentially too, it makes little difference).  At steady state, this
>> forces one page decompression and one compression (on average) at each
>> step of the walk.
>>
>> 2. Decompression only.  After filling the memory, the program walks
>> all pages sequentially.  Then it frees the second half of the pages
>> (the ones most recently touched), and walks the first half.  This
>> causes one page decompression at each step, and almost no
>> compressions.
>>
>> RESULTS
>>
>> The average time (real time) to walk a page in microseconds is
>>
>> experiment 1 (compress + decompress): 26.5  us/page
>> experiment 2 (decompress only): 9.3 us/page
>>
>> I ran "perf record -ag"during the relevant parts of the experiment.
>> (CAVEAT: the version of perf I used doesn't match the kernel, it's
>> quite a bit older, but that should be mostly OK).  I put the output of
>> "perf report" in this Google Drive folder:
>>
>> https://drive.google.com/folderview?id=0B6kmZ3mOd0bzVzJKeTV6eExfeFE&usp=sharing
>>
>> (You shouldn't need a Google ID to access it.  You may have to re-join
>> the link if the plain text mailer splits it into multiple lines.)
>>
>> I also tried to analyze cumulative graph profiles.  Interestingly the
>> only tool I found to do this is gprof2dot (any other suggestion?  I
>
> I think this could be useful for better graphs here:
>
> http://www.brendangregg.com/flamegraphs.html
>
>> would prefer a text-based tool).  The output is in the .png files in
>> the same folder.  The interesting numbers are:
>>
>> experiment 1
>> compression 43.2%
>> decompression 20.4%
>> everything else 36.4%
>>
>> experiment 2
>> decompression 61.7%
>> everything else 38.3%
>>
>> The graph profiles don't seem to show low-hanging fruits on any path.
>>
>> CONCLUSION
>>
>> Before zram, in a situation involving swapping, the MM overhead was
>> probably nearly invisible, especially with rotating disks.  But with
>> zram the MM is surprisingly close to being the main bottleneck.
>> Compression/decompression speeds will likely improve, and they are
>> tuneable (tradeoff between compression ratio and speed).  Compression
>> can happen often in the background, so decompression speed is more
>> important for latency, and LZ4 decompression can already be a lot
>> faster than LZO (the experiments use LZO, and LZ4 can be 2x faster).
>> This suggests that simplifying and speeding up the relevant code paths
>> in the Linux MM may be worth the effort.
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-01-13 16:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-08 22:49 mm performance with zram Luigi Semenzato
2015-01-09  6:30 ` Andrew Morton
2015-01-09 16:45   ` Luigi Semenzato
2015-01-10  0:06     ` Joonsoo Kim
2015-01-10  0:19       ` Luigi Semenzato
2015-01-10  0:42         ` Joonsoo Kim
2015-01-13  9:18 ` Vlastimil Babka
2015-01-13 16:25   ` Luigi Semenzato

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.