All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6 v2] PM / Hibernate: Memory bitmap scalability improvements
@ 2014-07-21 10:26 Joerg Roedel
  2014-07-21 10:26 ` [PATCH 1/6] PM / Hibernate: Create a Radix-Tree to store memory bitmap Joerg Roedel
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Joerg Roedel @ 2014-07-21 10:26 UTC (permalink / raw)
  To: Rafael J. Wysocki, Pavel Machek, Len Brown
  Cc: linux-pm, linux-kernel, Joerg Roedel

Changes v1->v2:

* Rebased to v3.16-rc6
* Fixed the style issues in Patch 1 mentioned by Rafael

Hi,

here is the revised patch set to improve the scalability of
the memory bitmap implementation used for hibernation. The
current implementation does not scale well to machines with
several TB of memory. A resume on those machines may cause
soft lockups to be reported.

These patches improve the data structure by adding a radix
tree to the linked list structure to improve random access
performance from O(n) to O(log_b(n)), where b depends on the
architecture (b=512 on amd64, 1024 in i386).

A test on a 12TB machine showed an improvement in resume
time from 76s with the old implementation to 2.4s with the
radix tree and the improved swsusp_free function. See below
for details of this test.

Patches 1-3 that add the radix tree while keeping the
existing memory bitmap implementation in place and add code
to compare the results between both implementations. This
was used during development to make sure both data
structures return the same results.

Patch 4 re-implements the swsusp_free() function to not
iterate over all pfns but only over the bits set in the
bitmaps. This showed to scale better on large memory
machines.

Patch 5 removes the old memory bitmap implementation now
that the radix tree is in place and working correctly.

The last patch adds touching the soft lockup watchdog in
rtree_next_node. This is necessary because the worst case
performance (all bits set in the forbidden_pages_map and
free_pages_map) is the same as with the old implementation
and may still cause soft lockups. Patch 6 avoids this.

The code was tested in 32 and 64 bit x86 and showed no
issues there.

Below is an example test that shows the performance
improvement on a 12TB machine. First the test with the old
memory bitmap:

# time perf record /usr/sbin/resume $sdev
resume: libgcrypt version: 1.5.0
[ perf record: Woken up 12 times to write data ]
[ perf record: Captured and wrote 2.882 MB perf.data (~125898 samples) ]

real    1m16.043s
user    0m0.016s
sys     0m0.312s
# perf report --stdio |head -50
# Events: 75K cycles
#
# Overhead  Command         Shared Object                                   
Symbol
# ........  .......  .................... 
........................................
#
    56.16%   resume  [kernel.kallsyms]     [k] memory_bm_test_bit
    19.35%   resume  [kernel.kallsyms]     [k] swsusp_free
    14.90%   resume  [kernel.kallsyms]     [k] memory_bm_find_bit
     7.28%   resume  [kernel.kallsyms]     [k] swsusp_page_is_forbidden

And here is the same test on the same machine with these
patches applied:

#  time perf record /usr/sbin/resume $sdev
resume: libgcrypt version: 1.5.0
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.039 MB perf.data (~1716 samples) ]

real    0m2.376s
user    0m0.020s
sys     0m0.408s

# perf report --stdio |head -50
# Events: 762  cycles
#
# Overhead  Command      Shared Object                     Symbol
# ........  .......  .................  .........................
#
    34.78%   resume  [kernel.kallsyms]  [k] find_next_bit
    27.03%   resume  [kernel.kallsyms]  [k] clear_page_c_e
     9.70%   resume  [kernel.kallsyms]  [k] mark_nosave_pages
     3.92%   resume  [kernel.kallsyms]  [k] alloc_rtree_node
     2.38%   resume  [kernel.kallsyms]  [k] get_image_page

As can be seen on these results these patches improve the
scalability significantly. Please review, any comments
appreciated.

Thanks,

	Joerg

Joerg Roedel (6):
  PM / Hibernate: Create a Radix-Tree to store memory bitmap
  PM / Hibernate: Add memory_rtree_find_bit function
  PM / Hibernate: Implement position keeping in radix tree
  PM / Hibernate: Iterate over set bits instead of PFNs in swsusp_free()
  PM / Hibernate: Remove the old memory-bitmap implementation
  PM / Hibernate: Touch Soft Lockup Watchdog in rtree_next_node

 kernel/power/snapshot.c | 494 +++++++++++++++++++++++++++++++++++-------------
 1 file changed, 367 insertions(+), 127 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 28+ messages in thread
* [PATCH 0/6] PM / Hibernate: Memory bitmap scalability improvements
@ 2014-07-18 11:57 Joerg Roedel
  2014-07-18 11:57 ` [PATCH 1/6] PM / Hibernate: Create a Radix-Tree to store memory bitmap Joerg Roedel
  0 siblings, 1 reply; 28+ messages in thread
From: Joerg Roedel @ 2014-07-18 11:57 UTC (permalink / raw)
  To: Rafael J. Wysocki, Pavel Machek, Len Brown
  Cc: linux-pm, linux-kernel, Joerg Roedel

Hi,

here is a patch set to improve the scalability of the memory
bitmap implementation used for hibernation. The current
implementation does not scale well to machines with several
TB of memory. A resume on those machines may cause soft
lockups to be reported.

These patches improve the data structure by adding a radix
tree to the linked list structure to improve random access
performance from O(n) to O(log_b(n)), where b depends on the
architecture (b=512 on amd64, 1024 in i386).

A test on a 12TB machine showed an improvement in resume
time from 76s with the old implementation to 2.4s with the
radix tree and the improved swsusp_free function. See below
for details of this test.

Patches 1-3 that add the radix tree while keeping the
existing memory bitmap implementation in place and add code
to compare the results between both implementations. This
was used during development to make sure both data
structures return the same results.

Patch 4 re-implements the swsusp_free() function to not
iterate over all pfns but only over the bits set in the
bitmaps. This showed to scale better on large memory
machines.

Patch 5 removes the old memory bitmap implementation now
that the radix tree is in place and working correctly.

The last patch adds touching the soft lockup watchdog in
rtree_next_node. This is necessary because the worst case
performance (all bits set in the forbidden_pages_map and
free_pages_map) is the same as with the old implementation
and may still cause soft lockups. Patch 6 avoids this.

The code was tested in 32 and 64 bit x86 and showed no
issues there.

Below is an example test that shows the performance
improvement on a 12TB machine. First the test with the old
memory bitmap:

# time perf record /usr/sbin/resume $sdev
resume: libgcrypt version: 1.5.0
[ perf record: Woken up 12 times to write data ]
[ perf record: Captured and wrote 2.882 MB perf.data (~125898 samples) ]

real    1m16.043s
user    0m0.016s
sys     0m0.312s
# perf report --stdio |head -50
# Events: 75K cycles
#
# Overhead  Command         Shared Object                                   
Symbol
# ........  .......  .................... 
........................................
#
    56.16%   resume  [kernel.kallsyms]     [k] memory_bm_test_bit
    19.35%   resume  [kernel.kallsyms]     [k] swsusp_free
    14.90%   resume  [kernel.kallsyms]     [k] memory_bm_find_bit
     7.28%   resume  [kernel.kallsyms]     [k] swsusp_page_is_forbidden

And here is the same test on the same machine with these
patches applied:

#  time perf record /usr/sbin/resume $sdev
resume: libgcrypt version: 1.5.0
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.039 MB perf.data (~1716 samples) ]

real    0m2.376s
user    0m0.020s
sys     0m0.408s

# perf report --stdio |head -50
# Events: 762  cycles
#
# Overhead  Command      Shared Object                     Symbol
# ........  .......  .................  .........................
#
    34.78%   resume  [kernel.kallsyms]  [k] find_next_bit
    27.03%   resume  [kernel.kallsyms]  [k] clear_page_c_e
     9.70%   resume  [kernel.kallsyms]  [k] mark_nosave_pages
     3.92%   resume  [kernel.kallsyms]  [k] alloc_rtree_node
     2.38%   resume  [kernel.kallsyms]  [k] get_image_page

As can be seen on these results these patches improve the
scalability significantly. Please review, any comments
appreciated.

Thanks,

	Joerg

Joerg Roedel (6):
  PM / Hibernate: Create a Radix-Tree to store memory bitmap
  PM / Hibernate: Add memory_rtree_find_bit function
  PM / Hibernate: Implement position keeping in radix tree
  PM / Hibernate: Iterate over set bits instead of PFNs in swsusp_free()
  PM / Hibernate: Remove the old memory-bitmap implementation
  PM / Hibernate: Touch Soft Lockup Watchdog in rtree_next_node

 kernel/power/snapshot.c | 510 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 368 insertions(+), 142 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2014-07-29 21:04 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-21 10:26 [PATCH 0/6 v2] PM / Hibernate: Memory bitmap scalability improvements Joerg Roedel
2014-07-21 10:26 ` [PATCH 1/6] PM / Hibernate: Create a Radix-Tree to store memory bitmap Joerg Roedel
2014-07-21 22:36   ` Joerg Roedel
2014-07-21 23:05     ` Pavel Machek
2014-07-21 10:26 ` [PATCH 2/6] PM / Hibernate: Add memory_rtree_find_bit function Joerg Roedel
2014-07-21 10:26 ` [PATCH 3/6] PM / Hibernate: Implement position keeping in radix tree Joerg Roedel
2014-07-21 10:27 ` [PATCH 4/6] PM / Hibernate: Iterate over set bits instead of PFNs in swsusp_free() Joerg Roedel
2014-07-21 10:27 ` [PATCH 5/6] PM / Hibernate: Remove the old memory-bitmap implementation Joerg Roedel
2014-07-21 10:27 ` [PATCH 6/6] PM / Hibernate: Touch Soft Lockup Watchdog in rtree_next_node Joerg Roedel
2014-07-21 12:00 ` [PATCH 0/6 v2] PM / Hibernate: Memory bitmap scalability improvements Pavel Machek
2014-07-21 12:36   ` Joerg Roedel
2014-07-21 13:06     ` Pavel Machek
2014-07-21 13:38       ` Joerg Roedel
2014-07-21 14:10         ` Pavel Machek
2014-07-21 16:03           ` Joerg Roedel
2014-07-21 23:05             ` Pavel Machek
2014-07-22  0:41               ` Rafael J. Wysocki
2014-07-22 10:34                 ` Joerg Roedel
2014-07-22 10:55                   ` Pavel Machek
2014-07-22 12:24                     ` Joerg Roedel
2014-07-22 10:58                   ` Pavel Machek
2014-07-22 12:10                     ` Joerg Roedel
2014-07-23 10:57                       ` Pavel Machek
2014-07-28 13:59                   ` Borislav Petkov
2014-07-29 21:22                     ` Rafael J. Wysocki
  -- strict thread matches above, loose matches on Subject: below --
2014-07-18 11:57 [PATCH 0/6] " Joerg Roedel
2014-07-18 11:57 ` [PATCH 1/6] PM / Hibernate: Create a Radix-Tree to store memory bitmap Joerg Roedel
2014-07-19  0:00   ` Rafael J. Wysocki
2014-07-19  7:57     ` Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.