From: Colin Ian King <colin.king@ubuntu.com> To: Mel Gorman <mgorman@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org>, James Bottomley <James.Bottomley@HansenPartnership.com>, Raghavendra D Prabhu <raghu.prabhu13@gmail.com>, Jan Kara <jack@suse.cz>, Chris Mason <chris.mason@oracle.com>, Christoph Lameter <cl@linux.com>, Pekka Enberg <penberg@kernel.org>, Rik van Riel <riel@redhat.com>, Johannes Weiner <hannes@cmpxchg.org>, linux-fsdevel <linux-fsdevel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>, linux-kernel <linux-kernel@vger.kernel.org>, linux-ext4 <linux-ext4@vger.kernel.org> Subject: Re: [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 Date: Mon, 16 May 2011 12:24:51 +0100 [thread overview] Message-ID: <1305545091.2046.2.camel@lenovo> (raw) In-Reply-To: <20110516083721.GB5279@suse.de> On Mon, 2011-05-16 at 09:37 +0100, Mel Gorman wrote: > On Sat, May 14, 2011 at 10:34:33AM +0200, Colin Ian King wrote: > > On Fri, 2011-05-13 at 15:03 +0100, Mel Gorman wrote: > > > Changelog since V1 > > > o kswapd should sleep if need_resched > > > o Remove __GFP_REPEAT from GFP flags when speculatively using high > > > orders so direct/compaction exits earlier > > > o Remove __GFP_NORETRY for correctness > > > o Correct logic in sleeping_prematurely > > > o Leave SLUB using the default slub_max_order > > > > > > There are a few reports of people experiencing hangs when copying > > > large amounts of data with kswapd using a large amount of CPU which > > > appear to be due to recent reclaim changes. > > > > > > SLUB using high orders is the trigger but not the root cause as SLUB > > > has been using high orders for a while. The following four patches > > > aim to fix the problems in reclaim while reducing the cost for SLUB > > > using those high orders. > > > > > > Patch 1 corrects logic introduced by commit [1741c877: mm: > > > kswapd: keep kswapd awake for high-order allocations until > > > a percentage of the node is balanced] to allow kswapd to > > > go to sleep when balanced for high orders. > > > > > > Patch 2 prevents kswapd waking up in response to SLUBs speculative > > > use of high orders. > > > > > > Patch 3 further reduces the cost by prevent SLUB entering direct > > > compaction or reclaim paths on the grounds that falling > > > back to order-0 should be cheaper. > > > > > > Patch 4 notes that even when kswapd is failing to keep up with > > > allocation requests, it should still go to sleep when its > > > quota has expired to prevent it spinning. > > > > > > My own data on this is not great. I haven't really been able to > > > reproduce the same problem locally. > > > > > > The test case is simple. "download tar" wgets a large tar file and > > > stores it locally. "unpack" is expanding it (15 times physical RAM > > > in this case) and "delete source dirs" is the tarfile being deleted > > > again. I also experimented with having the tar copied numerous times > > > and into deeper directories to increase the size but the results were > > > not particularly interesting so I left it as one tar. > > > > > > In the background, applications are being launched to time to vaguely > > > simulate activity on the desktop and to measure how long it takes > > > applications to start. > > > > > > Test server, 4 CPU threads, x86_64, 2G of RAM, no PREEMPT, no COMPACTION, X running > > > LARGE COPY AND UNTAR > > > vanilla fixprematurely kswapd-nowwake slub-noexstep kswapdsleep > > > download tar 95 ( 0.00%) 94 ( 1.06%) 94 ( 1.06%) 94 ( 1.06%) 94 ( 1.06%) > > > unpack tar 654 ( 0.00%) 649 ( 0.77%) 655 (-0.15%) 589 (11.04%) 598 ( 9.36%) > > > copy source files 0 ( 0.00%) 0 ( 0.00%) 0 ( 0.00%) 0 ( 0.00%) 0 ( 0.00%) > > > delete source dirs 327 ( 0.00%) 334 (-2.10%) 318 ( 2.83%) 325 ( 0.62%) 320 ( 2.19%) > > > MMTests Statistics: duration > > > User/Sys Time Running Test (seconds) 1139.7 1142.55 1149.78 1109.32 1113.26 > > > Total Elapsed Time (seconds) 1341.59 1342.45 1324.90 1271.02 1247.35 > > > > > > MMTests Statistics: application launch > > > evolution-wait30 mean 34.92 34.96 34.92 34.92 35.08 > > > gnome-terminal-find mean 7.96 7.96 8.76 7.80 7.96 > > > iceweasel-table mean 7.93 7.81 7.73 7.65 7.88 > > > > > > evolution-wait30 stddev 0.96 1.22 1.27 1.20 1.15 > > > gnome-terminal-find stddev 3.02 3.09 3.51 2.99 3.02 > > > iceweasel-table stddev 1.05 0.90 1.09 1.11 1.11 > > > > > > Having SLUB avoid expensive steps in reclaim improves performance > > > by quite a bit with the overall test completing 1.5 minutes > > > faster. Application launch times were not really affected but it's > > > not something my test machine was suffering from in the first place > > > so it's not really conclusive. The kswapd patches also did not appear > > > to help but again, the test machine wasn't suffering that problem. > > > > > > These patches are against 2.6.39-rc7. Again, testing would be > > > appreciated. > > > > These patches solve the problem for me. I've been soak testing the file > > copy test > > for 3.5 hours with nearly 400 test cycles and observed no lockups at all > > - rock solid. From my observations from the output from vmstat the > > system is behaving sanely. > > Thanks for finding a solution - much appreciated! > > > > Can you tell me if just patches 1 and 4 fix the problem please? It'd be good > to know if this was only a reclaim-related problem. Thanks. Hi Mel, Soak tested just patches 1 + 4 and works fine. Did 250 cycles for ~2 hours, no lockups, and the output from vmstat looked sane. Colin >
WARNING: multiple messages have this Message-ID (diff)
From: Colin Ian King <colin.king@ubuntu.com> To: Mel Gorman <mgorman@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org>, James Bottomley <James.Bottomley@HansenPartnership.com>, Raghavendra D Prabhu <raghu.prabhu13@gmail.com>, Jan Kara <jack@suse.cz>, Chris Mason <chris.mason@oracle.com>, Christoph Lameter <cl@linux.com>, Pekka Enberg <penberg@kernel.org>, Rik van Riel <riel@redhat.com>, Johannes Weiner <hannes@cmpxchg.org>, linux-fsdevel <linux-fsdevel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>, linux-kernel <linux-kernel@vger.kernel.org>, linux-ext4 <linux-ext4@vger.kernel.org> Subject: Re: [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 Date: Mon, 16 May 2011 12:24:51 +0100 [thread overview] Message-ID: <1305545091.2046.2.camel@lenovo> (raw) In-Reply-To: <20110516083721.GB5279@suse.de> On Mon, 2011-05-16 at 09:37 +0100, Mel Gorman wrote: > On Sat, May 14, 2011 at 10:34:33AM +0200, Colin Ian King wrote: > > On Fri, 2011-05-13 at 15:03 +0100, Mel Gorman wrote: > > > Changelog since V1 > > > o kswapd should sleep if need_resched > > > o Remove __GFP_REPEAT from GFP flags when speculatively using high > > > orders so direct/compaction exits earlier > > > o Remove __GFP_NORETRY for correctness > > > o Correct logic in sleeping_prematurely > > > o Leave SLUB using the default slub_max_order > > > > > > There are a few reports of people experiencing hangs when copying > > > large amounts of data with kswapd using a large amount of CPU which > > > appear to be due to recent reclaim changes. > > > > > > SLUB using high orders is the trigger but not the root cause as SLUB > > > has been using high orders for a while. The following four patches > > > aim to fix the problems in reclaim while reducing the cost for SLUB > > > using those high orders. > > > > > > Patch 1 corrects logic introduced by commit [1741c877: mm: > > > kswapd: keep kswapd awake for high-order allocations until > > > a percentage of the node is balanced] to allow kswapd to > > > go to sleep when balanced for high orders. > > > > > > Patch 2 prevents kswapd waking up in response to SLUBs speculative > > > use of high orders. > > > > > > Patch 3 further reduces the cost by prevent SLUB entering direct > > > compaction or reclaim paths on the grounds that falling > > > back to order-0 should be cheaper. > > > > > > Patch 4 notes that even when kswapd is failing to keep up with > > > allocation requests, it should still go to sleep when its > > > quota has expired to prevent it spinning. > > > > > > My own data on this is not great. I haven't really been able to > > > reproduce the same problem locally. > > > > > > The test case is simple. "download tar" wgets a large tar file and > > > stores it locally. "unpack" is expanding it (15 times physical RAM > > > in this case) and "delete source dirs" is the tarfile being deleted > > > again. I also experimented with having the tar copied numerous times > > > and into deeper directories to increase the size but the results were > > > not particularly interesting so I left it as one tar. > > > > > > In the background, applications are being launched to time to vaguely > > > simulate activity on the desktop and to measure how long it takes > > > applications to start. > > > > > > Test server, 4 CPU threads, x86_64, 2G of RAM, no PREEMPT, no COMPACTION, X running > > > LARGE COPY AND UNTAR > > > vanilla fixprematurely kswapd-nowwake slub-noexstep kswapdsleep > > > download tar 95 ( 0.00%) 94 ( 1.06%) 94 ( 1.06%) 94 ( 1.06%) 94 ( 1.06%) > > > unpack tar 654 ( 0.00%) 649 ( 0.77%) 655 (-0.15%) 589 (11.04%) 598 ( 9.36%) > > > copy source files 0 ( 0.00%) 0 ( 0.00%) 0 ( 0.00%) 0 ( 0.00%) 0 ( 0.00%) > > > delete source dirs 327 ( 0.00%) 334 (-2.10%) 318 ( 2.83%) 325 ( 0.62%) 320 ( 2.19%) > > > MMTests Statistics: duration > > > User/Sys Time Running Test (seconds) 1139.7 1142.55 1149.78 1109.32 1113.26 > > > Total Elapsed Time (seconds) 1341.59 1342.45 1324.90 1271.02 1247.35 > > > > > > MMTests Statistics: application launch > > > evolution-wait30 mean 34.92 34.96 34.92 34.92 35.08 > > > gnome-terminal-find mean 7.96 7.96 8.76 7.80 7.96 > > > iceweasel-table mean 7.93 7.81 7.73 7.65 7.88 > > > > > > evolution-wait30 stddev 0.96 1.22 1.27 1.20 1.15 > > > gnome-terminal-find stddev 3.02 3.09 3.51 2.99 3.02 > > > iceweasel-table stddev 1.05 0.90 1.09 1.11 1.11 > > > > > > Having SLUB avoid expensive steps in reclaim improves performance > > > by quite a bit with the overall test completing 1.5 minutes > > > faster. Application launch times were not really affected but it's > > > not something my test machine was suffering from in the first place > > > so it's not really conclusive. The kswapd patches also did not appear > > > to help but again, the test machine wasn't suffering that problem. > > > > > > These patches are against 2.6.39-rc7. Again, testing would be > > > appreciated. > > > > These patches solve the problem for me. I've been soak testing the file > > copy test > > for 3.5 hours with nearly 400 test cycles and observed no lockups at all > > - rock solid. From my observations from the output from vmstat the > > system is behaving sanely. > > Thanks for finding a solution - much appreciated! > > > > Can you tell me if just patches 1 and 4 fix the problem please? It'd be good > to know if this was only a reclaim-related problem. Thanks. Hi Mel, Soak tested just patches 1 + 4 and works fine. Did 250 cycles for ~2 hours, no lockups, and the output from vmstat looked sane. Colin > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-05-16 11:25 UTC|newest] Thread overview: 119+ messages / expand[flat|nested] mbox.gz Atom feed top 2011-05-13 14:03 [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 Mel Gorman 2011-05-13 14:03 ` Mel Gorman 2011-05-13 14:03 ` [PATCH 1/4] mm: vmscan: Correct use of pgdat_balanced in sleeping_prematurely Mel Gorman 2011-05-13 14:03 ` Mel Gorman 2011-05-13 14:28 ` Johannes Weiner 2011-05-13 14:28 ` Johannes Weiner 2011-05-14 16:30 ` Minchan Kim 2011-05-14 16:30 ` Minchan Kim 2011-05-16 14:30 ` Rik van Riel 2011-05-16 14:30 ` Rik van Riel 2011-05-13 14:03 ` [PATCH 2/4] mm: slub: Do not wake kswapd for SLUBs speculative high-order allocations Mel Gorman 2011-05-13 14:03 ` Mel Gorman 2011-05-16 21:10 ` David Rientjes 2011-05-16 21:10 ` David Rientjes 2011-05-18 6:09 ` Pekka Enberg 2011-05-18 6:09 ` Pekka Enberg 2011-05-18 17:21 ` Christoph Lameter 2011-05-18 17:21 ` Christoph Lameter 2011-05-13 14:03 ` [PATCH 3/4] mm: slub: Do not take expensive steps " Mel Gorman 2011-05-13 14:03 ` Mel Gorman 2011-05-16 21:16 ` David Rientjes 2011-05-16 21:16 ` David Rientjes 2011-05-17 8:42 ` Mel Gorman 2011-05-17 8:42 ` Mel Gorman 2011-05-17 13:51 ` Christoph Lameter 2011-05-17 13:51 ` Christoph Lameter 2011-05-17 16:22 ` Mel Gorman 2011-05-17 16:22 ` Mel Gorman 2011-05-17 17:52 ` Christoph Lameter 2011-05-17 17:52 ` Christoph Lameter 2011-05-17 19:35 ` David Rientjes 2011-05-17 19:35 ` David Rientjes 2011-05-17 19:31 ` David Rientjes 2011-05-17 19:31 ` David Rientjes 2011-05-13 14:03 ` [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep Mel Gorman 2011-05-13 14:03 ` Mel Gorman 2011-05-15 10:27 ` KOSAKI Motohiro 2011-05-15 10:27 ` KOSAKI Motohiro 2011-05-16 4:21 ` James Bottomley 2011-05-16 4:21 ` James Bottomley 2011-05-16 5:04 ` Minchan Kim 2011-05-16 5:04 ` Minchan Kim 2011-05-16 8:45 ` Mel Gorman 2011-05-16 8:45 ` Mel Gorman 2011-05-16 8:45 ` Mel Gorman 2011-05-16 8:58 ` Minchan Kim 2011-05-16 8:58 ` Minchan Kim 2011-05-16 8:58 ` Minchan Kim 2011-05-16 10:27 ` Mel Gorman 2011-05-16 10:27 ` Mel Gorman 2011-05-16 10:27 ` Mel Gorman 2011-05-16 23:50 ` Minchan Kim 2011-05-16 23:50 ` Minchan Kim 2011-05-17 0:48 ` Minchan Kim 2011-05-17 0:48 ` Minchan Kim 2011-05-17 0:48 ` Minchan Kim 2011-05-17 10:38 ` Mel Gorman 2011-05-17 10:38 ` Mel Gorman 2011-05-17 10:38 ` Mel Gorman 2011-05-17 13:50 ` Colin Ian King 2011-05-17 13:50 ` Colin Ian King 2011-05-17 16:15 ` [PATCH] mm: vmscan: Correctly check if reclaimer should schedule during shrink_slab Mel Gorman 2011-05-17 16:15 ` Mel Gorman 2011-05-18 0:45 ` KOSAKI Motohiro 2011-05-18 0:45 ` KOSAKI Motohiro 2011-05-19 0:03 ` Minchan Kim 2011-05-19 0:03 ` Minchan Kim 2011-05-19 0:03 ` Minchan Kim 2011-05-19 0:09 ` Minchan Kim 2011-05-19 0:09 ` Minchan Kim 2011-05-19 0:09 ` Minchan Kim 2011-05-19 11:36 ` Colin Ian King 2011-05-19 11:36 ` Colin Ian King 2011-05-20 0:06 ` Minchan Kim 2011-05-20 0:06 ` Minchan Kim 2011-05-20 0:06 ` Minchan Kim 2011-05-18 4:19 ` [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep Minchan Kim 2011-05-18 4:19 ` Minchan Kim 2011-05-18 7:39 ` Colin Ian King 2011-05-18 7:39 ` Colin Ian King 2011-05-18 4:09 ` James Bottomley 2011-05-18 4:09 ` James Bottomley 2011-05-18 1:05 ` KOSAKI Motohiro 2011-05-18 1:05 ` KOSAKI Motohiro 2011-05-18 5:44 ` Minchan Kim 2011-05-18 5:44 ` Minchan Kim 2011-05-18 5:44 ` Minchan Kim 2011-05-18 6:05 ` KOSAKI Motohiro 2011-05-18 6:05 ` KOSAKI Motohiro 2011-05-18 9:58 ` Mel Gorman 2011-05-18 9:58 ` Mel Gorman 2011-05-18 9:58 ` Mel Gorman 2011-05-18 22:55 ` Minchan Kim 2011-05-18 22:55 ` Minchan Kim 2011-05-18 23:54 ` KOSAKI Motohiro 2011-05-18 23:54 ` KOSAKI Motohiro 2011-05-18 0:26 ` KOSAKI Motohiro 2011-05-18 0:26 ` KOSAKI Motohiro 2011-05-18 9:57 ` Mel Gorman 2011-05-18 9:57 ` Mel Gorman 2011-05-16 8:45 ` Mel Gorman 2011-05-16 8:45 ` Mel Gorman 2011-05-16 14:30 ` Rik van Riel 2011-05-16 14:30 ` Rik van Riel 2011-05-13 15:19 ` [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 James Bottomley 2011-05-13 15:19 ` James Bottomley 2011-05-13 15:19 ` James Bottomley 2011-05-13 15:52 ` Mel Gorman 2011-05-13 15:52 ` Mel Gorman 2011-05-13 15:21 ` Christoph Lameter 2011-05-13 15:21 ` Christoph Lameter 2011-05-13 15:43 ` Mel Gorman 2011-05-13 15:43 ` Mel Gorman 2011-05-14 8:34 ` Colin Ian King 2011-05-14 8:34 ` Colin Ian King 2011-05-16 8:37 ` Mel Gorman 2011-05-16 8:37 ` Mel Gorman 2011-05-16 11:24 ` Colin Ian King [this message] 2011-05-16 11:24 ` Colin Ian King
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1305545091.2046.2.camel@lenovo \ --to=colin.king@ubuntu.com \ --cc=James.Bottomley@HansenPartnership.com \ --cc=akpm@linux-foundation.org \ --cc=chris.mason@oracle.com \ --cc=cl@linux.com \ --cc=hannes@cmpxchg.org \ --cc=jack@suse.cz \ --cc=linux-ext4@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=penberg@kernel.org \ --cc=raghu.prabhu13@gmail.com \ --cc=riel@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.