Re: [Bug 64121] New: [BISECTED] "mm" performance regression updating from 3.2 to 3.3

* Re: [Bug 64121] New: [BISECTED] "mm" performance regression updating from 3.2 to 3.3
       [not found] <bug-64121-27@https.bugzilla.kernel.org/>
@ 2013-10-31 20:46 ` Andrew Morton
  2013-11-01 18:43   ` Johannes Weiner
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2013-10-31 20:46 UTC (permalink / raw)
  To: thomas.jarosch; +Cc: bugzilla-daemon, linux-mm, Johannes Weiner

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 31 Oct 2013 10:53:47 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=64121
> 
>             Bug ID: 64121
>            Summary: [BISECTED] "mm" performance regression updating from
>                     3.2 to 3.3
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 3.3
>           Hardware: i386
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>           Assignee: akpm@linux-foundation.org
>           Reporter: thomas.jarosch@intra2net.com
>         Regression: No
> 
> Created attachment 112881
>   --> https://bugzilla.kernel.org/attachment.cgi?id=112881&action=edit
> Dmesg output
> 
> Hi,
> 
> I've updated a productive box running kernel 3.0.x to 3.4.67.
> This caused a severe I/O performance regression.
> 
> After some hours I've bisected it down to this commit:
> 
> ---------------------------
> # git bisect good
> ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d is the first bad commit
> commit ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d
> Author: Johannes Weiner <jweiner@redhat.com>
> Date:   Tue Jan 10 15:07:42 2012 -0800
> 
>     mm: exclude reserved pages from dirtyable memory
> 
>     Per-zone dirty limits try to distribute page cache pages allocated for
>     writing across zones in proportion to the individual zone sizes, to reduce
>     the likelihood of reclaim having to write back individual pages from the
>     LRU lists in order to make progress.
> 
>     ...
> ---------------------------
> 
> With the "problematic" patch:
> # dd_rescue -A /dev/zero img.disk
> dd_rescue: (info): ipos:     15296.0k, opos:     15296.0k, xferd:     15296.0k
>                    errs:      0, errxfer:         0.0k, succxfer:     15296.0k
>              +curr.rate:      681kB/s, avg.rate:      681kB/s, avg.load:  0.3%
> 
> 
> Without the patch (using 25bd91bd27820d5971258cecd1c0e64b0e485144):
> # dd_rescue -A /dev/zero img.disk
> dd_rescue: (info): ipos:    293888.0k, opos:    293888.0k, xferd:    293888.0k
>                    errs:      0, errxfer:         0.0k, succxfer:    293888.0k
>              +curr.rate:    99935kB/s, avg.rate:    51625kB/s, avg.load:  3.3%
> 
> 
> 
> The kernel is 32bit using PAE mode. The system has 32GB of RAM.
> (compiled with "gcc (GCC) 4.4.4 20100630 (Red Hat 4.4.4-10)")
> 
> Interestingly: If I limit the amount of RAM to roughly 20GB
> via the "mem=20000m" boot parameter, the performance is fine.
> When I increase it to f.e. "mem=23000m", performance is bad.
> 
> Also tested kernel 3.10.17 in 32bit + PAE mode,
> it was fine out of the box.
> 
> 
> So basically we need a fix for the LTS kernel 3.4, I can work around
> this issue with "mem=20000m" until I upgrade to 3.10.
> 
> I'll probably have access to the hardware for one more week
> to test patches, it was lent to me to debug this specific problem.
> 
> The same issue appeared on a complete different machine in July
> using the same 3.4.x kernel. The box had 16GB of RAM.
> I didn't get a chance to access the hardware back then.
> 
> Attached is the dmesg output and my kernel config.

32GB of memory on a highmem machine just isn't going to work well,
sorry.  Our rule of thumb is that 16G is the max.  If it was previously
working OK with 32G then you were very lucky!

That being said, we should try to work out exactly why that commit
caused the big slowdown - perhaps there is something we can do to
restore things.  It appears that the (small?) increase in the per-zone
dirty limit is what kicked things over - perhaps we can permit that to
be tuned back again.  Or something.  Johannes, could you please have a
think about it?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread