All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
       [not found] <bug-15618-10286@https.bugzilla.kernel.org/>
@ 2010-03-23 14:22   ` Andrew Morton
  0 siblings, 0 replies; 66+ messages in thread
From: Andrew Morton @ 2010-03-23 14:22 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: bugzilla-daemon, bugme-daemon, ant.starikov, Peter Zijlstra


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 23 Mar 2010 16:13:25 GMT bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=15618
> 
>            Summary: 2.6.18->2.6.32->2.6.33 huge regression in performance
>            Product: Process Management
>            Version: 2.5
>     Kernel Version: 2.6.32
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Other
>         AssignedTo: process_other@kernel-bugs.osdl.org
>         ReportedBy: ant.starikov@gmail.com
>         Regression: No
> 
> 
> We have benchmarked some multithreaded code here on 16-core/4-way opteron 8356
> host on number of kernels (see below) and found strange results.
> Up to 8 threads we didn't see any noticeable differences in performance, but
> starting from 9 threads performance diverges substantially. I provide here
> results for 14 threads

lolz.  Catastrophic meltdown.  Thanks for doing all that work - at a
guess I'd say it's mmap_sem.  Perhaps with some assist from the CPU
scheduler.

If you change the config to set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
CONFIG_RWSEM_XCHGADD_ALGORITHM=y does it help?

Anyway, there's a testcase in bugzilla and it looks like we got us some
work to do.


> 2.6.18-164.11.1.el5 (centos)
> 
> user time: ~60 sec
> sys time: ~12 sec
> 
> 2.6.32.9-70.fc12.x86_64 (fedora-12)
> 
> user time: ~60 sec
> sys time: ~75 sec
> 
> 2.6.33-0.46.rc8.git1.fc13.x86_64 (fedora-12 + rawhide kernel)
> 
> user time: ~60 sec
> sys time: ~300 sec
> 
> In all three cases real time regress corresponding to giving numbers.
> 
> Binary used for all three cases is exactly the same (compiled on centos).
> Setups for all three cases so identical as possible (last two - the same
> fedora-12 setup booted with different kernels).
> 
> What can be reason of this regress in performance? Is it possible to tune
> something to recover performance on 2.6.18 kernel? 
> 
> I perf'ed on 2.6.32.9-70.fc12.x86_64 kernel
> 
> report (top part only):
> 
> 43.64% dve22lts-mc [kernel] [k] _spin_lock_irqsave 
> 32.93% dve22lts-mc ./dve22lts-mc [.] DBSLLlookup_ret 
> 5.37% dve22lts-mc ./dve22lts-mc [.] SuperFastHash 
> 3.76% dve22lts-mc /lib64/libc-2.11.1.so [.] __GI_memcpy 
> 2.60% dve22lts-mc [kernel] [k] clear_page_c 
> 1.60% dve22lts-mc ./dve22lts-mc [.] index_next_dfs
> 
> stat: 
> 129875.554435 task-clock-msecs # 10.210 CPUs 
> 1883 context-switches # 0.000 M/sec 
> 17 CPU-migrations # 0.000 M/sec 
> 2695310 page-faults # 0.021 M/sec 
> 298370338040 cycles # 2297.356 M/sec 
> 130581778178 instructions # 0.438 IPC 
> 42517143751 cache-references # 327.368 M/sec 
> 101906904 cache-misses # 0.785 M/sec 
> 
> callgraph(top part only):
> 
> 53.09%      dve22lts-mc  [kernel]                                         [k]
> _spin_lock_irqsave
>                |          
>                |--49.90%-- __down_read_trylock
>                |          down_read_trylock
>                |          do_page_fault
>                |          page_fault
>                |          |          
>                |          |--99.99%-- __GI_memcpy
>                |          |          |          
>                |          |          |--84.28%-- (nil)
>                |          |          |          
>                |          |          |--9.78%-- 0x100000000
>                |          |          |          
>                |          |           --5.94%-- 0x1
>                |           --0.01%-- 
> [...]
> 
>                |          
>                |--49.39%-- __up_read
>                |          up_read
>                |          |          
>                |          |--100.00%-- do_page_fault
>                |          |          page_fault
>                |          |          |          
>                |          |          |--99.99%-- __GI_memcpy
>                |          |          |          |          
>                |          |          |          |--84.18%-- (nil)
>                |          |          |          |          
>                |          |          |          |--10.13%-- 0x100000000
>                |          |          |          |          
>                |          |          |           --5.69%-- 0x1
>                |          |           --0.01%-- 
> [...]
> 
>                |           --0.00%-- 
> [...]
> 
>                 --0.72%-- 
> [...]
> 
> 
> 
> On 2.6.33 I see similar picture with spin-lock plus addition of a lot of time
> spent in cgroup related kernel calls.
> 
> If it is necessary, I can attach binary for tests.
> 
> -- 
> Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 14:22   ` Andrew Morton
  0 siblings, 0 replies; 66+ messages in thread
From: Andrew Morton @ 2010-03-23 14:22 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: bugzilla-daemon, bugme-daemon, ant.starikov, Peter Zijlstra


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 23 Mar 2010 16:13:25 GMT bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=15618
> 
>            Summary: 2.6.18->2.6.32->2.6.33 huge regression in performance
>            Product: Process Management
>            Version: 2.5
>     Kernel Version: 2.6.32
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Other
>         AssignedTo: process_other@kernel-bugs.osdl.org
>         ReportedBy: ant.starikov@gmail.com
>         Regression: No
> 
> 
> We have benchmarked some multithreaded code here on 16-core/4-way opteron 8356
> host on number of kernels (see below) and found strange results.
> Up to 8 threads we didn't see any noticeable differences in performance, but
> starting from 9 threads performance diverges substantially. I provide here
> results for 14 threads

lolz.  Catastrophic meltdown.  Thanks for doing all that work - at a
guess I'd say it's mmap_sem.  Perhaps with some assist from the CPU
scheduler.

If you change the config to set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
CONFIG_RWSEM_XCHGADD_ALGORITHM=y does it help?

Anyway, there's a testcase in bugzilla and it looks like we got us some
work to do.


> 2.6.18-164.11.1.el5 (centos)
> 
> user time: ~60 sec
> sys time: ~12 sec
> 
> 2.6.32.9-70.fc12.x86_64 (fedora-12)
> 
> user time: ~60 sec
> sys time: ~75 sec
> 
> 2.6.33-0.46.rc8.git1.fc13.x86_64 (fedora-12 + rawhide kernel)
> 
> user time: ~60 sec
> sys time: ~300 sec
> 
> In all three cases real time regress corresponding to giving numbers.
> 
> Binary used for all three cases is exactly the same (compiled on centos).
> Setups for all three cases so identical as possible (last two - the same
> fedora-12 setup booted with different kernels).
> 
> What can be reason of this regress in performance? Is it possible to tune
> something to recover performance on 2.6.18 kernel? 
> 
> I perf'ed on 2.6.32.9-70.fc12.x86_64 kernel
> 
> report (top part only):
> 
> 43.64% dve22lts-mc [kernel] [k] _spin_lock_irqsave 
> 32.93% dve22lts-mc ./dve22lts-mc [.] DBSLLlookup_ret 
> 5.37% dve22lts-mc ./dve22lts-mc [.] SuperFastHash 
> 3.76% dve22lts-mc /lib64/libc-2.11.1.so [.] __GI_memcpy 
> 2.60% dve22lts-mc [kernel] [k] clear_page_c 
> 1.60% dve22lts-mc ./dve22lts-mc [.] index_next_dfs
> 
> stat: 
> 129875.554435 task-clock-msecs # 10.210 CPUs 
> 1883 context-switches # 0.000 M/sec 
> 17 CPU-migrations # 0.000 M/sec 
> 2695310 page-faults # 0.021 M/sec 
> 298370338040 cycles # 2297.356 M/sec 
> 130581778178 instructions # 0.438 IPC 
> 42517143751 cache-references # 327.368 M/sec 
> 101906904 cache-misses # 0.785 M/sec 
> 
> callgraph(top part only):
> 
> 53.09%      dve22lts-mc  [kernel]                                         [k]
> _spin_lock_irqsave
>                |          
>                |--49.90%-- __down_read_trylock
>                |          down_read_trylock
>                |          do_page_fault
>                |          page_fault
>                |          |          
>                |          |--99.99%-- __GI_memcpy
>                |          |          |          
>                |          |          |--84.28%-- (nil)
>                |          |          |          
>                |          |          |--9.78%-- 0x100000000
>                |          |          |          
>                |          |           --5.94%-- 0x1
>                |           --0.01%-- 
> [...]
> 
>                |          
>                |--49.39%-- __up_read
>                |          up_read
>                |          |          
>                |          |--100.00%-- do_page_fault
>                |          |          page_fault
>                |          |          |          
>                |          |          |--99.99%-- __GI_memcpy
>                |          |          |          |          
>                |          |          |          |--84.18%-- (nil)
>                |          |          |          |          
>                |          |          |          |--10.13%-- 0x100000000
>                |          |          |          |          
>                |          |          |           --5.69%-- 0x1
>                |          |           --0.01%-- 
> [...]
> 
>                |           --0.00%-- 
> [...]
> 
>                 --0.72%-- 
> [...]
> 
> 
> 
> On 2.6.33 I see similar picture with spin-lock plus addition of a lot of time
> spent in cgroup related kernel calls.
> 
> If it is necessary, I can attach binary for tests.
> 
> -- 
> Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 14:22   ` Andrew Morton
@ 2010-03-23 17:34     ` Ingo Molnar
  -1 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2010-03-23 17:34 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds
  Cc: linux-mm, linux-kernel, bugzilla-daemon, bugme-daemon,
	ant.starikov, Peter Zijlstra


* Andrew Morton <akpm@linux-foundation.org> wrote:

> lolz.  Catastrophic meltdown.  Thanks for doing all that work - at a guess 
> I'd say it's mmap_sem. [...]

Looks like we dont need to guess, just look at the call graph profile (a'ka 
the smoking gun):

> > I perf'ed on 2.6.32.9-70.fc12.x86_64 kernel
> >
> > [...]
> >
> > callgraph(top part only):
> > 
> > 53.09%      dve22lts-mc  [kernel]                                         [k]
> > _spin_lock_irqsave
> >                |          
> >                |--49.90%-- __down_read_trylock
> >                |          down_read_trylock
> >                |          do_page_fault
> >                |          page_fault
> >                |          |          
> >                |          |--99.99%-- __GI_memcpy
> >                |          |          |          
> >                |          |          |--84.28%-- (nil)
> >                |          |          |          
> >                |          |          |--9.78%-- 0x100000000
> >                |          |          |          
> >                |          |           --5.94%-- 0x1
> >                |           --0.01%-- 
> > [...]
> > 
> >                |          
> >                |--49.39%-- __up_read
> >                |          up_read
> >                |          |          
> >                |          |--100.00%-- do_page_fault
> >                |          |          page_fault
> >                |          |          |          
> >                |          |          |--99.99%-- __GI_memcpy
> >                |          |          |          |          
> >                |          |          |          |--84.18%-- (nil)
> >                |          |          |          |          
> >                |          |          |          |--10.13%-- 0x100000000
> >                |          |          |          |          
> >                |          |          |           --5.69%-- 0x1
> >                |          |           --0.01%-- 
> > [...]

It shows a very brutal amount of page fault invoked mmap_sem spinning 
overhead.

> Perhaps with some assist from the CPU scheduler.

Doesnt look like it, the perf stat numbers show that the scheduler is only 
very lightly involved:

  > > 129875.554435 task-clock-msecs # 10.210 CPUs 
  > >          1883 context-switches # 0.000 M/sec 
 
a context switch only every ~68 milliseconds.

	Ingo
	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 17:34     ` Ingo Molnar
  0 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2010-03-23 17:34 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds
  Cc: linux-mm, linux-kernel, bugzilla-daemon, bugme-daemon,
	ant.starikov, Peter Zijlstra


* Andrew Morton <akpm@linux-foundation.org> wrote:

> lolz.  Catastrophic meltdown.  Thanks for doing all that work - at a guess 
> I'd say it's mmap_sem. [...]

Looks like we dont need to guess, just look at the call graph profile (a'ka 
the smoking gun):

> > I perf'ed on 2.6.32.9-70.fc12.x86_64 kernel
> >
> > [...]
> >
> > callgraph(top part only):
> > 
> > 53.09%      dve22lts-mc  [kernel]                                         [k]
> > _spin_lock_irqsave
> >                |          
> >                |--49.90%-- __down_read_trylock
> >                |          down_read_trylock
> >                |          do_page_fault
> >                |          page_fault
> >                |          |          
> >                |          |--99.99%-- __GI_memcpy
> >                |          |          |          
> >                |          |          |--84.28%-- (nil)
> >                |          |          |          
> >                |          |          |--9.78%-- 0x100000000
> >                |          |          |          
> >                |          |           --5.94%-- 0x1
> >                |           --0.01%-- 
> > [...]
> > 
> >                |          
> >                |--49.39%-- __up_read
> >                |          up_read
> >                |          |          
> >                |          |--100.00%-- do_page_fault
> >                |          |          page_fault
> >                |          |          |          
> >                |          |          |--99.99%-- __GI_memcpy
> >                |          |          |          |          
> >                |          |          |          |--84.18%-- (nil)
> >                |          |          |          |          
> >                |          |          |          |--10.13%-- 0x100000000
> >                |          |          |          |          
> >                |          |          |           --5.69%-- 0x1
> >                |          |           --0.01%-- 
> > [...]

It shows a very brutal amount of page fault invoked mmap_sem spinning 
overhead.

> Perhaps with some assist from the CPU scheduler.

Doesnt look like it, the perf stat numbers show that the scheduler is only 
very lightly involved:

  > > 129875.554435 task-clock-msecs # 10.210 CPUs 
  > >          1883 context-switches # 0.000 M/sec 
 
a context switch only every ~68 milliseconds.

	Ingo
	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 17:34     ` Ingo Molnar
@ 2010-03-23 17:45       ` Linus Torvalds
  -1 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-23 17:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, linux-mm, linux-kernel, bugzilla-daemon,
	bugme-daemon, ant.starikov, Peter Zijlstra



On Tue, 23 Mar 2010, Ingo Molnar wrote:
> 
> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> overhead.

Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
the shit-for-brains generic version" thing, and it's fixed by

	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
	5d0b723 x86: clean up rwsem type system
	59c33fa x86-32: clean up rwsem inline asm statements

NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
compile his own kernel to test his load.

We could mark them as stable material if the load in question is a real 
load rather than just a test-case. On one of the random page-fault 
benchmarks the rwsem fix was something like a 400% performance 
improvement, and it was apparently visible in real life on some crazy SGI 
"initialize huge heap concurrently on lots of threads" load.

Side note: the reason the spinlock sucks is because of the fair ticket 
locks, it really does all the wrong things for the rwsem code. That's why 
old kernels don't show it - the old unfair locks didn't show the same kind 
of behavior.

			Linus

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 17:45       ` Linus Torvalds
  0 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-23 17:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, linux-mm, linux-kernel, bugzilla-daemon,
	bugme-daemon, ant.starikov, Peter Zijlstra



On Tue, 23 Mar 2010, Ingo Molnar wrote:
> 
> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> overhead.

Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
the shit-for-brains generic version" thing, and it's fixed by

	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
	5d0b723 x86: clean up rwsem type system
	59c33fa x86-32: clean up rwsem inline asm statements

NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
compile his own kernel to test his load.

We could mark them as stable material if the load in question is a real 
load rather than just a test-case. On one of the random page-fault 
benchmarks the rwsem fix was something like a 400% performance 
improvement, and it was apparently visible in real life on some crazy SGI 
"initialize huge heap concurrently on lots of threads" load.

Side note: the reason the spinlock sucks is because of the fair ticket 
locks, it really does all the wrong things for the rwsem code. That's why 
old kernels don't show it - the old unfair locks didn't show the same kind 
of behavior.

			Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 17:45       ` Linus Torvalds
@ 2010-03-23 17:57         ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 17:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Andrew Morton, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra


On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Ingo Molnar wrote:
>> 
>> It shows a very brutal amount of page fault invoked mmap_sem spinning 
>> overhead.
> 
> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> the shit-for-brains generic version" thing, and it's fixed by
> 
> 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> 	5d0b723 x86: clean up rwsem type system
> 	59c33fa x86-32: clean up rwsem inline asm statements
> 
> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> compile his own kernel to test his load.

Thanks for info, I will try it now.

> We could mark them as stable material if the load in question is a real 
> load rather than just a test-case. On one of the random page-fault 
> benchmarks the rwsem fix was something like a 400% performance 
> improvement, and it was apparently visible in real life on some crazy SGI 
> "initialize huge heap concurrently on lots of threads" load.

It is not just a test-case, it is real-life code. With real-life problems on 2.6.32 and later :)


Anton.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 17:57         ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 17:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Andrew Morton, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra


On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Ingo Molnar wrote:
>> 
>> It shows a very brutal amount of page fault invoked mmap_sem spinning 
>> overhead.
> 
> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> the shit-for-brains generic version" thing, and it's fixed by
> 
> 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> 	5d0b723 x86: clean up rwsem type system
> 	59c33fa x86-32: clean up rwsem inline asm statements
> 
> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> compile his own kernel to test his load.

Thanks for info, I will try it now.

> We could mark them as stable material if the load in question is a real 
> load rather than just a test-case. On one of the random page-fault 
> benchmarks the rwsem fix was something like a 400% performance 
> improvement, and it was apparently visible in real life on some crazy SGI 
> "initialize huge heap concurrently on lots of threads" load.

It is not just a test-case, it is real-life code. With real-life problems on 2.6.32 and later :)


Anton.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 17:45       ` Linus Torvalds
@ 2010-03-23 18:00         ` Ingo Molnar
  -1 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2010-03-23 18:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, linux-mm, linux-kernel, bugzilla-daemon,
	bugme-daemon, ant.starikov, Peter Zijlstra


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, 23 Mar 2010, Ingo Molnar wrote:
> > 
> > It shows a very brutal amount of page fault invoked mmap_sem spinning 
> > overhead.
> 
> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> the shit-for-brains generic version" thing, and it's fixed by
> 
> 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> 	5d0b723 x86: clean up rwsem type system
> 	59c33fa x86-32: clean up rwsem inline asm statements

Ah, indeed!

> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> compile his own kernel to test his load.

another option is to run the rawhide kernel via something like:

	yum update --enablerepo=development kernel

this will give kernel-2.6.34-0.13.rc1.git1.fc14.x86_64, which has those 
changes included.

OTOH that kernel has debugging [lockdep] enabled so it might not be 
comparable.

> We could mark them as stable material if the load in question is a real load 
> rather than just a test-case. On one of the random page-fault benchmarks the 
> rwsem fix was something like a 400% performance improvement, and it was 
> apparently visible in real life on some crazy SGI "initialize huge heap 
> concurrently on lots of threads" load.
> 
> Side note: the reason the spinlock sucks is because of the fair ticket 
> locks, it really does all the wrong things for the rwsem code. That's why 
> old kernels don't show it - the old unfair locks didn't show the same kind 
> of behavior.

Yeah.

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 18:00         ` Ingo Molnar
  0 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2010-03-23 18:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, linux-mm, linux-kernel, bugzilla-daemon,
	bugme-daemon, ant.starikov, Peter Zijlstra


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, 23 Mar 2010, Ingo Molnar wrote:
> > 
> > It shows a very brutal amount of page fault invoked mmap_sem spinning 
> > overhead.
> 
> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> the shit-for-brains generic version" thing, and it's fixed by
> 
> 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> 	5d0b723 x86: clean up rwsem type system
> 	59c33fa x86-32: clean up rwsem inline asm statements

Ah, indeed!

> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> compile his own kernel to test his load.

another option is to run the rawhide kernel via something like:

	yum update --enablerepo=development kernel

this will give kernel-2.6.34-0.13.rc1.git1.fc14.x86_64, which has those 
changes included.

OTOH that kernel has debugging [lockdep] enabled so it might not be 
comparable.

> We could mark them as stable material if the load in question is a real load 
> rather than just a test-case. On one of the random page-fault benchmarks the 
> rwsem fix was something like a 400% performance improvement, and it was 
> apparently visible in real life on some crazy SGI "initialize huge heap 
> concurrently on lots of threads" load.
> 
> Side note: the reason the spinlock sucks is because of the fair ticket 
> locks, it really does all the wrong things for the rwsem code. That's why 
> old kernels don't show it - the old unfair locks didn't show the same kind 
> of behavior.

Yeah.

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 18:00         ` Ingo Molnar
@ 2010-03-23 18:03           ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 18:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Andrew Morton, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra


On Mar 23, 2010, at 7:00 PM, Ingo Molnar wrote:
>> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
>> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
>> compile his own kernel to test his load.
> 
> another option is to run the rawhide kernel via something like:
> 
> 	yum update --enablerepo=development kernel
> 
> this will give kernel-2.6.34-0.13.rc1.git1.fc14.x86_64, which has those 
> changes included.

I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.

Anton.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 18:03           ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 18:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Andrew Morton, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra


On Mar 23, 2010, at 7:00 PM, Ingo Molnar wrote:
>> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
>> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
>> compile his own kernel to test his load.
> 
> another option is to run the rawhide kernel via something like:
> 
> 	yum update --enablerepo=development kernel
> 
> this will give kernel-2.6.34-0.13.rc1.git1.fc14.x86_64, which has those 
> changes included.

I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.

Anton.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 17:34     ` Ingo Molnar
@ 2010-03-23 18:13       ` Andrew Morton
  -1 siblings, 0 replies; 66+ messages in thread
From: Andrew Morton @ 2010-03-23 18:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, linux-mm, linux-kernel, bugzilla-daemon,
	bugme-daemon, ant.starikov, Peter Zijlstra

On Tue, 23 Mar 2010 18:34:09 +0100
Ingo Molnar <mingo@elte.hu> wrote:

> 
> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> overhead.
> 

Yes.  Note that we fall off a cliff at nine threads on a 16-way.  As
soon as a core gets two threads scheduled onto it?  Probably triggered
by an MM change, possibly triggered by a sched change which tickled a
preexisting MM shortcoming.  Who knows.

Anton, we have an executable binary in the bugzilla report but it would
be nice to also have at least a description of what that code is
actually doing.  A quick strace shows quite a lot of mprotect activity.
A pseudo-code walkthrough, perhaps?

Thanks.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 18:13       ` Andrew Morton
  0 siblings, 0 replies; 66+ messages in thread
From: Andrew Morton @ 2010-03-23 18:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, linux-mm, linux-kernel, bugzilla-daemon,
	bugme-daemon, ant.starikov, Peter Zijlstra

On Tue, 23 Mar 2010 18:34:09 +0100
Ingo Molnar <mingo@elte.hu> wrote:

> 
> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> overhead.
> 

Yes.  Note that we fall off a cliff at nine threads on a 16-way.  As
soon as a core gets two threads scheduled onto it?  Probably triggered
by an MM change, possibly triggered by a sched change which tickled a
preexisting MM shortcoming.  Who knows.

Anton, we have an executable binary in the bugzilla report but it would
be nice to also have at least a description of what that code is
actually doing.  A quick strace shows quite a lot of mprotect activity.
A pseudo-code walkthrough, perhaps?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 18:13       ` Andrew Morton
@ 2010-03-23 18:19         ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 18:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Linus Torvalds, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Mar 23, 2010, at 7:13 PM, Andrew Morton wrote:
> Anton, we have an executable binary in the bugzilla report but it would
> be nice to also have at least a description of what that code is
> actually doing.  A quick strace shows quite a lot of mprotect activity.
> A pseudo-code walkthrough, perhaps?


Right now can't say too much about the code (we just gave a chance to neighbor group to run their code on our cluster, so I'm totally unfriendly with this code). I will forward your question to them.

But probably right now you can get more information (including sources) here http://fmt.cs.utwente.nl/tools/ltsmin/

Anton

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 18:19         ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 18:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Linus Torvalds, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Mar 23, 2010, at 7:13 PM, Andrew Morton wrote:
> Anton, we have an executable binary in the bugzilla report but it would
> be nice to also have at least a description of what that code is
> actually doing.  A quick strace shows quite a lot of mprotect activity.
> A pseudo-code walkthrough, perhaps?


Right now can't say too much about the code (we just gave a chance to neighbor group to run their code on our cluster, so I'm totally unfriendly with this code). I will forward your question to them.

But probably right now you can get more information (including sources) here http://fmt.cs.utwente.nl/tools/ltsmin/

Anton
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 18:03           ` Anton Starikov
@ 2010-03-23 18:21             ` Andrew Morton
  -1 siblings, 0 replies; 66+ messages in thread
From: Andrew Morton @ 2010-03-23 18:21 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Ingo Molnar, Linus Torvalds, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Tue, 23 Mar 2010 19:03:36 +0100
Anton Starikov <ant.starikov@gmail.com> wrote:

> 
> On Mar 23, 2010, at 7:00 PM, Ingo Molnar wrote:
> >> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> >> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> >> compile his own kernel to test his load.
> > 
> > another option is to run the rawhide kernel via something like:
> > 
> > 	yum update --enablerepo=development kernel
> > 
> > this will give kernel-2.6.34-0.13.rc1.git1.fc14.x86_64, which has those 
> > changes included.
> 
> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
> 

You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 18:21             ` Andrew Morton
  0 siblings, 0 replies; 66+ messages in thread
From: Andrew Morton @ 2010-03-23 18:21 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Ingo Molnar, Linus Torvalds, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Tue, 23 Mar 2010 19:03:36 +0100
Anton Starikov <ant.starikov@gmail.com> wrote:

> 
> On Mar 23, 2010, at 7:00 PM, Ingo Molnar wrote:
> >> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> >> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> >> compile his own kernel to test his load.
> > 
> > another option is to run the rawhide kernel via something like:
> > 
> > 	yum update --enablerepo=development kernel
> > 
> > this will give kernel-2.6.34-0.13.rc1.git1.fc14.x86_64, which has those 
> > changes included.
> 
> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
> 

You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 18:21             ` Andrew Morton
@ 2010-03-23 18:25               ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 18:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Linus Torvalds, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
>> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
>> 
> 
> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?

Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.

Anton.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 18:25               ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 18:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Linus Torvalds, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
>> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
>> 
> 
> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?

Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.

Anton.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 18:13       ` Andrew Morton
@ 2010-03-23 18:27         ` Ingo Molnar
  -1 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2010-03-23 18:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, linux-mm, linux-kernel, bugzilla-daemon,
	bugme-daemon, ant.starikov, Peter Zijlstra


* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Tue, 23 Mar 2010 18:34:09 +0100
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > It shows a very brutal amount of page fault invoked mmap_sem spinning 
> > overhead.
> > 
> 
> Yes.  Note that we fall off a cliff at nine threads on a 16-way.  As soon as 
> a core gets two threads scheduled onto it?

it's AMD Opterons so no SMT.

My (wild) guess would be that 8 cpus can still do cacheline ping-pong 
reasonably efficiently, but it starts breaking down very seriously with 9 or 
more cores bouncing the same single cache-line.

Breakdowns in scalability are usually very non-linear, for hardware and 
software reasons. '8 threads' sounds like a hw limit to me. From the scheduler 
POV there's no big difference between 8 or 9 CPUs used [this is non-HT] - with 
8 or 7 cores still idle.

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 18:27         ` Ingo Molnar
  0 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2010-03-23 18:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, linux-mm, linux-kernel, bugzilla-daemon,
	bugme-daemon, ant.starikov, Peter Zijlstra


* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Tue, 23 Mar 2010 18:34:09 +0100
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > It shows a very brutal amount of page fault invoked mmap_sem spinning 
> > overhead.
> > 
> 
> Yes.  Note that we fall off a cliff at nine threads on a 16-way.  As soon as 
> a core gets two threads scheduled onto it?

it's AMD Opterons so no SMT.

My (wild) guess would be that 8 cpus can still do cacheline ping-pong 
reasonably efficiently, but it starts breaking down very seriously with 9 or 
more cores bouncing the same single cache-line.

Breakdowns in scalability are usually very non-linear, for hardware and 
software reasons. '8 threads' sounds like a hw limit to me. From the scheduler 
POV there's no big difference between 8 or 9 CPUs used [this is non-HT] - with 
8 or 7 cores still idle.

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 17:45       ` Linus Torvalds
@ 2010-03-23 19:14         ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 19:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Andrew Morton, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra


On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Ingo Molnar wrote:
>> 
>> It shows a very brutal amount of page fault invoked mmap_sem spinning 
>> overhead.
> 
> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> the shit-for-brains generic version" thing, and it's fixed by
> 
> 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> 	5d0b723 x86: clean up rwsem type system
> 	59c33fa x86-32: clean up rwsem inline asm statements
> 
> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> compile his own kernel to test his load.


Applied mentioned patches. Things didn't improve too much.

before:
prog: Total exploration time 9.880 real 60.620 user 76.970 sys

after:
prog: Total exploration time 9.020 real 59.430 user 66.190 sys

perf report:

    38.58%             prog  [kernel]                                           [k] _spin_lock_irqsave
    37.42%             prog  ./prog                                             [.] DBSLLlookup_ret
     6.22%             prog  ./prog                                             [.] SuperFastHash
     3.65%             prog  /lib64/libc-2.11.1.so                              [.] __GI_memcpy
     2.09%             prog  ./anderson.6.dve2C                                 [.] get_successors
     1.75%             prog  [kernel]                                           [k] clear_page_c
     1.73%             prog  ./prog                                             [.] index_next_dfs
     0.71%             prog  [kernel]                                           [k] handle_mm_fault
     0.38%             prog  ./prog                                             [.] cb_hook
     0.33%             prog  ./prog                                             [.] get_local
     0.32%             prog  [kernel]                                           [k] page_fault

Anton.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 19:14         ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 19:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Andrew Morton, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra


On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Ingo Molnar wrote:
>> 
>> It shows a very brutal amount of page fault invoked mmap_sem spinning 
>> overhead.
> 
> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> the shit-for-brains generic version" thing, and it's fixed by
> 
> 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> 	5d0b723 x86: clean up rwsem type system
> 	59c33fa x86-32: clean up rwsem inline asm statements
> 
> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> compile his own kernel to test his load.


Applied mentioned patches. Things didn't improve too much.

before:
prog: Total exploration time 9.880 real 60.620 user 76.970 sys

after:
prog: Total exploration time 9.020 real 59.430 user 66.190 sys

perf report:

    38.58%             prog  [kernel]                                           [k] _spin_lock_irqsave
    37.42%             prog  ./prog                                             [.] DBSLLlookup_ret
     6.22%             prog  ./prog                                             [.] SuperFastHash
     3.65%             prog  /lib64/libc-2.11.1.so                              [.] __GI_memcpy
     2.09%             prog  ./anderson.6.dve2C                                 [.] get_successors
     1.75%             prog  [kernel]                                           [k] clear_page_c
     1.73%             prog  ./prog                                             [.] index_next_dfs
     0.71%             prog  [kernel]                                           [k] handle_mm_fault
     0.38%             prog  ./prog                                             [.] cb_hook
     0.33%             prog  ./prog                                             [.] get_local
     0.32%             prog  [kernel]                                           [k] page_fault

Anton.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 19:14         ` Anton Starikov
@ 2010-03-23 19:17           ` Peter Zijlstra
  -1 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2010-03-23 19:17 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon

On Tue, 2010-03-23 at 20:14 +0100, Anton Starikov wrote:
> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
> 
> > 
> > 
> > On Tue, 23 Mar 2010, Ingo Molnar wrote:
> >> 
> >> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> >> overhead.
> > 
> > Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> > the shit-for-brains generic version" thing, and it's fixed by
> > 
> > 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> > 	5d0b723 x86: clean up rwsem type system
> > 	59c33fa x86-32: clean up rwsem inline asm statements
> > 
> > NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> > are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> > compile his own kernel to test his load.
> 
> 
> Applied mentioned patches. Things didn't improve too much.
> 
> before:
> prog: Total exploration time 9.880 real 60.620 user 76.970 sys
> 
> after:
> prog: Total exploration time 9.020 real 59.430 user 66.190 sys
> 
> perf report:
> 
>     38.58%             prog  [kernel]                                           [k] _spin_lock_irqsave
>     37.42%             prog  ./prog                                             [.] DBSLLlookup_ret
>      6.22%             prog  ./prog                                             [.] SuperFastHash
>      3.65%             prog  /lib64/libc-2.11.1.so                              [.] __GI_memcpy
>      2.09%             prog  ./anderson.6.dve2C                                 [.] get_successors
>      1.75%             prog  [kernel]                                           [k] clear_page_c
>      1.73%             prog  ./prog                                             [.] index_next_dfs
>      0.71%             prog  [kernel]                                           [k] handle_mm_fault
>      0.38%             prog  ./prog                                             [.] cb_hook
>      0.33%             prog  ./prog                                             [.] get_local
>      0.32%             prog  [kernel]                                           [k] page_fault

Could you verify with a callgraph profile what that spin_lock_irqsave()
is? If those rwsem patches were successfull mmap_sem should no longer
have a spinlock to content on, in which case it might be another lock.

If not, something went wrong with backporting those patches.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 19:17           ` Peter Zijlstra
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2010-03-23 19:17 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon

On Tue, 2010-03-23 at 20:14 +0100, Anton Starikov wrote:
> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
> 
> > 
> > 
> > On Tue, 23 Mar 2010, Ingo Molnar wrote:
> >> 
> >> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> >> overhead.
> > 
> > Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> > the shit-for-brains generic version" thing, and it's fixed by
> > 
> > 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> > 	5d0b723 x86: clean up rwsem type system
> > 	59c33fa x86-32: clean up rwsem inline asm statements
> > 
> > NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> > are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> > compile his own kernel to test his load.
> 
> 
> Applied mentioned patches. Things didn't improve too much.
> 
> before:
> prog: Total exploration time 9.880 real 60.620 user 76.970 sys
> 
> after:
> prog: Total exploration time 9.020 real 59.430 user 66.190 sys
> 
> perf report:
> 
>     38.58%             prog  [kernel]                                           [k] _spin_lock_irqsave
>     37.42%             prog  ./prog                                             [.] DBSLLlookup_ret
>      6.22%             prog  ./prog                                             [.] SuperFastHash
>      3.65%             prog  /lib64/libc-2.11.1.so                              [.] __GI_memcpy
>      2.09%             prog  ./anderson.6.dve2C                                 [.] get_successors
>      1.75%             prog  [kernel]                                           [k] clear_page_c
>      1.73%             prog  ./prog                                             [.] index_next_dfs
>      0.71%             prog  [kernel]                                           [k] handle_mm_fault
>      0.38%             prog  ./prog                                             [.] cb_hook
>      0.33%             prog  ./prog                                             [.] get_local
>      0.32%             prog  [kernel]                                           [k] page_fault

Could you verify with a callgraph profile what that spin_lock_irqsave()
is? If those rwsem patches were successfull mmap_sem should no longer
have a spinlock to content on, in which case it might be another lock.

If not, something went wrong with backporting those patches.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 18:25               ` Anton Starikov
@ 2010-03-23 19:22                 ` Robin Holt
  -1 siblings, 0 replies; 66+ messages in thread
From: Robin Holt @ 2010-03-23 19:22 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
> >> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
> >> 
> > 
> > You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> > CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
> 
> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.

Have you tracked this down yet?  I just got the patches applied against
an older kernel and am running into the same issue.

Thanks,
Robin

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 19:22                 ` Robin Holt
  0 siblings, 0 replies; 66+ messages in thread
From: Robin Holt @ 2010-03-23 19:22 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
> >> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
> >> 
> > 
> > You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> > CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
> 
> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.

Have you tracked this down yet?  I just got the patches applied against
an older kernel and am running into the same issue.

Thanks,
Robin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 19:22                 ` Robin Holt
@ 2010-03-23 19:30                   ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 19:30 UTC (permalink / raw)
  To: Robin Holt
  Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra


On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:

> On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
>> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
>>>> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
>>>> 
>>> 
>>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
>>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
>> 
>> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.
> 
> Have you tracked this down yet?  I just got the patches applied against
> an older kernel and am running into the same issue.

I decided to not track down this issue and just applied patches. I understood that with this patches there is no need to change this config options. Am I wrong?

Anton

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 19:30                   ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 19:30 UTC (permalink / raw)
  To: Robin Holt
  Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra


On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:

> On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
>> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
>>>> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
>>>> 
>>> 
>>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
>>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
>> 
>> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.
> 
> Have you tracked this down yet?  I just got the patches applied against
> an older kernel and am running into the same issue.

I decided to not track down this issue and just applied patches. I understood that with this patches there is no need to change this config options. Am I wrong?

Anton
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 19:17           ` Peter Zijlstra
  (?)
@ 2010-03-23 19:42           ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 19:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon

[-- Attachment #1: Type: text/plain, Size: 157 bytes --]

I attach here callgraph.

Also I checked kernel source, actual code which was compiled is exactly what should be after patches.

Do I miss something?


[-- Attachment #2: callg.txt.gz --]
[-- Type: application/x-gzip, Size: 166398 bytes --]

[-- Attachment #3: Type: text/plain, Size: 2538 bytes --]


On Mar 23, 2010, at 8:17 PM, Peter Zijlstra wrote:

> On Tue, 2010-03-23 at 20:14 +0100, Anton Starikov wrote:
>> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
>> 
>>> 
>>> 
>>> On Tue, 23 Mar 2010, Ingo Molnar wrote:
>>>> 
>>>> It shows a very brutal amount of page fault invoked mmap_sem spinning 
>>>> overhead.
>>> 
>>> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
>>> the shit-for-brains generic version" thing, and it's fixed by
>>> 
>>> 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
>>> 	5d0b723 x86: clean up rwsem type system
>>> 	59c33fa x86-32: clean up rwsem inline asm statements
>>> 
>>> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
>>> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
>>> compile his own kernel to test his load.
>> 
>> 
>> Applied mentioned patches. Things didn't improve too much.
>> 
>> before:
>> prog: Total exploration time 9.880 real 60.620 user 76.970 sys
>> 
>> after:
>> prog: Total exploration time 9.020 real 59.430 user 66.190 sys
>> 
>> perf report:
>> 
>>    38.58%             prog  [kernel]                                           [k] _spin_lock_irqsave
>>    37.42%             prog  ./prog                                             [.] DBSLLlookup_ret
>>     6.22%             prog  ./prog                                             [.] SuperFastHash
>>     3.65%             prog  /lib64/libc-2.11.1.so                              [.] __GI_memcpy
>>     2.09%             prog  ./anderson.6.dve2C                                 [.] get_successors
>>     1.75%             prog  [kernel]                                           [k] clear_page_c
>>     1.73%             prog  ./prog                                             [.] index_next_dfs
>>     0.71%             prog  [kernel]                                           [k] handle_mm_fault
>>     0.38%             prog  ./prog                                             [.] cb_hook
>>     0.33%             prog  ./prog                                             [.] get_local
>>     0.32%             prog  [kernel]                                           [k] page_fault
> 
> Could you verify with a callgraph profile what that spin_lock_irqsave()
> is? If those rwsem patches were successfull mmap_sem should no longer
> have a spinlock to content on, in which case it might be another lock.
> 
> If not, something went wrong with backporting those patches.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 19:30                   ` Anton Starikov
@ 2010-03-23 19:49                     ` Robin Holt
  -1 siblings, 0 replies; 66+ messages in thread
From: Robin Holt @ 2010-03-23 19:49 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Robin Holt, Andrew Morton, Ingo Molnar, Linus Torvalds, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Tue, Mar 23, 2010 at 08:30:19PM +0100, Anton Starikov wrote:
> 
> On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:
> 
> > On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
> >> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
> >>>> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
> >>>> 
> >>> 
> >>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> >>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
> >> 
> >> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.
> > 
> > Have you tracked this down yet?  I just got the patches applied against
> > an older kernel and am running into the same issue.
> 
> I decided to not track down this issue and just applied patches. I understood that with this patches there is no need to change this config options. Am I wrong?

We might need to also apply:
bafaecd11df15ad5b1e598adc7736afcd38ee13d

Robin

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 19:49                     ` Robin Holt
  0 siblings, 0 replies; 66+ messages in thread
From: Robin Holt @ 2010-03-23 19:49 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Robin Holt, Andrew Morton, Ingo Molnar, Linus Torvalds, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Tue, Mar 23, 2010 at 08:30:19PM +0100, Anton Starikov wrote:
> 
> On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:
> 
> > On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
> >> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
> >>>> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
> >>>> 
> >>> 
> >>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> >>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
> >> 
> >> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.
> > 
> > Have you tracked this down yet?  I just got the patches applied against
> > an older kernel and am running into the same issue.
> 
> I decided to not track down this issue and just applied patches. I understood that with this patches there is no need to change this config options. Am I wrong?

We might need to also apply:
bafaecd11df15ad5b1e598adc7736afcd38ee13d

Robin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 19:22                 ` Robin Holt
@ 2010-03-23 19:50                   ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 19:50 UTC (permalink / raw)
  To: Robin Holt
  Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra


On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:

> On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
>> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
>>>> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
>>>> 
>>> 
>>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
>>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
>> 
>> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.
> 
> Have you tracked this down yet?  I just got the patches applied against
> an older kernel and am running into the same issue.


I think you can prevent overwriting this options if you set them in arch/x86/configs/x86_64_defconfig

Anton


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 19:50                   ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 19:50 UTC (permalink / raw)
  To: Robin Holt
  Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra


On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:

> On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
>> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
>>>> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
>>>> 
>>> 
>>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
>>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
>> 
>> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.
> 
> Have you tracked this down yet?  I just got the patches applied against
> an older kernel and am running into the same issue.


I think you can prevent overwriting this options if you set them in arch/x86/configs/x86_64_defconfig

Anton

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 18:21             ` Andrew Morton
@ 2010-03-23 19:52               ` Linus Torvalds
  -1 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-23 19:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Anton Starikov, Ingo Molnar, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra



On Tue, 23 Mar 2010, Andrew Morton wrote:
> 
> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?

No. Doesn't work. The XADD code simply never worked on x86-64, which is 
why those three commits I pointed at are required.

Oh, and you need one more commit (at least) in addition to the three I 
already mentioned - the one that actually adds the x86-64 wrappers and 
Kconfig option:

	bafaecd x86-64: support native xadd rwsem implementation

so the minimal list of commits (on top of 2.6.33) is at least

	59c33fa x86-32: clean up rwsem inline asm statements
	5d0b723 x86: clean up rwsem type system
	bafaecd x86-64: support native xadd rwsem implementation
	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation

and I just verified that they at least cherry-pick cleanly (in that 
order). I _think_ it would be good to also do

	0d1622d x86-64, rwsem: Avoid store forwarding hazard in __downgrade_write

but that one is a small detail, not anything fundamentally important.

			Linus

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 19:52               ` Linus Torvalds
  0 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-23 19:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Anton Starikov, Ingo Molnar, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra



On Tue, 23 Mar 2010, Andrew Morton wrote:
> 
> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?

No. Doesn't work. The XADD code simply never worked on x86-64, which is 
why those three commits I pointed at are required.

Oh, and you need one more commit (at least) in addition to the three I 
already mentioned - the one that actually adds the x86-64 wrappers and 
Kconfig option:

	bafaecd x86-64: support native xadd rwsem implementation

so the minimal list of commits (on top of 2.6.33) is at least

	59c33fa x86-32: clean up rwsem inline asm statements
	5d0b723 x86: clean up rwsem type system
	bafaecd x86-64: support native xadd rwsem implementation
	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation

and I just verified that they at least cherry-pick cleanly (in that 
order). I _think_ it would be good to also do

	0d1622d x86-64, rwsem: Avoid store forwarding hazard in __downgrade_write

but that one is a small detail, not anything fundamentally important.

			Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 19:14         ` Anton Starikov
@ 2010-03-23 19:54           ` Linus Torvalds
  -1 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-23 19:54 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Ingo Molnar, Andrew Morton, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra



On Tue, 23 Mar 2010, Anton Starikov wrote:

> 
> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
> 
> > 
> > 
> > On Tue, 23 Mar 2010, Ingo Molnar wrote:
> >> 
> >> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> >> overhead.
> > 
> > Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> > the shit-for-brains generic version" thing, and it's fixed by
> > 
> > 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> > 	5d0b723 x86: clean up rwsem type system
> > 	59c33fa x86-32: clean up rwsem inline asm statements
> > 
> > NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> > are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> > compile his own kernel to test his load.
> 
> 
> Applied mentioned patches. Things didn't improve too much.

Yeah, I missed at least one commit, namely

	bafaecd x86-64: support native xadd rwsem implementation

which is the one that actually makes x86-64 able to use the xadd version.

		Linus

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 19:54           ` Linus Torvalds
  0 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-23 19:54 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Ingo Molnar, Andrew Morton, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra



On Tue, 23 Mar 2010, Anton Starikov wrote:

> 
> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
> 
> > 
> > 
> > On Tue, 23 Mar 2010, Ingo Molnar wrote:
> >> 
> >> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> >> overhead.
> > 
> > Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> > the shit-for-brains generic version" thing, and it's fixed by
> > 
> > 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> > 	5d0b723 x86: clean up rwsem type system
> > 	59c33fa x86-32: clean up rwsem inline asm statements
> > 
> > NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> > are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> > compile his own kernel to test his load.
> 
> 
> Applied mentioned patches. Things didn't improve too much.

Yeah, I missed at least one commit, namely

	bafaecd x86-64: support native xadd rwsem implementation

which is the one that actually makes x86-64 able to use the xadd version.

		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 19:49                     ` Robin Holt
@ 2010-03-23 19:57                       ` Robin Holt
  -1 siblings, 0 replies; 66+ messages in thread
From: Robin Holt @ 2010-03-23 19:57 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Robin Holt, Andrew Morton, Ingo Molnar, Linus Torvalds, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Tue, Mar 23, 2010 at 02:49:59PM -0500, Robin Holt wrote:
> On Tue, Mar 23, 2010 at 08:30:19PM +0100, Anton Starikov wrote:
> > 
> > On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:
> > 
> > > On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
> > >> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
> > >>>> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
> > >>>> 
> > >>> 
> > >>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> > >>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
> > >> 
> > >> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.
> > > 
> > > Have you tracked this down yet?  I just got the patches applied against
> > > an older kernel and am running into the same issue.
> > 
> > I decided to not track down this issue and just applied patches. I understood that with this patches there is no need to change this config options. Am I wrong?
> 
> We might need to also apply:
> bafaecd11df15ad5b1e598adc7736afcd38ee13d

For the record, these are the patches I have applied to a 2.6.32 kernel from a vendor:

59c33fa7791e9948ba467c2b83e307a0d087ab49
5d0b7235d83eefdafda300656e97d368afcafc9a
1838ef1d782f7527e6defe87e180598622d2d071
0d1622d7f526311d87d7da2ee7dd14b73e45d3fc
bafaecd11df15ad5b1e598adc7736afcd38ee13d

A quick look at the disassembly makes it look like we are using the
rwsem_64, et al.

Robin

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 19:57                       ` Robin Holt
  0 siblings, 0 replies; 66+ messages in thread
From: Robin Holt @ 2010-03-23 19:57 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Robin Holt, Andrew Morton, Ingo Molnar, Linus Torvalds, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Tue, Mar 23, 2010 at 02:49:59PM -0500, Robin Holt wrote:
> On Tue, Mar 23, 2010 at 08:30:19PM +0100, Anton Starikov wrote:
> > 
> > On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:
> > 
> > > On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
> > >> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
> > >>>> I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.
> > >>>> 
> > >>> 
> > >>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> > >>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
> > >> 
> > >> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.
> > > 
> > > Have you tracked this down yet?  I just got the patches applied against
> > > an older kernel and am running into the same issue.
> > 
> > I decided to not track down this issue and just applied patches. I understood that with this patches there is no need to change this config options. Am I wrong?
> 
> We might need to also apply:
> bafaecd11df15ad5b1e598adc7736afcd38ee13d

For the record, these are the patches I have applied to a 2.6.32 kernel from a vendor:

59c33fa7791e9948ba467c2b83e307a0d087ab49
5d0b7235d83eefdafda300656e97d368afcafc9a
1838ef1d782f7527e6defe87e180598622d2d071
0d1622d7f526311d87d7da2ee7dd14b73e45d3fc
bafaecd11df15ad5b1e598adc7736afcd38ee13d

A quick look at the disassembly makes it look like we are using the
rwsem_64, et al.

Robin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 19:54           ` Linus Torvalds
@ 2010-03-23 20:43             ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 20:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Andrew Morton, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra

I think we got a winner!

Problem seems to be fixed.

Just for record, I used next patches:

59c33fa7791e9948ba467c2b83e307a0d087ab49
5d0b7235d83eefdafda300656e97d368afcafc9a
1838ef1d782f7527e6defe87e180598622d2d071
4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
bafaecd11df15ad5b1e598adc7736afcd38ee13d
0d1622d7f526311d87d7da2ee7dd14b73e45d3fc


Thanks,
Anton.

On Mar 23, 2010, at 8:54 PM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Anton Starikov wrote:
> 
>> 
>> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
>> 
>>> 
>>> 
>>> On Tue, 23 Mar 2010, Ingo Molnar wrote:
>>>> 
>>>> It shows a very brutal amount of page fault invoked mmap_sem spinning 
>>>> overhead.
>>> 
>>> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
>>> the shit-for-brains generic version" thing, and it's fixed by
>>> 
>>> 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
>>> 	5d0b723 x86: clean up rwsem type system
>>> 	59c33fa x86-32: clean up rwsem inline asm statements
>>> 
>>> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
>>> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
>>> compile his own kernel to test his load.
>> 
>> 
>> Applied mentioned patches. Things didn't improve too much.
> 
> Yeah, I missed at least one commit, namely
> 
> 	bafaecd x86-64: support native xadd rwsem implementation
> 
> which is the one that actually makes x86-64 able to use the xadd version.
> 
> 		Linus


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 20:43             ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 20:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Andrew Morton, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra

I think we got a winner!

Problem seems to be fixed.

Just for record, I used next patches:

59c33fa7791e9948ba467c2b83e307a0d087ab49
5d0b7235d83eefdafda300656e97d368afcafc9a
1838ef1d782f7527e6defe87e180598622d2d071
4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
bafaecd11df15ad5b1e598adc7736afcd38ee13d
0d1622d7f526311d87d7da2ee7dd14b73e45d3fc


Thanks,
Anton.

On Mar 23, 2010, at 8:54 PM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Anton Starikov wrote:
> 
>> 
>> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
>> 
>>> 
>>> 
>>> On Tue, 23 Mar 2010, Ingo Molnar wrote:
>>>> 
>>>> It shows a very brutal amount of page fault invoked mmap_sem spinning 
>>>> overhead.
>>> 
>>> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
>>> the shit-for-brains generic version" thing, and it's fixed by
>>> 
>>> 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
>>> 	5d0b723 x86: clean up rwsem type system
>>> 	59c33fa x86-32: clean up rwsem inline asm statements
>>> 
>>> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
>>> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
>>> compile his own kernel to test his load.
>> 
>> 
>> Applied mentioned patches. Things didn't improve too much.
> 
> Yeah, I missed at least one commit, namely
> 
> 	bafaecd x86-64: support native xadd rwsem implementation
> 
> which is the one that actually makes x86-64 able to use the xadd version.
> 
> 		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 18:13       ` Andrew Morton
@ 2010-03-23 21:19         ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 21:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Linus Torvalds, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra

Although case is solved, I will post description for testcase program.
Just in case someone wonder or would like to keep it for some later tests.

------------------------------------------------------------------------
It is a parallel model checker. The command line you used does reachability
on the state space of mode anderson.6, meaning that it searches through all
possible states (int vectors). Each thread gets a vector from the queue,
calculates its successor states and puts them in a lock-less static hash
table (pseudo BFS exploration because the threads each have there own
queue).

How did ingo run the binary? Because the static table size should be chosen
to fit into memory. "-s 27" allocates 2^27 * (|vector| + 1 ) * sizeof(int)
bytes. |vector| is equal to 19 for anderson.6, ergo the table size is 10GB.
This could explain the huge number of page faults ingo gets.

But anyway, you can imagine that the code is quiet jumpy and has a big
memory footprint, so the page faults may also be normal.
------------------------------------------------------------------------

On Mar 23, 2010, at 7:13 PM, Andrew Morton wrote:

> Anton, we have an executable binary in the bugzilla report but it would
> be nice to also have at least a description of what that code is
> actually doing.  A quick strace shows quite a lot of mprotect activity.
> A pseudo-code walkthrough, perhaps?
> 
> Thanks.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 21:19         ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 21:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Linus Torvalds, linux-mm, linux-kernel,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra

Although case is solved, I will post description for testcase program.
Just in case someone wonder or would like to keep it for some later tests.

------------------------------------------------------------------------
It is a parallel model checker. The command line you used does reachability
on the state space of mode anderson.6, meaning that it searches through all
possible states (int vectors). Each thread gets a vector from the queue,
calculates its successor states and puts them in a lock-less static hash
table (pseudo BFS exploration because the threads each have there own
queue).

How did ingo run the binary? Because the static table size should be chosen
to fit into memory. "-s 27" allocates 2^27 * (|vector| + 1 ) * sizeof(int)
bytes. |vector| is equal to 19 for anderson.6, ergo the table size is 10GB.
This could explain the huge number of page faults ingo gets.

But anyway, you can imagine that the code is quiet jumpy and has a big
memory footprint, so the page faults may also be normal.
------------------------------------------------------------------------

On Mar 23, 2010, at 7:13 PM, Andrew Morton wrote:

> Anton, we have an executable binary in the bugzilla report but it would
> be nice to also have at least a description of what that code is
> actually doing.  A quick strace shows quite a lot of mprotect activity.
> A pseudo-code walkthrough, perhaps?
> 
> Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 20:43             ` Anton Starikov
@ 2010-03-23 23:04               ` Linus Torvalds
  -1 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-23 23:04 UTC (permalink / raw)
  To: Anton Starikov, Greg KH, stable
  Cc: Ingo Molnar, Andrew Morton, linux-mm, Linux Kernel Mailing List,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra



On Tue, 23 Mar 2010, Anton Starikov wrote:
>
> I think we got a winner!
> 
> Problem seems to be fixed.
> 
> Just for record, I used next patches:
> 
> 59c33fa7791e9948ba467c2b83e307a0d087ab49
> 5d0b7235d83eefdafda300656e97d368afcafc9a
> 1838ef1d782f7527e6defe87e180598622d2d071
> 4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
> bafaecd11df15ad5b1e598adc7736afcd38ee13d
> 0d1622d7f526311d87d7da2ee7dd14b73e45d3fc

Ok. If you have performance numbers for before/after these patches for 
your actual workload, I'd suggest posting them to stable@kernel.org, and 
maybe those rwsem fixes will get back-ported.

The patches are pretty small, and should be fairly safe. So they are 
certainly stable material.

		Linus

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 23:04               ` Linus Torvalds
  0 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-23 23:04 UTC (permalink / raw)
  To: Anton Starikov, Greg KH, stable
  Cc: Ingo Molnar, Andrew Morton, linux-mm, Linux Kernel Mailing List,
	bugzilla-daemon, bugme-daemon, Peter Zijlstra



On Tue, 23 Mar 2010, Anton Starikov wrote:
>
> I think we got a winner!
> 
> Problem seems to be fixed.
> 
> Just for record, I used next patches:
> 
> 59c33fa7791e9948ba467c2b83e307a0d087ab49
> 5d0b7235d83eefdafda300656e97d368afcafc9a
> 1838ef1d782f7527e6defe87e180598622d2d071
> 4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
> bafaecd11df15ad5b1e598adc7736afcd38ee13d
> 0d1622d7f526311d87d7da2ee7dd14b73e45d3fc

Ok. If you have performance numbers for before/after these patches for 
your actual workload, I'd suggest posting them to stable@kernel.org, and 
maybe those rwsem fixes will get back-ported.

The patches are pretty small, and should be fairly safe. So they are 
certainly stable material.

		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 23:04               ` Linus Torvalds
@ 2010-03-23 23:19                 ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 23:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Greg KH, stable, Ingo Molnar, Andrew Morton, linux-mm,
	Linux Kernel Mailing List, bugzilla-daemon, bugme-daemon,
	Peter Zijlstra

Tomorrow I will try to patch and check 2.6.33 and see are this patches enough to restore performance or not, because on 2.6.33 kernel performance issue also used to involve somehow crgoup business (and performance was terrible even comparing to broken 2.6.32). If it will not fix 2.6.33, then I will ask to reopen the bug, otherwise I will post to stable@.

Thanks again for help,
Anton.

On Mar 24, 2010, at 12:04 AM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Anton Starikov wrote:
>> 
>> I think we got a winner!
>> 
>> Problem seems to be fixed.
>> 
>> Just for record, I used next patches:
>> 
>> 59c33fa7791e9948ba467c2b83e307a0d087ab49
>> 5d0b7235d83eefdafda300656e97d368afcafc9a
>> 1838ef1d782f7527e6defe87e180598622d2d071
>> 4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
>> bafaecd11df15ad5b1e598adc7736afcd38ee13d
>> 0d1622d7f526311d87d7da2ee7dd14b73e45d3fc
> 
> Ok. If you have performance numbers for before/after these patches for 
> your actual workload, I'd suggest posting them to stable@kernel.org, and 
> maybe those rwsem fixes will get back-ported.
> 
> The patches are pretty small, and should be fairly safe. So they are 
> certainly stable material.
> 
> 		Linus


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 23:19                 ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-23 23:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Greg KH, stable, Ingo Molnar, Andrew Morton, linux-mm,
	Linux Kernel Mailing List, bugzilla-daemon, bugme-daemon,
	Peter Zijlstra

Tomorrow I will try to patch and check 2.6.33 and see are this patches enough to restore performance or not, because on 2.6.33 kernel performance issue also used to involve somehow crgoup business (and performance was terrible even comparing to broken 2.6.32). If it will not fix 2.6.33, then I will ask to reopen the bug, otherwise I will post to stable@.

Thanks again for help,
Anton.

On Mar 24, 2010, at 12:04 AM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Anton Starikov wrote:
>> 
>> I think we got a winner!
>> 
>> Problem seems to be fixed.
>> 
>> Just for record, I used next patches:
>> 
>> 59c33fa7791e9948ba467c2b83e307a0d087ab49
>> 5d0b7235d83eefdafda300656e97d368afcafc9a
>> 1838ef1d782f7527e6defe87e180598622d2d071
>> 4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
>> bafaecd11df15ad5b1e598adc7736afcd38ee13d
>> 0d1622d7f526311d87d7da2ee7dd14b73e45d3fc
> 
> Ok. If you have performance numbers for before/after these patches for 
> your actual workload, I'd suggest posting them to stable@kernel.org, and 
> maybe those rwsem fixes will get back-ported.
> 
> The patches are pretty small, and should be fairly safe. So they are 
> certainly stable material.
> 
> 		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 23:04               ` Linus Torvalds
@ 2010-03-23 23:36                 ` Ingo Molnar
  -1 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2010-03-23 23:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Anton Starikov, Greg KH, stable, Andrew Morton, linux-mm,
	Linux Kernel Mailing List, bugzilla-daemon, bugme-daemon,
	Peter Zijlstra


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Tue, 23 Mar 2010, Anton Starikov wrote:
> >
> > I think we got a winner!
> > 
> > Problem seems to be fixed.
> > 
> > Just for record, I used next patches:
> > 
> > 59c33fa7791e9948ba467c2b83e307a0d087ab49
> > 5d0b7235d83eefdafda300656e97d368afcafc9a
> > 1838ef1d782f7527e6defe87e180598622d2d071
> > 4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
> > bafaecd11df15ad5b1e598adc7736afcd38ee13d
> > 0d1622d7f526311d87d7da2ee7dd14b73e45d3fc
> 
> Ok. If you have performance numbers for before/after these patches for 
> your actual workload, I'd suggest posting them to stable@kernel.org, and 
> maybe those rwsem fixes will get back-ported.
> 
> The patches are pretty small, and should be fairly safe. So they are 
> certainly stable material.

We havent had any stability problems with them, except one trivial build bug, 
so -stable would be nice.

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 23:36                 ` Ingo Molnar
  0 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2010-03-23 23:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Anton Starikov, Greg KH, stable, Andrew Morton, linux-mm,
	Linux Kernel Mailing List, bugzilla-daemon, bugme-daemon,
	Peter Zijlstra


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Tue, 23 Mar 2010, Anton Starikov wrote:
> >
> > I think we got a winner!
> > 
> > Problem seems to be fixed.
> > 
> > Just for record, I used next patches:
> > 
> > 59c33fa7791e9948ba467c2b83e307a0d087ab49
> > 5d0b7235d83eefdafda300656e97d368afcafc9a
> > 1838ef1d782f7527e6defe87e180598622d2d071
> > 4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
> > bafaecd11df15ad5b1e598adc7736afcd38ee13d
> > 0d1622d7f526311d87d7da2ee7dd14b73e45d3fc
> 
> Ok. If you have performance numbers for before/after these patches for 
> your actual workload, I'd suggest posting them to stable@kernel.org, and 
> maybe those rwsem fixes will get back-ported.
> 
> The patches are pretty small, and should be fairly safe. So they are 
> certainly stable material.

We havent had any stability problems with them, except one trivial build bug, 
so -stable would be nice.

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 23:36                 ` Ingo Molnar
@ 2010-03-23 23:55                   ` Linus Torvalds
  -1 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-23 23:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Starikov, Greg KH, stable, Andrew Morton, linux-mm,
	Linux Kernel Mailing List, bugzilla-daemon, bugme-daemon,
	Peter Zijlstra



On Wed, 24 Mar 2010, Ingo Molnar wrote:
> 
> We havent had any stability problems with them, except one trivial build bug, 
> so -stable would be nice.

Oh, you're right. There was that UML build bug. But I think that was 
included in the list of commits Anton had - commit 4126faf0ab ("x86: Fix 
breakage of UML from the changes in the rwsem system").

		Linus

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-23 23:55                   ` Linus Torvalds
  0 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-23 23:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Starikov, Greg KH, stable, Andrew Morton, linux-mm,
	Linux Kernel Mailing List, bugzilla-daemon, bugme-daemon,
	Peter Zijlstra



On Wed, 24 Mar 2010, Ingo Molnar wrote:
> 
> We havent had any stability problems with them, except one trivial build bug, 
> so -stable would be nice.

Oh, you're right. There was that UML build bug. But I think that was 
included in the list of commits Anton had - commit 4126faf0ab ("x86: Fix 
breakage of UML from the changes in the rwsem system").

		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 23:55                   ` Linus Torvalds
@ 2010-03-24  0:03                     ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-24  0:03 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Greg KH, stable, Andrew Morton, linux-mm,
	Linux Kernel Mailing List, bugzilla-daemon, bugme-daemon,
	Peter Zijlstra

Yes, it is included into my list.
When I will submit it into stable, I will include it also.

Anton

On Mar 24, 2010, at 12:55 AM, Linus Torvalds wrote:

> 
> 
> On Wed, 24 Mar 2010, Ingo Molnar wrote:
>> 
>> We havent had any stability problems with them, except one trivial build bug, 
>> so -stable would be nice.
> 
> Oh, you're right. There was that UML build bug. But I think that was 
> included in the list of commits Anton had - commit 4126faf0ab ("x86: Fix 
> breakage of UML from the changes in the rwsem system").
> 
> 		Linus


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-24  0:03                     ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-24  0:03 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Greg KH, stable, Andrew Morton, linux-mm,
	Linux Kernel Mailing List, bugzilla-daemon, bugme-daemon,
	Peter Zijlstra

Yes, it is included into my list.
When I will submit it into stable, I will include it also.

Anton

On Mar 24, 2010, at 12:55 AM, Linus Torvalds wrote:

> 
> 
> On Wed, 24 Mar 2010, Ingo Molnar wrote:
>> 
>> We havent had any stability problems with them, except one trivial build bug, 
>> so -stable would be nice.
> 
> Oh, you're right. There was that UML build bug. But I think that was 
> included in the list of commits Anton had - commit 4126faf0ab ("x86: Fix 
> breakage of UML from the changes in the rwsem system").
> 
> 		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 23:55                   ` Linus Torvalds
@ 2010-03-24  2:15                     ` Andi Kleen
  -1 siblings, 0 replies; 66+ messages in thread
From: Andi Kleen @ 2010-03-24  2:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Anton Starikov, Greg KH, stable, Andrew Morton,
	linux-mm, Linux Kernel Mailing List, bugzilla-daemon,
	bugme-daemon, Peter Zijlstra

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, 24 Mar 2010, Ingo Molnar wrote:
>> 
>> We havent had any stability problems with them, except one trivial build bug, 
>> so -stable would be nice.
>
> Oh, you're right. There was that UML build bug. But I think that was 
> included in the list of commits Anton had - commit 4126faf0ab ("x86: Fix 
> breakage of UML from the changes in the rwsem system").

It would be also nice to get that change into 2.6.32 stable. That is
widely used on larger systems.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-24  2:15                     ` Andi Kleen
  0 siblings, 0 replies; 66+ messages in thread
From: Andi Kleen @ 2010-03-24  2:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Anton Starikov, Greg KH, stable, Andrew Morton,
	linux-mm, Linux Kernel Mailing List, bugzilla-daemon,
	bugme-daemon, Peter Zijlstra

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, 24 Mar 2010, Ingo Molnar wrote:
>> 
>> We havent had any stability problems with them, except one trivial build bug, 
>> so -stable would be nice.
>
> Oh, you're right. There was that UML build bug. But I think that was 
> included in the list of commits Anton had - commit 4126faf0ab ("x86: Fix 
> breakage of UML from the changes in the rwsem system").

It would be also nice to get that change into 2.6.32 stable. That is
widely used on larger systems.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-24  2:15                     ` Andi Kleen
@ 2010-03-24  3:00                       ` Linus Torvalds
  -1 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-24  3:00 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ingo Molnar, Anton Starikov, Greg KH, stable, Andrew Morton,
	linux-mm, Linux Kernel Mailing List, bugzilla-daemon,
	bugme-daemon, Peter Zijlstra



On Wed, 24 Mar 2010, Andi Kleen wrote:
> 
> It would be also nice to get that change into 2.6.32 stable. That is
> widely used on larger systems.

Looking at the changes to the files in question, it looks like it should 
all apply cleanly to 2.6.32, so I don't see any reason not to backport 
further back.

Somebody should double-check, though.

		Linus

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-24  3:00                       ` Linus Torvalds
  0 siblings, 0 replies; 66+ messages in thread
From: Linus Torvalds @ 2010-03-24  3:00 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ingo Molnar, Anton Starikov, Greg KH, stable, Andrew Morton,
	linux-mm, Linux Kernel Mailing List, bugzilla-daemon,
	bugme-daemon, Peter Zijlstra



On Wed, 24 Mar 2010, Andi Kleen wrote:
> 
> It would be also nice to get that change into 2.6.32 stable. That is
> widely used on larger systems.

Looking at the changes to the files in question, it looks like it should 
all apply cleanly to 2.6.32, so I don't see any reason not to backport 
further back.

Somebody should double-check, though.

		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 18:03           ` Anton Starikov
@ 2010-03-24 16:40             ` Roland Dreier
  -1 siblings, 0 replies; 66+ messages in thread
From: Roland Dreier @ 2010-03-24 16:40 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Ingo Molnar, Linus Torvalds, Andrew Morton, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra

 > I will apply this commits to 2.6.32, I afraid current OFED (which I
 > need also) will not work on 2.6.33+.

What do you need from OFED that is not in 2.6.34-rc1?
-- 
Roland Dreier  <rolandd@cisco.com>
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-24 16:40             ` Roland Dreier
  0 siblings, 0 replies; 66+ messages in thread
From: Roland Dreier @ 2010-03-24 16:40 UTC (permalink / raw)
  To: Anton Starikov
  Cc: Ingo Molnar, Linus Torvalds, Andrew Morton, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra

 > I will apply this commits to 2.6.32, I afraid current OFED (which I
 > need also) will not work on 2.6.33+.

What do you need from OFED that is not in 2.6.34-rc1?
-- 
Roland Dreier  <rolandd@cisco.com>
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-24 16:40             ` Roland Dreier
@ 2010-03-26  3:24               ` Anton Starikov
  -1 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-26  3:24 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Ingo Molnar, Linus Torvalds, Andrew Morton, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Mar 24, 2010, at 5:40 PM, Roland Dreier wrote:

>> I will apply this commits to 2.6.32, I afraid current OFED (which I
>> need also) will not work on 2.6.33+.
> 
> What do you need from OFED that is not in 2.6.34-rc1?

I didn't go too 2.6.34-rc1.
I tried 2.6.33, mlx4 driver which comes with kernel produces panic on my hardwire. And OFED-1.5 doesn't support this kernel (probably it still can be compiled, didn't check).

Anton.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-03-26  3:24               ` Anton Starikov
  0 siblings, 0 replies; 66+ messages in thread
From: Anton Starikov @ 2010-03-26  3:24 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Ingo Molnar, Linus Torvalds, Andrew Morton, linux-mm,
	linux-kernel, bugzilla-daemon, bugme-daemon, Peter Zijlstra

On Mar 24, 2010, at 5:40 PM, Roland Dreier wrote:

>> I will apply this commits to 2.6.32, I afraid current OFED (which I
>> need also) will not work on 2.6.33+.
> 
> What do you need from OFED that is not in 2.6.34-rc1?

I didn't go too 2.6.34-rc1.
I tried 2.6.33, mlx4 driver which comes with kernel produces panic on my hardwire. And OFED-1.5 doesn't support this kernel (probably it still can be compiled, didn't check).

Anton.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-23 14:22   ` Andrew Morton
  (?)
  (?)
@ 2010-04-02 18:57   ` Lee Schermerhorn
  -1 siblings, 0 replies; 66+ messages in thread
From: Lee Schermerhorn @ 2010-04-02 18:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, bugzilla-daemon, bugme-daemon,
	ant.starikov, Peter Zijlstra, Eric Whitney

[-- Attachment #1: Type: text/plain, Size: 3957 bytes --]

On Tue, 2010-03-23 at 10:22 -0400, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Tue, 23 Mar 2010 16:13:25 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=15618
> > 
> >            Summary: 2.6.18->2.6.32->2.6.33 huge regression in performance
> >            Product: Process Management
> >            Version: 2.5
> >     Kernel Version: 2.6.32
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: Other
> >         AssignedTo: process_other@kernel-bugs.osdl.org
> >         ReportedBy: ant.starikov@gmail.com
> >         Regression: No
> > 
> > 
> > We have benchmarked some multithreaded code here on 16-core/4-way opteron 8356
> > host on number of kernels (see below) and found strange results.
> > Up to 8 threads we didn't see any noticeable differences in performance, but
> > starting from 9 threads performance diverges substantially. I provide here
> > results for 14 threads
> 
> lolz.  Catastrophic meltdown.  Thanks for doing all that work - at a
> guess I'd say it's mmap_sem.  Perhaps with some assist from the CPU
> scheduler.
> 
> If you change the config to set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y does it help?
> 
> Anyway, there's a testcase in bugzilla and it looks like we got us some
> work to do.
> 
<snip>

I had an "opportunity" to investigate page fault behavior on 2.6.18+
[RHEL5.4] on an 8-socket Istanbul system earlier this year.  When I saw
this mail, I collected up the data I had from that adventure and ran
additional tests on 2.6.33 and 2.6.34-rc1.  I have attached plots for
what "per node" and "system wide" page fault scalability.

The per node plot [#1] shows the page fault rate of 1 to 6
[nr_cores_per_socket] tasks [processes] and threads faulting in a fixed
GB/task at the same time on a single socket.  The system wide plot [#3]
show 1 to 48 [nr_sockets * nr_cores_per_socket] tasks and threads again
faulting in a fixed GB/task...   For the latter test, I load one core
per socket at at time, then add the 2nd core per socket, ...  In all
cases, the individual tasks/threads are fork()ed/pthread_create()d by a
parent bound to the cpu where they'll run to obtain node-local kernel
data structures.  The tests run with SCHED_FIFO.

I plot both "faults per wall clock second"--the aggregate rate--and
"faults per cpu second" or normalized rate.  The per node scalability
doesn't look all that different across the 3 releases, especially the
faults per cpu seconds curves.  However, in the system wide
multi-threaded tests, 2.6.33 is an anomaly compared to both 2.6.18+ and
2.6.34-rc1.  The 2.6.18+ and 2.6.34.rc1 multi-threaded tests show a lot
of noise and, of course, a lot lower fault rate relative the the
multi-task tests.  I aborted the 2.6.33 system wide multi-threaded test
at 32 threads because it was just taking too long.

Unfortunately, with this many curves, the legends obscure much of the
plot.  So, rather than bloat this message any more, I've packaged up the
raw data along with plots with and without legends and placed the
tarball here:

	http://free.linux.hp.com/~lts/Pft/

That directory also contains the source for the version of the pft test
used, along with the scripts used to run the tests and plot the results.
Note that some manual editing of the "plot annotations" in the raw data
was required to generate several different plots from the same data.

The pft test is a highly, uh, "evolved" version of pft.c that Christoph
Lameter pointed me at a few years ago.  This version requires a patched
libnuma with the v2 api.  The required patch to the numactl-2.0.3
package is included in the test tarball.  [I've contacted Cliff about
getting the patch into 2.0.4.]

Lee

[-- Attachment #2: 1-pft-istanbul_per_node_task_vs_thread_18v33v34rc1.png --]
[-- Type: image/png, Size: 110307 bytes --]

[-- Attachment #3: 3-pft-istanbul_task_and_thread_18v33v34rc1.png --]
[-- Type: image/png, Size: 121886 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
  2010-03-24  3:00                       ` Linus Torvalds
@ 2010-04-19 18:19                         ` Greg KH
  -1 siblings, 0 replies; 66+ messages in thread
From: Greg KH @ 2010-04-19 18:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andi Kleen, Ingo Molnar, Anton Starikov, stable, Andrew Morton,
	linux-mm, Linux Kernel Mailing List, bugzilla-daemon,
	bugme-daemon, Peter Zijlstra

On Tue, Mar 23, 2010 at 08:00:54PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 24 Mar 2010, Andi Kleen wrote:
> > 
> > It would be also nice to get that change into 2.6.32 stable. That is
> > widely used on larger systems.
> 
> Looking at the changes to the files in question, it looks like it should 
> all apply cleanly to 2.6.32, so I don't see any reason not to backport 
> further back.
> 
> Somebody should double-check, though.

I have queued them all up for .33 and .32-stable kernel releases now.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance
@ 2010-04-19 18:19                         ` Greg KH
  0 siblings, 0 replies; 66+ messages in thread
From: Greg KH @ 2010-04-19 18:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andi Kleen, Ingo Molnar, Anton Starikov, stable, Andrew Morton,
	linux-mm, Linux Kernel Mailing List, bugzilla-daemon,
	bugme-daemon, Peter Zijlstra

On Tue, Mar 23, 2010 at 08:00:54PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 24 Mar 2010, Andi Kleen wrote:
> > 
> > It would be also nice to get that change into 2.6.32 stable. That is
> > widely used on larger systems.
> 
> Looking at the changes to the files in question, it looks like it should 
> all apply cleanly to 2.6.32, so I don't see any reason not to backport 
> further back.
> 
> Somebody should double-check, though.

I have queued them all up for .33 and .32-stable kernel releases now.

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2010-04-19 18:32 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-15618-10286@https.bugzilla.kernel.org/>
2010-03-23 14:22 ` [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance Andrew Morton
2010-03-23 14:22   ` Andrew Morton
2010-03-23 17:34   ` Ingo Molnar
2010-03-23 17:34     ` Ingo Molnar
2010-03-23 17:45     ` Linus Torvalds
2010-03-23 17:45       ` Linus Torvalds
2010-03-23 17:57       ` Anton Starikov
2010-03-23 17:57         ` Anton Starikov
2010-03-23 18:00       ` Ingo Molnar
2010-03-23 18:00         ` Ingo Molnar
2010-03-23 18:03         ` Anton Starikov
2010-03-23 18:03           ` Anton Starikov
2010-03-23 18:21           ` Andrew Morton
2010-03-23 18:21             ` Andrew Morton
2010-03-23 18:25             ` Anton Starikov
2010-03-23 18:25               ` Anton Starikov
2010-03-23 19:22               ` Robin Holt
2010-03-23 19:22                 ` Robin Holt
2010-03-23 19:30                 ` Anton Starikov
2010-03-23 19:30                   ` Anton Starikov
2010-03-23 19:49                   ` Robin Holt
2010-03-23 19:49                     ` Robin Holt
2010-03-23 19:57                     ` Robin Holt
2010-03-23 19:57                       ` Robin Holt
2010-03-23 19:50                 ` Anton Starikov
2010-03-23 19:50                   ` Anton Starikov
2010-03-23 19:52             ` Linus Torvalds
2010-03-23 19:52               ` Linus Torvalds
2010-03-24 16:40           ` Roland Dreier
2010-03-24 16:40             ` Roland Dreier
2010-03-26  3:24             ` Anton Starikov
2010-03-26  3:24               ` Anton Starikov
2010-03-23 19:14       ` Anton Starikov
2010-03-23 19:14         ` Anton Starikov
2010-03-23 19:17         ` Peter Zijlstra
2010-03-23 19:17           ` Peter Zijlstra
2010-03-23 19:42           ` Anton Starikov
2010-03-23 19:54         ` Linus Torvalds
2010-03-23 19:54           ` Linus Torvalds
2010-03-23 20:43           ` Anton Starikov
2010-03-23 20:43             ` Anton Starikov
2010-03-23 23:04             ` Linus Torvalds
2010-03-23 23:04               ` Linus Torvalds
2010-03-23 23:19               ` Anton Starikov
2010-03-23 23:19                 ` Anton Starikov
2010-03-23 23:36               ` Ingo Molnar
2010-03-23 23:36                 ` Ingo Molnar
2010-03-23 23:55                 ` Linus Torvalds
2010-03-23 23:55                   ` Linus Torvalds
2010-03-24  0:03                   ` Anton Starikov
2010-03-24  0:03                     ` Anton Starikov
2010-03-24  2:15                   ` Andi Kleen
2010-03-24  2:15                     ` Andi Kleen
2010-03-24  3:00                     ` Linus Torvalds
2010-03-24  3:00                       ` Linus Torvalds
2010-04-19 18:19                       ` Greg KH
2010-04-19 18:19                         ` Greg KH
2010-03-23 18:13     ` Andrew Morton
2010-03-23 18:13       ` Andrew Morton
2010-03-23 18:19       ` Anton Starikov
2010-03-23 18:19         ` Anton Starikov
2010-03-23 18:27       ` Ingo Molnar
2010-03-23 18:27         ` Ingo Molnar
2010-03-23 21:19       ` Anton Starikov
2010-03-23 21:19         ` Anton Starikov
2010-04-02 18:57   ` Lee Schermerhorn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.