linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Mainline kernel OLTP performance update
@ 2009-01-13 21:10 Ma, Chinang
  2009-01-13 22:44 ` Wilcox, Matthew R
  0 siblings, 1 reply; 122+ messages in thread
From: Ma, Chinang @ 2009-01-13 21:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tripathi, Sharad C, arjan, Wilcox, Matthew R, Kleen, Andi,
	Siddha, Suresh B, Chilukuri, Harita, Styner, Douglas W, Wang,
	Peter Xihong, Nueckel, Hubert, Chris Mason

This is latest 2.6.29-rc1 kernel OLTP performance result. Compare to 2.6.24.2 the regression is around 3.5%.

Linux OLTP Performance summary
Kernel#            Speedup(x)   Intr/s  CtxSw/s us%  sys%   idle%  iowait%
2.6.24.2                1.000   21969   43425   76   24     0      0
2.6.27.2                0.973   30402   43523   74   25     0      1
2.6.29-rc1              0.965   30331   41970   74   26     0      0

Server configurations:
Intel Xeon Quad-core 2.0GHz  2 cpus/8 cores/8 threads
64GB memory, 3 qle2462 FC HBA, 450 spindles (30 logical units)

======oprofile CPU_CLK_UNHALTED for top 30 functions
Cycles% 2.6.24.2                   Cycles% 2.6.27.2
1.0500 qla24xx_start_scsi          1.2125 qla24xx_start_scsi
0.8089 schedule                    0.6962 kmem_cache_alloc
0.5864 kmem_cache_alloc            0.6209 qla24xx_intr_handler
0.4989 __blockdev_direct_IO        0.4895 copy_user_generic_string
0.4152 copy_user_generic_string    0.4591 __blockdev_direct_IO
0.3953 qla24xx_intr_handler        0.4409 __end_that_request_first
0.3596 scsi_request_fn             0.3729 __switch_to
0.3188 __switch_to                 0.3716 try_to_wake_up
0.2889 lock_timer_base             0.3531 lock_timer_base
0.2519 task_rq_lock                0.3393 scsi_request_fn
0.2474 aio_complete                0.3038 aio_complete
0.2460 scsi_alloc_sgtable          0.2989 memset_c
0.2445 generic_make_request        0.2633 qla2x00_process_completed_re
0.2263 qla2x00_process_completed_re0.2583 pick_next_highest_task_rt
0.2118 blk_queue_end_tag           0.2578 generic_make_request
0.2085 dio_bio_complete            0.2510 __list_add
0.2021 e1000_xmit_frame            0.2459 task_rq_lock
0.2006 __end_that_request_first    0.2322 kmem_cache_free
0.1954 generic_file_aio_read       0.2206 blk_queue_end_tag
0.1949 kfree                       0.2205 __mod_timer
0.1915 tcp_sendmsg                 0.2179 update_curr_rt
0.1901 try_to_wake_up              0.2164 sd_prep_fn
0.1895 kref_get                    0.2130 kref_get
0.1864 __mod_timer                 0.2075 dio_bio_complete
0.1863 thread_return               0.2066 push_rt_task
0.1854 math_state_restore          0.1974 qla24xx_msix_default
0.1775 __list_add                  0.1935 generic_file_aio_read
0.1721 memset_c                    0.1870 scsi_device_unbusy
0.1706 find_vma                    0.1861 tcp_sendmsg
0.1688 read_tsc                    0.1843 e1000_xmit_frame

======oprofile CPU_CLK_UNHALTED for top 30 functions
Cycles% 2.6.24.2                   Cycles% 2.6.29-rc1
1.0500 qla24xx_start_scsi          1.0691 qla24xx_intr_handler
0.8089 schedule                    0.7701 copy_user_generic_string
0.5864 kmem_cache_alloc            0.7339 qla24xx_wrt_req_reg
0.4989 __blockdev_direct_IO        0.6458 kmem_cache_alloc
0.4152 copy_user_generic_string    0.5794 qla24xx_start_scsi
0.3953 qla24xx_intr_handler        0.5505 unmap_vmas
0.3596 scsi_request_fn             0.4869 __blockdev_direct_IO
0.3188 __switch_to                 0.4493 try_to_wake_up
0.2889 lock_timer_base             0.4291 scsi_request_fn
0.2519 task_rq_lock                0.4118 clear_page_c
0.2474 aio_complete                0.4002 __switch_to
0.2460 scsi_alloc_sgtable          0.3381 ring_buffer_consume
0.2445 generic_make_request        0.3366 rb_get_reader_page
0.2263 qla2x00_process_completed_re0.3222 aio_complete
0.2118 blk_queue_end_tag           0.3135 memset_c
0.2085 dio_bio_complete            0.2875 __list_add
0.2021 e1000_xmit_frame            0.2673 task_rq_lock
0.2006 __end_that_request_first    0.2658 __end_that_request_first
0.1954 generic_file_aio_read       0.2615 qla2x00_process_completed_re
0.1949 kfree                       0.2615 lock_timer_base
0.1915 tcp_sendmsg                 0.2456 disk_map_sector_rcu
0.1901 try_to_wake_up              0.2427 tcp_sendmsg
0.1895 kref_get                    0.2413 e1000_xmit_frame
0.1864 __mod_timer                 0.2398 kmem_cache_free
0.1863 thread_return               0.2384 pick_next_highest_task_rt
0.1854 math_state_restore          0.2225 blk_queue_end_tag
0.1775 __list_add                  0.2211 sd_prep_fn
0.1721 memset_c                    0.2167 qla24xx_queuecommand
0.1706 find_vma                    0.2109 scsi_device_unbusy
0.1688 read_tsc                    0.2095 kref_get


^ permalink raw reply	[flat|nested] 122+ messages in thread
* Mainline kernel OLTP performance update
@ 2010-01-25 18:26 Ma, Chinang
  0 siblings, 0 replies; 122+ messages in thread
From: Ma, Chinang @ 2010-01-25 18:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: arjan, Wilcox, Matthew R, Chris Mason, Kleen, Andi, Garg, Anil K,
	Prickett, Terry O

Here is an OLTP performance summary comparing 2.6.33-rc4 to Red Hat EL5.4 release. Both kernels were compiled using the same EL5.4 .config to minimize configuration differences.

Comparing to RHEL 5.4 baseline, 2.6.33-rc4 kernel has around 0.8% OLTP performance regression.

Linux OLTP Performance summary
Kernel#            Speedup(x)   Intr/s  CtxSw/s us%     sys%  idle%  iowait%
2.6.18-164.el5(RHEL5.4) 1.000   144080  181307  68      28      1       4
2.6.32.1                0.983   248305  174940  67      32      0       1
2.6.33-rc4              0.992   221354  180750  68      30      0       2

Hardware configuration
NHM-EP 2.93GHz 2 sockets/8 cores/16 threads
72GB memory  
4x LSI 3801SAS + 2x QLA2300, 192 SSDs + 28 spindles log

======oprofile CPU_CLK_UNHALTED for top 30 functions
Cycles% 2.6.18-164.el5             Cycles% 2.6.33-rc4
70.5642 <dbms>        		     69.4350 <dbms>
1.4696 mpt_interrupt               0.9540 mpt_interrupt
0.9001 kmem_cache_free             0.7649 scsi_request_fn
0.7807 schedule                    0.6990 __blockdev_direct_IO
0.7769 __blockdev_direct_IO        0.6729 schedule
0.6053 scsi_request_fn             0.6721 kmem_cache_alloc
0.5003 kmem_cache_alloc            0.6245 kmem_cache_free
0.4355 kmem_cache_zalloc           0.4556 pick_next_highest_task_rt
0.4090 list_del                    0.3538 __switch_to
0.3570 gup_huge_pmd                0.3397 memmove
0.3452 __switch_to                 0.3339 rb_get_reader_page
0.3399 kfree                       0.3318 try_to_wake_up
0.3371 task_rq_lock                0.3309 sd_prep_fn
0.3173 __sigsetjmp                 0.3219 list_del
0.3153 memmove                     0.3217 ring_buffer_consume
0.2869 lock_timer_base             0.3100 kfree
0.2851 generic_make_request        0.3085 __sigsetjmp
0.2613 scsi_get_command            0.3073 mptscsih_qcmd
0.2599 __generic_file_aio_read     0.2810 scsi_device_unbusy
0.2567 fget_light                  0.2727 generic_make_request
0.2434 mptscsih_io_done            0.2510 touch_atime
0.2413 touch_atime                 0.2480 generic_file_aio_read
0.2380 get_request                 0.2453 memset_c
0.2282 try_to_wake_up              0.2448 fget_light
0.2248 mptscsih_qcmd               0.2218 dequeue_rt_stack
0.2196 sd_init_command             0.2140 sys_io_submit
0.2035 device_not_available        0.2063 scsi_dispatch_cmd
0.2032 elv_queue_empty             0.2027 _setjmp
0.2007 __errno_location            0.1996 mptscsih_io_done
0.2006 math_state_restore          0.1973 gup_huge_pmd
0.2003 _setjmp                     0.1908 __list_add
0.1995 kref_get                    0.1880 submit_page_section
0.1979 mempool_alloc               0.1879 task_rq_lock
0.1965 scsi_prep_fn                0.1856 __errno_location








^ permalink raw reply	[flat|nested] 122+ messages in thread
* Mainline kernel OLTP performance update
@ 2009-05-04 15:54 Styner, Douglas W
  2009-05-06  6:29 ` Anirban Chakraborty
  0 siblings, 1 reply; 122+ messages in thread
From: Styner, Douglas W @ 2009-05-04 15:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tripathi, Sharad C, arjan, Wilcox, Matthew R, Kleen, Andi,
	Siddha, Suresh B, Ma, Chinang, Wang, Peter Xihong, Nueckel,
	Hubert, Recalde, Luis F, Nelson, Doug, Cheng, Wu-sun, Prickett,
	Terry O, Shunmuganathan, Rajalakshmi, Garg, Anil K, Chilukuri,
	Harita, chris.mason

<this time with subject line>
Summary: Measured the mainline kernel from kernel.org (2.6.30-rc4). 

The regression for 2.6.30-rc4 against the baseline, 2.6.24.2 is 2.15% (2.6.30-rc3 regression was 1.91%).  Oprofile reports 70.1204% user, 29.874% system. 

Linux OLTP Performance summary
Kernel#            Speedup(x)   Intr/s  CtxSw/s us%     sys%    idle%   iowait%
2.6.24.2                1.000   22106   43709   75      24      0       0
2.6.30-rc4              0.978   30581   43034   75      25      0       0

Server configurations:
Intel Xeon Quad-core 2.0GHz  2 cpus/8 cores/8 threads
64GB memory, 3 qle2462 FC HBA, 450 spindles (30 logical units)


======oprofile CPU_CLK_UNHALTED for top 30 functions
Cycles% 2.6.24.2                   Cycles% 2.6.30-rc4
74.8578 <database>                 67.8732 <database>
1.0500 qla24xx_start_scsi          1.1162 qla24xx_start_scsi
0.8089 schedule                    0.9888 qla24xx_intr_handler
0.5864 kmem_cache_alloc            0.8776 __schedule
0.4989 __blockdev_direct_IO        0.7401 kmem_cache_alloc
0.4357 __sigsetjmp                 0.4914 read_hpet
0.4152 copy_user_generic_string    0.4792 __sigsetjmp
0.3953 qla24xx_intr_handler        0.4368 __blockdev_direct_IO
0.3850 memcpy                      0.3822 task_rq_lock
0.3596 scsi_request_fn             0.3781 __switch_to
0.3188 __switch_to                 0.3620 __list_add
0.2889 lock_timer_base             0.3377 rb_get_reader_page
0.2750 memmove                     0.3336 copy_user_generic_string
0.2519 task_rq_lock                0.3195 try_to_wake_up
0.2474 aio_complete                0.3114 scsi_request_fn
0.2460 scsi_alloc_sgtable          0.3114 ring_buffer_consume
0.2445 generic_make_request        0.2932 aio_complete
0.2263 qla2x00_process_completed_re0.2730 lock_timer_base
0.2118 blk_queue_end_tag           0.2588 memset_c
0.2085 dio_bio_complete            0.2588 mod_timer
0.2021 e1000_xmit_frame            0.2447 generic_make_request
0.2006 __end_that_request_first    0.2426 qla2x00_process_completed_re
0.1954 generic_file_aio_read       0.2265 tcp_sendmsg
0.1949 kfree                       0.2184 memmove
0.1915 tcp_sendmsg                 0.2184 kfree
0.1901 try_to_wake_up              0.2103 scsi_device_unbusy
0.1895 kref_get                    0.2083 mempool_free
0.1864 __mod_timer                 0.1961 blk_queue_end_tag
0.1863 thread_return               0.1941 kmem_cache_free
0.1854 math_state_restore          0.1921 kref_get

^ permalink raw reply	[flat|nested] 122+ messages in thread
* Mainline kernel OLTP performance update
@ 2009-04-28 17:22 Styner, Douglas W
  0 siblings, 0 replies; 122+ messages in thread
From: Styner, Douglas W @ 2009-04-28 17:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tripathi, Sharad C, arjan, Wilcox, Matthew R, Kleen, Andi,
	Siddha, Suresh B, Ma, Chinang, Styner, Douglas W, Wang,
	Peter Xihong, Nueckel, Hubert, Recalde, Luis F, Nelson, Doug,
	Cheng, Wu-sun, Prickett, Terry O, Shunmuganathan, Rajalakshmi,
	Garg, Anil K, Chilukuri, Harita, chris.mason

Summary: Measured the mainline kernel from kernel.org (2.6.29.2). 

The regression for 2.6.29.2 against the baseline, 2.6.24.2 is 2.07% (2.6.29.1 regression was 2.35%).  Oprofile reports 70.419% user, 29.5709% system. 2.6.29.1 -> 2.6.29.2 comparison is below. 

Linux OLTP Performance summary
Kernel#            Speedup(x)   Intr/s  CtxSw/s us%     sys%    idle%   iowait%
2.6.24.2                1.000   22106   43709   75      24      0       0
2.6.29.2                0.979   30509   43139   75      25      0       0

Server configurations:
Intel Xeon Quad-core 2.0GHz  2 cpus/8 cores/8 threads
64GB memory, 3 qle2462 FC HBA, 450 spindles (30 logical units)

======oprofile CPU_CLK_UNHALTED for top 30 functions
Cycles% 2.6.24.2                   Cycles% 2.6.29.2
74.8578 <database>                 67.7080 <database>
1.0500 qla24xx_start_scsi          0.9487 qla24xx_intr_handler
0.8089 schedule                    0.8117 schedule
0.5864 kmem_cache_alloc            0.6215 qla24xx_wrt_req_reg
0.4989 __blockdev_direct_IO        0.5439 kmem_cache_alloc
0.4357 __sigsetjmp                 0.4784 qla24xx_start_scsi
0.4152 copy_user_generic_string    0.4703 __blockdev_direct_IO
0.3953 qla24xx_intr_handler        0.4416 try_to_wake_up
0.3850 memcpy                      0.4253 __sigsetjmp
0.3596 scsi_request_fn             0.3803 scsi_request_fn
0.3188 __switch_to                 0.3701 __switch_to
0.2889 lock_timer_base             0.3619 copy_user_generic_string
0.2750 memmove                     0.3476 rb_get_reader_page
0.2519 task_rq_lock                0.3149 symbols)
0.2474 aio_complete                0.3006 aio_complete
0.2460 scsi_alloc_sgtable          0.2903 memset_c
0.2445 generic_make_request        0.2883 ring_buffer_consume
0.2263 qla2x00_process_completed_re0.2719 lock_timer_base
0.2118 blk_queue_end_tag           0.2699 __list_add
0.2085 dio_bio_complete            0.2474 blk_queue_end_tag
0.2021 e1000_xmit_frame            0.2453 memmove
0.2006 __end_that_request_first    0.2392 e1000_xmit_frame
0.1954 generic_file_aio_read       0.2290 ipc_lock
0.1949 kfree                       0.2290 task_rq_lock
0.1915 tcp_sendmsg                 0.2249 generic_make_request
0.1901 try_to_wake_up              0.2167 kref_get
0.1895 kref_get                    0.2147 tcp_sendmsg
0.1864 __mod_timer                 0.2045 qla2x00_process_completed_re
0.1863 thread_return               0.2045 pick_next_highest_task_rt

-- 2.6.29.1 vs. 2.6.29.2
Linux OLTP Performance summary
Kernel#            Speedup(x)   Intr/s  CtxSw/s us%     sys%    idle%   iowait%
2.6.29.1                1.000   30570   42818   74      25      0       0
2.6.29.2                1.003   30509   43139   75      25      0       0

Server configurations:
Intel Xeon Quad-core 2.0GHz  2 cpus/8 cores/8 threads
64GB memory, 3 qle2462 FC HBA, 450 spindles (30 logical units)

======oprofile CPU_CLK_UNHALTED for top 30 functions
Cycles% 2.6.29.1                   Cycles% 2.6.29.2
64.5424 <database>                 67.7080 <database>
1.1571 qla24xx_intr_handler        0.9487 qla24xx_intr_handler
0.9209 schedule                    0.8117 schedule
0.6533 kmem_cache_alloc            0.6215 qla24xx_wrt_req_reg
0.5456 qla24xx_wrt_req_reg         0.5439 kmem_cache_alloc
0.5247 try_to_wake_up              0.4784 qla24xx_start_scsi
0.4858 qla24xx_start_scsi          0.4703 __blockdev_direct_IO
0.4485 __sigsetjmp                 0.4416 try_to_wake_up
0.3976 __blockdev_direct_IO        0.4253 __sigsetjmp
0.3857 __switch_to                 0.3803 scsi_request_fn
0.3692 copy_user_generic_string    0.3701 __switch_to
0.3648 aio_complete                0.3619 copy_user_generic_string
0.3633 scsi_request_fn             0.3476 rb_get_reader_page
0.3259 rb_get_reader_page          0.3149 symbols)
0.3109 ring_buffer_consume         0.3006 aio_complete
0.3050 memset_c                    0.2903 memset_c
0.2900 pick_next_highest_task_rt   0.2883 ring_buffer_consume
0.2885 page_fault                  0.2719 lock_timer_base
0.2855 task_rq_lock                0.2699 __list_add
0.2691 mwait_idle                  0.2474 blk_queue_end_tag
0.2661 lock_timer_base             0.2453 memmove
0.2616 symbols)                    0.2392 e1000_xmit_frame
0.2616 __list_add                  0.2290 ipc_lock
0.2541 tcp_sendmsg                 0.2290 task_rq_lock
0.2302 blk_queue_end_tag           0.2249 generic_make_request
0.2242 e1000_xmit_frame            0.2167 kref_get
0.2198 scsi_softirq_done           0.2147 tcp_sendmsg
0.2183 qla2x00_process_completed_re0.2045 qla2x00_process_completed_re
0.2168 memmove                     0.2045 pick_next_highest_task_rt
0.2138 cpupri_set                  0.2024 __mod_timer
0.2078 qla24xx_process_response_que0.2004 kmem_cache_free

^ permalink raw reply	[flat|nested] 122+ messages in thread
* Mainline kernel OLTP performance update
@ 2009-04-28 17:08 Styner, Douglas W
  2009-04-29  7:29 ` Andrew Morton
  0 siblings, 1 reply; 122+ messages in thread
From: Styner, Douglas W @ 2009-04-28 17:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tripathi, Sharad C, arjan, Wilcox, Matthew R, Kleen, Andi,
	Siddha, Suresh B, Ma, Chinang, Styner, Douglas W, Wang,
	Peter Xihong, Nueckel, Hubert, Recalde, Luis F, Nelson, Doug,
	Cheng, Wu-sun, Prickett, Terry O, Shunmuganathan, Rajalakshmi,
	Garg, Anil K, Chilukuri, Harita, chris.mason

Summary: Measured the mainline kernel from kernel.org (2.6.30-rc3). 

The regression for 2.6.30-rc3 against the baseline, 2.6.24.2 is 1.91%.  Oprofile reports 71.1626% user, 28.8295% system.  

Linux OLTP Performance summary
Kernel#            Speedup(x)   Intr/s  CtxSw/s us%     sys%    idle%   iowait%
2.6.24.2                1.000   22106   43709   75      24      0       0
2.6.30-rc3              0.981   30645   43027   75      25      0       0

Server configurations:
Intel Xeon Quad-core 2.0GHz  2 cpus/8 cores/8 threads
64GB memory, 3 qle2462 FC HBA, 450 spindles (30 logical units)


======oprofile CPU_CLK_UNHALTED for top 30 functions
Cycles% 2.6.24.2                   Cycles% 2.6.30-rc3
74.8578 <database>                 69.1925 <database>
1.0500 qla24xx_start_scsi          1.1314 qla24xx_intr_handler
0.8089 schedule                    1.0031 qla24xx_start_scsi
0.5864 kmem_cache_alloc            0.8476 __schedule
0.4989 __blockdev_direct_IO        0.6532 kmem_cache_alloc
0.4357 __sigsetjmp                 0.4490 __blockdev_direct_IO
0.4152 copy_user_generic_string    0.4199 __sigsetjmp
0.3953 qla24xx_intr_handler        0.3946 __switch_to
0.3850 memcpy                      0.3538 __list_add
0.3596 scsi_request_fn             0.3499 task_rq_lock
0.3188 __switch_to                 0.3402 scsi_request_fn
0.2889 lock_timer_base             0.3382 rb_get_reader_page
0.2750 memmove                     0.3363 copy_user_generic_string
0.2519 task_rq_lock                0.3324 aio_complete
0.2474 aio_complete                0.3110 try_to_wake_up
0.2460 scsi_alloc_sgtable          0.2877 ring_buffer_consume
0.2445 generic_make_request        0.2683 mod_timer
0.2263 qla2x00_process_completed_re0.2605 qla2x00_process_completed_re
0.2118 blk_queue_end_tag           0.2566 blk_queue_end_tag
0.2085 dio_bio_complete            0.2566 generic_make_request
0.2021 e1000_xmit_frame            0.2547 tcp_sendmsg
0.2006 __end_that_request_first    0.2372 lock_timer_base
0.1954 generic_file_aio_read       0.2333 memmove
0.1949 kfree                       0.2294 memset_c
0.1915 tcp_sendmsg                 0.2080 mempool_free
0.1901 try_to_wake_up              0.2022 generic_file_aio_read
0.1895 kref_get                    0.1963 scsi_device_unbusy
0.1864 __mod_timer                 0.1963 plist_del
0.1863 thread_return               0.1944 dequeue_rt_stack
0.1854 math_state_restore          0.1924 e1000_xmit_frame

Thanks
Doug

^ permalink raw reply	[flat|nested] 122+ messages in thread
* Mainline kernel OLTP performance update
@ 2009-04-23 16:49 Styner, Douglas W
  2009-04-27  7:02 ` Andi Kleen
  0 siblings, 1 reply; 122+ messages in thread
From: Styner, Douglas W @ 2009-04-23 16:49 UTC (permalink / raw)
  To: linux-kernel


Summary: Measured the mainline kernel from kernel.org (2.6.30-rc2). 

The regression for 2.6.30-rc2 against the baseline, 2.6.24.2 is 1.95%.  Note the dip in cycles for database compared to us% in summary.  

Linux OLTP Performance summary
Kernel#            Speedup(x)   Intr/s  CtxSw/s us%     sys%    idle%   iowait%
2.6.24.2                1.000   22106   43709   75      24      0       0
2.6.30-rc2              0.981   30755   43072   75      25      0       0

Server configurations:
Intel Xeon Quad-core 2.0GHz  2 cpus/8 cores/8 threads
64GB memory, 3 qle2462 FC HBA, 450 spindles (30 logical units)


======oprofile 0.9.3 CPU_CLK_UNHALTED for top 30 functions
Cycles% 2.6.24.2                   Cycles% 2.6.30-rc2
74.8578 <database>                   67.6966 <database>
1.0500 qla24xx_start_scsi          1.1724 qla24xx_start_scsi
0.8089 schedule                    1.0578 qla24xx_intr_handler
0.5864 kmem_cache_alloc            0.8259 __schedule
0.4989 __blockdev_direct_IO        0.7451 kmem_cache_alloc
0.4357 __sigsetjmp                 0.4872 __blockdev_direct_IO
0.4152 copy_user_generic_string    0.4390 task_rq_lock
0.3953 qla24xx_intr_handler        0.4338 __sigsetjmp
0.3850 memcpy                      0.4195 __switch_to
0.3596 scsi_request_fn             0.3713 copy_user_generic_string
0.3188 __switch_to                 0.3608 __list_add
0.2889 lock_timer_base             0.3595 rb_get_reader_page
0.2750 memmove                     0.3309 ring_buffer_consume
0.2519 task_rq_lock                0.3152 scsi_request_fn
0.2474 aio_complete                0.3048 try_to_wake_up
0.2460 scsi_alloc_sgtable          0.2983 tcp_sendmsg
0.2445 generic_make_request        0.2931 lock_timer_base
0.2263 qla2x00_process_completed_re0.2840 aio_complete
0.2118 blk_queue_end_tag           0.2697 memset_c
0.2085 dio_bio_complete            0.2527 mod_timer
0.2021 e1000_xmit_frame            0.2462 qla2x00_process_completed_re
0.2006 __end_that_request_first    0.2449 memmove
0.1954 generic_file_aio_read       0.2358 blk_queue_end_tag
0.1949 kfree                       0.2241 generic_make_request
0.1915 tcp_sendmsg                 0.2215 scsi_device_unbusy
0.1901 try_to_wake_up              0.2162 mempool_free
0.1895 kref_get                    0.2097 e1000_xmit_frame
0.1864 __mod_timer                 0.2097 kmem_cache_free
0.1863 thread_return               0.2058 kfree
0.1854 math_state_restore          0.1993 sched_clock_cpu
 
Thanks,
Doug

^ permalink raw reply	[flat|nested] 122+ messages in thread
* Mainline kernel OLTP performance update
@ 2009-01-12 18:30 Ma, Chinang
  0 siblings, 0 replies; 122+ messages in thread
From: Ma, Chinang @ 2009-01-12 18:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tripathi, Sharad C, arjan, Wilcox, Matthew R, Kleen, Andi,
	Siddha, Suresh B, Chilukuri, Harita, Styner, Douglas W, Wang,
	Peter Xihong, Nueckel, Hubert, Chris Mason

Here is the latest 2.6.28 kernel OLTP performance comparing to 2.6.24.2 and 2.6.27.2. 

Linux OLTP Performance summary
Kernel#            Speedup(x)   Intr/s  CtxSw/s us%  sys%  idle% iowait%
2.6.24.2                1.000   21969   43425   76   24    0     0
2.6.27.2                0.973   30402   43523   74   25    0     1
2.6.28                  0.967   30400   42640   74   25    0     0

Server configurations:
Intel Xeon Quad-core 2.0GHz  2 cpus/8 cores/8 threads
64GB memory, 3 qle2462 FC HBA, 450 spindles (30 logical units)

======oprofile CPU_CLK_UNHALTED for top 30 functions
Cycles% 2.6.24.2                   Cycles% 2.6.27.2
1.0500 qla24xx_start_scsi          1.2125 qla24xx_start_scsi
0.8089 schedule                    0.6962 kmem_cache_alloc
0.5864 kmem_cache_alloc            0.6209 qla24xx_intr_handler
0.4989 __blockdev_direct_IO        0.4895 copy_user_generic_string
0.4152 copy_user_generic_string    0.4591 __blockdev_direct_IO
0.3953 qla24xx_intr_handler        0.4409 __end_that_request_first
0.3596 scsi_request_fn             0.3729 __switch_to
0.3188 __switch_to                 0.3716 try_to_wake_up
0.2889 lock_timer_base             0.3531 lock_timer_base
0.2519 task_rq_lock                0.3393 scsi_request_fn
0.2474 aio_complete                0.3038 aio_complete
0.2460 scsi_alloc_sgtable          0.2989 memset_c
0.2445 generic_make_request        0.2633 qla2x00_process_completed_re
0.2263 qla2x00_process_completed_re0.2583 pick_next_highest_task_rt
0.2118 blk_queue_end_tag           0.2578 generic_make_request
0.2085 dio_bio_complete            0.2510 __list_add
0.2021 e1000_xmit_frame            0.2459 task_rq_lock
0.2006 __end_that_request_first    0.2322 kmem_cache_free
0.1954 generic_file_aio_read       0.2206 blk_queue_end_tag
0.1949 kfree                       0.2205 __mod_timer
0.1915 tcp_sendmsg                 0.2179 update_curr_rt
0.1901 try_to_wake_up              0.2164 sd_prep_fn
0.1895 kref_get                    0.2130 kref_get
0.1864 __mod_timer                 0.2075 dio_bio_complete
0.1863 thread_return               0.2066 push_rt_task
0.1854 math_state_restore          0.1974 qla24xx_msix_default
0.1775 __list_add                  0.1935 generic_file_aio_read
0.1721 memset_c                    0.1870 scsi_device_unbusy
0.1706 find_vma                    0.1861 tcp_sendmsg
0.1688 read_tsc                    0.1843 e1000_xmit_frame


======oprofile CPU_CLK_UNHALTED for top 30 functions
Cycles% 2.6.27.2                   Cycles% 2.6.28
1.2125 qla24xx_start_scsi          1.4257 qla24xx_start_scsi
0.6962 kmem_cache_alloc            0.8784 kmem_cache_alloc
0.6209 qla24xx_intr_handler        0.6876 qla24xx_intr_handler
0.4895 copy_user_generic_string    0.5834 copy_user_generic_string
0.4591 __blockdev_direct_IO        0.4945 scsi_request_fn
0.4409 __end_that_request_first    0.4846 __blockdev_direct_IO
0.3729 __switch_to                 0.4187 try_to_wake_up
0.3716 try_to_wake_up              0.3518 aio_complete
0.3531 lock_timer_base             0.3513 __end_that_request_first
0.3393 scsi_request_fn             0.3483 __switch_to
0.3038 aio_complete                0.3271 memset_c
0.2989 memset_c                    0.2976 qla2x00_process_completed_re
0.2633 qla2x00_process_completed_re0.2905 __list_add
0.2583 pick_next_highest_task_rt   0.2901 generic_make_request
0.2578 generic_make_request        0.2755 lock_timer_base
0.2510 __list_add                  0.2741 blk_queue_end_tag
0.2459 task_rq_lock                0.2593 kmem_cache_free
0.2322 kmem_cache_free             0.2445 disk_map_sector_rcu
0.2206 blk_queue_end_tag           0.2370 pick_next_highest_task_rt
0.2205 __mod_timer                 0.2323 scsi_device_unbusy
0.2179 update_curr_rt              0.2321 task_rq_lock
0.2164 sd_prep_fn                  0.2316 scsi_dispatch_cmd
0.2130 kref_get                    0.2239 kref_get
0.2075 dio_bio_complete            0.2237 dio_bio_complete
0.2066 push_rt_task                0.2194 push_rt_task
0.1974 qla24xx_msix_default        0.2145 __aio_get_req
0.1935 generic_file_aio_read       0.2143 kfree
0.1870 scsi_device_unbusy          0.2138 __mod_timer
0.1861 tcp_sendmsg                 0.2131 e1000_irq_enable
0.1843 e1000_xmit_frame            0.2091 scsi_softirq_done



^ permalink raw reply	[flat|nested] 122+ messages in thread

end of thread, other threads:[~2010-01-25 18:26 UTC | newest]

Thread overview: 122+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-13 21:10 Mainline kernel OLTP performance update Ma, Chinang
2009-01-13 22:44 ` Wilcox, Matthew R
2009-01-15  0:35   ` Andrew Morton
2009-01-15  1:21     ` Matthew Wilcox
2009-01-15  2:04       ` Andrew Morton
2009-01-15  2:27         ` Steven Rostedt
2009-01-15  7:11           ` Ma, Chinang
2009-01-19 18:04             ` Chris Mason
2009-01-19 18:37               ` Steven Rostedt
2009-01-19 18:55                 ` Chris Mason
2009-01-19 19:07                   ` Steven Rostedt
2009-01-19 23:40                 ` Ingo Molnar
2009-01-15  2:39         ` Andi Kleen
2009-01-15  2:47           ` Matthew Wilcox
2009-01-15  3:36             ` Andi Kleen
2009-01-20 13:27             ` Jens Axboe
     [not found]               ` <588992150B702C48B3312184F1B810AD03A497632C@azsmsx501.amr.corp.intel.com>
2009-01-22 11:29                 ` Jens Axboe
     [not found]                   ` <588992150B702C48B3312184F1B810AD03A4F59632@azsmsx501.amr.corp.intel.com>
2009-01-27  8:28                     ` Jens Axboe
2009-01-15  7:24         ` Nick Piggin
2009-01-15  9:46           ` Pekka Enberg
2009-01-15 13:52             ` Matthew Wilcox
2009-01-15 14:42               ` Pekka Enberg
2009-01-16 10:16               ` Pekka Enberg
2009-01-16 10:21                 ` Nick Piggin
2009-01-16 10:31                   ` Pekka Enberg
2009-01-16 10:42                     ` Nick Piggin
2009-01-16 10:55                       ` Pekka Enberg
2009-01-19  7:13                         ` Nick Piggin
2009-01-19  8:05                           ` Pekka Enberg
2009-01-19  8:33                             ` Nick Piggin
2009-01-19  8:42                               ` Nick Piggin
2009-01-19  8:47                                 ` Pekka Enberg
2009-01-19  8:57                                   ` Nick Piggin
2009-01-19  9:48                               ` Pekka Enberg
2009-01-19 10:03                                 ` Nick Piggin
2009-01-16 20:59                     ` Christoph Lameter
2009-01-16  0:27           ` Andrew Morton
2009-01-16  4:03             ` Nick Piggin
2009-01-16  4:12               ` Andrew Morton
2009-01-16  6:46                 ` Nick Piggin
2009-01-16  6:55                   ` Matthew Wilcox
2009-01-16  7:06                     ` Nick Piggin
2009-01-16  7:53                     ` Zhang, Yanmin
2009-01-16 10:20                       ` Andi Kleen
2009-01-20  5:16                         ` Zhang, Yanmin
2009-01-21 23:58                           ` Christoph Lameter
2009-01-22  8:36                             ` Zhang, Yanmin
2009-01-22  9:15                               ` Pekka Enberg
2009-01-22  9:28                                 ` Zhang, Yanmin
2009-01-22  9:47                                   ` Pekka Enberg
2009-01-23  3:02                                     ` Zhang, Yanmin
2009-01-23  6:52                                       ` Pekka Enberg
2009-01-23  8:06                                         ` Pekka Enberg
2009-01-23  8:30                                           ` Zhang, Yanmin
2009-01-23  8:40                                             ` Pekka Enberg
2009-01-23  9:46                                             ` Pekka Enberg
2009-01-23 15:22                                               ` Christoph Lameter
2009-01-23 15:31                                                 ` Pekka Enberg
2009-01-23 15:55                                                   ` Christoph Lameter
2009-01-23 16:01                                                     ` Pekka Enberg
2009-01-24  2:55                                                 ` Zhang, Yanmin
2009-01-24  7:36                                                   ` Pekka Enberg
2009-02-12  5:22                                                     ` Zhang, Yanmin
2009-02-12  5:47                                                       ` Zhang, Yanmin
2009-02-12 15:25                                                         ` Christoph Lameter
2009-02-12 16:07                                                           ` Pekka Enberg
2009-02-12 16:03                                                         ` Pekka Enberg
2009-01-26 17:36                                                   ` Christoph Lameter
2009-02-01  2:52                                                     ` Zhang, Yanmin
2009-01-23  8:33                                       ` Nick Piggin
2009-01-23  9:02                                         ` Zhang, Yanmin
2009-01-23 18:40                                           ` care and feeding of netperf (Re: Mainline kernel OLTP performance update) Rick Jones
2009-01-23 18:51                                             ` Grant Grundler
2009-01-24  3:03                                             ` Zhang, Yanmin
2009-01-26 18:26                                               ` Rick Jones
2009-01-16  7:00                   ` Mainline kernel OLTP performance update Andrew Morton
2009-01-16  7:25                     ` Nick Piggin
2009-01-16  8:59                     ` Nick Piggin
2009-01-16 18:11                   ` Rick Jones
2009-01-19  7:43                     ` Nick Piggin
2009-01-19 22:19                       ` Rick Jones
2009-01-15 14:12         ` James Bottomley
2009-01-15 17:44           ` Andrew Morton
2009-01-15 18:00             ` Matthew Wilcox
2009-01-15 18:14               ` Steven Rostedt
2009-01-15 18:44                 ` Gregory Haskins
2009-01-15 18:46                   ` Wilcox, Matthew R
2009-01-15 19:44                     ` Ma, Chinang
2009-01-16 18:14                       ` Gregory Haskins
2009-01-16 19:09                         ` Steven Rostedt
2009-01-20 12:45                         ` Gregory Haskins
2009-01-15 19:28                 ` Ma, Chinang
2009-01-15 16:48       ` Ma, Chinang
  -- strict thread matches above, loose matches on Subject: below --
2010-01-25 18:26 Ma, Chinang
2009-05-04 15:54 Styner, Douglas W
2009-05-06  6:29 ` Anirban Chakraborty
2009-05-06 15:53   ` Wilcox, Matthew R
2009-05-06 18:05     ` Styner, Douglas W
2009-05-06 18:12       ` Wilcox, Matthew R
2009-05-06 18:24         ` Anirban Chakraborty
2009-05-06 19:25           ` Wilcox, Matthew R
2009-05-06 18:19   ` Styner, Douglas W
2009-04-28 17:22 Styner, Douglas W
2009-04-28 17:08 Styner, Douglas W
2009-04-29  7:29 ` Andrew Morton
2009-04-29  8:28   ` Andi Kleen
2009-04-29 16:00     ` Styner, Douglas W
2009-04-29 16:06       ` Wilcox, Matthew R
2009-04-29 16:19         ` Andi Kleen
2009-04-29 15:48   ` Styner, Douglas W
2009-04-29 16:07     ` Andrew Morton
2009-04-29 16:25       ` Peter Zijlstra
2009-04-29 17:46         ` Chris Mason
2009-04-29 18:06           ` Pallipadi, Venkatesh
2009-04-29 18:25             ` Styner, Douglas W
2009-04-29 17:52         ` Styner, Douglas W
2009-04-23 16:49 Styner, Douglas W
2009-04-27  7:02 ` Andi Kleen
2009-04-28 16:57   ` Chuck Ebbert
2009-04-28 17:15     ` James Bottomley
2009-04-28 17:17       ` Styner, Douglas W
2009-01-12 18:30 Ma, Chinang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).