Thank you Dave.
I also did raw device and ext4 performance test with 2.6.39, all these tests are
doing identical IO patterns(non-buffered IO, 16 IO threads, 16KB block size,
mixed random read and write, r:w=9:1):
====== raw device + 2.6.39 ======
Read 21.7GB @ 11.6k IOPS , 185MB/s, av latency of 1.37 ms/IO
Wrote 2.4GB @ 1.3k IOPS, 20MB/s, av latency of 0.095 ms/IO
Total 1.5M IOs, @ 96% <= 2ms
====== ext4 + 2.6.39 ======
Read 21.7GB @ 11.6k IOPS , 185MB/s, av latency of 1.37 ms/IO
Wrote 2.4GB @ 1.3k IOPS, 20MB/s, av latency of 0.1 ms/IO
Total 1.5M IOs, @ 96% <= 2ms
====== XFS + 2.6.39 ======
Read 6.5GB @ 3.5k iops, 55MB/s, av latency of 4.5ms/IO
Wrote 700MB @ 386 iops, 6MB/s, av latency of 0.39ms/IO
Total 460k IOs, @ 95% <= 10ms, 4ms > 50% < 10ms
here are the detailed test results: