How to track down abysmal performance ata - raid1 - crypto - vg/lv

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
@ 2010-08-04  7:35 Dominik Brodowski
  2010-08-04  8:50 ` Christoph Hellwig
  2010-08-04  9:16 ` Michael Monnerie
  0 siblings, 2 replies; 23+ messages in thread
From: Dominik Brodowski @ 2010-08-04  7:35 UTC (permalink / raw)
  To: linux-raid, xfs, linux-kernel

Hey,

on a production system I run kernel 2.6.35 and

	XFS	(rw,relatime,nobarrier)

on a

	lvdevice of a vgroup 

consisting of five

	dm-crypt devices (cryptsetup -c aes-lrw-benbi -s 384 create)

, each of which runs on a

	md-raid1 device (mdadm --create --level=raid1 --raid-devices=2)

on two

	750 GB ATA devices.


The read performance is abysmal. The ata devices can be ruled out, as hdparm
resulted in acceptable performance:
> Timing cached reads:   9444 MB in  2.00 seconds = 4733.59 MB/sec
> Timing buffered disk reads:  298 MB in  3.02 seconds =  98.73 MB/sec

How can I best track down the cause of the performance problem, 
a) without rebooting too often, and
b) without breaking up the setup specified above (production system)?

Any ideas? perf(1)? iostat(1)?

Thanks & best,

	Dominik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
  2010-08-04  7:35 How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs Dominik Brodowski
@ 2010-08-04  8:50 ` Christoph Hellwig
  2010-08-04  9:13   ` Dominik Brodowski
  2010-08-04  9:16 ` Michael Monnerie
  1 sibling, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2010-08-04  8:50 UTC (permalink / raw)
  To: Dominik Brodowski, linux-raid, xfs, linux-kernel

On Wed, Aug 04, 2010 at 09:35:46AM +0200, Dominik Brodowski wrote:
> How can I best track down the cause of the performance problem, 
> a) without rebooting too often, and
> b) without breaking up the setup specified above (production system)?

So did you just upgrade the system from an earlier kernel that did not
show these problems?  Or did no one notice them before?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
  2010-08-04  8:50 ` Christoph Hellwig
@ 2010-08-04  9:13   ` Dominik Brodowski
  2010-08-04  9:21     ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Dominik Brodowski @ 2010-08-04  9:13 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-raid, xfs, linux-kernel

Christoph,

On Wed, Aug 04, 2010 at 04:50:39AM -0400, Christoph Hellwig wrote:
> On Wed, Aug 04, 2010 at 09:35:46AM +0200, Dominik Brodowski wrote:
> > How can I best track down the cause of the performance problem, 
> > a) without rebooting too often, and
> > b) without breaking up the setup specified above (production system)?
> 
> So did you just upgrade the system from an earlier kernel that did not
> show these problems?

No, 2.6.31 to 2.6.34 show similar behaviour.

>  Or did no one notice them before?

Well, there are some reports relating to XFS on MD or RAID, though I couldn't
find a resolution to the issues reported, e.g.

- http://kerneltrap.org/mailarchive/linux-raid/2009/10/12/6490333
- http://lkml.indiana.edu/hypermail/linux/kernel/1006.1/00099.html

However, I think we can rule out barriers, as XFS is mounted "nobarrier"
here.

Best,
	Dominik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
  2010-08-04  7:35 How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs Dominik Brodowski
  2010-08-04  8:50 ` Christoph Hellwig
@ 2010-08-04  9:16 ` Michael Monnerie
  2010-08-04 10:25   ` Dominik Brodowski
  1 sibling, 1 reply; 23+ messages in thread
From: Michael Monnerie @ 2010-08-04  9:16 UTC (permalink / raw)
  To: Dominik Brodowski, linux-raid, xfs, linux-kernel

[-- Attachment #1: Type: Text/Plain, Size: 1047 bytes --]

On Mittwoch, 4. August 2010 Dominik Brodowski wrote:
> The read performance is abysmal. The ata devices can be ruled out, as
>  hdparm
> 
> resulted in acceptable performance:
> > Timing cached reads:   9444 MB in  2.00 seconds = 4733.59 MB/sec
> > Timing buffered disk reads:  298 MB in  3.02 seconds =  98.73
> > MB/sec

Has that system been running acceptable before? If yes, what has been 
changed that performance is down now?

Or is it a new setup? Then why is it in production already?

Can you run bonnie on that system?
What does "dd if=<your device> of=/dev/null bs=1m count=1024" say?
What does "dd if=/dev/zero of=<your device> bs=1m count=1024" say?


-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

****** Aktuelles Radiointerview! ******
http://www.it-podcast.at/aktuelle-sendung.html

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
  2010-08-04  9:13   ` Dominik Brodowski
@ 2010-08-04  9:21     ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2010-08-04  9:21 UTC (permalink / raw)
  To: Dominik Brodowski, Christoph Hellwig, linux-raid, xfs, linux-kernel

On Wed, Aug 04, 2010 at 11:13:17AM +0200, Dominik Brodowski wrote:
> > show these problems?
> 
> No, 2.6.31 to 2.6.34 show similar behaviour.

Ok, so it's been around for a while.  Can you test the write speed of
each individual device layer by doing a large read from it, using:

	dd if=<device> of=/dev/null bs=8k iflag=direct

where device starts with the /dev/sda* device, and goes up to the MD
device, the dm-crypt device and the LV.  And yes, it's safe to read
from the device while it's otherwise mounted/used.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
  2010-08-04  9:16 ` Michael Monnerie
@ 2010-08-04 10:25   ` Dominik Brodowski
  2010-08-04 11:18     ` Christoph Hellwig
  2010-08-04 20:33     ` Valdis.Kletnieks
  0 siblings, 2 replies; 23+ messages in thread
From: Dominik Brodowski @ 2010-08-04 10:25 UTC (permalink / raw)
  To: Michael Monnerie, Christoph Hellwig
  Cc: linux-raid, xfs, linux-kernel, dm-devel

Hey,

many thanks for your feedback. It seems the crypto step is the culprit:

Reading 1.1 GB with dd, iflag=direct, bs=8k:

/dev/sd*                35.3 MB/s       ( 90 %)
/dev/md*                39.1 MB/s       (100 %)
/dev/mapper/md*_crypt    3.9 MB/s       ( 10 %)
/dev/mapper/vg1-*        3.9 MB/s       ( 10 %)

The "good" news: it also happens on my notebook, even though it has a
different setup (no raid, disk -> lv/vg -> crypt). On my notebook, I'm
more than happy to test out different kernel versions, patches etc.

/dev/sd*                17.7 MB/s       (100 %)
/dev/mapper/vg1-*       16.2 MB/s       ( 92 %)
/dev/mapper/*_crypt      3.1 MB/s       ( 18 %)

On a different system, a friend of mine reported (with 2.6.33):

/dev/sd*		51.9 MB/s	(100 %)
dm-crypt		32.9 MB/s	( 64 %)

This shows that the speed drop when using dmcrypt is not always a factor of
5 to 10... Btw, it occurs both with aes-lrw-benbi and aes-cbc-essiv:sha256 ,
and (on my notebook) the CPU is mostly idling or waiting. 

Best,
	Dominik

PS: Bonnie output:

Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.03d       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
ABCABCABCABCABC 16G           60186   4 24796   4           53386   5 281.1   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  3176  16 +++++ +++  4641  28  5501  20 +++++ +++  2286   9
ABCABCABCABCABCABC,16G,,,60186,4,24796,4,,,53386,5,281.1,1,16,3176,16,+++++,+++,4641,28,5501,20,+++++,+++,2286,9


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
  2010-08-04 10:25   ` Dominik Brodowski
@ 2010-08-04 11:18     ` Christoph Hellwig
  2010-08-04 11:24       ` Dominik Brodowski
  2010-08-04 11:53       ` Mikael Abrahamsson
  2010-08-04 20:33     ` Valdis.Kletnieks
  1 sibling, 2 replies; 23+ messages in thread
From: Christoph Hellwig @ 2010-08-04 11:18 UTC (permalink / raw)
  To: Dominik Brodowski, Michael Monnerie, Christoph Hellwig,
	linux-raid, xfs, linux-kernel, dm-devel

On Wed, Aug 04, 2010 at 12:25:26PM +0200, Dominik Brodowski wrote:
> Hey,
> 
> many thanks for your feedback. It seems the crypto step is the culprit:
> 
> Reading 1.1 GB with dd, iflag=direct, bs=8k:
> 
> /dev/sd*                35.3 MB/s       ( 90 %)
> /dev/md*                39.1 MB/s       (100 %)
> /dev/mapper/md*_crypt    3.9 MB/s       ( 10 %)
> /dev/mapper/vg1-*        3.9 MB/s       ( 10 %)
> 
> The "good" news: it also happens on my notebook, even though it has a
> different setup (no raid, disk -> lv/vg -> crypt). On my notebook, I'm
> more than happy to test out different kernel versions, patches etc.
> 
> /dev/sd*                17.7 MB/s       (100 %)
> /dev/mapper/vg1-*       16.2 MB/s       ( 92 %)
> /dev/mapper/*_crypt      3.1 MB/s       ( 18 %)

The good news is that you have it tracked down, the bad news is that
I know very little about dm-crypt.  Maybe the issue is the single
threaded decryption in dm-crypt?  Can you check how much CPU time
the dm crypt kernel thread uses?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
  2010-08-04 11:18     ` Christoph Hellwig
@ 2010-08-04 11:24       ` Dominik Brodowski
  2010-08-04 11:53       ` Mikael Abrahamsson
  1 sibling, 0 replies; 23+ messages in thread
From: Dominik Brodowski @ 2010-08-04 11:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Michael Monnerie, linux-raid, xfs, linux-kernel, dm-devel

On Wed, Aug 04, 2010 at 07:18:03AM -0400, Christoph Hellwig wrote:
> On Wed, Aug 04, 2010 at 12:25:26PM +0200, Dominik Brodowski wrote:
> > Hey,
> > 
> > many thanks for your feedback. It seems the crypto step is the culprit:
> > 
> > Reading 1.1 GB with dd, iflag=direct, bs=8k:
> > 
> > /dev/sd*                35.3 MB/s       ( 90 %)
> > /dev/md*                39.1 MB/s       (100 %)
> > /dev/mapper/md*_crypt    3.9 MB/s       ( 10 %)
> > /dev/mapper/vg1-*        3.9 MB/s       ( 10 %)
> > 
> > The "good" news: it also happens on my notebook, even though it has a
> > different setup (no raid, disk -> lv/vg -> crypt). On my notebook, I'm
> > more than happy to test out different kernel versions, patches etc.
> > 
> > /dev/sd*                17.7 MB/s       (100 %)
> > /dev/mapper/vg1-*       16.2 MB/s       ( 92 %)
> > /dev/mapper/*_crypt      3.1 MB/s       ( 18 %)
> 
> The good news is that you have it tracked down, the bad news is that
> I know very little about dm-crypt.  Maybe the issue is the single
> threaded decryption in dm-crypt?  Can you check how much CPU time
> the dm crypt kernel thread uses?

2 CPUs overall:
Cpu(s):  1.0%us,  5.7%sy,  0.0%ni, 44.8%id, 47.0%wa,  0.0%hi,  1.5%si, 0.0%st

Thanks & best,
	Dominik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
  2010-08-04 11:18     ` Christoph Hellwig
  2010-08-04 11:24       ` Dominik Brodowski
@ 2010-08-04 11:53       ` Mikael Abrahamsson
  2010-08-04 12:56         ` Mike Snitzer
  2010-08-04 22:24         ` Neil Brown
  1 sibling, 2 replies; 23+ messages in thread
From: Mikael Abrahamsson @ 2010-08-04 11:53 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dominik Brodowski, Michael Monnerie, linux-raid, xfs,
	linux-kernel, dm-devel

On Wed, 4 Aug 2010, Christoph Hellwig wrote:

> The good news is that you have it tracked down, the bad news is that I 
> know very little about dm-crypt.  Maybe the issue is the single threaded 
> decryption in dm-crypt?  Can you check how much CPU time the dm crypt 
> kernel thread uses?

I'm not sure it's that. I have a Core i5 with AES-NI and that didn't 
significantly increase my overall performance, as it's not there the 
bottleneck is (at least in my system).

I earlier sent out an email wondering if someone could shed some light on 
how scheduling, block caching and read-ahead works together when one does 
disks->md->crypto->lvm->fs, becase that's a lot of layers and potentially 
a lot of unneeded buffering, readahead and scheduling magic?

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
  2010-08-04 11:53       ` Mikael Abrahamsson
@ 2010-08-04 12:56         ` Mike Snitzer
  2010-08-04 22:24         ` Neil Brown
  1 sibling, 0 replies; 23+ messages in thread
From: Mike Snitzer @ 2010-08-04 12:56 UTC (permalink / raw)
  To: Mikael Abrahamsson
  Cc: Christoph Hellwig, device-mapper development, Michael Monnerie,
	linux-kernel, Andi Kleen, Dominik Brodowski, xfs, linux-raid

On Wed, Aug 04 2010 at  7:53am -0400,
Mikael Abrahamsson <swmike@swm.pp.se> wrote:

> On Wed, 4 Aug 2010, Christoph Hellwig wrote:
> 
> >The good news is that you have it tracked down, the bad news is
> >that I know very little about dm-crypt.  Maybe the issue is the
> >single threaded decryption in dm-crypt?  Can you check how much
> >CPU time the dm crypt kernel thread uses?
> 
> I'm not sure it's that. I have a Core i5 with AES-NI and that didn't
> significantly increase my overall performance, as it's not there the
> bottleneck is (at least in my system).

You could try applying both of these patches that are pending review for
hopeful inclussion in 2.6.36:

https://patchwork.kernel.org/patch/103404/
https://patchwork.kernel.org/patch/112657/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
  2010-08-04 10:25   ` Dominik Brodowski
  2010-08-04 11:18     ` Christoph Hellwig
@ 2010-08-04 20:33     ` Valdis.Kletnieks
  2010-08-05  9:31       ` direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs] Dominik Brodowski
  1 sibling, 1 reply; 23+ messages in thread
From: Valdis.Kletnieks @ 2010-08-04 20:33 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: Michael Monnerie, Christoph Hellwig, linux-raid, xfs,
	linux-kernel, dm-devel

[-- Attachment #1: Type: text/plain, Size: 1304 bytes --]

On Wed, 04 Aug 2010 12:25:26 +0200, Dominik Brodowski said:

> The "good" news: it also happens on my notebook, even though it has a
> different setup (no raid, disk -> lv/vg -> crypt). On my notebook, I'm
> more than happy to test out different kernel versions, patches etc.
> 
> /dev/sd*                17.7 MB/s       (100 %)
> /dev/mapper/vg1-*       16.2 MB/s       ( 92 %)
> /dev/mapper/*_crypt      3.1 MB/s       ( 18 %)

Unfortunately, on my laptop with a similar config, I'm seeing this:

# dd if=/dev/sda bs=8k count=1000000 of=/dev/null
1000000+0 records in
1000000+0 records out
8192000000 bytes (8.2 GB) copied, 108.352 s, 75.6 MB/s
# dd if=/dev/sda2 bs=8k count=1000000 of=/dev/null
1000000+0 records in
1000000+0 records out
8192000000 bytes (8.2 GB) copied, 105.105 s, 77.9 MB/s
# dd if=/dev/mapper/vg_blackice-root bs=8k count=100000 of=/dev/null
100000+0 records in
100000+0 records out
819200000 bytes (819 MB) copied, 11.6469 s, 70.3 MB/s

The raw disk, the LUKS-encrypted partition that's got a LVM on it, and a
crypted LVM partition. The last run spikes both CPUs up to about 50%CPU each.
So whatever it is, it's somehow more subtle than that.  Maybe the fact that
in my case, it's disk, crypto, and LVM on the crypted partition, rather than
crypted filesystems on an LVM volume?


[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
  2010-08-04 11:53       ` Mikael Abrahamsson
  2010-08-04 12:56         ` Mike Snitzer
@ 2010-08-04 22:24         ` Neil Brown
  1 sibling, 0 replies; 23+ messages in thread
From: Neil Brown @ 2010-08-04 22:24 UTC (permalink / raw)
  To: Mikael Abrahamsson
  Cc: Christoph Hellwig, Dominik Brodowski, Michael Monnerie,
	linux-raid, xfs, linux-kernel, dm-devel

On Wed, 4 Aug 2010 13:53:03 +0200 (CEST)
Mikael Abrahamsson <swmike@swm.pp.se> wrote:

> On Wed, 4 Aug 2010, Christoph Hellwig wrote:
> 
> > The good news is that you have it tracked down, the bad news is that I 
> > know very little about dm-crypt.  Maybe the issue is the single threaded 
> > decryption in dm-crypt?  Can you check how much CPU time the dm crypt 
> > kernel thread uses?
> 
> I'm not sure it's that. I have a Core i5 with AES-NI and that didn't 
> significantly increase my overall performance, as it's not there the 
> bottleneck is (at least in my system).
> 
> I earlier sent out an email wondering if someone could shed some light on 
> how scheduling, block caching and read-ahead works together when one does 
> disks->md->crypto->lvm->fs, becase that's a lot of layers and potentially 
> a lot of unneeded buffering, readahead and scheduling magic?
> 

Both page-cache and read-ahead work at the filesystem level, so only the
device in the stack that the filesystem mounts from is relevant for these.
Any read-ahead setting on other devices are ignored.
Other levels only have a cache if they explicitly need one.  e.g. raid5 has a
stripe-cache to allow parity calculations across all blocks in a stripe.

Scheduling can potentially happen at every layer, but it takes very different
forms.  Crypto, lvm, raid0 etc don't do any scheduling - it is just
first-in-first-out.
RAID5 does some scheduling for writes (but not reads) to try to gather full
stripes.  If you write 2 of 3 blocks in a stripe, then 3 of 3 in another
stripe, the 3 of 3 will be processed immediately while the 2 of 3 might be
delayed a little in the hope that the third will arrive.

The sys/block/XXX/queue/scheduler setting only applies at the bottom of the
stack (though when you have dm-multipath it is actually one step above the
bottom).

Hope that helps,
NeilBrown

^ permalink raw reply	[flat|nested] 23+ messages in thread

* direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs]
  2010-08-04 20:33     ` Valdis.Kletnieks
@ 2010-08-05  9:31       ` Dominik Brodowski
  2010-08-05 11:32         ` Chris Mason
  0 siblings, 1 reply; 23+ messages in thread
From: Dominik Brodowski @ 2010-08-05  9:31 UTC (permalink / raw)
  To: Valdis.Kletnieks, josef, chris.mason
  Cc: Michael Monnerie, Christoph Hellwig, linux-raid, xfs,
	linux-kernel, dm-devel

Hey,

when attempting to track down insufficient I/O performance, I found the
following reression relating to direct-io on my notebook, where an
ata device, which consists of several partitions, is combined to a lvm
volume, and one logical volume is then encrypted using dm-crypt. Test case
was the following command:

$ dd if=/dev/mapper/vg0-root_crypt of=/dev/zero iflag=direct bs=8k count=131072

2.6.34 results in ~16 MB/s,
2.6.35 results in ~ 3.1 MB/s

The regression was bisected down to the follwoing commit:

commit c2c6ca417e2db7a519e6e92c82f4a933d940d076
Author: Josef Bacik <josef@redhat.com>
Date:   Sun May 23 11:00:55 2010 -0400

    direct-io: do not merge logically non-contiguous requests

...

How to fix this? I do not use btrfs, but ext3 (and the access was down on
the block level, not on the fs level, so this btrs-related commit should not
cause such a regression).

Best,

	Dominik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs]
  2010-08-05  9:31       ` direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs] Dominik Brodowski
@ 2010-08-05 11:32         ` Chris Mason
  2010-08-05 12:36           ` Josef Bacik
                             ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Chris Mason @ 2010-08-05 11:32 UTC (permalink / raw)
  To: Dominik Brodowski, Valdis.Kletnieks, josef, Michael Monnerie,
	Christoph Hellwig, linux-raid, xfs, linux-kernel, dm-devel

On Thu, Aug 05, 2010 at 11:31:00AM +0200, Dominik Brodowski wrote:
> Hey,
> 
> when attempting to track down insufficient I/O performance, I found the
> following reression relating to direct-io on my notebook, where an
> ata device, which consists of several partitions, is combined to a lvm
> volume, and one logical volume is then encrypted using dm-crypt. Test case
> was the following command:
> 
> $ dd if=/dev/mapper/vg0-root_crypt of=/dev/zero iflag=direct bs=8k count=131072
> 
> 2.6.34 results in ~16 MB/s,
> 2.6.35 results in ~ 3.1 MB/s
> 
> The regression was bisected down to the follwoing commit:
> 
> commit c2c6ca417e2db7a519e6e92c82f4a933d940d076
> Author: Josef Bacik <josef@redhat.com>
> Date:   Sun May 23 11:00:55 2010 -0400
> 
>     direct-io: do not merge logically non-contiguous requests
>     
> ...
> 
> How to fix this? I do not use btrfs, but ext3 (and the access was down on
> the block level, not on the fs level, so this btrs-related commit should not
> cause such a regression).

Well, you've already bisected down to an offending if statement, that's
a huge help.  I'll try to reproduce this and fix it up today.

But, I'm surprised your drive is doing 8K dio reads at 16MB/s, that
seems a little high.  

-chris


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs]
  2010-08-05 11:32         ` Chris Mason
@ 2010-08-05 12:36           ` Josef Bacik
  2010-08-05 15:35           ` Dominik Brodowski
  2010-08-05 18:58           ` direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs] Jeff Moyer
  2 siblings, 0 replies; 23+ messages in thread
From: Josef Bacik @ 2010-08-05 12:36 UTC (permalink / raw)
  To: Chris Mason, Dominik Brodowski, Valdis.Kletnieks, josef,
	Michael Monnerie, Christoph Hellwig, linux-raid, xfs,
	linux-kernel, dm-devel

On Thu, Aug 05, 2010 at 07:32:40AM -0400, Chris Mason wrote:
> On Thu, Aug 05, 2010 at 11:31:00AM +0200, Dominik Brodowski wrote:
> > Hey,
> > 
> > when attempting to track down insufficient I/O performance, I found the
> > following reression relating to direct-io on my notebook, where an
> > ata device, which consists of several partitions, is combined to a lvm
> > volume, and one logical volume is then encrypted using dm-crypt. Test case
> > was the following command:
> > 
> > $ dd if=/dev/mapper/vg0-root_crypt of=/dev/zero iflag=direct bs=8k count=131072
> > 
> > 2.6.34 results in ~16 MB/s,
> > 2.6.35 results in ~ 3.1 MB/s
> > 
> > The regression was bisected down to the follwoing commit:
> > 
> > commit c2c6ca417e2db7a519e6e92c82f4a933d940d076
> > Author: Josef Bacik <josef@redhat.com>
> > Date:   Sun May 23 11:00:55 2010 -0400
> > 
> >     direct-io: do not merge logically non-contiguous requests
> >     
> > ...
> > 
> > How to fix this? I do not use btrfs, but ext3 (and the access was down on
> > the block level, not on the fs level, so this btrs-related commit should not
> > cause such a regression).
> 
> Well, you've already bisected down to an offending if statement, that's
> a huge help.  I'll try to reproduce this and fix it up today.
> 
> But, I'm surprised your drive is doing 8K dio reads at 16MB/s, that
> seems a little high.  
>

Hrm, I made sure there were no perf regressions when I wast testing this stuff,
though I think I only tested xfs and ext4.  Originally I had a test where if we
provided our own submit_io, so maybe as a workaround just make

if (dio->final_block_in_bio != dio->cur_page_block ||
                    cur_offset != bio_next_offset) 

look like this

if (dio->final_block_in_bio != dio->cur_page_block ||
    (dio->submit_io && cur_offset != bio_next_offset))

and that should limit my change to only btrfs.  I know why it could cause a
problem, but this change shouldn't be causing a 400% regression.  I suspect
something else is afoot here.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs]
  2010-08-05 11:32         ` Chris Mason
  2010-08-05 12:36           ` Josef Bacik
@ 2010-08-05 15:35           ` Dominik Brodowski
  2010-08-05 15:39             ` Chris Mason
  2010-08-05 16:35             ` Dominik Brodowski
  2010-08-05 18:58           ` direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs] Jeff Moyer
  2 siblings, 2 replies; 23+ messages in thread
From: Dominik Brodowski @ 2010-08-05 15:35 UTC (permalink / raw)
  To: Chris Mason, josef
  Cc: Valdis.Kletnieks, Michael Monnerie, Christoph Hellwig,
	linux-raid, xfs, linux-kernel, dm-devel

Hey,

On Thu, Aug 05, 2010 at 07:32:40AM -0400, Chris Mason wrote:
> But, I'm surprised your drive is doing 8K dio reads at 16MB/s, that
> seems a little high.  

Well, that's what it does:

# $ dd if=/dev/mapper/vg0-home_crypt of=/dev/zero iflag=direct bs=8k count=131072 seek=131072
# 131072+0 records in
# 131072+0 records out
# 1073741824 bytes (1.1 GB) copied, 62.0177 s, 17.3 MB/s

On Thu, Aug 05, 2010 at 08:36:49AM -0400, Josef Bacik wrote:
> Hrm, I made sure there were no perf regressions when I wast testing this stuff,
> though I think I only tested xfs and ext4.

For this test, I'm not doing dio on filesystem level, but on block level
(/dev/mapper/vg0-*_crypt). It seems that dm-crypt creates such offending
holes, which cause this huge performance drop.

>  Originally I had a test where if we
> provided our own submit_io, so maybe as a workaround just make
> 
> if (dio->final_block_in_bio != dio->cur_page_block ||
>                     cur_offset != bio_next_offset) 
> 
> look like this
> 
> if (dio->final_block_in_bio != dio->cur_page_block ||
>     (dio->submit_io && cur_offset != bio_next_offset))

Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>

With this fix, I get proper speeds when doing dio reads from
/dev/mapper/vg0-*_crypt; see the 17.3 MB/s above. Most strangely,
also accesing /dev/mapper/vg0-* (un-encrypted) and the raw
device at /dev/sda* speeds up (to up to 28 MB/s). Was only seeing around
16 to 18 MB/s without this patch for unencrypted access.

> I know why it could cause a problem, but this change shouldn't be
> causing a 400% regression.

Well, it seems to cause -- at least on my notebook -- a 150% regression on
unencrypted LVM2 access; and this > 400% on encrypted LVM2 access...

> I suspect something else is afoot here.

There is, probably. But the fix you propose helps a lot, already.

Thanks & best,

	Dominik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs]
  2010-08-05 15:35           ` Dominik Brodowski
@ 2010-08-05 15:39             ` Chris Mason
  2010-08-05 15:53               ` Dominik Brodowski
  2010-08-05 16:35             ` Dominik Brodowski
  1 sibling, 1 reply; 23+ messages in thread
From: Chris Mason @ 2010-08-05 15:39 UTC (permalink / raw)
  To: Dominik Brodowski, josef, Valdis.Kletnieks, Michael Monnerie,
	Christoph Hellwig, linux-raid, xfs, linux-kernel, dm-devel

On Thu, Aug 05, 2010 at 05:35:19PM +0200, Dominik Brodowski wrote:
> Hey,
> 
> On Thu, Aug 05, 2010 at 07:32:40AM -0400, Chris Mason wrote:
> > But, I'm surprised your drive is doing 8K dio reads at 16MB/s, that
> > seems a little high.  
> 
> Well, that's what it does:
> 
> # $ dd if=/dev/mapper/vg0-home_crypt of=/dev/zero iflag=direct bs=8k count=131072 seek=131072
> # 131072+0 records in
> # 131072+0 records out
> # 1073741824 bytes (1.1 GB) copied, 62.0177 s, 17.3 MB/s

Can I ask you to do the test directly to the real honest to goodness
drive?   If it were an SSD I'd be less surprised, but then the extra
submits shouldn't hurt the ssd that much either.

Thanks for testing the patch, I'll send it in.

-chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs]
  2010-08-05 15:39             ` Chris Mason
@ 2010-08-05 15:53               ` Dominik Brodowski
  0 siblings, 0 replies; 23+ messages in thread
From: Dominik Brodowski @ 2010-08-05 15:53 UTC (permalink / raw)
  To: Chris Mason
  Cc: josef, Valdis.Kletnieks, Michael Monnerie, Christoph Hellwig,
	linux-raid, xfs, linux-kernel, dm-devel

On Thu, Aug 05, 2010 at 11:39:43AM -0400, Chris Mason wrote:
> On Thu, Aug 05, 2010 at 05:35:19PM +0200, Dominik Brodowski wrote:
> > On Thu, Aug 05, 2010 at 07:32:40AM -0400, Chris Mason wrote:
> > > But, I'm surprised your drive is doing 8K dio reads at 16MB/s, that
> > > seems a little high.  
> > 
> > Well, that's what it does:
> > 
> > # $ dd if=/dev/mapper/vg0-home_crypt of=/dev/zero iflag=direct bs=8k count=131072 seek=131072
> > # 131072+0 records in
> > # 131072+0 records out
> > # 1073741824 bytes (1.1 GB) copied, 62.0177 s, 17.3 MB/s
> 
> Can I ask you to do the test directly to the real honest to goodness
> drive?   If it were an SSD I'd be less surprised, but then the extra
> submits shouldn't hurt the ssd that much either.

>From lower the chain up to the device:

# LANG=EN dd if=/dev/mapper/vg0-root_crypt of=/dev/zero bs=8k count=131072 seek=393300 iflag=direct
131072+0 records in
131072+0 records out
1073741824 bytes (1.1 GB) copied, 63.1217 s, 17.0 MB/s

# LANG=EN dd if=/dev/mapper/vg0-root of=/dev/zero bs=8k count=131072 seek=393300 iflag=direct
131072+0 records in
131072+0 records out
1073741824 bytes (1.1 GB) copied, 43.2335 s, 24.8 MB/s

# LANG=EN dd if=/dev/sda5 of=/dev/zero bs=8k count=131072 seek=393300 iflag=direct
131072+0 records in
131072+0 records out
1073741824 bytes (1.1 GB) copied, 42.0868 s, 25.5 MB/s

Best,
	Dominik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs]
  2010-08-05 15:35           ` Dominik Brodowski
  2010-08-05 15:39             ` Chris Mason
@ 2010-08-05 16:35             ` Dominik Brodowski
  2010-08-05 20:47               ` Performance impact of CONFIG_DEBUG? direct-io test case Dominik Brodowski
  2010-08-05 20:54               ` Performance impact of CONFIG_SCHED_MC? " Dominik Brodowski
  1 sibling, 2 replies; 23+ messages in thread
From: Dominik Brodowski @ 2010-08-05 16:35 UTC (permalink / raw)
  To: Chris Mason, josef, Valdis.Kletnieks, Michael Monnerie,
	Christoph Hellwig, linux-raid, xfs, linux-kernel, dm-devel

Small correction:

On Thu, Aug 05, 2010 at 05:35:19PM +0200, Dominik Brodowski wrote:
> With this fix, I get proper speeds when doing dio reads from
> /dev/mapper/vg0-*_crypt; see the 17.3 MB/s above. Most strangely,
> also accesing /dev/mapper/vg0-* (un-encrypted) and the raw
> device at /dev/sda* speeds up (to up to 28 MB/s). Was only seeing around
> 16 to 18 MB/s without this patch for unencrypted access.

The speed-up of the unencrypted access (18 -> 28 MB/s) is caused by using a
different configuration for kernel 2.6.35; and seems to be unrelated to your
patch. Will try to track down which config option is the culprit.


kernel, dmcrypt?	| good config | bad config
-------------------------------------------------
patched 2.6.35, dmcrypt	  ~ 18 MB/s	~ 13 MB/s
patched 2.6.35		  ~ 28 MB/s	~ 18 MB/s
-------------------------------------------------
plain 2.6.35, dmcrypt	  ~  3 MB/s	~  3 MB/s
plain 2.6.35		  <not tested>	~ 16 MB/s	


Best,
	Dominik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs]
  2010-08-05 11:32         ` Chris Mason
  2010-08-05 12:36           ` Josef Bacik
  2010-08-05 15:35           ` Dominik Brodowski
@ 2010-08-05 18:58           ` Jeff Moyer
  2010-08-05 19:01             ` Chris Mason
  2 siblings, 1 reply; 23+ messages in thread
From: Jeff Moyer @ 2010-08-05 18:58 UTC (permalink / raw)
  To: Chris Mason
  Cc: Dominik Brodowski, Valdis.Kletnieks, josef, Michael Monnerie,
	Christoph Hellwig, linux-raid, xfs, linux-kernel, dm-devel

Chris Mason <chris.mason@oracle.com> writes:

> But, I'm surprised your drive is doing 8K dio reads at 16MB/s, that
> seems a little high.  

I'm not sure why you think that.  We're talking about a plain old SATA
disk, right?  I can get 40-50MB/s on my systems for 8KB O_DIRECT reads.
What am I missing?

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs]
  2010-08-05 18:58           ` direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs] Jeff Moyer
@ 2010-08-05 19:01             ` Chris Mason
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Mason @ 2010-08-05 19:01 UTC (permalink / raw)
  To: Jeff Moyer
  Cc: Dominik Brodowski, Valdis.Kletnieks, josef, Michael Monnerie,
	Christoph Hellwig, linux-raid, xfs, linux-kernel, dm-devel

On Thu, Aug 05, 2010 at 02:58:37PM -0400, Jeff Moyer wrote:
> Chris Mason <chris.mason@oracle.com> writes:
> 
> > But, I'm surprised your drive is doing 8K dio reads at 16MB/s, that
> > seems a little high.  
> 
> I'm not sure why you think that.  We're talking about a plain old SATA
> disk, right?  I can get 40-50MB/s on my systems for 8KB O_DIRECT reads.
> What am I missing?

Clearly I'm wrong, his drive is going much faster ;)  I expect the
smaller reads to be slower but the drive's internal cache is doing well.

-chris


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Performance impact of CONFIG_DEBUG? direct-io test case
  2010-08-05 16:35             ` Dominik Brodowski
@ 2010-08-05 20:47               ` Dominik Brodowski
  2010-08-05 20:54               ` Performance impact of CONFIG_SCHED_MC? " Dominik Brodowski
  1 sibling, 0 replies; 23+ messages in thread
From: Dominik Brodowski @ 2010-08-05 20:47 UTC (permalink / raw)
  To: linux-kernel, mingo, peterz
  Cc: Chris Mason, josef, Valdis.Kletnieks, Michael Monnerie,
	Christoph Hellwig, linux-raid, xfs, dm-devel


How large is the performance impact of CONFIG_DEBUG? Well, for the test
workload I've been working with lately,

dd if=<device> of=/dev/zero bs=8k count=100000 iflag=direct

where <device> is a dm-crypted LVM volume consisting of several
partitions on a notebook pata harddisk, I get the following results:

1) best results are ~ 28 MB/s

2) Enabling CONFIG_DEBUG_LOCK_ALLOC, which also means CONFIG_LOCKDEP
   being enabled, causes the transfer rate to decrease by ~ 1.2 MB/s

3) Enabling CONFIG_DEBUG_SPINLOCK && CONFIG_DEBUG_MUTEXTES or
   CONFIG_DEBUG_SPINLOCK_SLEEP=y costs about ~ 0.4 MB/s each

4) Enabling all of the following options:
	CONFIG_DEBUG_RT_MUTEXES
	CONFIG_DEBUG_PI_LIST
	CONFIG_PROVE_LOCKING
	CONFIG_LOCK_STAT
	CONFIG_DEBUG_LOCKDEP
   costs another ~ 5 MB/s.

So, for this test case, the performance impact of (some) CONFIG_DEBUG
options is highly significant, here by about 25 %.

Best,
	Dominik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Performance impact of CONFIG_SCHED_MC? direct-io test case
  2010-08-05 16:35             ` Dominik Brodowski
  2010-08-05 20:47               ` Performance impact of CONFIG_DEBUG? direct-io test case Dominik Brodowski
@ 2010-08-05 20:54               ` Dominik Brodowski
  1 sibling, 0 replies; 23+ messages in thread
From: Dominik Brodowski @ 2010-08-05 20:54 UTC (permalink / raw)
  To: linux-kernel, mingo, peterz
  Cc: Chris Mason, josef, Valdis.Kletnieks, Michael Monnerie,
	Christoph Hellwig, linux-raid, xfs, dm-devel

How large is the performance impact of CONFIG_SCHED_MC -- for which there
is a warning that it comes "at a cost of slightly increased overhead in
some places."? Well, for the test workload I've been working with lately,

dd if=<device> of=/dev/zero bs=8k count=100000 iflag=direct

where <device> is a dm-crypted LVM volume consisting of several
partitions on a notebook pata harddisk, and all this runs on a Core2 Duo,
I get a ~ 10 % performance reduction if CONFIG_SCHED_MC is enabled.

Combined with the CONFIG_DEBUG performance reduction mentioned in the other
message, all of the reduction from 28 MB/s to 18 MB/s is explained for.

Best,
	Dominik

PS: Ingo: you got both mingo@elte.hu and mingo@redhat.com in MAINTAINERS,
I suppose both are valid?

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2010-08-05 20:55 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-04  7:35 How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs Dominik Brodowski
2010-08-04  8:50 ` Christoph Hellwig
2010-08-04  9:13   ` Dominik Brodowski
2010-08-04  9:21     ` Christoph Hellwig
2010-08-04  9:16 ` Michael Monnerie
2010-08-04 10:25   ` Dominik Brodowski
2010-08-04 11:18     ` Christoph Hellwig
2010-08-04 11:24       ` Dominik Brodowski
2010-08-04 11:53       ` Mikael Abrahamsson
2010-08-04 12:56         ` Mike Snitzer
2010-08-04 22:24         ` Neil Brown
2010-08-04 20:33     ` Valdis.Kletnieks
2010-08-05  9:31       ` direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs] Dominik Brodowski
2010-08-05 11:32         ` Chris Mason
2010-08-05 12:36           ` Josef Bacik
2010-08-05 15:35           ` Dominik Brodowski
2010-08-05 15:39             ` Chris Mason
2010-08-05 15:53               ` Dominik Brodowski
2010-08-05 16:35             ` Dominik Brodowski
2010-08-05 20:47               ` Performance impact of CONFIG_DEBUG? direct-io test case Dominik Brodowski
2010-08-05 20:54               ` Performance impact of CONFIG_SCHED_MC? " Dominik Brodowski
2010-08-05 18:58           ` direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs] Jeff Moyer
2010-08-05 19:01             ` Chris Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).