All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: *Really* bad I/O latency with md raid5+dm-crypt+lvm
@ 2009-10-12 14:48 Tomasz Chmielewski
  2009-10-12 17:37 ` Mike Galbraith
  0 siblings, 1 reply; 5+ messages in thread
From: Tomasz Chmielewski @ 2009-10-12 14:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: pernegger, arjan

> Summary: I was hoping to use a layered storage setup, namely lvm on
> dm-crypt on md raid5 for a new box I'm setting up, but that isn't
> looking so good since a single heavyish writer will monopolise any and
> all I/O on the "device". F. ex. while cp'ing a few GB of data from an
> external disk to the array it takes ~10sec to run ls and ~2min to
> start aptitude. Clueless attempts at a diagnosis below.

Did you try running strace to see where ls pauses?

Did you try running latencytop (and generally, top/htop while doing your 
tests)?


(...)

> Anyway, as soon as I copy something to the array or create a larger
> (upwards of a few hundred MiB) tar archive the box becomes utterly
> unresponsive until that job is finished. Even on the local console the
> completion time for a simple ls or cat is of the order of tens of
> seconds, just forget about launching emacs.
> Now I know that people have been ranting about desktop responsiveness
> for a while but that was very much an abstract thing for me until now.

I think the above (big latency when doing some bigger IO) is a general 
Linux problem.

I see similar behaviour on quite powerful hardware, i.e. Core i7, 8 GB 
RAM, 2x HDD in a software RAID-1 array (no dm-crypt), when tarring 
something big, or writing dd if=/dev/zero of=/home/me/bigfile - doing ls 
in another terminal or just starting top can take up to a minute.

Quite interestingly, background RAID synchronization have almost no 
effect on latency.

(...)


> According to openssl speed aes-256-cbc the CPUs encryption speed is
> ~113 MiB/s (single core, est. for 512b blocks). Obviously the array is
> much faster than that. I can't find the benchmarks ATM but the numbers
> seemed plausible for 70 MiB/s (optimistic est. for sequential access)
> disks at the time.

You can find some dm-crypt benchmarks i.e. here:

http://blog.wpkg.org/2009/04/23/cipher-benchmark-for-dm-crypt-luks/

Obviously, they will not match your hardware.

Also note that dm-crypt is not "SMP-ready", so whatever hardware you 
have, it will only use once CPU - this may seriously limit the 
performance, depending on your usage and hardware.


-- 
Tomasz Chmielewski



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: *Really* bad I/O latency with md raid5+dm-crypt+lvm
  2009-10-12 14:48 *Really* bad I/O latency with md raid5+dm-crypt+lvm Tomasz Chmielewski
@ 2009-10-12 17:37 ` Mike Galbraith
  0 siblings, 0 replies; 5+ messages in thread
From: Mike Galbraith @ 2009-10-12 17:37 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: linux-kernel, pernegger, arjan

On Mon, 2009-10-12 at 16:48 +0200, Tomasz Chmielewski wrote:
> > Summary: I was hoping to use a layered storage setup, namely lvm on
> > dm-crypt on md raid5 for a new box I'm setting up, but that isn't
> > looking so good since a single heavyish writer will monopolise any and
> > all I/O on the "device". F. ex. while cp'ing a few GB of data from an
> > external disk to the array it takes ~10sec to run ls and ~2min to
> > start aptitude. Clueless attempts at a diagnosis below.
> 
> Did you try running strace to see where ls pauses?
> 
> Did you try running latencytop (and generally, top/htop while doing your 
> tests)?
> 
> 
> (...)
> 
> > Anyway, as soon as I copy something to the array or create a larger
> > (upwards of a few hundred MiB) tar archive the box becomes utterly
> > unresponsive until that job is finished. Even on the local console the
> > completion time for a simple ls or cat is of the order of tens of
> > seconds, just forget about launching emacs.
> > Now I know that people have been ranting about desktop responsiveness
> > for a while but that was very much an abstract thing for me until now.
> 
> I think the above (big latency when doing some bigger IO) is a general 
> Linux problem.

It would be interesting to test latest -rc.  Though it may prove to be
unrelated. the symptoms sound very much like a recent thread wrt writers
starving readers.

	-Mike


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: *Really* bad I/O latency with md raid5+dm-crypt+lvm
  2009-10-12 14:26 ` Arjan van de Ven
@ 2009-10-12 19:05   ` Christian Pernegger
  0 siblings, 0 replies; 5+ messages in thread
From: Christian Pernegger @ 2009-10-12 19:05 UTC (permalink / raw)
  To: linux-kernel

>> [Please keep me CCed as I'm not subscribed to LKML]
>>
>> Summary: I was hoping to use a layered storage setup, namely lvm on
>> dm-crypt on md raid5 for a new box I'm setting up, but that isn't
>> looking so good since a single heavyish writer will monopolise any and
>> all I/O on the "device". F. ex. while cp'ing a few GB of data from an
>> external disk to the array it takes ~10sec to run ls and ~2min to
>> start aptitude. Clueless attempts at a diagnosis below.

> Also note that dm-crypt is not "SMP-ready", so whatever hardware you have,
> it will only use once CPU - this may seriously limit the performance,
> depending on your usage and hardware.

The crypto performance itself is fine. Yes, it limits throughput to a
little over 100MiB/s but so what, that's plenty. Multi-core support
will come in time, I can wait. What I can't live with is a single
streaming write singlehandedly starving all reads. Linux has never
been great at this and it has been getting worse since ~2.6.18 but it
was never more than a nuisance (say <1sec delay).

It's as if the I/O scheduler weren't there.

> [latencytop? regular top?]

Actually I hadn't heard of latencytop but it looks nifty. Will have to
compile a custom kernel for it, though, since Debian kernels don't
have CONFIG_LATENCYTOP set.

Regular top has kcryptd (67%), mv (36%), md1_raid5 (36%), pdflush
(7%), kjournald (5%) at the top. Seems a bit much for md, doesn't it?
This is while mv'ing in some data from an sdditional SATA disk, lateny
isn't *too* bad, ~3s for an ls. According to iostat mv is writing to
the array at 50-60 MB/s. The fun part: it's using ~15000tps averaging
out to 4k per transaction as observed via btrace.

That can't be normal, can it?

Thanks,

C.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: *Really* bad I/O latency with md raid5+dm-crypt+lvm
  2009-10-12 14:01 Christian Pernegger
@ 2009-10-12 14:26 ` Arjan van de Ven
  2009-10-12 19:05   ` Christian Pernegger
  0 siblings, 1 reply; 5+ messages in thread
From: Arjan van de Ven @ 2009-10-12 14:26 UTC (permalink / raw)
  To: Christian Pernegger; +Cc: linux-kernel

On Mon, 12 Oct 2009 16:01:58 +0200
Christian Pernegger <pernegger@gmail.com> wrote:

> [Please keep me CCed as I'm not subscribed to LKML]
> 
> Summary: I was hoping to use a layered storage setup, namely lvm on
> dm-crypt on md raid5 for a new box I'm setting up, but that isn't
> looking so good since a single heavyish writer will monopolise any and
> all I/O on the "device". F. ex. while cp'ing a few GB of data from an
> external disk to the array it takes ~10sec to run ls and ~2min to
> start aptitude. Clueless attempts at a diagnosis below.


have you ran latencytop ?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* *Really* bad I/O latency with md raid5+dm-crypt+lvm
@ 2009-10-12 14:01 Christian Pernegger
  2009-10-12 14:26 ` Arjan van de Ven
  0 siblings, 1 reply; 5+ messages in thread
From: Christian Pernegger @ 2009-10-12 14:01 UTC (permalink / raw)
  To: linux-kernel

[Please keep me CCed as I'm not subscribed to LKML]

Summary: I was hoping to use a layered storage setup, namely lvm on
dm-crypt on md raid5 for a new box I'm setting up, but that isn't
looking so good since a single heavyish writer will monopolise any and
all I/O on the "device". F. ex. while cp'ing a few GB of data from an
external disk to the array it takes ~10sec to run ls and ~2min to
start aptitude. Clueless attempts at a diagnosis below.

Hardware:
AMD Athlon II X2 250
2GB Crucial DDR2-ECC RAM (more after testing)
ASUS M4A785D-M PRO
4x WD1000FYPS
        connected to onboard SATA controller (AMD SB710 / ahci)

Software:
Debian 5.03 (lenny/stable)
Kernel: linux-image-2.6.30-bpo.2-amd64 (based on 2.6.30.5 it seems)

The 4 disks are each partitioned into a 256MB sdX1 and a $REST sdX2.
The sdX1s make up md0, a raid1 w/ 1.0 superblock for /boot.
The sdX2s make up md1, a raid5 w/ 1.1 superblock, 1MiB chunk size and
stripe_cache_size = 8192.
On top of md1 sits md1_crypt, a dm-crypt/luks layer using
aes-cbc-essiv:sha256 and a 256 bit key. It's aligned to 6144 sectors
(=3MiB / 1 stripe)
The whole of md1_crypt is an lvm PV with a metadatasize of 3008KiB.
(That's the poor-man's way of aligning the data to align the data to
3MiB / 1 stripe. The lvm tools in stable are too old for proper
alignment options.)
The VG consisting of md1_crypt has 16GiB root, 4GiB swap, 200GiB home
and $REST data LVs.
All filesystems are ext3 with stride=256 and stripe-width=768. home is
mounted acl,user_xattr, data acl,user_xattr,noatime. Readahed on the
LVs is at 6MiB (2 stripes).

So, first question: should this kind of setup work at all or am I
doing something pathological in the first place?

Anyway, as soon as I copy something to the array or create a larger
(upwards of a few hundred MiB) tar archive the box becomes utterly
unresponsive until that job is finished. Even on the local console the
completion time for a simple ls or cat is of the order of tens of
seconds, just forget about launching emacs.

Now I know that people have been ranting about desktop responsiveness
for a while but that was very much an abstract thing for me until now.
I'd never have thought it would hit me on a personal streaming media /
backups / multi-user general purpose server. Well, at the moment it's
single-user, single-job ... :-(

Here's what I tried:
changing scheduler from cfq to deadline (no effect)
tuning proc/sys/vm/dirty*ratio way down (no effect)
turning off NCQ (some effect, maybe)
raising queue/nr_requests really high, e. g. 1000000 (helps
noticeably, especially when NCQ is off)

Ideas:
According to openssl speed aes-256-cbc the CPUs encryption speed is
~113 MiB/s (single core, est. for 512b blocks). Obviously the array is
much faster than that. I can't find the benchmarks ATM but the numbers
seemed plausible for 70 MiB/s (optimistic est. for sequential access)
disks at the time. So lets say at least 50% faster. Wouldn't this move
the bottleneck for requests away from the scheduler queue thus
rendering it ineffective?

Also, running btrace on the various block device layers I never see
>4k writes, even when using dd with a blocksize of 3 MiB. Is this
normal? btrace on (one of) the component disks shows some merged
requests at least. Am I wrong or would scheduling/merging lots and
lots of 4k blocks effectively, take an *insane* queue length?

All comments and suggestions welcome

Thank you,

Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-10-12 19:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-12 14:48 *Really* bad I/O latency with md raid5+dm-crypt+lvm Tomasz Chmielewski
2009-10-12 17:37 ` Mike Galbraith
  -- strict thread matches above, loose matches on Subject: below --
2009-10-12 14:01 Christian Pernegger
2009-10-12 14:26 ` Arjan van de Ven
2009-10-12 19:05   ` Christian Pernegger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.