From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kaul Subject: Re: Substantial performance difference when reading/writing to device-mapper vs. the individual device Date: Wed, 24 Jul 2013 14:49:05 +0300 Message-ID: References: Reply-To: device-mapper development Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5527200849341125565==" Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: dm-devel@redhat.com List-Id: dm-devel.ids --===============5527200849341125565== Content-Type: multipart/alternative; boundary=047d7b15a0d581f5d604e24081e5 --047d7b15a0d581f5d604e24081e5 Content-Type: text/plain; charset=ISO-8859-1 Reply to self: Could it be explained by the difference in max_segments between the different devices and the dm device? Sounds like https://bugzilla.redhat.com/show_bug.cgi?id=755046 which is supposed to be fixed in 6.4, I reckon: 3514f0c5615a00003 dm-3 XtremIO,XtremApp size=1.0T features='0' hwhandler='0' wp=rw `-+- policy='queue-length 0' prio=1 status=active |- 0:0:2:2 sdi 8:128 active ready running |- 0:0:3:2 sdl 8:176 active ready running |- 0:0:1:2 sdf 8:80 active ready running |- 0:0:0:2 sdc 8:32 active ready running |- 1:0:0:2 sds 65:32 active ready running |- 1:0:3:2 sdab 65:176 active ready running |- 1:0:2:2 sdy 65:128 active ready running `- 1:0:1:2 sdv 65:80 active ready running [root@lg545 ~]# cat /sys/class/block/dm-3/queue/max_segments 128 [root@lg545 ~]# cat /sys/class/block/sdi/queue/max_segments 1024 [root@lg545 ~]# cat /sys/class/block/sdl/queue/max_segments 1024 [root@lg545 ~]# cat /sys/class/block/sdf/queue/max_segments 1024 [root@lg545 ~]# cat /sys/class/block/sdc/queue/max_segments 1024 [root@lg545 ~]# cat /sys/class/block/sds/queue/max_segments 1024 [root@lg545 ~]# cat /sys/class/block/sdab/queue/max_segments 1024 [root@lg545 ~]# cat /sys/class/block/sdy/queue/max_segments 1024 [root@lg545 ~]# cat /sys/class/block/sdv/queue/max_segments 1024 On Mon, Jul 22, 2013 at 2:47 PM, Kaul wrote: > We are seeing a substantial difference in performance when we perform a > read/write to /dev/mapper/... vs. the specific device (/dev/sdXX) > What can we do to further isolate the issue? > > We are using CentOS 6.4, with all updates, 2 CPUs, 4 FC ports: > Here's a table comparing the results: > > # of LUNs > > # of Paths per device > > Native Multipath Device > > IO Pattern > > IOPS > > Latency Micro > > BW KBps > > 4 > > 16 > > No > > 100% Read > > 605,661.4 > > 3,381 > > 2,420,736 > > 4 > > 16 > > No > > 100% Write > > 477,515.1 > > 4,288 > > 1,908,736 > > 8 > > 16 > > No > > 100% Read > > 663,339.4 > > 6,174 > > 2,650,112 > > 8 > > 16 > > No > > 100% Write > > 536,936.9 > > 7,628 > > 2,146,304 > > 4 > > 16 > > Yes > > 100% Read > > 456,108.9 > > 1,122 > > 1,824,256 > > 4 > > 16 > > Yes > > 100% Write > > 371,665.8 > > 1,377 > > 1,486,336 > > 8 > > 16 > > Yes > > 100% Read > > 519,450.2 > > 1,971 > > 2,077,696 > > 8 > > 16 > > Yes > > 100% Write > > 448,840.4 > > 2,281 > > 1,795,072 > --047d7b15a0d581f5d604e24081e5 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Reply to self:

Could it be explained by= the difference in max_segments between the different devices and the dm de= vice?
Sounds like=A0https://bugzil= la.redhat.com/show_bug.cgi?id=3D755046=A0which is supposed to be fixed = in 6.4, I reckon:

=A0

3514f0c5615a00003 dm-3 XtremIO,XtremApp

size=3D1.0T features=3D'0' hwhandler=3D'0' wp=3Drw

`-+- policy=3D'queue-length 0' prio=3D1 status=3Dactive

=A0 |- 0:0:2:2 sdi=A0 8:128=A0 active ready running

=A0 |- 0:0:3:2 sdl=A0 8:176=A0 active ready running

=A0 |- 0:0:1:2 sdf=A0 8:80=A0=A0 active ready running

=A0 |- 0:0:0:2 sdc=A0 8:32=A0=A0 active ready running

=A0 |- 1:0:0:2 sds=A0 65:32=A0 active ready running

=A0 |- 1:0:3:2 sdab 65:176 active ready running

=A0 |- 1:0:2:2 sdy=A0 65:128 active ready running

=A0 `- 1:0:1:2 sdv=A0 65:80=A0 active ready running

=A0

[root@lg545 ~]# cat /sys/class/block/dm-3/queue/max_segments

128

[root@lg545 ~]# cat /sys/class/block/sdi/queue/max_segments

1024

[root@lg545 ~]# cat /sys/class/block/sdl/queue/max_segments

1024

[root@lg545 ~]# cat /sys/class/block/sdf/queue/max_segments

1024

[root@lg545 ~]# cat /sys/class/block/sdc/queue/max_segments

1024

[root@lg545 ~]# cat /sys/class/block/sds/queue/max_segments

1024

[root@lg545 ~]# cat /sys/class/block/sdab/queue/max_segments

1024

[root@lg545 ~]# cat /sys/class/block/sdy/queue/max_segments

1024

[root@lg545 ~]# cat /sys/class/block/sdv/queue/max_segments

1024



On Mon, Jul 22, 2013 at 2:47 PM, Kaul <mykaul@gmail.com> wrote:
We are seeing a substantial= difference in performance when we perform a read/write to /dev/mapper/... = vs. the specific device (/dev/sdXX)
What can we do to further isolate the issue?

We are using CentOS 6.4, with all updates, 2 CPUs, 4 FC ports:
Here's a table comparing the results:

# of LUNs

# of Paths per device

Native Multipath Device

IO Pattern

IOPS

Latency Micro

BW KBps

4

16

No

100% Read

605,661.4

3,381

2,420,736

4

16

No

100% Write

477,515.1

4,288

1,908,736

8

16

No

100% Read

663,339.4

6,174

2,650,112

8

16

No

100% Write

536,936.9

7,628

2,146,304

4

16

Yes

100% Read

456,108.9

1,122

1,824,256

4

16

Yes

100% Write

371,665.8

1,377

1,486,336

8

16

Yes

100% Read

519,450.2

1,971

2,077,696

8

16

Yes

100% Write

448,840.4

2,281

1,795,072


--047d7b15a0d581f5d604e24081e5-- --===============5527200849341125565== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline --===============5527200849341125565==--