All of lore.kernel.org
 help / color / mirror / Atom feed
* waiting for sub ops
@ 2012-05-11  7:02 Stefan Priebe - Profihost AG
  2012-05-11 13:25 ` Stefan Priebe - Profihost AG
  2012-05-13  0:15 ` Sage Weil
  0 siblings, 2 replies; 7+ messages in thread
From: Stefan Priebe - Profihost AG @ 2012-05-11  7:02 UTC (permalink / raw)
  To: ceph-devel

Hi,

while doing some stress testing with bonnie++ i'm seeing always these
messages accross all osd's here is just an example for osd.2.

All machines are connected with 2x 1Gbit/s bonding mode 6 to a HP switch.

2012-05-11 08:58:52.752057 osd.2 192.168.0.102:6801/4798 3218 : [WRN]
old request osd_op(mds.0.1:49546 200.00000027 [write 1643206~13552]
1.379675e3) v4 received at 2012-05-11 08:58:22.044033 currently waiting
for sub ops
2012-05-11 08:58:52.752059 osd.2 192.168.0.102:6801/4798 3219 : [WRN]
old request osd_op(client.4108.1:28755 10000008004.00000266 [write
0~4194304] 0.79b497c7 snapc 1=[]) received at 2012-05-11 08:58:22.026435
currently waiting for sub ops
2012-05-11 08:58:52.752061 osd.2 192.168.0.102:6801/4798 3220 : [WRN]
old request osd_op(client.4108.1:28762 10000008004.0000026d [write
0~4194304] 0.493cb0d4 snapc 1=[]) received at 2012-05-11 08:58:22.191536
currently waiting for sub ops
2012-05-11 08:58:52.752064 osd.2 192.168.0.102:6801/4798 3221 : [WRN]
old request osd_op(client.4108.1:28765 10000008004.00000270 [write
0~4194304] 0.b2f1b144 snapc 1=[]) received at 2012-05-11 08:58:22.290388
currently waiting for sub ops
2012-05-11 08:58:52.752066 osd.2 192.168.0.102:6801/4798 3222 : [WRN]
old request osd_op(client.4108.1:28774 10000008004.00000279 [write
0~4194304] 0.2373c652 snapc 1=[]) received at 2012-05-11 08:58:22.392604
currently waiting for sub ops
2012-05-11 08:58:52.752068 osd.2 192.168.0.102:6801/4798 3223 : [WRN]
old request osd_op(client.4108.1:28775 10000008004.0000027a [write
0~4194304] 0.24a58617 snapc 1=[]) received at 2012-05-11 08:58:22.455618
currently waiting for sub ops
2012-05-11 08:58:52.752070 osd.2 192.168.0.102:6801/4798 3224 : [WRN]
old request osd_op(client.4108.1:28777 10000008004.0000027c [write
0~4194304] 0.1c900a76 snapc 1=[]) received at 2012-05-11 08:58:22.527086
currently waiting for sub ops
2012-05-11 08:58:53.752212 osd.2 192.168.0.102:6801/4798 3225 : [WRN]
old request osd_op(client.4108.1:28808 10000008004.0000029b [write
0~4194304] 0.46c529aa snapc 1=[]) received at 2012-05-11 08:58:23.252258
currently waiting for sub ops

Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: waiting for sub ops
  2012-05-11  7:02 waiting for sub ops Stefan Priebe - Profihost AG
@ 2012-05-11 13:25 ` Stefan Priebe - Profihost AG
  2012-05-13  0:13   ` Sage Weil
  2012-05-13  0:15 ` Sage Weil
  1 sibling, 1 reply; 7+ messages in thread
From: Stefan Priebe - Profihost AG @ 2012-05-11 13:25 UTC (permalink / raw)
  To: ceph-devel

Hi,

all messages are always taling about snapc? I've no idea what this is.

Stefan

Am 11.05.2012 09:02, schrieb Stefan Priebe - Profihost AG:
> Hi,
> 
> while doing some stress testing with bonnie++ i'm seeing always these
> messages accross all osd's here is just an example for osd.2.
> 
> All machines are connected with 2x 1Gbit/s bonding mode 6 to a HP switch.
> 
> 2012-05-11 08:58:52.752057 osd.2 192.168.0.102:6801/4798 3218 : [WRN]
> old request osd_op(mds.0.1:49546 200.00000027 [write 1643206~13552]
> 1.379675e3) v4 received at 2012-05-11 08:58:22.044033 currently waiting
> for sub ops
> 2012-05-11 08:58:52.752059 osd.2 192.168.0.102:6801/4798 3219 : [WRN]
> old request osd_op(client.4108.1:28755 10000008004.00000266 [write
> 0~4194304] 0.79b497c7 snapc 1=[]) received at 2012-05-11 08:58:22.026435
> currently waiting for sub ops
> 2012-05-11 08:58:52.752061 osd.2 192.168.0.102:6801/4798 3220 : [WRN]
> old request osd_op(client.4108.1:28762 10000008004.0000026d [write
> 0~4194304] 0.493cb0d4 snapc 1=[]) received at 2012-05-11 08:58:22.191536
> currently waiting for sub ops
> 2012-05-11 08:58:52.752064 osd.2 192.168.0.102:6801/4798 3221 : [WRN]
> old request osd_op(client.4108.1:28765 10000008004.00000270 [write
> 0~4194304] 0.b2f1b144 snapc 1=[]) received at 2012-05-11 08:58:22.290388
> currently waiting for sub ops
> 2012-05-11 08:58:52.752066 osd.2 192.168.0.102:6801/4798 3222 : [WRN]
> old request osd_op(client.4108.1:28774 10000008004.00000279 [write
> 0~4194304] 0.2373c652 snapc 1=[]) received at 2012-05-11 08:58:22.392604
> currently waiting for sub ops
> 2012-05-11 08:58:52.752068 osd.2 192.168.0.102:6801/4798 3223 : [WRN]
> old request osd_op(client.4108.1:28775 10000008004.0000027a [write
> 0~4194304] 0.24a58617 snapc 1=[]) received at 2012-05-11 08:58:22.455618
> currently waiting for sub ops
> 2012-05-11 08:58:52.752070 osd.2 192.168.0.102:6801/4798 3224 : [WRN]
> old request osd_op(client.4108.1:28777 10000008004.0000027c [write
> 0~4194304] 0.1c900a76 snapc 1=[]) received at 2012-05-11 08:58:22.527086
> currently waiting for sub ops
> 2012-05-11 08:58:53.752212 osd.2 192.168.0.102:6801/4798 3225 : [WRN]
> old request osd_op(client.4108.1:28808 10000008004.0000029b [write
> 0~4194304] 0.46c529aa snapc 1=[]) received at 2012-05-11 08:58:23.252258
> currently waiting for sub ops
> 
> Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: waiting for sub ops
  2012-05-11 13:25 ` Stefan Priebe - Profihost AG
@ 2012-05-13  0:13   ` Sage Weil
  0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2012-05-13  0:13 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: ceph-devel

On Fri, 11 May 2012, Stefan Priebe - Profihost AG wrote:
> Hi,
> 
> all messages are always taling about snapc? I've no idea what this is.

snapc = "snap context", something that is attached to every write 
operation to inform copy-on-write for snapshotting in the object store.  
In this case it is irrelevant.

sage


> 
> Stefan
> 
> Am 11.05.2012 09:02, schrieb Stefan Priebe - Profihost AG:
> > Hi,
> > 
> > while doing some stress testing with bonnie++ i'm seeing always these
> > messages accross all osd's here is just an example for osd.2.
> > 
> > All machines are connected with 2x 1Gbit/s bonding mode 6 to a HP switch.
> > 
> > 2012-05-11 08:58:52.752057 osd.2 192.168.0.102:6801/4798 3218 : [WRN]
> > old request osd_op(mds.0.1:49546 200.00000027 [write 1643206~13552]
> > 1.379675e3) v4 received at 2012-05-11 08:58:22.044033 currently waiting
> > for sub ops
> > 2012-05-11 08:58:52.752059 osd.2 192.168.0.102:6801/4798 3219 : [WRN]
> > old request osd_op(client.4108.1:28755 10000008004.00000266 [write
> > 0~4194304] 0.79b497c7 snapc 1=[]) received at 2012-05-11 08:58:22.026435
> > currently waiting for sub ops
> > 2012-05-11 08:58:52.752061 osd.2 192.168.0.102:6801/4798 3220 : [WRN]
> > old request osd_op(client.4108.1:28762 10000008004.0000026d [write
> > 0~4194304] 0.493cb0d4 snapc 1=[]) received at 2012-05-11 08:58:22.191536
> > currently waiting for sub ops
> > 2012-05-11 08:58:52.752064 osd.2 192.168.0.102:6801/4798 3221 : [WRN]
> > old request osd_op(client.4108.1:28765 10000008004.00000270 [write
> > 0~4194304] 0.b2f1b144 snapc 1=[]) received at 2012-05-11 08:58:22.290388
> > currently waiting for sub ops
> > 2012-05-11 08:58:52.752066 osd.2 192.168.0.102:6801/4798 3222 : [WRN]
> > old request osd_op(client.4108.1:28774 10000008004.00000279 [write
> > 0~4194304] 0.2373c652 snapc 1=[]) received at 2012-05-11 08:58:22.392604
> > currently waiting for sub ops
> > 2012-05-11 08:58:52.752068 osd.2 192.168.0.102:6801/4798 3223 : [WRN]
> > old request osd_op(client.4108.1:28775 10000008004.0000027a [write
> > 0~4194304] 0.24a58617 snapc 1=[]) received at 2012-05-11 08:58:22.455618
> > currently waiting for sub ops
> > 2012-05-11 08:58:52.752070 osd.2 192.168.0.102:6801/4798 3224 : [WRN]
> > old request osd_op(client.4108.1:28777 10000008004.0000027c [write
> > 0~4194304] 0.1c900a76 snapc 1=[]) received at 2012-05-11 08:58:22.527086
> > currently waiting for sub ops
> > 2012-05-11 08:58:53.752212 osd.2 192.168.0.102:6801/4798 3225 : [WRN]
> > old request osd_op(client.4108.1:28808 10000008004.0000029b [write
> > 0~4194304] 0.46c529aa snapc 1=[]) received at 2012-05-11 08:58:23.252258
> > currently waiting for sub ops
> > 
> > Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: waiting for sub ops
  2012-05-11  7:02 waiting for sub ops Stefan Priebe - Profihost AG
  2012-05-11 13:25 ` Stefan Priebe - Profihost AG
@ 2012-05-13  0:15 ` Sage Weil
  2012-05-13 18:04   ` Stefan Priebe
  1 sibling, 1 reply; 7+ messages in thread
From: Sage Weil @ 2012-05-13  0:15 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: ceph-devel

On Fri, 11 May 2012, Stefan Priebe - Profihost AG wrote:
> Hi,
> 
> while doing some stress testing with bonnie++ i'm seeing always these
> messages accross all osd's here is just an example for osd.2.
> 
> All machines are connected with 2x 1Gbit/s bonding mode 6 to a HP switch.

These are just telling you that some operations are taking > 30 seconds.  
The 'waiting for sub ops' means that it is waiting for the write/update to 
be acked by other replicas.  Either there is some load imbalance (some 
osds are more busy than others), or everyone is similarly loaded and the 
request queues are just long across the board.

sage


> 
> 2012-05-11 08:58:52.752057 osd.2 192.168.0.102:6801/4798 3218 : [WRN]
> old request osd_op(mds.0.1:49546 200.00000027 [write 1643206~13552]
> 1.379675e3) v4 received at 2012-05-11 08:58:22.044033 currently waiting
> for sub ops
> 2012-05-11 08:58:52.752059 osd.2 192.168.0.102:6801/4798 3219 : [WRN]
> old request osd_op(client.4108.1:28755 10000008004.00000266 [write
> 0~4194304] 0.79b497c7 snapc 1=[]) received at 2012-05-11 08:58:22.026435
> currently waiting for sub ops
> 2012-05-11 08:58:52.752061 osd.2 192.168.0.102:6801/4798 3220 : [WRN]
> old request osd_op(client.4108.1:28762 10000008004.0000026d [write
> 0~4194304] 0.493cb0d4 snapc 1=[]) received at 2012-05-11 08:58:22.191536
> currently waiting for sub ops
> 2012-05-11 08:58:52.752064 osd.2 192.168.0.102:6801/4798 3221 : [WRN]
> old request osd_op(client.4108.1:28765 10000008004.00000270 [write
> 0~4194304] 0.b2f1b144 snapc 1=[]) received at 2012-05-11 08:58:22.290388
> currently waiting for sub ops
> 2012-05-11 08:58:52.752066 osd.2 192.168.0.102:6801/4798 3222 : [WRN]
> old request osd_op(client.4108.1:28774 10000008004.00000279 [write
> 0~4194304] 0.2373c652 snapc 1=[]) received at 2012-05-11 08:58:22.392604
> currently waiting for sub ops
> 2012-05-11 08:58:52.752068 osd.2 192.168.0.102:6801/4798 3223 : [WRN]
> old request osd_op(client.4108.1:28775 10000008004.0000027a [write
> 0~4194304] 0.24a58617 snapc 1=[]) received at 2012-05-11 08:58:22.455618
> currently waiting for sub ops
> 2012-05-11 08:58:52.752070 osd.2 192.168.0.102:6801/4798 3224 : [WRN]
> old request osd_op(client.4108.1:28777 10000008004.0000027c [write
> 0~4194304] 0.1c900a76 snapc 1=[]) received at 2012-05-11 08:58:22.527086
> currently waiting for sub ops
> 2012-05-11 08:58:53.752212 osd.2 192.168.0.102:6801/4798 3225 : [WRN]
> old request osd_op(client.4108.1:28808 10000008004.0000029b [write
> 0~4194304] 0.46c529aa snapc 1=[]) received at 2012-05-11 08:58:23.252258
> currently waiting for sub ops
> 
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: waiting for sub ops
  2012-05-13  0:15 ` Sage Weil
@ 2012-05-13 18:04   ` Stefan Priebe
  2012-05-14  4:28     ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Priebe @ 2012-05-13 18:04 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

Hi Sage,

Am 13.05.2012 02:15, schrieb Sage Weil:
> On Fri, 11 May 2012, Stefan Priebe - Profihost AG wrote:
>> Hi,
>>
>> while doing some stress testing with bonnie++ i'm seeing always these
>> messages accross all osd's here is just an example for osd.2.
>>
>> All machines are connected with 2x 1Gbit/s bonding mode 6 to a HP switch.
>
> These are just telling you that some operations are taking>  30 seconds.
> The 'waiting for sub ops' means that it is waiting for the write/update to
> be acked by other replicas.  Either there is some load imbalance (some
> osds are more busy than others), or everyone is similarly loaded and the
> request queues are just long across the board.

mhm but there must be something wrong in my testsetup.

1.) th osd bench shows 150MB/s per osd
2.) iperf shows constant 930Mbit/s per eth
3.) when i write 16GB with dd to the ceph mount i see spikes to 450Mbit/ 
and drops to 90kb/s for long periods of time. The overall dd speed is 
then 40Mbit/s
4.) the speed drops to 90kb/s while seeing the
"[WRN] slow request received at 2012-05-13 20:01:55.227811: 
osd_op(client.4102.1:38432 10000000004.000003ae [write 0~4194304] 
0.5f4dfca8 snapc 1=[]) currently waiting for sub ops"
messages.

The client shows this in dmesg:
[2012-05-13 19:55:26]  libceph:  tid 38132 timed out on osd2, will reset osd
[2012-05-13 19:55:46]  libceph:  tid 38400 timed out on osd0, will reset osd
[2012-05-13 19:56:31]  libceph:  tid 38886 timed out on osd2, will reset osd

greets and thanks
Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: waiting for sub ops
  2012-05-13 18:04   ` Stefan Priebe
@ 2012-05-14  4:28     ` Sage Weil
  2012-05-14  6:36       ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2012-05-14  4:28 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: Sage Weil, ceph-devel

On Sun, 13 May 2012, Stefan Priebe wrote:
> Hi Sage,
> 
> Am 13.05.2012 02:15, schrieb Sage Weil:
> > On Fri, 11 May 2012, Stefan Priebe - Profihost AG wrote:
> > > Hi,
> > > 
> > > while doing some stress testing with bonnie++ i'm seeing always these
> > > messages accross all osd's here is just an example for osd.2.
> > > 
> > > All machines are connected with 2x 1Gbit/s bonding mode 6 to a HP switch.
> > 
> > These are just telling you that some operations are taking>  30 seconds.
> > The 'waiting for sub ops' means that it is waiting for the write/update to
> > be acked by other replicas.  Either there is some load imbalance (some
> > osds are more busy than others), or everyone is similarly loaded and the
> > request queues are just long across the board.
> 
> mhm but there must be something wrong in my testsetup.
> 
> 1.) th osd bench shows 150MB/s per osd
> 2.) iperf shows constant 930Mbit/s per eth
> 3.) when i write 16GB with dd to the ceph mount i see spikes to 450Mbit/ and
> drops to 90kb/s for long periods of time. The overall dd speed is then
> 40Mbit/s

We were doing some btrfs performance testing this week and seeing similar 
bursty behavior.  (See my May 7th performance on btrfs email for some 
out-of-context detail.)  It currently looks like the workload we're 
presenting to the fs is resulting in non-optimal writeout, but we haven't 
figured out yet how we can improve that or nailed down the source for the 
burstiness.  How that some of the inktank launch stuff is out of the way 
we'll be picking it up again and continuing to work on that this week.

If you're interested in getting involved, ping nhm in #ceph.  It would be 
interesting to capture a block trace from your environment and see if 
what you're seeing is what we're seeing!

sage


> 4.) the speed drops to 90kb/s while seeing the
> "[WRN] slow request received at 2012-05-13 20:01:55.227811:
> osd_op(client.4102.1:38432 10000000004.000003ae [write 0~4194304] 0.5f4dfca8
> snapc 1=[]) currently waiting for sub ops"
> messages.
> 
> The client shows this in dmesg:
> [2012-05-13 19:55:26]  libceph:  tid 38132 timed out on osd2, will reset osd
> [2012-05-13 19:55:46]  libceph:  tid 38400 timed out on osd0, will reset osd
> [2012-05-13 19:56:31]  libceph:  tid 38886 timed out on osd2, will reset osd
> 
> greets and thanks
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: waiting for sub ops
  2012-05-14  4:28     ` Sage Weil
@ 2012-05-14  6:36       ` Stefan Priebe - Profihost AG
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Priebe - Profihost AG @ 2012-05-14  6:36 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel


>> 1.) th osd bench shows 150MB/s per osd
>> 2.) iperf shows constant 930Mbit/s per eth
>> 3.) when i write 16GB with dd to the ceph mount i see spikes to 450Mbit/ and
>> drops to 90kb/s for long periods of time. The overall dd speed is then
>> 40Mbit/s
> 
> We were doing some btrfs performance testing this week and seeing similar 
> bursty behavior.  (See my May 7th performance on btrfs email for some 
> out-of-context detail.)  It currently looks like the workload we're 
> presenting to the fs is resulting in non-optimal writeout, but we haven't 
> figured out yet how we can improve that or nailed down the source for the 
> burstiness.  How that some of the inktank launch stuff is out of the way 
> we'll be picking it up again and continuing to work on that this week.
> 
> If you're interested in getting involved, ping nhm in #ceph.  It would be 
> interesting to capture a block trace from your environment and see if 
> what you're seeing is what we're seeing!
Thank for your information - we've holdays in germany from thursday on
and i won't be available. So thanks for the offer.

But if you want i can provide a trace. Just tell me how i can produce one.

Thanks!

Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-05-14  6:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-11  7:02 waiting for sub ops Stefan Priebe - Profihost AG
2012-05-11 13:25 ` Stefan Priebe - Profihost AG
2012-05-13  0:13   ` Sage Weil
2012-05-13  0:15 ` Sage Weil
2012-05-13 18:04   ` Stefan Priebe
2012-05-14  4:28     ` Sage Weil
2012-05-14  6:36       ` Stefan Priebe - Profihost AG

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.