All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jim Schutt" <jaschut@sandia.gov>
To: Sage Weil <sage@newdream.net>
Cc: Gregory Farnum <gregory.farnum@dreamhost.com>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: cosd multi-second stalls cause "wrongly marked me down"
Date: Fri, 11 Mar 2011 11:26:28 -0700	[thread overview]
Message-ID: <1299867988.4750.394.camel@sale659.sandia.gov> (raw)
In-Reply-To: <Pine.LNX.4.64.1103101538100.20133@cobra.newdream.net>


On Thu, 2011-03-10 at 16:40 -0700, Sage Weil wrote:
> > > Hmm, maybe.  I wouldn't expect this behavior from any allocator, though!
> > >
> > > Can you drill down a bit further and see if either of these is
> > > responsible?
> > >
> > >   virtual ~Message() {
> > >     assert(nref.read() == 0);
> > >     if (connection)
> > >       connection->put();
> > >     if (throttler)
> > >       throttler->put(payload.length() + middle.length() + data.length());
> > >   }
> > >
> > > (msg/Message.h)
> >
> > Hmmm, this is the patch I'm running to produce above.
> > It seems pretty definitive to me; am I missing something?
> 
>         delete this;
> 
> is calling the virtual destructor ~MOSDPing(), and then ~Message(), and
> only then releasing the memory to the allocator.  ~MOSDPing doesn't do
> anything, but ~Message adjusts the throttler (which involves a mutex that
> *shouldn't* be contended :) and a connection->put(), which calls
> ~Connection() and then releases memory.
> 
> My money is on the throttler, but let's see!

I added this patch:

--- a/src/msg/Message.h
+++ b/src/msg/Message.h
@@ -158,8 +158,8 @@ struct RefCountedObject {
       utime_t s = g_clock.now();
       delete this;
       utime_t e = g_clock.now();
-      if (e - s > 0.5) {
-       generic_dout(1) << "RefCountedObject::put delete " << this << " took " << e - s << " secs!" << dendl;
+      if (double(e - s) > 0.5) {
+       generic_dout(1) << "RefCountedObject::put delete " << this << " took " << double(e - s) << " secs!" << dendl;
       }
     }
   }
@@ -304,10 +304,22 @@ public:
 protected:
   virtual ~Message() { 
     assert(nref.read() == 0);
-    if (connection)
+    if (connection) {
+      utime_t s = g_clock.now();
       connection->put();
-    if (throttler)
+      utime_t e = g_clock.now();
+      if (double(e - s) > 0.25) {
+       generic_dout(1) << "~Message() " << this << " connection->put took " << double(e - s) << " secs!" << dendl;
+      }
+    }
+    if (throttler) {
+      utime_t s = g_clock.now();
       throttler->put(payload.length() + middle.length() + data.length());
+      utime_t e = g_clock.now();
+      if (double(e - s) > 0.25) {
+       generic_dout(1) << "~Message() " << this << " throttler->put took " << double(e - s) << " secs!" << dendl;
+      }
+    }
   }
 public:
   Connection *get_connection() { return connection; }



I got these hits:
# egrep -Hn --color -e "throttler->put took" osd.*.log
osd.16.log:1029077:2011-03-11 10:56:05.898264 7f30a7398940 ~Message() 0x2214ea0 throttler->put took 0.39543 secs!
osd.41.log:1401766:2011-03-11 10:56:51.110989 7ffad771e940 ~Message() 0x7ffaa7c12c40 throttler->put took 0.28708 secs!
osd.41.log:1494336:2011-03-11 10:57:07.748022 7ffad9123940 ~Message() 0x1adc8c0 throttler->put took 0.87857 secs!

Here are the corresponding messages:

osd.16.log:1011274:2011-03-11 10:55:48.805427 7f3089c9e940 -- 172.17.40.23:6800/22969 >> 172.17.40.68:0/1519385907 pipe(0x20c9930 sd=161 pgs=103 cs=1 l=1).reader got message 1 0x2214ea0 osd_op(client4212.1:39 10000004a4b.00000026 [write 0~4194304 [1@-1]] 0.2815 RETRY snapc 1=[])
osd.16.log:1027713:2011-03-11 10:56:05.679653 7f30a9d9f940 -- 172.17.40.23:6800/22969 dispatch_entry pipe 0x20c9930 dequeued 0x2214ea0
osd.16.log:1027714:2011-03-11 10:56:05.679666 7f30a9d9f940 -- 172.17.40.23:6800/22969 <== client4212 172.17.40.68:0/1519385907 1 ==== osd_op(client4212.1:39 10000004a4b.00000026 [write 0~4194304 [1@-1]] 0.2815 RETRY snapc 1=[]) ==== 128+0+4194304 (4010875753 0 0) 0x2214ea0 con 0x28b9e00
osd.16.log:1027717:2011-03-11 10:56:05.679688 7f30a9d9f940 osd16 14 _dispatch 0x2214ea0 osd_op(client4212.1:39 10000004a4b.00000026 [write 0~4194304 [1@-1]] 0.2815 RETRY snapc 1=[])
osd.16.log:1027718:2011-03-11 10:56:05.679703 7f30a9d9f940 osd16 14 require_same_or_newer_map 14 (i am 14) 0x2214ea0
osd.16.log:1027722:2011-03-11 10:56:05.679768 7f30a9d9f940 osd16 14 pg[0.815( v 14'6 (0'0,14'6] n=6 ec=3 les=6 3/3/3) [16,83] r=0 luod=14'5 lcod 14'5 mlcod 0'0 active+clean] enqueue_op 0x2214ea0 osd_op(client4212.1:39 10000004a4b.00000026 [write 0~4194304 [1@-1]] 0.2815 RETRY snapc 1=[])
osd.16.log:1027725:2011-03-11 10:56:05.679793 7f30a9d9f940 -- 172.17.40.23:6800/22969 dispatch_entry done with 0x2214ea0 que_et 16.874237 op_et 0.000117 tot_et 16.874354
osd.16.log:1029077:2011-03-11 10:56:05.898264 7f30a7398940 ~Message() 0x2214ea0 throttler->put took 0.39543 secs!
osd.16.log:1029078:2011-03-11 10:56:05.898287 7f30a7398940 osd16 14 dequeue_op 0x2214ea0 finish

osd.41.log:1400466:2011-03-11 10:56:50.926741 7ffab79cb940 -- 172.17.40.27:6803/25508 >> 172.17.40.97:0/893403283 pipe(0x7ffabc6eba50 sd=156 pgs=194 cs=1 l=1).reader got message 2 0x7ffaa7c12c40 osd_op(client4251.1:250 1000000cf3d.000000f4 [write 0~4194304 [1@-1]] 0.2c5b RETRY snapc 1=[])
osd.41.log:1401371:2011-03-11 10:56:51.069569 7ffad9924940 -- 172.17.40.27:6803/25508 dispatch_entry pipe 0x7ffabc6eba50 dequeued 0x7ffaa7c12c40
osd.41.log:1401372:2011-03-11 10:56:51.069584 7ffad9924940 -- 172.17.40.27:6803/25508 <== client4251 172.17.40.97:0/893403283 2 ==== osd_op(client4251.1:250 1000000cf3d.000000f4 [write 0~4194304 [1@-1]] 0.2c5b RETRY snapc 1=[]) ==== 128+0+4194304 (4206015051 0 0) 0x7ffaa7c12c40 con 0x7ffabc1c6640
osd.41.log:1401375:2011-03-11 10:56:51.069608 7ffad9924940 osd41 14 _dispatch 0x7ffaa7c12c40 osd_op(client4251.1:250 1000000cf3d.000000f4 [write 0~4194304 [1@-1]] 0.2c5b RETRY snapc 1=[])
osd.41.log:1401376:2011-03-11 10:56:51.069615 7ffad9924940 osd41 14 require_same_or_newer_map 14 (i am 14) 0x7ffaa7c12c40
osd.41.log:1401380:2011-03-11 10:56:51.069675 7ffad9924940 osd41 14 pg[0.c5b( v 14'8 (14'2,14'8] n=8 ec=3 les=5 3/3/3) [41,55] r=0 mlcod 14'3 active+clean] enqueue_op 0x7ffaa7c12c40 osd_op(client4251.1:250 1000000cf3d.000000f4 [write 0~4194304 [1@-1]] 0.2c5b RETRY snapc 1=[])
osd.41.log:1401383:2011-03-11 10:56:51.069704 7ffad9924940 -- 172.17.40.27:6803/25508 dispatch_entry done with 0x7ffaa7c12c40 que_et 0.142841 op_et 0.000110 tot_et 0.142951
osd.41.log:1401766:2011-03-11 10:56:51.110989 7ffad771e940 ~Message() 0x7ffaa7c12c40 throttler->put took 0.28708 secs!
osd.41.log:1401782:2011-03-11 10:56:51.111711 7ffad771e940 osd41 14 dequeue_op 0x7ffaa7c12c40 finish

osd.41.log:1406682:2011-03-11 10:56:52.368878 7ffab55a9940 -- 172.17.40.27:6803/25508 >> 172.17.40.49:0/530169199 pipe(0x1b03a90 sd=209 pgs=137 cs=1 l=1).reader got message 6 0x1adc8c0 osd_op(client4200.1:460 10000000bbb.000001c5 [write 0~4194304 [1@-1]] 0.a5b1 RETRY snapc 1=[])
osd.41.log:1439271:2011-03-11 10:56:59.443761 7ffad9924940 -- 172.17.40.27:6803/25508 dispatch_entry pipe 0x1b03a90 dequeued 0x1adc8c0
osd.41.log:1439272:2011-03-11 10:56:59.443782 7ffad9924940 -- 172.17.40.27:6803/25508 <== client4200 172.17.40.49:0/530169199 6 ==== osd_op(client4200.1:460 10000000bbb.000001c5 [write 0~4194304 [1@-1]] 0.a5b1 RETRY snapc 1=[]) ==== 128+0+4194304 (3016272799 0 0) 0x1adc8c0 con 0x8114be0
osd.41.log:1439275:2011-03-11 10:56:59.443806 7ffad9924940 osd41 21 _dispatch 0x1adc8c0 osd_op(client4200.1:460 10000000bbb.000001c5 [write 0~4194304 [1@-1]] 0.a5b1 RETRY snapc 1=[])
osd.41.log:1439276:2011-03-11 10:56:59.443816 7ffad9924940 osd41 21 require_same_or_newer_map 15 (i am 21) 0x1adc8c0
osd.41.log:1439306:2011-03-11 10:56:59.449276 7ffad9924940 -- 172.17.40.27:6803/25508 dispatch_entry done with 0x1adc8c0 que_et 7.074903 op_et 0.005482 tot_et 7.080385
osd.41.log:1493847:2011-03-11 10:57:07.658838 7ffad9123940 osd41 28 _dispatch 0x1adc8c0 osd_op(client4200.1:460 10000000bbb.000001c5 [write 0~4194304 [1@-1]] 0.a5b1 RETRY snapc 1=[])
osd.41.log:1493848:2011-03-11 10:57:07.658848 7ffad9123940 osd41 28 require_same_or_newer_map 15 (i am 28) 0x1adc8c0
osd.41.log:1494336:2011-03-11 10:57:07.748022 7ffad9123940 ~Message() 0x1adc8c0 throttler->put took 0.87857 secs!
osd.41.log:1494337:2011-03-11 10:57:07.748043 7ffad9123940 RefCountedObject::put delete 0x1adc8c0 took 0.87917 secs!


So none of those were osd_ping messages.

But, I still had lots of delayed acks.  Here's a couple more examples:

osd.0.log:960713:2011-03-11 10:55:32.117721 7fc4cb7fe940 -- 172.17.40.21:6802/28363 --> osd74 172.17.40.31:6808/24916 -- osd_ping(e14 as_of 14) v1 -- ?+0 0x7fc4cc16f270
osd.0.log:960756:2011-03-11 10:55:32.118395 7fc4c96eb940 -- 172.17.40.21:6802/28363 >> 172.17.40.31:6808/24916 pipe(0x28fb580 sd=53 pgs=47 cs=1 l=0).writer encoding 289 0x7fc4cc16f270 osd_ping(e14 as_of 14) v1
osd.0.log:960757:2011-03-11 10:55:32.118409 7fc4c96eb940 -- 172.17.40.21:6802/28363 >> 172.17.40.31:6808/24916 pipe(0x28fb580 sd=53 pgs=47 cs=1 l=0).writer sending 289 0x7fc4cc16f270
osd.0.log:960758:2011-03-11 10:55:32.118422 7fc4c96eb940 -- 172.17.40.21:6802/28363 >> 172.17.40.31:6808/24916 pipe(0x28fb580 sd=53 pgs=47 cs=1 l=0).write_message 0x7fc4cc16f270 osd_ping(e14 as_of 14) v1
osd.0.log:963163:2011-03-11 10:55:32.941413 7fc4c61b6940 -- 172.17.40.21:6802/28363 >> 172.17.40.31:6808/24916 pipe(0x28fb580 sd=53 pgs=47 cs=1 l=0).reader got ack seq 289 >= 289 on 0x7fc4cc16f270 osd_ping(e14 as_of 14) v1
osd.0.log:964273:2011-03-11 10:55:33.447526 7fc4c61b6940 RefCountedObject::put delete 0x7fc4cc16f270 took 5.06013 secs!
osd.0.log:964274:2011-03-11 10:55:33.447538 7fc4c61b6940 -- 172.17.40.21:6802/28363 >> 172.17.40.31:6808/24916 pipe(0x28fb580 sd=53 pgs=47 cs=1 l=0).handle_ack finished put on 0x7fc4cc16f270

osd.0.log:964311:2011-03-11 10:55:33.448063 7fc4cb7fe940 -- 172.17.40.21:6802/28363 --> osd77 172.17.40.31:6817/25216 -- osd_ping(e14 as_of 14) v1 -- ?+0 0x7fc4cc2f9a90
osd.0.log:964356:2011-03-11 10:55:33.448883 7fc4ba5a5940 -- 172.17.40.21:6802/28363 >> 172.17.40.31:6817/25216 pipe(0x27abca0 sd=108 pgs=81 cs=1 l=0).writer encoding 277 0x7fc4cc2f9a90 osd_ping(e14 as_of 14) v1
osd.0.log:964357:2011-03-11 10:55:33.448896 7fc4ba5a5940 -- 172.17.40.21:6802/28363 >> 172.17.40.31:6817/25216 pipe(0x27abca0 sd=108 pgs=81 cs=1 l=0).writer sending 277 0x7fc4cc2f9a90
osd.0.log:964358:2011-03-11 10:55:33.448907 7fc4ba5a5940 -- 172.17.40.21:6802/28363 >> 172.17.40.31:6817/25216 pipe(0x27abca0 sd=108 pgs=81 cs=1 l=0).write_message 0x7fc4cc2f9a90 osd_ping(e14 as_of 14) v1
osd.0.log:972337:2011-03-11 10:55:34.976054 7fc4ba8a8940 -- 172.17.40.21:6802/28363 >> 172.17.40.31:6817/25216 pipe(0x27abca0 sd=108 pgs=81 cs=1 l=0).reader got ack seq 278 >= 277 on 0x7fc4cc2f9a90 osd_ping(e14 as_of 14) v1
osd.0.log:977785:2011-03-11 10:55:45.119599 7fc4ba8a8940 RefCountedObject::put delete 0x7fc4cc2f9a90 took 11.4353 secs!
osd.0.log:977786:2011-03-11 10:55:45.119612 7fc4ba8a8940 -- 172.17.40.21:6802/28363 >> 172.17.40.31:6817/25216 pipe(0x27abca0 sd=108 pgs=81 cs=1 l=0).handle_ack finished put on 0x7fc4cc2f9a90

Seems like this pretty much rules out anything but the memory allocator,
but maybe I'm still missing something?

-- Jim

> 
> sage
> 



  parent reply	other threads:[~2011-03-11 18:26 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-16 21:25 cosd multi-second stalls cause "wrongly marked me down" Jim Schutt
2011-02-16 21:37 ` Wido den Hollander
2011-02-16 21:51   ` Jim Schutt
2011-02-16 21:40 ` Gregory Farnum
2011-02-16 21:50   ` Jim Schutt
2011-02-17  0:50     ` Sage Weil
2011-02-17  0:54       ` Sage Weil
2011-02-17 15:46         ` Jim Schutt
2011-02-17 16:11           ` Sage Weil
2011-02-17 23:31             ` Jim Schutt
2011-02-18  7:13               ` Sage Weil
2011-02-18 17:04                 ` Jim Schutt
2011-02-18 17:15                 ` Gregory Farnum
2011-02-18 18:41                 ` Jim Schutt
2011-02-18 19:07                 ` Colin McCabe
2011-02-18 20:48                   ` Jim Schutt
2011-02-18 20:58                     ` Sage Weil
2011-02-18 21:09                       ` Jim Schutt
2011-03-09 16:02               ` Jim Schutt
2011-03-09 17:07                 ` Gregory Farnum
2011-03-09 18:36                   ` Jim Schutt
2011-03-09 19:37                     ` Gregory Farnum
2011-03-10 23:09                       ` Jim Schutt
2011-03-10 23:21                         ` Sage Weil
2011-03-10 23:32                           ` Jim Schutt
2011-03-10 23:40                             ` Sage Weil
2011-03-11 14:51                               ` Jim Schutt
2011-03-11 18:26                               ` Jim Schutt [this message]
2011-03-11 18:37                                 ` Jim Schutt
2011-03-11 18:37                                 ` Sage Weil
2011-03-11 18:51                                   ` Jim Schutt
2011-03-11 19:09                                     ` Gregory Farnum
2011-03-11 19:13                                       ` Yehuda Sadeh Weinraub
2011-03-11 19:17                                         ` Yehuda Sadeh Weinraub
2011-03-11 19:16                                       ` Jim Schutt
2011-03-11 21:13                                   ` Jim Schutt
2011-03-11 21:37                                     ` Sage Weil
2011-03-11 22:21                                       ` Jim Schutt
2011-03-11 22:26                                         ` Jim Schutt
2011-03-11 22:45                                           ` Sage Weil
2011-03-11 23:29                                             ` Jim Schutt
2011-03-30 21:26                                       ` Jim Schutt
2011-03-30 21:55                                         ` Sage Weil
2011-03-31 14:16                                           ` Jim Schutt
2011-03-31 16:25                                             ` Sage Weil
2011-03-31 17:00                                               ` Jim Schutt
2011-03-31 17:10                                                 ` Jim Schutt
2011-03-31 17:24                                                   ` Sage Weil
2011-03-31 18:08                                                     ` Jim Schutt
2011-03-31 18:41                                                       ` Sage Weil
2011-04-01 22:38                                                         ` Jim Schutt
2011-02-23 17:52             ` Jim Schutt
2011-02-23 18:12               ` Gregory Farnum
2011-02-23 18:54                 ` Sage Weil
2011-02-23 19:12                   ` Gregory Farnum
2011-02-23 19:23                 ` Jim Schutt
2011-02-23 20:27                   ` Gregory Farnum
2011-03-02  0:53                   ` Sage Weil
2011-03-02 15:21                     ` Jim Schutt
2011-03-02 17:10                       ` Sage Weil
2011-03-02 20:54                         ` Jim Schutt
2011-03-02 21:45                           ` Sage Weil
2011-03-02 21:59                             ` Jim Schutt
2011-03-02 22:57                               ` Jim Schutt
2011-03-02 23:20                                 ` Gregory Farnum
2011-03-02 23:25                                   ` Jim Schutt
2011-03-02 23:33                                     ` Gregory Farnum
2011-03-03  2:26                                 ` Colin McCabe
2011-03-03 20:03                                   ` Jim Schutt
2011-03-03 20:47                                     ` Jim Schutt
2011-03-03 20:55                                       ` Yehuda Sadeh Weinraub
2011-03-03 21:45                                         ` Jim Schutt
2011-03-03 22:22                                           ` Sage Weil
2011-03-03 22:34                                             ` Jim Schutt
2011-03-03 21:53                                         ` Colin McCabe
2011-03-03 23:06                                           ` Jim Schutt
2011-03-03 23:30                                             ` Colin McCabe
2011-03-03 23:37                                               ` Jim Schutt
2011-03-03  5:03                                 ` Sage Weil
2011-03-03 16:35                                   ` Jim Schutt
2011-03-03 17:28                                   ` Jim Schutt
2011-03-03 18:04                                     ` Sage Weil
2011-03-03 18:42                                       ` Jim Schutt
2011-03-03 18:51                                         ` Sage Weil
2011-03-03 19:39                                           ` Jim Schutt
2011-04-08 16:23       ` Jim Schutt
2011-04-08 20:50         ` Sage Weil
2011-04-08 22:11           ` Jim Schutt
2011-04-08 23:10             ` Colin McCabe
2011-04-11 14:41               ` Jim Schutt
2011-04-11 16:25                 ` Sage Weil
2011-04-11 20:14             ` Jim Schutt
2011-04-11 21:18             ` Jim Schutt
2011-04-11 23:23               ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1299867988.4750.394.camel@sale659.sandia.gov \
    --to=jaschut@sandia.gov \
    --cc=ceph-devel@vger.kernel.org \
    --cc=gregory.farnum@dreamhost.com \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.