All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jim Schutt" <jaschut@sandia.gov>
To: Colin McCabe <cmccabe@alumni.cmu.edu>
Cc: Sage Weil <sage@newdream.net>,
	Gregory Farnum <gregory.farnum@dreamhost.com>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: cosd multi-second stalls cause "wrongly marked me down"
Date: Thu, 3 Mar 2011 13:47:14 -0700	[thread overview]
Message-ID: <1299185234.4750.137.camel@sale659.sandia.gov> (raw)
In-Reply-To: <1299182603.4750.128.camel@sale659.sandia.gov>


On Thu, 2011-03-03 at 13:03 -0700, Jim Schutt wrote:
> > If none of that works, it's possible that someone is calling exit()
> > somewhere. You can attach a gdb to the process and put a breakpoint on
> > exit() to see if this is going on. There's a lot of "your foo is not
> > bar enough, I hate your config, exit(1)" type code that gets executed
> > while the daemon is starting up. It sounds like you should be past
> > that point, though.
> 
> I've finally gotten a little info, using a variant of
> your gdb idea: I waited until many of the OSD instances
> had died, then I attached gdb to several that were left,
> and waited.
> 
> Two of them died the same way, like this:
> 
> Program received signal SIGPIPE, Broken pipe.
> [Switching to Thread 0x7fd7888c8940 (LWP 28693)]
> 0x00007fd7a9b82f2b in sendmsg () from /lib64/libpthread.so.0
> (gdb) bt
> #0  0x00007fd7a9b82f2b in sendmsg () from /lib64/libpthread.so.0
> #1  0x0000000000672e0b in SimpleMessenger::Pipe::do_sendmsg (
>     this=0x7fd799b67c20, sd=13, msg=0x7fd7888c7f20, len=251237, more=false)
>     at msg/SimpleMessenger.cc:1994
> #2  0x00000000006739d3 in SimpleMessenger::Pipe::write_message (
>     this=0x7fd799b67c20, m=0x7fd79b2dcb70) at msg/SimpleMessenger.cc:2217
> #3  0x000000000067e74a in SimpleMessenger::Pipe::writer (this=0x7fd799b67c20)
>     at msg/SimpleMessenger.cc:1734
> #4  0x000000000066fa2b in SimpleMessenger::Pipe::Writer::entry (
>     this=0x7fd799b67e70) at msg/SimpleMessenger.h:204
> #5  0x000000000068282e in Thread::_entry_func (arg=0x7fd799b67e70)
>     at ./common/Thread.h:41
> #6  0x00007fd7a9b7b73d in start_thread (arg=<value optimized out>)
>     at pthread_create.c:301
> #7  0x00007fd7a8a91f6d in clone () from /lib64/libc.so.6
> (gdb) 
> 

Has something maybe changed in signal handling recently?

Maybe SIGPIPE used to be blocked, and sendmsg() would
return -EPIPE, but now it's not blocked and not handled?

This bit in linux-2.6.git/net/core/stream.c is what made
me wonder, but maybe it's a red herring:

int sk_stream_error(struct sock *sk, int flags, int err)
{
	if (err == -EPIPE)
		err = sock_error(sk) ? : -EPIPE;
	if (err == -EPIPE && !(flags & MSG_NOSIGNAL))
		send_sig(SIGPIPE, current, 0);
	return err;
}

-- Jim




  reply	other threads:[~2011-03-03 20:47 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-16 21:25 cosd multi-second stalls cause "wrongly marked me down" Jim Schutt
2011-02-16 21:37 ` Wido den Hollander
2011-02-16 21:51   ` Jim Schutt
2011-02-16 21:40 ` Gregory Farnum
2011-02-16 21:50   ` Jim Schutt
2011-02-17  0:50     ` Sage Weil
2011-02-17  0:54       ` Sage Weil
2011-02-17 15:46         ` Jim Schutt
2011-02-17 16:11           ` Sage Weil
2011-02-17 23:31             ` Jim Schutt
2011-02-18  7:13               ` Sage Weil
2011-02-18 17:04                 ` Jim Schutt
2011-02-18 17:15                 ` Gregory Farnum
2011-02-18 18:41                 ` Jim Schutt
2011-02-18 19:07                 ` Colin McCabe
2011-02-18 20:48                   ` Jim Schutt
2011-02-18 20:58                     ` Sage Weil
2011-02-18 21:09                       ` Jim Schutt
2011-03-09 16:02               ` Jim Schutt
2011-03-09 17:07                 ` Gregory Farnum
2011-03-09 18:36                   ` Jim Schutt
2011-03-09 19:37                     ` Gregory Farnum
2011-03-10 23:09                       ` Jim Schutt
2011-03-10 23:21                         ` Sage Weil
2011-03-10 23:32                           ` Jim Schutt
2011-03-10 23:40                             ` Sage Weil
2011-03-11 14:51                               ` Jim Schutt
2011-03-11 18:26                               ` Jim Schutt
2011-03-11 18:37                                 ` Jim Schutt
2011-03-11 18:37                                 ` Sage Weil
2011-03-11 18:51                                   ` Jim Schutt
2011-03-11 19:09                                     ` Gregory Farnum
2011-03-11 19:13                                       ` Yehuda Sadeh Weinraub
2011-03-11 19:17                                         ` Yehuda Sadeh Weinraub
2011-03-11 19:16                                       ` Jim Schutt
2011-03-11 21:13                                   ` Jim Schutt
2011-03-11 21:37                                     ` Sage Weil
2011-03-11 22:21                                       ` Jim Schutt
2011-03-11 22:26                                         ` Jim Schutt
2011-03-11 22:45                                           ` Sage Weil
2011-03-11 23:29                                             ` Jim Schutt
2011-03-30 21:26                                       ` Jim Schutt
2011-03-30 21:55                                         ` Sage Weil
2011-03-31 14:16                                           ` Jim Schutt
2011-03-31 16:25                                             ` Sage Weil
2011-03-31 17:00                                               ` Jim Schutt
2011-03-31 17:10                                                 ` Jim Schutt
2011-03-31 17:24                                                   ` Sage Weil
2011-03-31 18:08                                                     ` Jim Schutt
2011-03-31 18:41                                                       ` Sage Weil
2011-04-01 22:38                                                         ` Jim Schutt
2011-02-23 17:52             ` Jim Schutt
2011-02-23 18:12               ` Gregory Farnum
2011-02-23 18:54                 ` Sage Weil
2011-02-23 19:12                   ` Gregory Farnum
2011-02-23 19:23                 ` Jim Schutt
2011-02-23 20:27                   ` Gregory Farnum
2011-03-02  0:53                   ` Sage Weil
2011-03-02 15:21                     ` Jim Schutt
2011-03-02 17:10                       ` Sage Weil
2011-03-02 20:54                         ` Jim Schutt
2011-03-02 21:45                           ` Sage Weil
2011-03-02 21:59                             ` Jim Schutt
2011-03-02 22:57                               ` Jim Schutt
2011-03-02 23:20                                 ` Gregory Farnum
2011-03-02 23:25                                   ` Jim Schutt
2011-03-02 23:33                                     ` Gregory Farnum
2011-03-03  2:26                                 ` Colin McCabe
2011-03-03 20:03                                   ` Jim Schutt
2011-03-03 20:47                                     ` Jim Schutt [this message]
2011-03-03 20:55                                       ` Yehuda Sadeh Weinraub
2011-03-03 21:45                                         ` Jim Schutt
2011-03-03 22:22                                           ` Sage Weil
2011-03-03 22:34                                             ` Jim Schutt
2011-03-03 21:53                                         ` Colin McCabe
2011-03-03 23:06                                           ` Jim Schutt
2011-03-03 23:30                                             ` Colin McCabe
2011-03-03 23:37                                               ` Jim Schutt
2011-03-03  5:03                                 ` Sage Weil
2011-03-03 16:35                                   ` Jim Schutt
2011-03-03 17:28                                   ` Jim Schutt
2011-03-03 18:04                                     ` Sage Weil
2011-03-03 18:42                                       ` Jim Schutt
2011-03-03 18:51                                         ` Sage Weil
2011-03-03 19:39                                           ` Jim Schutt
2011-04-08 16:23       ` Jim Schutt
2011-04-08 20:50         ` Sage Weil
2011-04-08 22:11           ` Jim Schutt
2011-04-08 23:10             ` Colin McCabe
2011-04-11 14:41               ` Jim Schutt
2011-04-11 16:25                 ` Sage Weil
2011-04-11 20:14             ` Jim Schutt
2011-04-11 21:18             ` Jim Schutt
2011-04-11 23:23               ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1299185234.4750.137.camel@sale659.sandia.gov \
    --to=jaschut@sandia.gov \
    --cc=ceph-devel@vger.kernel.org \
    --cc=cmccabe@alumni.cmu.edu \
    --cc=gregory.farnum@dreamhost.com \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.