All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Farnum <greg@inktank.com>
To: Sam Just <sam.just@inktank.com>
Cc: Tommi Virtanen <tv@inktank.com>,
	Yann Dupont <Yann.Dupont@univ-nantes.fr>,
	ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: domino-style OSD crash
Date: Mon, 4 Jun 2012 11:34:15 -0700	[thread overview]
Message-ID: <10879AB1820F4E288E10129A1EF14183@inktank.com> (raw)
In-Reply-To: <CA+4uBUYoDFWcYhmd_EacQgJSf+i=WcA7x-PNWZ0EerD+_fTAjg@mail.gmail.com>

This is probably the same/similar to http://tracker.newdream.net/issues/2462, no? There's a log there, though I've no idea how helpful it is.


On Monday, June 4, 2012 at 10:40 AM, Sam Just wrote:

> Can you send the osd logs? The merge_log crashes are probably fixable
> if I can see the logs.
> 
> The leveldb crash is almost certainly a result of memory corruption.
> 
> Thanks
> -Sam
> 
> On Mon, Jun 4, 2012 at 9:16 AM, Tommi Virtanen <tv@inktank.com (mailto:tv@inktank.com)> wrote:
> > On Mon, Jun 4, 2012 at 1:44 AM, Yann Dupont <Yann.Dupont@univ-nantes.fr (mailto:Yann.Dupont@univ-nantes.fr)> wrote:
> > > Results : Worked like a charm during two days, apart btrfs warn messages
> > > then OSD begin to crash 1 after all 'domino style'.
> > 
> > 
> > 
> > Sorry to hear that. Reading through your message, there seem to be
> > several problems; whether they are because of the same root cause, I
> > can't tell.
> > 
> > Quick triage to benefit the other devs:
> > 
> > #1: kernel crash, no details available
> > > 1 of the physical machine was in kernel oops state - Nothing was remote
> > 
> > 
> > 
> > #2: leveldb corruption? may be memory corruption that started
> > elsewhere.. Sam, does this look like the leveldb issue you saw?
> > > [push] v 1438'9416 snapset=0=[]:[] snapc=0=[]) v6 currently started
> > > 0> 2012-06-03 12:55:33.088034 7ff1237f6700 -1 *** Caught signal
> > > (Aborted) **
> > 
> > 
> > ...
> > > 13: (leveldb::InternalKeyComparator::FindShortestSeparator(std::string*,
> > > leveldb::Slice const&) const+0x4d) [0x6ef69d]
> > > 14: (leveldb::TableBuilder::Add(leveldb::Slice const&, leveldb::Slice
> > > const&)+0x9f) [0x6fdd9f]
> > 
> > 
> > 
> > #3: PG::merge_log assertion while recovering from the above; Sam, any ideas?
> > > 0> 2012-06-03 13:36:48.147020 7f74f58b6700 -1 osd/PG.cc (http://PG.cc): In function
> > > 'void PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&, int)'
> > > thread 7f74f58b6700 time 2012-06-03 13:36:48.100157
> > > osd/PG.cc (http://PG.cc): 402: FAILED assert(log.head >= olog.tail && olog.head >=
> > > log.tail)
> > 
> > 
> > 
> > #4: unknown btrfs warnings, there should an actual message above this
> > traceback; believed fixed in latest kernel
> > > Jun 2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479278]
> > > [<ffffffffa026fca5>] ? btrfs_orphan_commit_root+0x105/0x110 [btrfs]
> > > Jun 2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479328]
> > > [<ffffffffa026965a>] ? commit_fs_roots.isra.22+0xaa/0x170 [btrfs]
> > > Jun 2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479379]
> > > [<ffffffffa02bc9a0>] ? btrfs_scrub_pause+0xf0/0x100 [btrfs]
> > > Jun 2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479415]
> > > [<ffffffffa026a6f1>] ? btrfs_commit_transaction+0x521/0x9d0 [btrfs]
> > > Jun 2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479460]
> > > [<ffffffff8105a9f0>] ? add_wait_queue+0x60/0x60
> > > Jun 2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479493]
> > > [<ffffffffa026aba0>] ? btrfs_commit_transaction+0x9d0/0x9d0 [btrfs]
> > > Jun 2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479543]
> > > [<ffffffffa026abb1>] ? do_async_commit+0x11/0x20 [btrfs]
> > > Jun 2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479572]
> > 
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)
> More majordomo info at http://vger.kernel.org/majordomo-info.html




  reply	other threads:[~2012-06-04 18:34 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-04  8:44 domino-style OSD crash Yann Dupont
2012-06-04 16:16 ` Tommi Virtanen
2012-06-04 17:40   ` Sam Just
2012-06-04 18:34     ` Greg Farnum [this message]
2012-07-03  8:40     ` Yann Dupont
2012-07-03 19:42       ` Tommi Virtanen
2012-07-03 20:54         ` Yann Dupont
2012-07-03 21:38           ` Tommi Virtanen
2012-07-04  8:06             ` Yann Dupont
2012-07-04 16:21               ` Gregory Farnum
2012-07-04 17:53                 ` Yann Dupont
2012-07-05 21:32                   ` Gregory Farnum
2012-07-06  7:19                     ` Yann Dupont
2012-07-06 17:01                       ` Gregory Farnum
2012-07-07  8:19                         ` Yann Dupont
2012-07-09 17:14                           ` Samuel Just
2012-07-10  9:46                             ` Yann Dupont
2012-07-10 15:56                               ` Tommi Virtanen
2012-07-10 16:39                                 ` Yann Dupont
2012-07-10 17:11                                   ` Tommi Virtanen
2012-07-10 17:36                                     ` Yann Dupont
2012-07-10 18:16                                       ` Tommi Virtanen
2012-07-09 17:43               ` Tommi Virtanen
2012-07-09 19:05                 ` Yann Dupont
2012-07-09 19:48                   ` Tommi Virtanen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=10879AB1820F4E288E10129A1EF14183@inktank.com \
    --to=greg@inktank.com \
    --cc=Yann.Dupont@univ-nantes.fr \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sam.just@inktank.com \
    --cc=tv@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.