All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tommi Virtanen <tv@inktank.com>
To: Yann Dupont <Yann.Dupont@univ-nantes.fr>
Cc: ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: domino-style OSD crash
Date: Mon, 4 Jun 2012 09:16:19 -0700	[thread overview]
Message-ID: <CADvuQRF1EUK-iuwd49TJibaSaTN4G6gbCHRvQ3W_e4JoOZ5ODA@mail.gmail.com> (raw)
In-Reply-To: <4FCC7573.3000704@univ-nantes.fr>

On Mon, Jun 4, 2012 at 1:44 AM, Yann Dupont <Yann.Dupont@univ-nantes.fr> wrote:
> Results : Worked like a charm during two days, apart btrfs warn messages
> then OSD begin to crash 1 after all 'domino style'.

Sorry to hear that. Reading through your message, there seem to be
several problems; whether they are because of the same root cause, I
can't tell.

Quick triage to benefit the other devs:

#1: kernel crash, no details available
> 1 of the physical machine was in kernel oops state - Nothing was remote

#2: leveldb corruption? may be memory corruption that started
elsewhere.. Sam, does this look like the leveldb issue you saw?
>  [push] v 1438'9416 snapset=0=[]:[] snapc=0=[]) v6 currently started
>     0> 2012-06-03 12:55:33.088034 7ff1237f6700 -1 *** Caught signal
> (Aborted) **
...
>  13: (leveldb::InternalKeyComparator::FindShortestSeparator(std::string*,
> leveldb::Slice const&) const+0x4d) [0x6ef69d]
>  14: (leveldb::TableBuilder::Add(leveldb::Slice const&, leveldb::Slice
> const&)+0x9f) [0x6fdd9f]

#3: PG::merge_log assertion while recovering from the above; Sam, any ideas?
>     0> 2012-06-03 13:36:48.147020 7f74f58b6700 -1 osd/PG.cc: In function
> 'void PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&, int)'
> thread 7f74f58b6700 time 2012-06-03 13:36:48.100157
> osd/PG.cc: 402: FAILED assert(log.head >= olog.tail && olog.head >=
> log.tail)

#4: unknown btrfs warnings, there should an actual message above this
traceback; believed fixed in latest kernel
> Jun  2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479278]
> [<ffffffffa026fca5>] ? btrfs_orphan_commit_root+0x105/0x110 [btrfs]
> Jun  2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479328]
> [<ffffffffa026965a>] ? commit_fs_roots.isra.22+0xaa/0x170 [btrfs]
> Jun  2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479379]
> [<ffffffffa02bc9a0>] ? btrfs_scrub_pause+0xf0/0x100 [btrfs]
> Jun  2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479415]
> [<ffffffffa026a6f1>] ? btrfs_commit_transaction+0x521/0x9d0 [btrfs]
> Jun  2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479460]
> [<ffffffff8105a9f0>] ? add_wait_queue+0x60/0x60
> Jun  2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479493]
> [<ffffffffa026aba0>] ? btrfs_commit_transaction+0x9d0/0x9d0 [btrfs]
> Jun  2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479543]
> [<ffffffffa026abb1>] ? do_async_commit+0x11/0x20 [btrfs]
> Jun  2 23:40:03 chichibu.u14.univ-nantes.prive kernel: [200652.479572]
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-06-04 16:16 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-04  8:44 domino-style OSD crash Yann Dupont
2012-06-04 16:16 ` Tommi Virtanen [this message]
2012-06-04 17:40   ` Sam Just
2012-06-04 18:34     ` Greg Farnum
2012-07-03  8:40     ` Yann Dupont
2012-07-03 19:42       ` Tommi Virtanen
2012-07-03 20:54         ` Yann Dupont
2012-07-03 21:38           ` Tommi Virtanen
2012-07-04  8:06             ` Yann Dupont
2012-07-04 16:21               ` Gregory Farnum
2012-07-04 17:53                 ` Yann Dupont
2012-07-05 21:32                   ` Gregory Farnum
2012-07-06  7:19                     ` Yann Dupont
2012-07-06 17:01                       ` Gregory Farnum
2012-07-07  8:19                         ` Yann Dupont
2012-07-09 17:14                           ` Samuel Just
2012-07-10  9:46                             ` Yann Dupont
2012-07-10 15:56                               ` Tommi Virtanen
2012-07-10 16:39                                 ` Yann Dupont
2012-07-10 17:11                                   ` Tommi Virtanen
2012-07-10 17:36                                     ` Yann Dupont
2012-07-10 18:16                                       ` Tommi Virtanen
2012-07-09 17:43               ` Tommi Virtanen
2012-07-09 19:05                 ` Yann Dupont
2012-07-09 19:48                   ` Tommi Virtanen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADvuQRF1EUK-iuwd49TJibaSaTN4G6gbCHRvQ3W_e4JoOZ5ODA@mail.gmail.com \
    --to=tv@inktank.com \
    --cc=Yann.Dupont@univ-nantes.fr \
    --cc=ceph-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.