From: Yann Dupont <Yann.Dupont@univ-nantes.fr>
To: Gregory Farnum <greg@inktank.com>
Cc: Sam Just <sam.just@inktank.com>, ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: domino-style OSD crash
Date: Fri, 06 Jul 2012 09:19:56 +0200 [thread overview]
Message-ID: <4FF6919C.8080201@univ-nantes.fr> (raw)
In-Reply-To: <CAPYLRzg3vJNGYUwBrZ6e9G6x-URCodAxFLfXRL10T8yOqx+wVQ@mail.gmail.com>
Le 05/07/2012 23:32, Gregory Farnum a écrit :
[...]
>> ok, so as all nodes were identical, I probably have hit a btrfs bug (like a
>> erroneous out of space ) in more or less the same time. And when 1 osd was
>> out,
OH , I didn't finish the sentence... When 1 osd was out, missing data
was copied on another nodes, probably speeding btrfs problem on those
nodes (I suspect erroneous out of space conditions)
I've reformatted OSD with xfs. Performance is slightly worse for the
moment (well, depend on the workload, and maybe lack of syncfs is to
blame), but at least I hope to have the storage layer rock-solid. BTW,
I've managed to keep the faulty btrfs volumes .
[...]
>>> I wonder if maybe there's a confounding factor here — are all your nodes
>>> similar to each other,
>> Yes. I designed the cluster that way. All nodes are identical hardware
>> (powerEdge M610, 10G intel ethernet + emulex fibre channel attached to
>> storage (1 Array for 2 OSD nodes, 1 controller dedicated for each OSD)
> Oh, interesting. Are the broken nodes all on the same set of arrays?
No. There are 4 completely independant raid arrays, in 4 different
locations. They are similar (same brand & model, but slighltly different
disks, and 1 different firmware), all arrays are multipathed. I don't
think the raid array is the problem. We use those particular models
since 2/3 years, and in the logs I don't see any problem that can be
caused by the storage itself (like scsi or multipath errors)
Cheers,
--
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-07-06 8:12 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-04 8:44 domino-style OSD crash Yann Dupont
2012-06-04 16:16 ` Tommi Virtanen
2012-06-04 17:40 ` Sam Just
2012-06-04 18:34 ` Greg Farnum
2012-07-03 8:40 ` Yann Dupont
2012-07-03 19:42 ` Tommi Virtanen
2012-07-03 20:54 ` Yann Dupont
2012-07-03 21:38 ` Tommi Virtanen
2012-07-04 8:06 ` Yann Dupont
2012-07-04 16:21 ` Gregory Farnum
2012-07-04 17:53 ` Yann Dupont
2012-07-05 21:32 ` Gregory Farnum
2012-07-06 7:19 ` Yann Dupont [this message]
2012-07-06 17:01 ` Gregory Farnum
2012-07-07 8:19 ` Yann Dupont
2012-07-09 17:14 ` Samuel Just
2012-07-10 9:46 ` Yann Dupont
2012-07-10 15:56 ` Tommi Virtanen
2012-07-10 16:39 ` Yann Dupont
2012-07-10 17:11 ` Tommi Virtanen
2012-07-10 17:36 ` Yann Dupont
2012-07-10 18:16 ` Tommi Virtanen
2012-07-09 17:43 ` Tommi Virtanen
2012-07-09 19:05 ` Yann Dupont
2012-07-09 19:48 ` Tommi Virtanen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FF6919C.8080201@univ-nantes.fr \
--to=yann.dupont@univ-nantes.fr \
--cc=ceph-devel@vger.kernel.org \
--cc=greg@inktank.com \
--cc=sam.just@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.