All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sage Weil <sage@inktank.com>
To: Paul Pettigrew <Paul.Pettigrew@mach.com.au>
Cc: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: mkcephfs failing on v0.48 "argonaut"
Date: Thu, 5 Jul 2012 15:17:56 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.1207051512360.21201@cobra.newdream.net> (raw)
In-Reply-To: <81C477727102DA4E9B2605AC748C495419104F51B3@exch10>

Hi Paul,

On Wed, 4 Jul 2012, Paul Pettigrew wrote:
> Firstly, well done guys on achieving this version milestone. I 
> successfully upgraded to the 0.48 format uneventfully on a live (test) 
> system.
> 
> The same system was then going through "rebuild" testing, to confirm 
> that also worked fine.
> 
> 
> Unfortunately, the mkcephfs command is failing:
> 
> root@dsanb1-coy:~# mkcephfs -c /etc/ceph/ceph.conf --allhosts --mkbtrfs -k /etc/ceph/keyring --crushmapsrc crushfile.txt -v
> temp dir is /tmp/mkcephfs.GaRCZ9i06a
> preparing monmap in /tmp/mkcephfs.GaRCZ9i06a/monmap
> /usr/bin/monmaptool --create --clobber --add alpha 10.32.0.10:6789 --add bravo 10.32.0.25:6789 --add charlie 10.32.0.11:6789 --print /tmp/mkcephfs.GaRCZ9i06a/monmap
> /usr/bin/monmaptool: monmap file /tmp/mkcephfs.GaRCZ9i06a/monmap
> /usr/bin/monmaptool: generated fsid c7202495-468c-4678-b678-115c3ee33402
> epoch 0
> fsid c7202495-468c-4678-b678-115c3ee33402
> last_changed 2012-07-04 15:02:31.732275
> created 2012-07-04 15:02:31.732275
> 0: 10.32.0.10:6789/0 mon.alpha
> 1: 10.32.0.11:6789/0 mon.charlie
> 2: 10.32.0.25:6789/0 mon.bravo
> /usr/bin/monmaptool: writing epoch 0 to /tmp/mkcephfs.GaRCZ9i06a/monmap (3 monitors)
> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "user"
> === osd.0 ===
> --- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.GaRCZ9i06a --prepare-osdfs osd.0
> umount: /srv/osd.0: not mounted
> umount: /dev/disk/by-wwn/wwn-0x50014ee601246234: not mounted
> 
> WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
> WARNING! - see http://btrfs.wiki.kernel.org before using
> 
> fs created label (null) on /dev/disk/by-wwn/wwn-0x50014ee601246234
>         nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB
> Btrfs Btrfs v0.19
> Scanning for Btrfs filesystems
> mount: wrong fs type, bad option, bad superblock on /dev/sdc,
>        missing codepage or helper program, or other error
>        In some cases useful info is found in syslog - try
>        dmesg | tail  or so
> 
> failed: '/sbin/mkcephfs -d /tmp/mkcephfs.GaRCZ9i06a --prepare-osdfs osd.0'

Hmm.  Can you try running with -v?  That will tell us exactly which 
command it is running, and hopefully we can work backwards from there.

> dmesg/syslog is spitting out at the time of this failure:
> 
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.751945] device fsid 7de0d192-b710-4629-a201-849df1d9db17 devid 1 transid 27109 /dev/sdp
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.751987] device fsid 08fc3479-2fa2-4388-8b61-83e2a742a13e devid 1 transid 28699 /dev/sdo
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.752023] device fsid 8b4a7c43-1a05-4dcb-bbed-de2a5c933996 devid 1 transid 24346 /dev/sdn
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.752068] device fsid ba5fb1ca-c642-49b1-8a41-7f56f8e59fbd devid 1 transid 27274 /dev/sdm
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761453] device fsid 7fe8c5cf-bf8c-4276-90f2-c3f57f5275fb devid 1 transid 28724 /dev/sdi
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761518] device fsid 93fa3631-1202-4d42-8908-e5ef4d3e600d devid 1 transid 25201 /dev/sdh
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761579] device fsid b9a1b5e4-3e5e-4381-a29a-33470f4b870f devid 1 transid 23375 /dev/sdg
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761635] device fsid 280ea990-23f8-4c43-9e56-140c82340fdc devid 1 transid 25559 /dev/sdf
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761693] device fsid 2f724cde-6de5-4262-b195-1ba3eea2256e devid 1 transid 176 /dev/sde
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761732] device fsid a66f890f-8b08-4393-aab0-f222637ca5a4 devid 1 transid 7 /dev/sdd
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761769] device fsid 6c181a94-697c-4e0c-af0d-05eb04d3626c devid 1 transid 7 /dev/sdc
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.775931] device fsid 6c181a94-697c-4e0c-af0d-05eb04d3626c devid 1 transid 7 /dev/sdc
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.779716] btrfs bad fsid on block 20971520
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.791594] btrfs bad fsid on block 20971520
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.803608] btrfs bad fsid on block 20971520
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.815541] btrfs bad fsid on block 20971520
> Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.815878] btrfs bad fsid on block 20971520
> Jul  4 15:02:32 dsanb1-coy kernel: [ 2306.823554] btrfs bad fsid on block 20971520
> Jul  4 15:02:32 dsanb1-coy kernel: [ 2306.823797] btrfs bad fsid on block 20971520
> Jul  4 15:02:32 dsanb1-coy kernel: [ 2306.823887] btrfs: failed to read chunk root on sdc
> Jul  4 15:02:32 dsanb1-coy kernel: [ 2306.825622] btrfs: open_ctree failed

Long shot, but is the kernel on that machine recent?

> Also fails if not forcing to use btrfs, eg:
> 
> root@dsanb1-coy:~# mkcephfs -c /etc/ceph/ceph.conf --allhosts -k /etc/ceph/keyring --crushmapsrc crushfile.txt -v
> temp dir is /tmp/mkcephfs.ZOh6tBPAH0
> preparing monmap in /tmp/mkcephfs.ZOh6tBPAH0/monmap
> /usr/bin/monmaptool --create --clobber --add alpha 10.32.0.10:6789 --add bravo 10.32.0.25:6789 --add charlie 10.32.0.11:6789 --print /tmp/mkcephfs.ZOh6tBPAH0/monmap
> /usr/bin/monmaptool: monmap file /tmp/mkcephfs.ZOh6tBPAH0/monmap
> /usr/bin/monmaptool: generated fsid adb8d65c-a823-4dc2-9415-22b0d7252699
> epoch 0
> fsid adb8d65c-a823-4dc2-9415-22b0d7252699
> last_changed 2012-07-04 15:04:17.423368
> created 2012-07-04 15:04:17.423368
> 0: 10.32.0.10:6789/0 mon.alpha
> 1: 10.32.0.11:6789/0 mon.charlie
> 2: 10.32.0.25:6789/0 mon.bravo
> /usr/bin/monmaptool: writing epoch 0 to /tmp/mkcephfs.ZOh6tBPAH0/monmap (3 monitors)
> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "user"
> === osd.0 ===
> --- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.ZOh6tBPAH0 --init-daemon osd.0
> 2012-07-04 15:04:17.789064 7fc7fadca780 -1 filestore(/srv/osd.0) limited size xattrs -- enable filestore_xattr_use_omap
> 2012-07-04 15:04:17.789120 7fc7fadca780 -1 OSD::mkfs: couldn't mount FileStore: error -95
> 2012-07-04 15:04:17.789161 7fc7fadca780 -1  ** ERROR: error creating empty object store in /srv/osd.0: (95) Operation not supported
> failed: '/sbin/mkcephfs -d /tmp/mkcephfs.ZOh6tBPAH0 --init-daemon osd.0'
> 
> 
> Confirming all this was working previously, and the crushmap, config 
> file, etc are all proven to be OK (get same failure when not specifying 
> a custom crushmap also). Also note that whilst the above is failing on 
> osd.0 creation, I have swapped disk references and still get the same 
> failure on different HDD's when they are hooked in as osd.0

The only thing that changed from v0.47 is the below.  Can you try 
replacing 'btrfs device scan || btrfsctl -a' with 'btrfs device scan ; 
btrfsctl -a'?  Maybe the btrfs tool isn't being pendantic about return 
codes...

sage


commit a414fd51c7c5ae5dbe9e3af7db6f17741a58c1a7
Author: Sage Weil <sage.weil@dreamhost.com>
Date:   Sat Feb 11 13:43:23 2012 -0800

    init-ceph, mkcephfs: try 'btrfs device scan' before 'btrfsctl -a'
    
    Fixes: #2023
    Reported-by: Wido den Hollander <wido@widodh.nl>
    Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

diff --git a/src/mkcephfs.in b/src/mkcephfs.in
index 83fb932..17b6014 100644
--- a/src/mkcephfs.in
+++ b/src/mkcephfs.in
@@ -332,7 +332,7 @@ if [ -n "$prepareosdfs" ]; then
 
     modprobe btrfs || true
     mkfs.btrfs $btrfs_devs
-    btrfsctl -a
+    btrfs device scan || btrfsctl -a
     mount -t btrfs $btrfs_opt $first_dev $btrfs_path
     chown $osd_user $btrfs_path
     chmod +w $btrfs_path



  reply	other threads:[~2012-07-05 22:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-04  5:07 mkcephfs failing on v0.48 "argonaut" Paul Pettigrew
2012-07-05 22:17 ` Sage Weil [this message]
2012-07-06  2:39   ` Paul Pettigrew
2012-07-06  4:08     ` Sage Weil
2012-07-07  3:33       ` Paul Pettigrew
2012-07-07  3:45         ` Paul Pettigrew
2012-07-07  4:20         ` Sage Weil
2012-07-07 21:15           ` Paul Pettigrew
2012-07-10 23:34             ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.1207051512360.21201@cobra.newdream.net \
    --to=sage@inktank.com \
    --cc=Paul.Pettigrew@mach.com.au \
    --cc=ceph-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.