All of lore.kernel.org
 help / color / mirror / Atom feed
* unstable branch won't come online
@ 2010-04-15 15:25 Wido den Hollander
  2010-04-15 17:23 ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Wido den Hollander @ 2010-04-15 15:25 UTC (permalink / raw)
  To: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 4236 bytes --]

Hi,

Since i had some troubles with the "stable" branch regarding snapshots
en stalling writebacks i'm trying to get the unstable branch to work.

My setup consists of 7 machines in total:

- ceph01 (mon / mds): 192.168.6.209
- ceph02 (mon / mds): 192.168.6.210
- osd1 (osd): 192.168.6.211
- osd2 (osd): 192.168.6.212
- osd3 (osd): 192.168.6.213
- osd4 (osd): 192.168.6.214
- osd5 (osd): 192.168.6.215

All these machines are running Ubuntu 9.10 (AMD64) with kernel version
2.6.34 fetched from:
http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/2010-04-05-karmic/

According to the build log these packages were build with 2.6.34-rc3

The problem right now is, my setup is not working. I did a complete
cleanup of alle 7 machines, gave them a re-install to be sure they were
all clean.

I've attached my ceph.conf for the reference, but imho there seems to be
a problem with the mon's, this is what i see in my logfiles:

cephx keyserverdata: get_caps: name=mon.
cephx keyserverdata: get_secret: num of caps=0
cephx: verify_authorizer_reply exception in decode_decrypt with
LR1SttT3gGUDYMC5Ylp4ww== (10.04.15 16:59:52.104714)
-- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a0d40 sd=7
pgs=0 cs=0 l=0).failed verifying authorize reply
-- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a0d40 sd=-1
pgs=0 cs=0 l=0).fault first fault
cephx keyserverdata: get_caps: name=mon.
cephx keyserverdata: get_secret: num of caps=0
cephx: verify_authorizer_reply exception in decode_decrypt with
2FUG2YqeiNqm0cxFgialag== (10.04.15 16:59:52.107835)
-- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a0d40 sd=7
pgs=0 cs=0 l=0).failed verifying authorize reply
7f0406603910 -- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0
pipe(0x25a40a0 sd=7 pgs=0 cs=0 l=0).accept connect_seq 0 vs existing 0
state 1
7f0406603910 -- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0
pipe(0x25a40a0 sd=-1 pgs=5 cs=1 l=0).fault initiating reconnect
cephx keyserverdata: get_caps: name=mon.
cephx keyserverdata: get_secret: num of caps=0
cephx: verify_authorizer_reply exception in decode_decrypt with
U9X7DcmEIlSzQ5TIZgUT+g== (10.04.15 16:59:52.235836)
-- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a40a0 sd=7
pgs=5 cs=2 l=0).failed verifying authorize reply
-- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a40a0 sd=-1
pgs=5 cs=2 l=0).fault first fault
cephx keyserverdata: get_caps: name=mon.
cephx keyserverdata: get_secret: num of caps=0
cephx: verify_authorizer_reply exception in decode_decrypt with
PBYNd7534Bft/AoQ0xRObw== (10.04.15 16:59:52.243187)
-- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a40a0 sd=7
pgs=5 cs=2 l=0).failed verifying authorize reply

The messages are the same on mon0 and mon1.

Now, on my osd's i see: 

10.04.15 17:11:26.791702 7ff980706710 -- 0.0.0.0:6800/24247 --> mon1
192.168.6.210:6789/0 -- auth(proto 0 26 bytes) -- ?+0 0xd0f8e0
10.04.15 17:11:26.800456 7ff978811910 -- 192.168.6.211:6800/24247
learned my addr 192.168.6.211:6800/24247
os/FileStore.cc: In function 'void FileStore::sync_entry()':
os/FileStore.cc:1628: FAILED assert(r == 0)
 1: (FileStore::SyncThread::entry()+0xd) [0x4e8e5d]
 2: (Thread::_entry_func(void*)+0x7) [0x46a057]
 3: /lib/libpthread.so.0 [0x7ff9800dda04]
 4: (clone()+0x6d) [0x7ff97f31580d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.

To be sure, my git branch is "unstable"

root@osd1:/usr/src/ceph# git status|head -n 2
# On branch unstable
# Changed but not updated:
root@osd1:/usr/src/ceph#

I used "dpkg-buildpackage" on osd1 (fastest CPU) to build .deb's for
unstable, this because i wanted to test the new rbd block driver and for
that i needed rbdtool.

So to me there seem to be two issues:

1. Mon's cant authenticate with eachother
2. OSD's are crashing

Any ideas? If needed, i can arrange access to these machines.

-- 
Met vriendelijke groet,

Wido den Hollander
Hoofd Systeembeheer / CSO
Telefoon Support Nederland: 0900 9633 (45 cpm)
Telefoon Support België: 0900 70312 (45 cpm)
Telefoon Direct: (+31) (0)20 50 60 104
Fax: +31 (0)20 50 60 111
E-mail: support@pcextreme.nl
Website: http://www.pcextreme.nl
Kennisbank: http://support.pcextreme.nl/
Netwerkstatus: http://nmc.pcextreme.nl



[-- Attachment #2: ceph.conf --]
[-- Type: text/plain, Size: 812 bytes --]

[global]
       pid file = /var/run/ceph/$name.pid
       debug ms = 1

[mon]
       mon data = /srv/ceph/mon

[mon0]
       host = ceph01
       mon addr = 192.168.6.209:6789

[mon1]
       host = ceph02
       mon addr = 192.168.6.210:6789

[mds]

[mds0]
       host = ceph01

[mds1]
       host = ceph02

[osd]
       sudo = true
       osd data = /srv/ceph/osd

[osd1]
       host = osd1
       btrfs devs = /dev/sda6

[osd2]
       host = osd2
       btrfs devs = /dev/sda6

[osd3]
       host = osd3
       btrfs devs = /dev/sda6

[osd4]
      host = osd4
      btrfs devs = /dev/sda6

[osd5]
     host = osd5
     btrfs devs = /dev/sda6

[group everyone]
       addr = 192.168.6.0/24

[mount /]
       allow = %everyone

[mount /mail]
       allow = %everyone

mount [/vhosting]
       allow = %everyone


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unstable branch won't come online
  2010-04-15 15:25 unstable branch won't come online Wido den Hollander
@ 2010-04-15 17:23 ` Sage Weil
  2010-04-15 17:56   ` Wido den Hollander
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2010-04-15 17:23 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5033 bytes --]

Hi Wido,

A fix for the monitor problem is pushed to the unstable branch.  We've 
been testing with cephx authentication enabled, and in this case it was 
getting confused when it was off.

I can't tell from the assertion alone why the OSD is crashing.  Can you 
reproduce with the osd log turned on (debug osd = 20 in ceph.conf), and 
let us know what error code it's getting?

Thanks!
sage



On Thu, 15 Apr 2010, Wido den Hollander wrote:

> Hi,
> 
> Since i had some troubles with the "stable" branch regarding snapshots
> en stalling writebacks i'm trying to get the unstable branch to work.
> 
> My setup consists of 7 machines in total:
> 
> - ceph01 (mon / mds): 192.168.6.209
> - ceph02 (mon / mds): 192.168.6.210
> - osd1 (osd): 192.168.6.211
> - osd2 (osd): 192.168.6.212
> - osd3 (osd): 192.168.6.213
> - osd4 (osd): 192.168.6.214
> - osd5 (osd): 192.168.6.215
> 
> All these machines are running Ubuntu 9.10 (AMD64) with kernel version
> 2.6.34 fetched from:
> http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/2010-04-05-karmic/
> 
> According to the build log these packages were build with 2.6.34-rc3
> 
> The problem right now is, my setup is not working. I did a complete
> cleanup of alle 7 machines, gave them a re-install to be sure they were
> all clean.
> 
> I've attached my ceph.conf for the reference, but imho there seems to be
> a problem with the mon's, this is what i see in my logfiles:
> 
> cephx keyserverdata: get_caps: name=mon.
> cephx keyserverdata: get_secret: num of caps=0
> cephx: verify_authorizer_reply exception in decode_decrypt with
> LR1SttT3gGUDYMC5Ylp4ww== (10.04.15 16:59:52.104714)
> -- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a0d40 sd=7
> pgs=0 cs=0 l=0).failed verifying authorize reply
> -- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a0d40 sd=-1
> pgs=0 cs=0 l=0).fault first fault
> cephx keyserverdata: get_caps: name=mon.
> cephx keyserverdata: get_secret: num of caps=0
> cephx: verify_authorizer_reply exception in decode_decrypt with
> 2FUG2YqeiNqm0cxFgialag== (10.04.15 16:59:52.107835)
> -- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a0d40 sd=7
> pgs=0 cs=0 l=0).failed verifying authorize reply
> 7f0406603910 -- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0
> pipe(0x25a40a0 sd=7 pgs=0 cs=0 l=0).accept connect_seq 0 vs existing 0
> state 1
> 7f0406603910 -- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0
> pipe(0x25a40a0 sd=-1 pgs=5 cs=1 l=0).fault initiating reconnect
> cephx keyserverdata: get_caps: name=mon.
> cephx keyserverdata: get_secret: num of caps=0
> cephx: verify_authorizer_reply exception in decode_decrypt with
> U9X7DcmEIlSzQ5TIZgUT+g== (10.04.15 16:59:52.235836)
> -- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a40a0 sd=7
> pgs=5 cs=2 l=0).failed verifying authorize reply
> -- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a40a0 sd=-1
> pgs=5 cs=2 l=0).fault first fault
> cephx keyserverdata: get_caps: name=mon.
> cephx keyserverdata: get_secret: num of caps=0
> cephx: verify_authorizer_reply exception in decode_decrypt with
> PBYNd7534Bft/AoQ0xRObw== (10.04.15 16:59:52.243187)
> -- 192.168.6.210:6789/0 >> 192.168.6.209:6789/0 pipe(0x25a40a0 sd=7
> pgs=5 cs=2 l=0).failed verifying authorize reply
> 
> The messages are the same on mon0 and mon1.
> 
> Now, on my osd's i see: 
> 
> 10.04.15 17:11:26.791702 7ff980706710 -- 0.0.0.0:6800/24247 --> mon1
> 192.168.6.210:6789/0 -- auth(proto 0 26 bytes) -- ?+0 0xd0f8e0
> 10.04.15 17:11:26.800456 7ff978811910 -- 192.168.6.211:6800/24247
> learned my addr 192.168.6.211:6800/24247
> os/FileStore.cc: In function 'void FileStore::sync_entry()':
> os/FileStore.cc:1628: FAILED assert(r == 0)
>  1: (FileStore::SyncThread::entry()+0xd) [0x4e8e5d]
>  2: (Thread::_entry_func(void*)+0x7) [0x46a057]
>  3: /lib/libpthread.so.0 [0x7ff9800dda04]
>  4: (clone()+0x6d) [0x7ff97f31580d]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
> 
> To be sure, my git branch is "unstable"
> 
> root@osd1:/usr/src/ceph# git status|head -n 2
> # On branch unstable
> # Changed but not updated:
> root@osd1:/usr/src/ceph#
> 
> I used "dpkg-buildpackage" on osd1 (fastest CPU) to build .deb's for
> unstable, this because i wanted to test the new rbd block driver and for
> that i needed rbdtool.
> 
> So to me there seem to be two issues:
> 
> 1. Mon's cant authenticate with eachother
> 2. OSD's are crashing
> 
> Any ideas? If needed, i can arrange access to these machines.
> 
> -- 
> Met vriendelijke groet,
> 
> Wido den Hollander
> Hoofd Systeembeheer / CSO
> Telefoon Support Nederland: 0900 9633 (45 cpm)
> Telefoon Support België: 0900 70312 (45 cpm)
> Telefoon Direct: (+31) (0)20 50 60 104
> Fax: +31 (0)20 50 60 111
> E-mail: support@pcextreme.nl
> Website: http://www.pcextreme.nl
> Kennisbank: http://support.pcextreme.nl/
> Netwerkstatus: http://nmc.pcextreme.nl
> 
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unstable branch won't come online
  2010-04-15 17:23 ` Sage Weil
@ 2010-04-15 17:56   ` Wido den Hollander
  2010-04-15 18:02     ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Wido den Hollander @ 2010-04-15 17:56 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 912 bytes --]

Hi Sage,

I've placed "debug osd = 20" in [global], [osd] and [osd1], but it
didn't seem to change much to the log itself.

Attached you'll find the log of "osd1"

After writing "send_pg_stats" in the logfile "cosd" crashes about 3
seconds later.

FYI, on my system libc6 version 2.10.1-0ubuntu16 is installed (don't
know if this is of any use?)

I'm building a new version of my .deb's right now using the latest GIT
checkout, i'll know if the monitors work by tomorrow.

-- 
Met vriendelijke groet,

Wido den Hollander
Hoofd Systeembeheer / CSO
Telefoon Support Nederland: 0900 9633 (45 cpm)
Telefoon Support België: 0900 70312 (45 cpm)
Telefoon Direct: (+31) (0)20 50 60 104
Fax: +31 (0)20 50 60 111
E-mail: support@pcextreme.nl
Website: http://www.pcextreme.nl
Kennisbank: http://support.pcextreme.nl/
Netwerkstatus: http://nmc.pcextreme.nl


On Thu, 2010-04-15 at 17:23 +0000, Sage Weil wrote:
> debug osd

[-- Attachment #2: osd1.log --]
[-- Type: text/x-log, Size: 3224 bytes --]

10.04.15 19:50:32.092320 --- opened log /var/log/ceph/osd1.1954 ---
ceph version 0.19.1 (198d7106818e91c655740f3a41c2390af36e069e)
10.04.15 19:50:32.092472 7f0ca4873710 ---- renamed symlink /var/log/ceph/osd1 -> /var/log/ceph/osd1.0 ----
10.04.15 19:50:32.092509 7f0ca4873710 ---- created symlink /var/log/ceph/osd1 -> osd1.1954 ----
10.04.15 19:50:32.092822 7f0ca4873710 -- 0.0.0.0:6800/1954 accepter.bind ms_addr is 0.0.0.0:6800/1954 need_addr=1
10.04.15 19:50:32.092874 7f0ca4873710 -- 0.0.0.0:6801/1954 accepter.bind ms_addr is 0.0.0.0:6801/1954 need_addr=1
10.04.15 19:50:32.093044 7f0ca4873710 -- 0.0.0.0:6800/1954 messenger.start
10.04.15 19:50:32.093058 7f0ca4873710 -- 0.0.0.0:6800/1954 messenger.start daemonizing
10.04.15 19:50:32.093523 7f0ca4873710 ---- renamed log /var/log/ceph/osd1.1954 -> /var/log/ceph/osd1.1955 ----
10.04.15 19:50:32.093742 7f0ca4873710 -- 0.0.0.0:6800/1954 accepter.start
10.04.15 19:50:32.093888 7f0ca4873710 -- 0.0.0.0:6801/1954 messenger.start
10.04.15 19:50:32.093927 7f0ca4873710 -- 0.0.0.0:6801/1954 accepter.start
10.04.15 19:50:32.093981 7f0ca4873710 osd1 0 mounting /srv/ceph/osd 
10.04.15 19:50:32.094126 7f0ca4873710 filestore(/srv/ceph/osd) mount detected btrfs
10.04.15 19:50:32.094180 7f0ca4873710 filestore(/srv/ceph/osd) mount btrfs CLONE_RANGE ioctl is supported
10.04.15 19:50:32.138890 7f0ca4873710 filestore(/srv/ceph/osd) mount btrfs SNAP_CREATE is supported
10.04.15 19:50:32.301309 7f0ca4873710 filestore(/srv/ceph/osd) mount btrfs SNAP_DESTROY is supported
10.04.15 19:50:32.301514 7f0ca4873710 filestore(/srv/ceph/osd) mount found snaps <1>
10.04.15 19:50:32.663982 7f0ca4873710 filestore(/srv/ceph/osd) mount WARNING: no journal
10.04.15 19:50:32.664053 7f0ca4873710 osd1 0 boot
10.04.15 19:50:32.664148 7f0ca4873710 osd1 0 read_superblock sb(27aff7af-36da-d776-dc68-9a46644555fd osd1 e0 [0,0] lci=[0,0])
10.04.15 19:50:32.664221 7f0ca4873710 osd1 0 load_pgs
10.04.15 19:50:32.664252 7f0ca4873710 filestore(/srv/ceph/osd) parse . -> 0.0p0_0 = 0
10.04.15 19:50:32.664268 7f0ca4873710 filestore(/srv/ceph/osd) parse .. -> 0.0p0_0 = 0
10.04.15 19:50:32.664278 7f0ca4873710 filestore(/srv/ceph/osd) parse commit_op_seq -> 0.0p0_0 = 0
10.04.15 19:50:32.664288 7f0ca4873710 filestore(/srv/ceph/osd) parse meta -> 0.0p0_0 = 1
10.04.15 19:50:32.664303 7f0ca4873710 osd1 0 superblock: i am osd1
10.04.15 19:50:32.664694 7f0ca4873710 -- 0.0.0.0:6800/1954 --> mon1 192.168.6.210:6789/0 -- auth(proto 0 26 bytes) -- ?+0 0x16134f0
10.04.15 19:50:32.666824 7f0c9c97f910 -- 192.168.6.211:6800/1954 learned my addr 192.168.6.211:6800/1954
10.04.15 19:50:32.666875 7f0c9c97f910 osd1 0 OSD::ms_get_authorizer type=mon
10.04.15 19:50:32.673342 7f0c9e182910 osd1 0 ms_handle_connect on mon
10.04.15 19:50:32.673396 7f0c9e182910 osd1 0 send_boot
10.04.15 19:50:32.673416 7f0c9e182910 osd1 0 send_pg_stats
os/FileStore.cc: In function 'void FileStore::sync_entry()':
os/FileStore.cc:1628: FAILED assert(r == 0)
 1: (FileStore::SyncThread::entry()+0xd) [0x4e8e5d]
 2: (Thread::_entry_func(void*)+0x7) [0x46a057]
 3: /lib/libpthread.so.0 [0x7f0ca424ba04]
 4: (clone()+0x6d) [0x7f0ca348380d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unstable branch won't come online
  2010-04-15 17:56   ` Wido den Hollander
@ 2010-04-15 18:02     ` Sage Weil
  2010-04-15 21:11       ` Wido den Hollander
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2010-04-15 18:02 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-devel

On Thu, 15 Apr 2010, Wido den Hollander wrote:
> I've placed "debug osd = 20" in [global], [osd] and [osd1], but it
> didn't seem to change much to the log itself.
> 
> Attached you'll find the log of "osd1"

My bad, you actually need 'debug filestore = 20' to get the error message 
I'm after!

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unstable branch won't come online
  2010-04-15 18:02     ` Sage Weil
@ 2010-04-15 21:11       ` Wido den Hollander
  2010-04-15 21:49         ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Wido den Hollander @ 2010-04-15 21:11 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 602 bytes --]

Hi Sage,

I tried "debug filestore = 20" but it didn't seem to change much, see
that attached log.

I also attach my ceph.conf, any errors on my side?

-- 
Met vriendelijke groet,

Wido den Hollander
Hoofd Systeembeheer / CSO
Telefoon Support Nederland: 0900 9633 (45 cpm)
Telefoon Support België: 0900 70312 (45 cpm)
Telefoon Direct: (+31) (0)20 50 60 104
Fax: +31 (0)20 50 60 111
E-mail: support@pcextreme.nl
Website: http://www.pcextreme.nl
Kennisbank: http://support.pcextreme.nl/
Netwerkstatus: http://nmc.pcextreme.nl


On Thu, 2010-04-15 at 18:02 +0000, Sage Weil wrote:
> debug filestore = 20

[-- Attachment #2: ceph.conf --]
[-- Type: application/x-extension-conf, Size: 896 bytes --]

[-- Attachment #3: osd1.log --]
[-- Type: text/x-log, Size: 4140 bytes --]

10.04.15 23:09:11.103346 --- opened log /var/log/ceph/osd1.12962 ---
ceph version 0.19.1 (198d7106818e91c655740f3a41c2390af36e069e)
10.04.15 23:09:11.103496 7fe91e1ce710 ---- renamed symlink /var/log/ceph/osd1 -> /var/log/ceph/osd1.0 ----
10.04.15 23:09:11.103533 7fe91e1ce710 ---- created symlink /var/log/ceph/osd1 -> osd1.12962 ----
10.04.15 23:09:11.103891 7fe91e1ce710 -- 0.0.0.0:6800/12962 accepter.bind ms_addr is 0.0.0.0:6800/12962 need_addr=1
10.04.15 23:09:11.103944 7fe91e1ce710 -- 0.0.0.0:6801/12962 accepter.bind ms_addr is 0.0.0.0:6801/12962 need_addr=1
10.04.15 23:09:11.104084 7fe91e1ce710 filestore(/srv/ceph/osd) test_mount basedir /srv/ceph/osd journal 
10.04.15 23:09:11.104113 7fe91e1ce710 -- 0.0.0.0:6800/12962 messenger.start
10.04.15 23:09:11.104136 7fe91e1ce710 -- 0.0.0.0:6800/12962 messenger.start daemonizing
10.04.15 23:09:11.104598 7fe91e1ce710 ---- renamed log /var/log/ceph/osd1.12962 -> /var/log/ceph/osd1.12963 ----
10.04.15 23:09:11.104818 7fe91e1ce710 -- 0.0.0.0:6800/12962 accepter.start
10.04.15 23:09:11.104965 7fe91e1ce710 -- 0.0.0.0:6801/12962 messenger.start
10.04.15 23:09:11.105004 7fe91e1ce710 -- 0.0.0.0:6801/12962 accepter.start
10.04.15 23:09:11.105176 7fe91e1ce710 filestore(/srv/ceph/osd) basedir /srv/ceph/osd journal 
10.04.15 23:09:11.105346 7fe91e1ce710 filestore(/srv/ceph/osd) mount detected btrfs
10.04.15 23:09:11.105383 7fe91e1ce710 filestore(/srv/ceph/osd) _do_clone_range 0~1
10.04.15 23:09:11.105408 7fe91e1ce710 filestore(/srv/ceph/osd) mount btrfs CLONE_RANGE ioctl is supported
10.04.15 23:09:11.152443 7fe91e1ce710 filestore(/srv/ceph/osd) mount btrfs SNAP_CREATE is supported
10.04.15 23:09:11.332557 7fe91e1ce710 filestore(/srv/ceph/osd) mount btrfs SNAP_DESTROY is supported
10.04.15 23:09:11.332657 7fe91e1ce710 filestore(/srv/ceph/osd) mount fsid is 321622132
10.04.15 23:09:11.332771 7fe91e1ce710 filestore(/srv/ceph/osd) mount found snaps <1>
10.04.15 23:09:11.660421 7fe91e1ce710 filestore(/srv/ceph/osd) mount rolled back to consistent snap 1
10.04.15 23:09:11.660518 7fe91e1ce710 filestore(/srv/ceph/osd) mount op_seq is 1
10.04.15 23:09:11.660676 7fe91b2e3910 filestore(/srv/ceph/osd) sync_entry waiting for max_interval 5.000000
10.04.15 23:09:11.660775 7fe919ae0910 filestore(/srv/ceph/osd) flusher_entry start
10.04.15 23:09:11.660801 7fe919ae0910 filestore(/srv/ceph/osd) flusher_entry sleeping
10.04.15 23:09:11.660824 7fe91e1ce710 filestore(/srv/ceph/osd) mount WARNING: no journal
10.04.15 23:09:11.660901 7fe91e1ce710 filestore(/srv/ceph/osd) read /srv/ceph/osd/current/meta/osd_superblock_0 0~0
10.04.15 23:09:11.660963 7fe91e1ce710 filestore(/srv/ceph/osd) read /srv/ceph/osd/current/meta/osd_superblock_0 0~85 = 85
10.04.15 23:09:11.661040 7fe91e1ce710 filestore(/srv/ceph/osd) list_collections
10.04.15 23:09:11.661075 7fe91e1ce710 filestore(/srv/ceph/osd) parse . -> 0.0p0_0 = 0
10.04.15 23:09:11.661090 7fe91e1ce710 filestore(/srv/ceph/osd) parse .. -> 0.0p0_0 = 0
10.04.15 23:09:11.661101 7fe91e1ce710 filestore(/srv/ceph/osd) parse commit_op_seq -> 0.0p0_0 = 0
10.04.15 23:09:11.661110 7fe91e1ce710 filestore(/srv/ceph/osd) parse meta -> 0.0p0_0 = 1
10.04.15 23:09:11.661533 7fe91e1ce710 -- 0.0.0.0:6800/12962 --> mon0 192.168.6.209:6789/0 -- auth(proto 0 26 bytes) -- ?+0 0x27fd4e0
10.04.15 23:09:11.663643 7fe9162d9910 -- 192.168.6.211:6800/12962 learned my addr 192.168.6.211:6800/12962
10.04.15 23:09:16.660751 7fe91b2e3910 filestore(/srv/ceph/osd) sync_entry woke after 5.000057
10.04.15 23:09:16.660779 7fe91b2e3910 filestore(/srv/ceph/osd) sync_entry committing 1 sync_epoch 1
10.04.15 23:09:16.660854 7fe91b2e3910 filestore(/srv/ceph/osd) taking snap 'snap_1'
10.04.15 23:09:16.660898 7fe91b2e3910 filestore(/srv/ceph/osd) snap create 'snap_1' got -1 File exists
os/FileStore.cc: In function 'void FileStore::sync_entry()':
os/FileStore.cc:1628: FAILED assert(r == 0)
 1: (FileStore::SyncThread::entry()+0xd) [0x4e8e5d]
 2: (Thread::_entry_func(void*)+0x7) [0x46a057]
 3: /lib/libpthread.so.0 [0x7fe91dba5a04]
 4: (clone()+0x6d) [0x7fe91cddd80d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unstable branch won't come online
  2010-04-15 21:11       ` Wido den Hollander
@ 2010-04-15 21:49         ` Sage Weil
  2010-04-16  9:37           ` Wido den Hollander
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2010-04-15 21:49 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-devel

On Thu, 15 Apr 2010, Wido den Hollander wrote:
> Hi Sage,
> 
> I tried "debug filestore = 20" but it didn't seem to change much, see
> that attached log.
> 
> I also attach my ceph.conf, any errors on my side?

Ok, the problem is that snap_1 already exists (probably others, too).  I 
just pushed a fix to unstable that makes the osd more carefully remove old 
directories or subvols it finds during the mkfs process which should fix 
your problem.  Alternatively, you can mkfs.btrfs beforehand to make sure 
there's nothing old in the osd data dir, or pass --mkbtrfs to mkcephfs and 
let it do the mkfs.btrfs for you.

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unstable branch won't come online
  2010-04-15 21:49         ` Sage Weil
@ 2010-04-16  9:37           ` Wido den Hollander
  0 siblings, 0 replies; 7+ messages in thread
From: Wido den Hollander @ 2010-04-16  9:37 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

Hi Sage,

The new version didn't fix it, but a mkcephfs seemed to fix it, right
now my cluster is online and working.

-- 
Met vriendelijke groet,

Wido den Hollander
Hoofd Systeembeheer / CSO
Telefoon Support Nederland: 0900 9633 (45 cpm)
Telefoon Support België: 0900 70312 (45 cpm)
Telefoon Direct: (+31) (0)20 50 60 104
Fax: +31 (0)20 50 60 111
E-mail: support@pcextreme.nl
Website: http://www.pcextreme.nl
Kennisbank: http://support.pcextreme.nl/
Netwerkstatus: http://nmc.pcextreme.nl


On Thu, 2010-04-15 at 21:49 +0000, Sage Weil wrote:
> On Thu, 15 Apr 2010, Wido den Hollander wrote:
> > Hi Sage,
> > 
> > I tried "debug filestore = 20" but it didn't seem to change much, see
> > that attached log.
> > 
> > I also attach my ceph.conf, any errors on my side?
> 
> Ok, the problem is that snap_1 already exists (probably others, too).  I 
> just pushed a fix to unstable that makes the osd more carefully remove old 
> directories or subvols it finds during the mkfs process which should fix 
> your problem.  Alternatively, you can mkfs.btrfs beforehand to make sure 
> there's nothing old in the osd data dir, or pass --mkbtrfs to mkcephfs and 
> let it do the mkfs.btrfs for you.
> 
> sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-04-16  9:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-15 15:25 unstable branch won't come online Wido den Hollander
2010-04-15 17:23 ` Sage Weil
2010-04-15 17:56   ` Wido den Hollander
2010-04-15 18:02     ` Sage Weil
2010-04-15 21:11       ` Wido den Hollander
2010-04-15 21:49         ` Sage Weil
2010-04-16  9:37           ` Wido den Hollander

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.