All of lore.kernel.org
 help / color / mirror / Atom feed
* Failed on starting osd-daemon after upgrade giant-0.87.1 to hammer-0.94.3
@ 2015-09-09 12:15 王锐
  2015-09-10 22:23 ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: 王锐 @ 2015-09-09 12:15 UTC (permalink / raw)
  To: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 900 bytes --]

Hi all:

I got on error after upgrade my ceph cluster from giant-0.87.2 to hammer-0.94.3, my local environment is:
CentOS 6.7 x86_64
Kernel 3.10.86-1.el6.elrepo.x86_64
HDD: XFS, 2TB
Install Package: ceph.com official RPMs x86_64

step 1: 
Upgrade MON server from 0.87.1 to 0.94.3, all is fine!

step 2: 
Upgrade OSD server from 0.87.1 to 0.94.3. i just upgrade two servers and noticed that some osds can not started!
server-1 have 4 osds, all of them can not started;
server-2 have 3 osds, 2 of them can not started, but 1 of them successfully started and work in good.

Error log 1:
service ceph start osd.4
/var/log/ceph/ceph-osd.24.log 
(attachment file: ceph.24.log)

Error log 2:
/usr/bin/ceph-osd -c /etc/ceph/ceph.conf -i 4 -f
 (attachment file: cli.24.log)

---------------------
There seems some data file version error, so how can i repair my osds? 

thank you ~

[-- Attachment #2: ceph.24.log --]
[-- Type: application/octet-stream, Size: 36620 bytes --]

2015-09-09 12:21:14.465828 7f45fc52e800  0 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b), process ceph-osd, pid 21383
2015-09-09 12:21:14.566343 7f45fc52e800  0 filestore(/ceph/data24) backend xfs (magic 0x58465342)
2015-09-09 12:21:14.570087 7f45fc52e800  0 genericfilestorebackend(/ceph/data24) detect_features: FIEMAP ioctl is supported and appears to work
2015-09-09 12:21:14.570106 7f45fc52e800  0 genericfilestorebackend(/ceph/data24) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2015-09-09 12:21:14.572116 7f45fc52e800  0 genericfilestorebackend(/ceph/data24) detect_features: syscall(SYS_syncfs, fd) fully supported
2015-09-09 12:21:14.573329 7f45fc52e800  0 xfsfilestorebackend(/ceph/data24) detect_feature: extsize is supported and kernel 3.10.86-1.el6.elrepo.x86_64 >= 3.5
2015-09-09 12:21:14.583689 7f45fc52e800  0 filestore(/ceph/data24) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
2015-09-09 12:21:14.610473 7f45fc52e800 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2015-09-09 12:21:14.610499 7f45fc52e800  1 journal _open /opt/ceph/osd/24/journal fd 19: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 0
2015-09-09 12:21:14.624277 7f45fc52e800  1 journal _open /opt/ceph/osd/24/journal fd 19: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 0
2015-09-09 12:21:14.636510 7f45fc52e800  0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello
2015-09-09 12:21:14.637908 7f45fc52e800  0 osd.24 27296 crush map has features 104186773504, adjusting msgr requires for clients
2015-09-09 12:21:14.637987 7f45fc52e800  0 osd.24 27296 crush map has features 379064680448 was 8705, adjusting msgr requires for mons
2015-09-09 12:21:14.638001 7f45fc52e800  0 osd.24 27296 crush map has features 379064680448, adjusting msgr requires for osds
2015-09-09 12:21:14.638026 7f45fc52e800  0 osd.24 27296 load_pgs
2015-09-09 12:21:15.554914 7f45fc52e800 -1 osd/PG.cc: In function 'static epoch_t PG::peek_map_epoch(ObjectStore*, spg_t, ceph::bufferlist*)' thread 7f45fc52e800 time 2015-09-09 12:21:15.553449
osd/PG.cc: 2864: FAILED assert(values.size() == 1)

 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: (PG::peek_map_epoch(ObjectStore*, spg_t, ceph::buffer::list*)+0x803) [0x826d73]
 2: (OSD::load_pgs()+0x1506) [0x6697a6]
 3: (OSD::init()+0x174e) [0x68a89e]
 4: (main()+0x384f) [0x62e2cf]
 5: (__libc_start_main()+0xfd) [0x3dbd81ed5d]
 6: /usr/bin/ceph-osd() [0x6299d9]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
  -183> 2015-09-09 12:21:14.460422 7f45fc52e800  5 asok(0x4d90000) register_command perfcounters_dump hook 0x4d00050
  -182> 2015-09-09 12:21:14.460484 7f45fc52e800  5 asok(0x4d90000) register_command 1 hook 0x4d00050
  -181> 2015-09-09 12:21:14.460495 7f45fc52e800  5 asok(0x4d90000) register_command perf dump hook 0x4d00050
  -180> 2015-09-09 12:21:14.460513 7f45fc52e800  5 asok(0x4d90000) register_command perfcounters_schema hook 0x4d00050
  -179> 2015-09-09 12:21:14.460527 7f45fc52e800  5 asok(0x4d90000) register_command 2 hook 0x4d00050
  -178> 2015-09-09 12:21:14.460534 7f45fc52e800  5 asok(0x4d90000) register_command perf schema hook 0x4d00050
  -177> 2015-09-09 12:21:14.460543 7f45fc52e800  5 asok(0x4d90000) register_command perf reset hook 0x4d00050
  -176> 2015-09-09 12:21:14.460548 7f45fc52e800  5 asok(0x4d90000) register_command config show hook 0x4d00050
  -175> 2015-09-09 12:21:14.460559 7f45fc52e800  5 asok(0x4d90000) register_command config set hook 0x4d00050
  -174> 2015-09-09 12:21:14.460564 7f45fc52e800  5 asok(0x4d90000) register_command config get hook 0x4d00050
  -173> 2015-09-09 12:21:14.460573 7f45fc52e800  5 asok(0x4d90000) register_command config diff hook 0x4d00050
  -172> 2015-09-09 12:21:14.460578 7f45fc52e800  5 asok(0x4d90000) register_command log flush hook 0x4d00050
  -171> 2015-09-09 12:21:14.460583 7f45fc52e800  5 asok(0x4d90000) register_command log dump hook 0x4d00050
  -170> 2015-09-09 12:21:14.460588 7f45fc52e800  5 asok(0x4d90000) register_command log reopen hook 0x4d00050
  -169> 2015-09-09 12:21:14.465828 7f45fc52e800  0 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b), process ceph-osd, pid 21383
  -168> 2015-09-09 12:21:14.467572 7f45fc52e800  1 -- 10.20.20.57:0/0 learned my addr 10.20.20.57:0/0
  -167> 2015-09-09 12:21:14.467598 7f45fc52e800  1 accepter.accepter.bind my_inst.addr is 10.20.20.57:6800/21383 need_addr=0
  -166> 2015-09-09 12:21:14.467694 7f45fc52e800  1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6801/21383 need_addr=1
  -165> 2015-09-09 12:21:14.467733 7f45fc52e800  1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6802/21383 need_addr=1
  -164> 2015-09-09 12:21:14.467757 7f45fc52e800  1 -- 10.20.20.57:0/0 learned my addr 10.20.20.57:0/0
  -163> 2015-09-09 12:21:14.467765 7f45fc52e800  1 accepter.accepter.bind my_inst.addr is 10.20.20.57:6803/21383 need_addr=0
  -162> 2015-09-09 12:21:14.469197 7f45fc52e800  1 finished global_init_daemonize
  -161> 2015-09-09 12:21:14.546672 7f45fc52e800  5 asok(0x4d90000) init /var/run/ceph/ceph-osd.24.asok
  -160> 2015-09-09 12:21:14.546785 7f45fc52e800  5 asok(0x4d90000) bind_and_listen /var/run/ceph/ceph-osd.24.asok
  -159> 2015-09-09 12:21:14.547053 7f45fc52e800  5 asok(0x4d90000) register_command 0 hook 0x4cf80a8
  -158> 2015-09-09 12:21:14.547144 7f45fc52e800  5 asok(0x4d90000) register_command version hook 0x4cf80a8
  -157> 2015-09-09 12:21:14.547167 7f45fc52e800  5 asok(0x4d90000) register_command git_version hook 0x4cf80a8
  -156> 2015-09-09 12:21:14.547173 7f45fc52e800  5 asok(0x4d90000) register_command help hook 0x4d00150
  -155> 2015-09-09 12:21:14.547188 7f45fc52e800  5 asok(0x4d90000) register_command get_command_descriptions hook 0x4d00140
  -154> 2015-09-09 12:21:14.547242 7f45fc52e800 10 monclient(hunting): build_initial_monmap
  -153> 2015-09-09 12:21:14.547240 7f45fb524700  5 asok(0x4d90000) entry start
  -152> 2015-09-09 12:21:14.565193 7f45fc52e800  5 adding auth protocol: cephx
  -151> 2015-09-09 12:21:14.565207 7f45fc52e800  5 adding auth protocol: cephx
  -150> 2015-09-09 12:21:14.565450 7f45fc52e800  5 asok(0x4d90000) register_command objecter_requests hook 0x4d00180
  -149> 2015-09-09 12:21:14.565629 7f45fc52e800  1 -- 10.20.20.57:6800/21383 messenger.start
  -148> 2015-09-09 12:21:14.565773 7f45fc52e800  1 -- :/0 messenger.start
  -147> 2015-09-09 12:21:14.565908 7f45fc52e800  1 -- 10.20.20.57:6803/21383 messenger.start
  -146> 2015-09-09 12:21:14.565953 7f45fc52e800  1 -- 0.0.0.0:6802/21383 messenger.start
  -145> 2015-09-09 12:21:14.565992 7f45fc52e800  1 -- 0.0.0.0:6801/21383 messenger.start
  -144> 2015-09-09 12:21:14.566040 7f45fc52e800  1 -- :/0 messenger.start
  -143> 2015-09-09 12:21:14.566150 7f45fc52e800  2 osd.24 0 mounting /ceph/data24 /opt/ceph/osd/24/journal
  -142> 2015-09-09 12:21:14.566343 7f45fc52e800  0 filestore(/ceph/data24) backend xfs (magic 0x58465342)
  -141> 2015-09-09 12:21:14.570087 7f45fc52e800  0 genericfilestorebackend(/ceph/data24) detect_features: FIEMAP ioctl is supported and appears to work
  -140> 2015-09-09 12:21:14.570106 7f45fc52e800  0 genericfilestorebackend(/ceph/data24) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
  -139> 2015-09-09 12:21:14.572116 7f45fc52e800  0 genericfilestorebackend(/ceph/data24) detect_features: syscall(SYS_syncfs, fd) fully supported
  -138> 2015-09-09 12:21:14.573329 7f45fc52e800  0 xfsfilestorebackend(/ceph/data24) detect_feature: extsize is supported and kernel 3.10.86-1.el6.elrepo.x86_64 >= 3.5
  -137> 2015-09-09 12:21:14.583689 7f45fc52e800  0 filestore(/ceph/data24) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
  -136> 2015-09-09 12:21:14.610357 7f45fc52e800  2 journal open /opt/ceph/osd/24/journal fsid e36e6116-6cb1-42ba-b1d8-3575a521b688 fs_op_seq 29502638
  -135> 2015-09-09 12:21:14.610473 7f45fc52e800 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
  -134> 2015-09-09 12:21:14.610499 7f45fc52e800  1 journal _open /opt/ceph/osd/24/journal fd 19: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 0
  -133> 2015-09-09 12:21:14.624089 7f45fc52e800  2 journal open advancing committed_seq 29502637 to fs op_seq 29502638
  -132> 2015-09-09 12:21:14.624216 7f45fc52e800  2 journal read_entry 1034997760 : seq 29502638 261 bytes
  -131> 2015-09-09 12:21:14.624241 7f45fc52e800  2 journal No further valid entries found, journal is most likely valid
  -130> 2015-09-09 12:21:14.624250 7f45fc52e800  2 journal No further valid entries found, journal is most likely valid
  -129> 2015-09-09 12:21:14.624254 7f45fc52e800  3 journal journal_replay: end of journal, done.
  -128> 2015-09-09 12:21:14.624277 7f45fc52e800  1 journal _open /opt/ceph/osd/24/journal fd 19: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 0
  -127> 2015-09-09 12:21:14.624744 7f45fc52e800  2 osd.24 0 boot
  -126> 2015-09-09 12:21:14.629537 7f45fc52e800  1 <cls> cls/refcount/cls_refcount.cc:231: Loaded refcount class!
  -125> 2015-09-09 12:21:14.630633 7f45fc52e800  1 <cls> cls/replica_log/cls_replica_log.cc:141: Loaded replica log class!
  -124> 2015-09-09 12:21:14.630854 7f45fc52e800  1 <cls> cls/log/cls_log.cc:312: Loaded log class!
  -123> 2015-09-09 12:21:14.631052 7f45fc52e800  1 <cls> cls/version/cls_version.cc:227: Loaded version class!
  -122> 2015-09-09 12:21:14.635552 7f45fc52e800  1 <cls> cls/rgw/cls_rgw.cc:3047: Loaded rgw class!
  -121> 2015-09-09 12:21:14.636060 7f45fc52e800  1 <cls> cls/user/cls_user.cc:367: Loaded user class!
  -120> 2015-09-09 12:21:14.636296 7f45fc52e800  1 <cls> cls/statelog/cls_statelog.cc:306: Loaded log class!
  -119> 2015-09-09 12:21:14.636510 7f45fc52e800  0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello
  -118> 2015-09-09 12:21:14.637908 7f45fc52e800  0 osd.24 27296 crush map has features 104186773504, adjusting msgr requires for clients
  -117> 2015-09-09 12:21:14.637987 7f45fc52e800  0 osd.24 27296 crush map has features 379064680448 was 8705, adjusting msgr requires for mons
  -116> 2015-09-09 12:21:14.638001 7f45fc52e800  0 osd.24 27296 crush map has features 379064680448, adjusting msgr requires for osds
  -115> 2015-09-09 12:21:14.638026 7f45fc52e800  0 osd.24 27296 load_pgs
  -114> 2015-09-09 12:21:15.498721 7f45fc52e800  5 osd.24 pg_epoch: 27152 pg[0.6(unlocked)] enter Initial
  -113> 2015-09-09 12:21:15.499238 7f45fc52e800  5 osd.24 pg_epoch: 27152 pg[0.6( empty local-les=27152 n=0 ec=1 les/c 27152/27152 27129/27149/27021) [0,24,31] r=1 lpr=0 pi=21958-27148/123 crt=0'0 inactive NOTIFY] exit Initial 0.000519 0 0.000000
  -112> 2015-09-09 12:21:15.499289 7f45fc52e800  5 osd.24 pg_epoch: 27152 pg[0.6( empty local-les=27152 n=0 ec=1 les/c 27152/27152 27129/27149/27021) [0,24,31] r=1 lpr=0 pi=21958-27148/123 crt=0'0 inactive NOTIFY] enter Reset
  -111> 2015-09-09 12:21:15.500436 7f45fc52e800  5 osd.24 pg_epoch: 27157 pg[0.8(unlocked)] enter Initial
  -110> 2015-09-09 12:21:15.500774 7f45fc52e800  5 osd.24 pg_epoch: 27157 pg[0.8( empty local-les=0 n=0 ec=1 les/c 22887/22919 27157/27157/27157) [26,39] r=-1 lpr=0 pi=19317-27156/135 crt=0'0 inactive NOTIFY] exit Initial 0.000338 0 0.000000
  -109> 2015-09-09 12:21:15.500794 7f45fc52e800  5 osd.24 pg_epoch: 27157 pg[0.8( empty local-les=0 n=0 ec=1 les/c 22887/22919 27157/27157/27157) [26,39] r=-1 lpr=0 pi=19317-27156/135 crt=0'0 inactive NOTIFY] enter Reset
  -108> 2015-09-09 12:21:15.500910 7f45fc52e800  5 osd.24 pg_epoch: 27152 pg[0.9(unlocked)] enter Initial
  -107> 2015-09-09 12:21:15.501159 7f45fc52e800  5 osd.24 pg_epoch: 27152 pg[0.9( v 1011'2 (0'0,1011'2] local-les=27144 n=0 ec=1 les/c 27144/27152 27129/27129/27129) [24,22,0] r=0 lpr=0 crt=1011'2 lcod 0'0 mlcod 0'0 inactive] exit Initial 0.000249 0 0.000000
  -106> 2015-09-09 12:21:15.501180 7f45fc52e800  5 osd.24 pg_epoch: 27152 pg[0.9( v 1011'2 (0'0,1011'2] local-les=27144 n=0 ec=1 les/c 27144/27152 27129/27129/27129) [24,22,0] r=0 lpr=0 crt=1011'2 lcod 0'0 mlcod 0'0 inactive] enter Reset
  -105> 2015-09-09 12:21:15.502371 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[0.c(unlocked)] enter Initial
  -104> 2015-09-09 12:21:15.502691 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[0.c( empty local-les=21866 n=0 ec=1 les/c 21866/21956 27277/27277/26952) [23,30] r=-1 lpr=0 pi=19317-27276/128 crt=0'0 inactive NOTIFY] exit Initial 0.000319 0 0.000000
  -103> 2015-09-09 12:21:15.502711 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[0.c( empty local-les=21866 n=0 ec=1 les/c 21866/21956 27277/27277/26952) [23,30] r=-1 lpr=0 pi=19317-27276/128 crt=0'0 inactive NOTIFY] enter Reset
  -102> 2015-09-09 12:21:15.503892 7f45fc52e800  5 osd.24 pg_epoch: 27144 pg[0.d(unlocked)] enter Initial
  -101> 2015-09-09 12:21:15.504128 7f45fc52e800  5 osd.24 pg_epoch: 27144 pg[0.d( empty local-les=27144 n=0 ec=1 les/c 26377/26419 27129/27129/27129) [24,34]/[24,34,12] r=0 lpr=0 pi=26367-27128/8 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000235 0 0.000000
  -100> 2015-09-09 12:21:15.504156 7f45fc52e800  5 osd.24 pg_epoch: 27144 pg[0.d( empty local-les=27144 n=0 ec=1 les/c 26377/26419 27129/27129/27129) [24,34]/[24,34,12] r=0 lpr=0 pi=26367-27128/8 crt=0'0 mlcod 0'0 inactive] enter Reset
   -99> 2015-09-09 12:21:15.505329 7f45fc52e800  5 osd.24 pg_epoch: 27242 pg[0.e(unlocked)] enter Initial
   -98> 2015-09-09 12:21:15.505694 7f45fc52e800  5 osd.24 pg_epoch: 27242 pg[0.e( empty local-les=21815 n=0 ec=1 les/c 23811/23869 27166/27240/27166) [27]/[27,17,24] r=2 lpr=0 pi=21805-27239/128 crt=0'0 inactive NOTIFY] exit Initial 0.000365 0 0.000000
   -97> 2015-09-09 12:21:15.505715 7f45fc52e800  5 osd.24 pg_epoch: 27242 pg[0.e( empty local-les=21815 n=0 ec=1 les/c 23811/23869 27166/27240/27166) [27]/[27,17,24] r=2 lpr=0 pi=21805-27239/128 crt=0'0 inactive NOTIFY] enter Reset
   -96> 2015-09-09 12:21:15.506944 7f45fc52e800  5 osd.24 pg_epoch: 27243 pg[0.16(unlocked)] enter Initial
   -95> 2015-09-09 12:21:15.507382 7f45fc52e800  5 osd.24 pg_epoch: 27243 pg[0.16( empty local-les=27132 n=0 ec=1 les/c 27132/27144 27241/27241/26870) [12,24] r=1 lpr=0 pi=1160-27240/238 crt=0'0 inactive NOTIFY] exit Initial 0.000438 0 0.000000
   -94> 2015-09-09 12:21:15.507403 7f45fc52e800  5 osd.24 pg_epoch: 27243 pg[0.16( empty local-les=27132 n=0 ec=1 les/c 27132/27144 27241/27241/26870) [12,24] r=1 lpr=0 pi=1160-27240/238 crt=0'0 inactive NOTIFY] enter Reset
   -93> 2015-09-09 12:21:15.507562 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[0.19(unlocked)] enter Initial
   -92> 2015-09-09 12:21:15.507851 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[0.19( empty local-les=27216 n=0 ec=1 les/c 27216/27222 27277/27277/27188) [33,30]/[33,0,24] r=2 lpr=0 pi=24433-27276/57 crt=0'0 inactive NOTIFY] exit Initial 0.000289 0 0.000000
   -91> 2015-09-09 12:21:15.507872 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[0.19( empty local-les=27216 n=0 ec=1 les/c 27216/27222 27277/27277/27188) [33,30]/[33,0,24] r=2 lpr=0 pi=24433-27276/57 crt=0'0 inactive NOTIFY] enter Reset
   -90> 2015-09-09 12:21:15.509122 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[0.1e(unlocked)] enter Initial
   -89> 2015-09-09 12:21:15.509434 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[0.1e( empty local-les=21679 n=0 ec=1 les/c 21679/21693 27129/27129/27129) [24,17,0] r=0 lpr=0 pi=21670-27128/114 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000311 0 0.000000
   -88> 2015-09-09 12:21:15.509455 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[0.1e( empty local-les=21679 n=0 ec=1 les/c 21679/21693 27129/27129/27129) [24,17,0] r=0 lpr=0 pi=21670-27128/114 crt=0'0 mlcod 0'0 inactive] enter Reset
   -87> 2015-09-09 12:21:15.509587 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[0.20(unlocked)] enter Initial
   -86> 2015-09-09 12:21:15.510006 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[0.20( empty local-les=26353 n=0 ec=1 les/c 26353/26353 27101/27129/27101) [31]/[31,24] r=1 lpr=0 pi=12327-27128/250 crt=0'0 inactive NOTIFY] exit Initial 0.000419 0 0.000000
   -85> 2015-09-09 12:21:15.510026 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[0.20( empty local-les=26353 n=0 ec=1 les/c 26353/26353 27101/27129/27101) [31]/[31,24] r=1 lpr=0 pi=12327-27128/250 crt=0'0 inactive NOTIFY] enter Reset
   -84> 2015-09-09 12:21:15.511231 7f45fc52e800  5 osd.24 pg_epoch: 27179 pg[0.22(unlocked)] enter Initial
   -83> 2015-09-09 12:21:15.511479 7f45fc52e800  5 osd.24 pg_epoch: 27179 pg[0.22( v 1013'2 (0'0,1013'2] local-les=27179 n=0 ec=1 les/c 27179/27179 27129/27176/27176) [24,0]/[24,0,12] r=0 lpr=0 crt=1013'2 lcod 0'0 mlcod 0'0 inactive] exit Initial 0.000248 0 0.000000
   -82> 2015-09-09 12:21:15.511501 7f45fc52e800  5 osd.24 pg_epoch: 27179 pg[0.22( v 1013'2 (0'0,1013'2] local-les=27179 n=0 ec=1 les/c 27179/27179 27129/27176/27176) [24,0]/[24,0,12] r=0 lpr=0 crt=1013'2 lcod 0'0 mlcod 0'0 inactive] enter Reset
   -81> 2015-09-09 12:21:15.511619 7f45fc52e800  5 osd.24 pg_epoch: 27144 pg[0.26(unlocked)] enter Initial
   -80> 2015-09-09 12:21:15.511846 7f45fc52e800  5 osd.24 pg_epoch: 27144 pg[0.26( empty local-les=27142 n=0 ec=1 les/c 27142/27144 27129/27134/26866) [6,24]/[6,24,12] r=1 lpr=0 pi=26557-27133/11 crt=0'0 inactive NOTIFY] exit Initial 0.000227 0 0.000000
   -79> 2015-09-09 12:21:15.511866 7f45fc52e800  5 osd.24 pg_epoch: 27144 pg[0.26( empty local-les=27142 n=0 ec=1 les/c 27142/27144 27129/27134/26866) [6,24]/[6,24,12] r=1 lpr=0 pi=26557-27133/11 crt=0'0 inactive NOTIFY] enter Reset
   -78> 2015-09-09 12:21:15.513164 7f45fc52e800  5 osd.24 pg_epoch: 27185 pg[0.29(unlocked)] enter Initial
   -77> 2015-09-09 12:21:15.513418 7f45fc52e800  5 osd.24 pg_epoch: 27185 pg[0.29( empty local-les=27185 n=0 ec=1 les/c 27185/27185 27129/27184/26628) [17,24]/[17,24,27] r=1 lpr=0 pi=24339-27183/59 crt=0'0 inactive NOTIFY] exit Initial 0.000253 0 0.000000
   -76> 2015-09-09 12:21:15.513438 7f45fc52e800  5 osd.24 pg_epoch: 27185 pg[0.29( empty local-les=27185 n=0 ec=1 les/c 27185/27185 27129/27184/26628) [17,24]/[17,24,27] r=1 lpr=0 pi=24339-27183/59 crt=0'0 inactive NOTIFY] enter Reset
   -75> 2015-09-09 12:21:15.514680 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[0.2f(unlocked)] enter Initial
   -74> 2015-09-09 12:21:15.514912 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[0.2f( empty local-les=27148 n=0 ec=1 les/c 27148/27148 27129/27139/27139) [24,8,0] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000231 0 0.000000
   -73> 2015-09-09 12:21:15.514932 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[0.2f( empty local-les=27148 n=0 ec=1 les/c 27148/27148 27129/27139/27139) [24,8,0] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive] enter Reset
   -72> 2015-09-09 12:21:15.515048 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[0.37(unlocked)] enter Initial
   -71> 2015-09-09 12:21:15.515379 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[0.37( empty local-les=21025 n=0 ec=1 les/c 21025/21051 27129/27129/27129) [24,0] r=0 lpr=0 pi=21019-27128/120 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000331 0 0.000000
   -70> 2015-09-09 12:21:15.515399 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[0.37( empty local-les=21025 n=0 ec=1 les/c 21025/21051 27129/27129/27129) [24,0] r=0 lpr=0 pi=21019-27128/120 crt=0'0 mlcod 0'0 inactive] enter Reset
   -69> 2015-09-09 12:21:15.515551 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[1.4(unlocked)] enter Initial
   -68> 2015-09-09 12:21:15.519086 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[1.4( v 1210'395 (0'0,1210'395] local-les=27148 n=0 ec=1 les/c 27148/27148 27129/27138/27138) [24,8,0] r=0 lpr=0 crt=1210'395 lcod 0'0 mlcod 0'0 inactive] exit Initial 0.003535 0 0.000000
   -67> 2015-09-09 12:21:15.519109 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[1.4( v 1210'395 (0'0,1210'395] local-les=27148 n=0 ec=1 les/c 27148/27148 27129/27138/27138) [24,8,0] r=0 lpr=0 crt=1210'395 lcod 0'0 mlcod 0'0 inactive] enter Reset
   -66> 2015-09-09 12:21:15.520395 7f45fc52e800  5 osd.24 pg_epoch: 27128 pg[1.5(unlocked)] enter Initial
   -65> 2015-09-09 12:21:15.520750 7f45fc52e800  5 osd.24 pg_epoch: 27128 pg[1.5( empty local-les=0 n=0 ec=1 les/c 23218/23218 27094/27094/26949) [22,8,31] r=-1 lpr=0 pi=12246-27093/173 crt=0'0 inactive NOTIFY] exit Initial 0.000354 0 0.000000
   -64> 2015-09-09 12:21:15.520770 7f45fc52e800  5 osd.24 pg_epoch: 27128 pg[1.5( empty local-les=0 n=0 ec=1 les/c 23218/23218 27094/27094/26949) [22,8,31] r=-1 lpr=0 pi=12246-27093/173 crt=0'0 inactive NOTIFY] enter Reset
   -63> 2015-09-09 12:21:15.520896 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[1.11(unlocked)] enter Initial
   -62> 2015-09-09 12:21:15.521563 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[1.11( v 1210'53 (0'0,1210'53] local-les=26683 n=0 ec=1 les/c 26683/26684 27129/27129/27129) [24,8] r=0 lpr=0 pi=26681-27128/5 crt=1210'53 lcod 0'0 mlcod 0'0 inactive] exit Initial 0.000667 0 0.000000
   -61> 2015-09-09 12:21:15.521588 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[1.11( v 1210'53 (0'0,1210'53] local-les=26683 n=0 ec=1 les/c 26683/26684 27129/27129/27129) [24,8] r=0 lpr=0 pi=26681-27128/5 crt=1210'53 lcod 0'0 mlcod 0'0 inactive] enter Reset
   -60> 2015-09-09 12:21:15.522854 7f45fc52e800  5 osd.24 pg_epoch: 27166 pg[1.18(unlocked)] enter Initial
   -59> 2015-09-09 12:21:15.523777 7f45fc52e800  5 osd.24 pg_epoch: 27166 pg[1.18( v 1210'65 (0'0,1210'65] local-les=20964 n=0 ec=1 les/c 16735/16735 27166/27166/27166) [27] r=-1 lpr=0 pi=16733-27165/120 crt=1210'65 lcod 0'0 inactive NOTIFY] exit Initial 0.000923 0 0.000000
   -58> 2015-09-09 12:21:15.523800 7f45fc52e800  5 osd.24 pg_epoch: 27166 pg[1.18( v 1210'65 (0'0,1210'65] local-les=20964 n=0 ec=1 les/c 16735/16735 27166/27166/27166) [27] r=-1 lpr=0 pi=16733-27165/120 crt=1210'65 lcod 0'0 inactive NOTIFY] enter Reset
   -57> 2015-09-09 12:21:15.523930 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[1.19(unlocked)] enter Initial
   -56> 2015-09-09 12:21:15.524708 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[1.19( v 1210'64 (0'0,1210'64] local-les=27148 n=0 ec=1 les/c 27148/27148 27129/27147/27147) [24,0]/[24,0,12] r=0 lpr=0 crt=1210'64 lcod 0'0 mlcod 0'0 inactive] exit Initial 0.000777 0 0.000000
   -55> 2015-09-09 12:21:15.524729 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[1.19( v 1210'64 (0'0,1210'64] local-les=27148 n=0 ec=1 les/c 27148/27148 27129/27147/27147) [24,0]/[24,0,12] r=0 lpr=0 crt=1210'64 lcod 0'0 mlcod 0'0 inactive] enter Reset
   -54> 2015-09-09 12:21:15.525931 7f45fc52e800  5 osd.24 pg_epoch: 27222 pg[1.26(unlocked)] enter Initial
   -53> 2015-09-09 12:21:15.526966 7f45fc52e800  5 osd.24 pg_epoch: 27222 pg[1.26( v 1210'83 (0'0,1210'83] local-les=27209 n=0 ec=1 les/c 27209/27216 27188/27188/27188) [33,6,24] r=2 lpr=0 pi=21064-27187/125 crt=1210'83 lcod 0'0 inactive NOTIFY] exit Initial 0.001035 0 0.000000
   -52> 2015-09-09 12:21:15.526989 7f45fc52e800  5 osd.24 pg_epoch: 27222 pg[1.26( v 1210'83 (0'0,1210'83] local-les=27209 n=0 ec=1 les/c 27209/27216 27188/27188/27188) [33,6,24] r=2 lpr=0 pi=21064-27187/125 crt=1210'83 lcod 0'0 inactive NOTIFY] enter Reset
   -51> 2015-09-09 12:21:15.527107 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[1.27(unlocked)] enter Initial
   -50> 2015-09-09 12:21:15.530800 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[1.27( v 1210'409 (0'0,1210'409] local-les=26419 n=0 ec=1 les/c 26419/26419 27129/27129/26973) [34,24,12] r=1 lpr=0 pi=14412-27128/126 crt=1210'409 lcod 0'0 inactive NOTIFY] exit Initial 0.003692 0 0.000000
   -49> 2015-09-09 12:21:15.530822 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[1.27( v 1210'409 (0'0,1210'409] local-les=26419 n=0 ec=1 les/c 26419/26419 27129/27129/26973) [34,24,12] r=1 lpr=0 pi=14412-27128/126 crt=1210'409 lcod 0'0 inactive NOTIFY] enter Reset
   -48> 2015-09-09 12:21:15.532098 7f45fc52e800  5 osd.24 pg_epoch: 27268 pg[1.28(unlocked)] enter Initial
   -47> 2015-09-09 12:21:15.534189 7f45fc52e800  5 osd.24 pg_epoch: 27268 pg[1.28( v 1210'212 (0'0,1210'212] local-les=26211 n=0 ec=1 les/c 26211/26235 26293/26640/26640) [] r=-1 lpr=0 pi=24339-26639/58 crt=1210'212 lcod 0'0 inactive NOTIFY] exit Initial 0.002091 0 0.000000
   -46> 2015-09-09 12:21:15.534212 7f45fc52e800  5 osd.24 pg_epoch: 27268 pg[1.28( v 1210'212 (0'0,1210'212] local-les=26211 n=0 ec=1 les/c 26211/26235 26293/26640/26640) [] r=-1 lpr=0 pi=24339-26639/58 crt=1210'212 lcod 0'0 inactive NOTIFY] enter Reset
   -45> 2015-09-09 12:21:15.534333 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[1.34(unlocked)] enter Initial
   -44> 2015-09-09 12:21:15.540416 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[1.34( v 1210'691 (0'0,1210'691] local-les=21817 n=0 ec=1 les/c 23413/23422 27129/27129/26952) [23,8,24] r=2 lpr=0 pi=20850-27128/144 crt=1210'691 lcod 0'0 inactive NOTIFY] exit Initial 0.006082 0 0.000000
   -43> 2015-09-09 12:21:15.540439 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[1.34( v 1210'691 (0'0,1210'691] local-les=21817 n=0 ec=1 les/c 23413/23422 27129/27129/26952) [23,8,24] r=2 lpr=0 pi=20850-27128/144 crt=1210'691 lcod 0'0 inactive NOTIFY] enter Reset
   -42> 2015-09-09 12:21:15.540564 7f45fc52e800  5 osd.24 pg_epoch: 27243 pg[1.3f(unlocked)] enter Initial
   -41> 2015-09-09 12:21:15.544296 7f45fc52e800  5 osd.24 pg_epoch: 27243 pg[1.3f( v 1210'401 (0'0,1210'401] local-les=21011 n=0 ec=1 les/c 21011/21026 27241/27241/27241) [24] r=0 lpr=0 pi=13893-27240/152 crt=1210'401 lcod 0'0 mlcod 0'0 inactive] exit Initial 0.003732 0 0.000000
   -40> 2015-09-09 12:21:15.544319 7f45fc52e800  5 osd.24 pg_epoch: 27243 pg[1.3f( v 1210'401 (0'0,1210'401] local-les=21011 n=0 ec=1 les/c 21011/21026 27241/27241/27241) [24] r=0 lpr=0 pi=13893-27240/152 crt=1210'401 lcod 0'0 mlcod 0'0 inactive] enter Reset
   -39> 2015-09-09 12:21:15.544440 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[2.1(unlocked)] enter Initial
   -38> 2015-09-09 12:21:15.544797 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[2.1( empty local-les=21855 n=0 ec=1 les/c 21855/21923 27277/27277/27277) [30,23] r=-1 lpr=0 pi=5097-27276/195 crt=0'0 inactive NOTIFY] exit Initial 0.000357 0 0.000000
   -37> 2015-09-09 12:21:15.544818 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[2.1( empty local-les=21855 n=0 ec=1 les/c 21855/21923 27277/27277/27277) [30,23] r=-1 lpr=0 pi=5097-27276/195 crt=0'0 inactive NOTIFY] enter Reset
   -36> 2015-09-09 12:21:15.546025 7f45fc52e800  5 osd.24 pg_epoch: 27237 pg[2.3(unlocked)] enter Initial
   -35> 2015-09-09 12:21:15.546349 7f45fc52e800  5 osd.24 pg_epoch: 27237 pg[2.3( empty local-les=27237 n=0 ec=1 les/c 27009/27031 27157/27174/27157) [26]/[26,34,24] r=2 lpr=0 pi=21189-27173/108 crt=0'0 inactive NOTIFY] exit Initial 0.000323 0 0.000000
   -34> 2015-09-09 12:21:15.546369 7f45fc52e800  5 osd.24 pg_epoch: 27237 pg[2.3( empty local-les=27237 n=0 ec=1 les/c 27009/27031 27157/27174/27157) [26]/[26,34,24] r=2 lpr=0 pi=21189-27173/108 crt=0'0 inactive NOTIFY] enter Reset
   -33> 2015-09-09 12:21:15.546482 7f45fc52e800  5 osd.24 pg_epoch: 27157 pg[2.4(unlocked)] enter Initial
   -32> 2015-09-09 12:21:15.546790 7f45fc52e800  5 osd.24 pg_epoch: 27157 pg[2.4( empty local-les=21601 n=0 ec=1 les/c 21601/21644 27157/27157/27157) [26,39] r=-1 lpr=0 pi=13846-27156/134 crt=0'0 inactive NOTIFY] exit Initial 0.000308 0 0.000000
   -31> 2015-09-09 12:21:15.546810 7f45fc52e800  5 osd.24 pg_epoch: 27157 pg[2.4( empty local-les=21601 n=0 ec=1 les/c 21601/21644 27157/27157/27157) [26,39] r=-1 lpr=0 pi=13846-27156/134 crt=0'0 inactive NOTIFY] enter Reset
   -30> 2015-09-09 12:21:15.546921 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[2.7(unlocked)] enter Initial
   -29> 2015-09-09 12:21:15.547127 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[2.7( empty local-les=27135 n=0 ec=1 les/c 27135/27148 27129/27129/27129) [24,0,8] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000205 0 0.000000
   -28> 2015-09-09 12:21:15.547153 7f45fc52e800  5 osd.24 pg_epoch: 27148 pg[2.7( empty local-les=27135 n=0 ec=1 les/c 27135/27148 27129/27129/27129) [24,0,8] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive] enter Reset
   -27> 2015-09-09 12:21:15.548351 7f45fc52e800  5 osd.24 pg_epoch: 27169 pg[2.e(unlocked)] enter Initial
   -26> 2015-09-09 12:21:15.548688 7f45fc52e800  5 osd.24 pg_epoch: 27169 pg[2.e( empty local-les=21304 n=0 ec=1 les/c 19381/19392 26748/26748/26748) [17] r=-1 lpr=0 pi=19364-26747/152 crt=0'0 inactive NOTIFY] exit Initial 0.000336 0 0.000000
   -25> 2015-09-09 12:21:15.548708 7f45fc52e800  5 osd.24 pg_epoch: 27169 pg[2.e( empty local-les=21304 n=0 ec=1 les/c 19381/19392 26748/26748/26748) [17] r=-1 lpr=0 pi=19364-26747/152 crt=0'0 inactive NOTIFY] enter Reset
   -24> 2015-09-09 12:21:15.548823 7f45fc52e800  5 osd.24 pg_epoch: 27128 pg[2.10(unlocked)] enter Initial
   -23> 2015-09-09 12:21:15.549172 7f45fc52e800  5 osd.24 pg_epoch: 27128 pg[2.10( empty local-les=21033 n=0 ec=1 les/c 21033/21033 27094/27094/26628) [17,31] r=-1 lpr=0 pi=19260-27093/148 crt=0'0 inactive NOTIFY] exit Initial 0.000349 0 0.000000
   -22> 2015-09-09 12:21:15.549192 7f45fc52e800  5 osd.24 pg_epoch: 27128 pg[2.10( empty local-les=21033 n=0 ec=1 les/c 21033/21033 27094/27094/26628) [17,31] r=-1 lpr=0 pi=19260-27093/148 crt=0'0 inactive NOTIFY] enter Reset
   -21> 2015-09-09 12:21:15.549345 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[2.12(unlocked)] enter Initial
   -20> 2015-09-09 12:21:15.549749 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[2.12( empty local-les=21887 n=0 ec=1 les/c 23820/23870 27129/27129/27129) [24,17] r=0 lpr=0 pi=5118-27128/204 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000403 0 0.000000
   -19> 2015-09-09 12:21:15.549769 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[2.12( empty local-les=21887 n=0 ec=1 les/c 23820/23870 27129/27129/27129) [24,17] r=0 lpr=0 pi=5118-27128/204 crt=0'0 mlcod 0'0 inactive] enter Reset
   -18> 2015-09-09 12:21:15.549896 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[2.18(unlocked)] enter Initial
   -17> 2015-09-09 12:21:15.550294 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[2.18( empty local-les=21837 n=0 ec=1 les/c 22338/22343 27277/27277/27277) [30,17,22] r=-1 lpr=0 pi=6160-27276/181 crt=0'0 inactive NOTIFY] exit Initial 0.000398 0 0.000000
   -16> 2015-09-09 12:21:15.550315 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[2.18( empty local-les=21837 n=0 ec=1 les/c 22338/22343 27277/27277/27277) [30,17,22] r=-1 lpr=0 pi=6160-27276/181 crt=0'0 inactive NOTIFY] enter Reset
   -15> 2015-09-09 12:21:15.550434 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[2.1f(unlocked)] enter Initial
   -14> 2015-09-09 12:21:15.550647 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[2.1f( empty local-les=26684 n=0 ec=1 les/c 26317/26334 27129/27277/27129) [24]/[24,34,30] r=0 lpr=0 pi=26306-27276/8 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000213 0 0.000000
   -13> 2015-09-09 12:21:15.550667 7f45fc52e800  5 osd.24 pg_epoch: 27279 pg[2.1f( empty local-les=26684 n=0 ec=1 les/c 26317/26334 27129/27277/27129) [24]/[24,34,30] r=0 lpr=0 pi=26306-27276/8 crt=0'0 mlcod 0'0 inactive] enter Reset
   -12> 2015-09-09 12:21:15.550813 7f45fc52e800  5 osd.24 pg_epoch: 27157 pg[2.2a(unlocked)] enter Initial
   -11> 2015-09-09 12:21:15.551090 7f45fc52e800  5 osd.24 pg_epoch: 27157 pg[2.2a( empty local-les=26101 n=0 ec=1 les/c 26101/26101 27157/27157/27077) [39,26,34] r=-1 lpr=0 pi=24460-27156/82 crt=0'0 inactive NOTIFY] exit Initial 0.000277 0 0.000000
   -10> 2015-09-09 12:21:15.551110 7f45fc52e800  5 osd.24 pg_epoch: 27157 pg[2.2a( empty local-les=26101 n=0 ec=1 les/c 26101/26101 27157/27157/27077) [39,26,34] r=-1 lpr=0 pi=24460-27156/82 crt=0'0 inactive NOTIFY] enter Reset
    -9> 2015-09-09 12:21:15.551260 7f45fc52e800  5 osd.24 pg_epoch: 27268 pg[2.2d(unlocked)] enter Initial
    -8> 2015-09-09 12:21:15.551463 7f45fc52e800  5 osd.24 pg_epoch: 27268 pg[2.2d( empty local-les=27266 n=0 ec=1 les/c 27266/27268 27241/27260/27241) [24]/[24,12,27] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000203 0 0.000000
    -7> 2015-09-09 12:21:15.551481 7f45fc52e800  5 osd.24 pg_epoch: 27268 pg[2.2d( empty local-les=27266 n=0 ec=1 les/c 27266/27268 27241/27260/27241) [24]/[24,12,27] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive] enter Reset
    -6> 2015-09-09 12:21:15.551595 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[2.2e(unlocked)] enter Initial
    -5> 2015-09-09 12:21:15.551882 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[2.2e( empty local-les=21918 n=0 ec=1 les/c 22911/22919 27129/27129/27129) [24,6]/[24,6,31] r=0 lpr=0 pi=21916-27128/113 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000287 0 0.000000
    -4> 2015-09-09 12:21:15.551903 7f45fc52e800  5 osd.24 pg_epoch: 27129 pg[2.2e( empty local-les=21918 n=0 ec=1 les/c 22911/22919 27129/27129/27129) [24,6]/[24,6,31] r=0 lpr=0 pi=21916-27128/113 crt=0'0 mlcod 0'0 inactive] enter Reset
    -3> 2015-09-09 12:21:15.553077 7f45fc52e800  5 osd.24 pg_epoch: 27283 pg[2.37(unlocked)] enter Initial
    -2> 2015-09-09 12:21:15.553328 7f45fc52e800  5 osd.24 pg_epoch: 27283 pg[2.37( empty local-les=26684 n=0 ec=1 les/c 26317/26334 27277/27277/27129) [24,30]/[24,30,34] r=0 lpr=0 pi=26310-27276/8 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000251 0 0.000000
    -1> 2015-09-09 12:21:15.553349 7f45fc52e800  5 osd.24 pg_epoch: 27283 pg[2.37( empty local-les=26684 n=0 ec=1 les/c 26317/26334 27277/27277/27129) [24,30]/[24,30,34] r=0 lpr=0 pi=26310-27276/8 crt=0'0 mlcod 0'0 inactive] enter Reset
     0> 2015-09-09 12:21:15.554914 7f45fc52e800 -1 osd/PG.cc: In function 'static epoch_t PG::peek_map_epoch(ObjectStore*, spg_t, ceph::bufferlist*)' thread 7f45fc52e800 time 2015-09-09 12:21:15.553449
osd/PG.cc: 2864: FAILED assert(values.size() == 1)

 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: (PG::peek_map_epoch(ObjectStore*, spg_t, ceph::buffer::list*)+0x803) [0x826d73]
 2: (OSD::load_pgs()+0x1506) [0x6697a6]
 3: (OSD::init()+0x174e) [0x68a89e]
 4: (main()+0x384f) [0x62e2cf]
 5: (__libc_start_main()+0xfd) [0x3dbd81ed5d]
 6: /usr/bin/ceph-osd() [0x6299d9]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_replay
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 xio
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.24.log
--- end dump of recent events ---
2015-09-09 12:21:15.559266 7f45fc52e800 -1 *** Caught signal (Aborted) **
 in thread 7f45fc52e800

 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: /usr/bin/ceph-osd() [0xa48445]
 2: /lib64/libpthread.so.0() [0x3dbdc0f710]
 3: (gsignal()+0x35) [0x3dbd832625]
 4: (abort()+0x175) [0x3dbd833e05]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x3fdf2bea7d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
     0> 2015-09-09 12:21:15.559266 7f45fc52e800 -1 *** Caught signal (Aborted) **
 in thread 7f45fc52e800

 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: /usr/bin/ceph-osd() [0xa48445]
 2: /lib64/libpthread.so.0() [0x3dbdc0f710]
 3: (gsignal()+0x35) [0x3dbd832625]
 4: (abort()+0x175) [0x3dbd833e05]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x3fdf2bea7d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_replay
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 xio
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.24.log
--- end dump of recent events ---

[-- Attachment #3: cli.24.log --]
[-- Type: application/octet-stream, Size: 4234 bytes --]

starting osd.24 at :/0 osd_data /ceph/data24 /opt/ceph/osd/24/journal
2015-09-09 12:28:30.146367 7fc36bee4800 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
osd/PG.cc: In function 'static epoch_t PG::peek_map_epoch(ObjectStore*, spg_t, ceph::bufferlist*)' thread 7fc36bee4800 time 2015-09-09 12:28:31.064297
osd/PG.cc: 2864: FAILED assert(values.size() == 1)
 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: (PG::peek_map_epoch(ObjectStore*, spg_t, ceph::buffer::list*)+0x803) [0x826d73]
 2: (OSD::load_pgs()+0x1506) [0x6697a6]
 3: (OSD::init()+0x174e) [0x68a89e]
 4: (main()+0x384f) [0x62e2cf]
 5: (__libc_start_main()+0xfd) [0x3dbd81ed5d]
 6: /usr/bin/ceph-osd() [0x6299d9]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2015-09-09 12:28:31.065935 7fc36bee4800 -1 osd/PG.cc: In function 'static epoch_t PG::peek_map_epoch(ObjectStore*, spg_t, ceph::bufferlist*)' thread 7fc36bee4800 time 2015-09-09 12:28:31.064297
osd/PG.cc: 2864: FAILED assert(values.size() == 1)

 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: (PG::peek_map_epoch(ObjectStore*, spg_t, ceph::buffer::list*)+0x803) [0x826d73]
 2: (OSD::load_pgs()+0x1506) [0x6697a6]
 3: (OSD::init()+0x174e) [0x68a89e]
 4: (main()+0x384f) [0x62e2cf]
 5: (__libc_start_main()+0xfd) [0x3dbd81ed5d]
 6: /usr/bin/ceph-osd() [0x6299d9]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

  -135> 2015-09-09 12:28:30.146367 7fc36bee4800 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
     0> 2015-09-09 12:28:31.065935 7fc36bee4800 -1 osd/PG.cc: In function 'static epoch_t PG::peek_map_epoch(ObjectStore*, spg_t, ceph::bufferlist*)' thread 7fc36bee4800 time 2015-09-09 12:28:31.064297
osd/PG.cc: 2864: FAILED assert(values.size() == 1)

 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: (PG::peek_map_epoch(ObjectStore*, spg_t, ceph::buffer::list*)+0x803) [0x826d73]
 2: (OSD::load_pgs()+0x1506) [0x6697a6]
 3: (OSD::init()+0x174e) [0x68a89e]
 4: (main()+0x384f) [0x62e2cf]
 5: (__libc_start_main()+0xfd) [0x3dbd81ed5d]
 6: /usr/bin/ceph-osd() [0x6299d9]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

terminate called after throwing an instance of 'ceph::FailedAssertion'
*** Caught signal (Aborted) **
 in thread 7fc36bee4800
tcmalloc: large alloc 0 bytes == (nil) @  0x3fdfe36c74 0xafceab 0xa485d9 0x3dbdc0f710 0x3dbd832625 0x3dbd833e05 0x3fdf2bea7d 0x3fdf2bcbd6 0x3fdf2bcc03 0x3fdf2bcd22 0xb1b19a 0x826d73 0x6697a6 0x68a89e 0x62e2cf 0x3dbd81ed5d 0x6299d9
 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: /usr/bin/ceph-osd() [0xa48445]
 2: /lib64/libpthread.so.0() [0x3dbdc0f710]
 3: (gsignal()+0x35) [0x3dbd832625]
 4: (abort()+0x175) [0x3dbd833e05]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x3fdf2bea7d]
tcmalloc: large alloc 0 bytes == (nil) @  0x3fdfe36c74 0xafceab 0xa487ef 0x3dbdc0f710 0x3dbd832625 0x3dbd833e05 0x3fdf2bea7d 0x3fdf2bcbd6 0x3fdf2bcc03 0x3fdf2bcd22 0xb1b19a 0x826d73 0x6697a6 0x68a89e 0x62e2cf 0x3dbd81ed5d 0x6299d9
2015-09-09 12:28:31.070844 7fc36bee4800 -1 *** Caught signal (Aborted) **
 in thread 7fc36bee4800

 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: /usr/bin/ceph-osd() [0xa48445]
 2: /lib64/libpthread.so.0() [0x3dbdc0f710]
 3: (gsignal()+0x35) [0x3dbd832625]
 4: (abort()+0x175) [0x3dbd833e05]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x3fdf2bea7d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

     0> 2015-09-09 12:28:31.070844 7fc36bee4800 -1 *** Caught signal (Aborted) **
 in thread 7fc36bee4800

 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: /usr/bin/ceph-osd() [0xa48445]
 2: /lib64/libpthread.so.0() [0x3dbdc0f710]
 3: (gsignal()+0x35) [0x3dbd832625]
 4: (abort()+0x175) [0x3dbd833e05]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x3fdf2bea7d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Aborted (core dumped)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Failed on starting osd-daemon after upgrade giant-0.87.1 to hammer-0.94.3
  2015-09-09 12:15 Failed on starting osd-daemon after upgrade giant-0.87.1 to hammer-0.94.3 王锐
@ 2015-09-10 22:23 ` Sage Weil
       [not found]   ` <tencent_7EDD2DF72EF83A9470528405@qq.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2015-09-10 22:23 UTC (permalink / raw)
  To: 王锐; +Cc: ceph-devel

Hi!

On Wed, 9 Sep 2015, ?? wrote:
> Hi all:
> 
> I got on error after upgrade my ceph cluster from giant-0.87.2 to hammer-0.94.3, my local environment is:
> CentOS 6.7 x86_64
> Kernel 3.10.86-1.el6.elrepo.x86_64
> HDD: XFS, 2TB
> Install Package: ceph.com official RPMs x86_64
> 
> step 1: 
> Upgrade MON server from 0.87.1 to 0.94.3, all is fine!
> 
> step 2: 
> Upgrade OSD server from 0.87.1 to 0.94.3. i just upgrade two servers and noticed that some osds can not started!
> server-1 have 4 osds, all of them can not started;
> server-2 have 3 osds, 2 of them can not started, but 1 of them successfully started and work in good.
> 
> Error log 1:
> service ceph start osd.4
> /var/log/ceph/ceph-osd.24.log 
> (attachment file: ceph.24.log)
> 
> Error log 2:
> /usr/bin/ceph-osd -c /etc/ceph/ceph.conf -i 4 -f
>  (attachment file: cli.24.log)

This looks a lot like a problem with a stray directory that older versions 
did not clean up (#11429)... but not quite.  Have you deleted pools in the 
past? (Can you attach a 'ceph osd dump'?)?  Also, i fyou start the osd 
with 'debug osd = 20' and 'debug filestore = 20' we can see which PG is 
problematic.  If you install the 'ceph-test' package which contains 
ceph-kvstore-tool, the output of 

 ceph-kvstore-tool /var/lib/ceph/osd/ceph-$id/current/db list

would also be helpful.

Thanks!
sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Failed on starting osd-daemon after upgrade giant-0.87.1 tohammer-0.94.3
       [not found]   ` <tencent_7EDD2DF72EF83A9470528405@qq.com>
@ 2015-09-11 12:56     ` Sage Weil
  2015-09-11 13:53       ` Haomai Wang
       [not found]       ` <CACJqLyYMw60oWcX-GQKTQ6d4Rz8f5gNsXeANOrHCcsWJnzEpkQ@mail.gmail.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Sage Weil @ 2015-09-11 12:56 UTC (permalink / raw)
  To: 王锐; +Cc: ceph-devel

On Fri, 11 Sep 2015, ?? wrote:
> Thank Sage Weil:
> 
> 1. I delete some testing pools in the past, but is was a long time ago (may be 2 months ago), in recently upgrade, do not delete pools.
> 2.  ceph osd dump please see the (attachment file ceph.osd.dump.log)
> 3. debug osd = 20' and 'debug filestore = 20  (attachment file ceph.osd.5.log.tar.gz)

This one is failing on pool 54, which has been deleted.  In this case you 
can work around it by renaming current/54.* out of the way.

> 4. i install the ceph-test, but output error
> ceph-kvstore-tool /ceph/data5/current/db list 
> Invalid argument: /ceph/data5/current/db: does not exist (create_if_missing is false)

Sorry, I should have said current/omap, not current/db.  I'm still curious 
to see the key dump.  I'm not sure why the leveldb key for these pgs is 
missing...

Thanks!
sage


> 
> ls -l /ceph/data5/current/db
> total 0
> -rw-r--r-- 1 root root 0 Sep 11 09:41 LOCK
> -rw-r--r-- 1 root root 0 Sep 11 09:54 LOG
> -rw-r--r-- 1 root root 0 Sep 11 09:54 LOG.old
> 
> Thanks very much!
> Wang Rui 
>  
> ------------------ Original ------------------
> From:  "Sage Weil"<sage@newdream.net>;
> Date:  Fri, Sep 11, 2015 06:23 AM
> To:  "??"<wangrui@tvmining.com>;
> Cc:  "ceph-devel"<ceph-devel@vger.kernel.org>;
> Subject:  Re: Failed on starting osd-daemon after upgrade giant-0.87.1 tohammer-0.94.3
>  
> Hi!
> 
> On Wed, 9 Sep 2015, ?? wrote:
> > Hi all:
> > 
> > I got on error after upgrade my ceph cluster from giant-0.87.2 to hammer-0.94.3, my local environment is:
> > CentOS 6.7 x86_64
> > Kernel 3.10.86-1.el6.elrepo.x86_64
> > HDD: XFS, 2TB
> > Install Package: ceph.com official RPMs x86_64
> > 
> > step 1: 
> > Upgrade MON server from 0.87.1 to 0.94.3, all is fine!
> > 
> > step 2: 
> > Upgrade OSD server from 0.87.1 to 0.94.3. i just upgrade two servers and noticed that some osds can not started!
> > server-1 have 4 osds, all of them can not started;
> > server-2 have 3 osds, 2 of them can not started, but 1 of them successfully started and work in good.
> > 
> > Error log 1:
> > service ceph start osd.4
> > /var/log/ceph/ceph-osd.24.log 
> > (attachment file: ceph.24.log)
> > 
> > Error log 2:
> > /usr/bin/ceph-osd -c /etc/ceph/ceph.conf -i 4 -f
> >  (attachment file: cli.24.log)
> 
> This looks a lot like a problem with a stray directory that older versions 
> did not clean up (#11429)... but not quite.  Have you deleted pools in the 
> past? (Can you attach a 'ceph osd dump'?)?  Also, i fyou start the osd 
> with 'debug osd = 20' and 'debug filestore = 20' we can see which PG is 
> problematic.  If you install the 'ceph-test' package which contains 
> ceph-kvstore-tool, the output of 
> 
>  ceph-kvstore-tool /var/lib/ceph/osd/ceph-$id/current/db list
> 
> would also be helpful.
> 
> Thanks!
> sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Failed on starting osd-daemon after upgrade giant-0.87.1 tohammer-0.94.3
  2015-09-11 12:56     ` Failed on starting osd-daemon after upgrade giant-0.87.1 tohammer-0.94.3 Sage Weil
@ 2015-09-11 13:53       ` Haomai Wang
       [not found]       ` <CACJqLyYMw60oWcX-GQKTQ6d4Rz8f5gNsXeANOrHCcsWJnzEpkQ@mail.gmail.com>
  1 sibling, 0 replies; 7+ messages in thread
From: Haomai Wang @ 2015-09-11 13:53 UTC (permalink / raw)
  To: Sage Weil; +Cc: 王锐, ceph-devel

Yesterday I have a chat with wangrui and the reason is "infos"(legacy
oid) is missing. I'm not sure why it's missing.

PS: resend again because of plain text

On Fri, Sep 11, 2015 at 8:56 PM, Sage Weil <sage@newdream.net> wrote:
> On Fri, 11 Sep 2015, ?? wrote:
>> Thank Sage Weil:
>>
>> 1. I delete some testing pools in the past, but is was a long time ago (may be 2 months ago), in recently upgrade, do not delete pools.
>> 2.  ceph osd dump please see the (attachment file ceph.osd.dump.log)
>> 3. debug osd = 20' and 'debug filestore = 20  (attachment file ceph.osd.5.log.tar.gz)
>
> This one is failing on pool 54, which has been deleted.  In this case you
> can work around it by renaming current/54.* out of the way.
>
>> 4. i install the ceph-test, but output error
>> ceph-kvstore-tool /ceph/data5/current/db list
>> Invalid argument: /ceph/data5/current/db: does not exist (create_if_missing is false)
>
> Sorry, I should have said current/omap, not current/db.  I'm still curious
> to see the key dump.  I'm not sure why the leveldb key for these pgs is
> missing...
>
> Thanks!
> sage
>
>
>>
>> ls -l /ceph/data5/current/db
>> total 0
>> -rw-r--r-- 1 root root 0 Sep 11 09:41 LOCK
>> -rw-r--r-- 1 root root 0 Sep 11 09:54 LOG
>> -rw-r--r-- 1 root root 0 Sep 11 09:54 LOG.old
>>
>> Thanks very much!
>> Wang Rui
>>
>> ------------------ Original ------------------
>> From:  "Sage Weil"<sage@newdream.net>;
>> Date:  Fri, Sep 11, 2015 06:23 AM
>> To:  "??"<wangrui@tvmining.com>;
>> Cc:  "ceph-devel"<ceph-devel@vger.kernel.org>;
>> Subject:  Re: Failed on starting osd-daemon after upgrade giant-0.87.1 tohammer-0.94.3
>>
>> Hi!
>>
>> On Wed, 9 Sep 2015, ?? wrote:
>> > Hi all:
>> >
>> > I got on error after upgrade my ceph cluster from giant-0.87.2 to hammer-0.94.3, my local environment is:
>> > CentOS 6.7 x86_64
>> > Kernel 3.10.86-1.el6.elrepo.x86_64
>> > HDD: XFS, 2TB
>> > Install Package: ceph.com official RPMs x86_64
>> >
>> > step 1:
>> > Upgrade MON server from 0.87.1 to 0.94.3, all is fine!
>> >
>> > step 2:
>> > Upgrade OSD server from 0.87.1 to 0.94.3. i just upgrade two servers and noticed that some osds can not started!
>> > server-1 have 4 osds, all of them can not started;
>> > server-2 have 3 osds, 2 of them can not started, but 1 of them successfully started and work in good.
>> >
>> > Error log 1:
>> > service ceph start osd.4
>> > /var/log/ceph/ceph-osd.24.log
>> > (attachment file: ceph.24.log)
>> >
>> > Error log 2:
>> > /usr/bin/ceph-osd -c /etc/ceph/ceph.conf -i 4 -f
>> >  (attachment file: cli.24.log)
>>
>> This looks a lot like a problem with a stray directory that older versions
>> did not clean up (#11429)... but not quite.  Have you deleted pools in the
>> past? (Can you attach a 'ceph osd dump'?)?  Also, i fyou start the osd
>> with 'debug osd = 20' and 'debug filestore = 20' we can see which PG is
>> problematic.  If you install the 'ceph-test' package which contains
>> ceph-kvstore-tool, the output of
>>
>>  ceph-kvstore-tool /var/lib/ceph/osd/ceph-$id/current/db list
>>
>> would also be helpful.
>>
>> Thanks!
>> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Best Regards,

Wheat

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Failed on starting osd-daemon after upgrade giant-0.87.1 tohammer-0.94.3
       [not found]       ` <CACJqLyYMw60oWcX-GQKTQ6d4Rz8f5gNsXeANOrHCcsWJnzEpkQ@mail.gmail.com>
@ 2015-09-11 14:09         ` Sage Weil
  2015-09-11 14:57           ` Haomai Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2015-09-11 14:09 UTC (permalink / raw)
  To: Haomai Wang; +Cc: 王锐, ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5108 bytes --]

On Fri, 11 Sep 2015, Haomai Wang wrote:
> On Fri, Sep 11, 2015 at 8:56 PM, Sage Weil <sage@newdream.net> wrote:
>       On Fri, 11 Sep 2015, ?? wrote:
>       > Thank Sage Weil:
>       >
>       > 1. I delete some testing pools in the past, but is was a long
>       time ago (may be 2 months ago), in recently upgrade, do not
>       delete pools.
>       > 2.  ceph osd dump please see the (attachment file
>       ceph.osd.dump.log)
>       > 3. debug osd = 20' and 'debug filestore = 20  (attachment file
>       ceph.osd.5.log.tar.gz)
> 
>       This one is failing on pool 54, which has been deleted.  In this
>       case you
>       can work around it by renaming current/54.* out of the way.
> 
>       > 4. i install the ceph-test, but output error
>       > ceph-kvstore-tool /ceph/data5/current/db list
>       > Invalid argument: /ceph/data5/current/db: does not exist
>       (create_if_missing is false)
> 
>       Sorry, I should have said current/omap, not current/db.  I'm
>       still curious
>       to see the key dump.  I'm not sure why the leveldb key for these
>       pgs is
>       missing...
> 
> 
> Yesterday I have a chat with wangrui and the reason is "infos"(legacy oid)
> is missing. I'm not sure why it's missing.

Probably

https://github.com/ceph/ceph/blob/hammer/src/osd/OSD.cc#L2908

Oh, I think I see what happened:

 - the pg removal was aborted pre-hammer.  On pre-hammer, thsi means that 
load_pgs skips it here:

 https://github.com/ceph/ceph/blob/firefly/src/osd/OSD.cc#L2121

 - we upgrade to hammer.  we skip this pg (same reason), don't upgrade it, 
but delete teh legacy infos object

 https://github.com/ceph/ceph/blob/hammer/src/osd/OSD.cc#L2908

 - now we see this crash...

I think the fix is, in hammer, to bail out of peek_map_epoch if the infos 
object isn't present, here

 https://github.com/ceph/ceph/blob/hammer/src/osd/PG.cc#L2867

Probably we should restructure so we can return a 'fail' value 
instead of a magic epoch_t meaning the same...

This is similar to the bug I'm fixing on master (and I think I just 
realized what I was doing wrong there).

Thanks!
sage



>  
> 
>       Thanks!
>       sage
> 
> 
>       >
>       > ls -l /ceph/data5/current/db
>       > total 0
>       > -rw-r--r-- 1 root root 0 Sep 11 09:41 LOCK
>       > -rw-r--r-- 1 root root 0 Sep 11 09:54 LOG
>       > -rw-r--r-- 1 root root 0 Sep 11 09:54 LOG.old
>       >
>       > Thanks very much!
>       > Wang Rui
>       >
>       > ------------------ Original ------------------
>       > From:  "Sage Weil"<sage@newdream.net>;
>       > Date:  Fri, Sep 11, 2015 06:23 AM
>       > To:  "??"<wangrui@tvmining.com>;
>       > Cc:  "ceph-devel"<ceph-devel@vger.kernel.org>;
>       > Subject:  Re: Failed on starting osd-daemon after upgrade
>       giant-0.87.1 tohammer-0.94.3
>       >
>       > Hi!
>       >
>       > On Wed, 9 Sep 2015, ?? wrote:
>       > > Hi all:
>       > >
>       > > I got on error after upgrade my ceph cluster from
>       giant-0.87.2 to hammer-0.94.3, my local environment is:
>       > > CentOS 6.7 x86_64
>       > > Kernel 3.10.86-1.el6.elrepo.x86_64
>       > > HDD: XFS, 2TB
>       > > Install Package: ceph.com official RPMs x86_64
>       > >
>       > > step 1:
>       > > Upgrade MON server from 0.87.1 to 0.94.3, all is fine!
>       > >
>       > > step 2:
>       > > Upgrade OSD server from 0.87.1 to 0.94.3. i just upgrade two
>       servers and noticed that some osds can not started!
>       > > server-1 have 4 osds, all of them can not started;
>       > > server-2 have 3 osds, 2 of them can not started, but 1 of
>       them successfully started and work in good.
>       > >
>       > > Error log 1:
>       > > service ceph start osd.4
>       > > /var/log/ceph/ceph-osd.24.log
>       > > (attachment file: ceph.24.log)
>       > >
>       > > Error log 2:
>       > > /usr/bin/ceph-osd -c /etc/ceph/ceph.conf -i 4 -f
>       > >  (attachment file: cli.24.log)
>       >
>       > This looks a lot like a problem with a stray directory that
>       older versions
>       > did not clean up (#11429)... but not quite.  Have you deleted
>       pools in the
>       > past? (Can you attach a 'ceph osd dump'?)?  Also, i fyou start
>       the osd
>       > with 'debug osd = 20' and 'debug filestore = 20' we can see
>       which PG is
>       > problematic.  If you install the 'ceph-test' package which
>       contains
>       > ceph-kvstore-tool, the output of
>       >
>       >  ceph-kvstore-tool /var/lib/ceph/osd/ceph-$id/current/db list
>       >
>       > would also be helpful.
>       >
>       > Thanks!
>       > sage
>       --
>       To unsubscribe from this list: send the line "unsubscribe
>       ceph-devel" in
>       the body of a message to majordomo@vger.kernel.org
>       More majordomo info at 
>       http://vger.kernel.org/majordomo-info.html
> 
> 
> 
> 
> --
> 
> Best Regards,
> 
> Wheat
> 
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Failed on starting osd-daemon after upgrade giant-0.87.1 tohammer-0.94.3
  2015-09-11 14:09         ` Sage Weil
@ 2015-09-11 14:57           ` Haomai Wang
  2015-09-11 15:10             ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Haomai Wang @ 2015-09-11 14:57 UTC (permalink / raw)
  To: Sage Weil; +Cc: 王锐, ceph-devel

On Fri, Sep 11, 2015 at 10:09 PM, Sage Weil <sage@newdream.net> wrote:
> On Fri, 11 Sep 2015, Haomai Wang wrote:
>> On Fri, Sep 11, 2015 at 8:56 PM, Sage Weil <sage@newdream.net> wrote:
>>       On Fri, 11 Sep 2015, ?? wrote:
>>       > Thank Sage Weil:
>>       >
>>       > 1. I delete some testing pools in the past, but is was a long
>>       time ago (may be 2 months ago), in recently upgrade, do not
>>       delete pools.
>>       > 2.  ceph osd dump please see the (attachment file
>>       ceph.osd.dump.log)
>>       > 3. debug osd = 20' and 'debug filestore = 20  (attachment file
>>       ceph.osd.5.log.tar.gz)
>>
>>       This one is failing on pool 54, which has been deleted.  In this
>>       case you
>>       can work around it by renaming current/54.* out of the way.
>>
>>       > 4. i install the ceph-test, but output error
>>       > ceph-kvstore-tool /ceph/data5/current/db list
>>       > Invalid argument: /ceph/data5/current/db: does not exist
>>       (create_if_missing is false)
>>
>>       Sorry, I should have said current/omap, not current/db.  I'm
>>       still curious
>>       to see the key dump.  I'm not sure why the leveldb key for these
>>       pgs is
>>       missing...
>>
>>
>> Yesterday I have a chat with wangrui and the reason is "infos"(legacy oid)
>> is missing. I'm not sure why it's missing.
>
> Probably
>
> https://github.com/ceph/ceph/blob/hammer/src/osd/OSD.cc#L2908
>
> Oh, I think I see what happened:
>
>  - the pg removal was aborted pre-hammer.  On pre-hammer, thsi means that
> load_pgs skips it here:
>
>  https://github.com/ceph/ceph/blob/firefly/src/osd/OSD.cc#L2121
>
>  - we upgrade to hammer.  we skip this pg (same reason), don't upgrade it,
> but delete teh legacy infos object
>
>  https://github.com/ceph/ceph/blob/hammer/src/osd/OSD.cc#L2908
>
>  - now we see this crash...
>
> I think the fix is, in hammer, to bail out of peek_map_epoch if the infos
> object isn't present, here
>
>  https://github.com/ceph/ceph/blob/hammer/src/osd/PG.cc#L2867
>
> Probably we should restructure so we can return a 'fail' value
> instead of a magic epoch_t meaning the same...
>
> This is similar to the bug I'm fixing on master (and I think I just
> realized what I was doing wrong there).

Hmm, I got it. So we could skip this assert or just like load_pgs to
check pool whether exists?

I think it's urgent bug because I remember several people show me the
alike crash.


>
> Thanks!
> sage
>
>
>
>>
>>
>>       Thanks!
>>       sage
>>
>>
>>       >
>>       > ls -l /ceph/data5/current/db
>>       > total 0
>>       > -rw-r--r-- 1 root root 0 Sep 11 09:41 LOCK
>>       > -rw-r--r-- 1 root root 0 Sep 11 09:54 LOG
>>       > -rw-r--r-- 1 root root 0 Sep 11 09:54 LOG.old
>>       >
>>       > Thanks very much!
>>       > Wang Rui
>>       >
>>       > ------------------ Original ------------------
>>       > From:  "Sage Weil"<sage@newdream.net>;
>>       > Date:  Fri, Sep 11, 2015 06:23 AM
>>       > To:  "??"<wangrui@tvmining.com>;
>>       > Cc:  "ceph-devel"<ceph-devel@vger.kernel.org>;
>>       > Subject:  Re: Failed on starting osd-daemon after upgrade
>>       giant-0.87.1 tohammer-0.94.3
>>       >
>>       > Hi!
>>       >
>>       > On Wed, 9 Sep 2015, ?? wrote:
>>       > > Hi all:
>>       > >
>>       > > I got on error after upgrade my ceph cluster from
>>       giant-0.87.2 to hammer-0.94.3, my local environment is:
>>       > > CentOS 6.7 x86_64
>>       > > Kernel 3.10.86-1.el6.elrepo.x86_64
>>       > > HDD: XFS, 2TB
>>       > > Install Package: ceph.com official RPMs x86_64
>>       > >
>>       > > step 1:
>>       > > Upgrade MON server from 0.87.1 to 0.94.3, all is fine!
>>       > >
>>       > > step 2:
>>       > > Upgrade OSD server from 0.87.1 to 0.94.3. i just upgrade two
>>       servers and noticed that some osds can not started!
>>       > > server-1 have 4 osds, all of them can not started;
>>       > > server-2 have 3 osds, 2 of them can not started, but 1 of
>>       them successfully started and work in good.
>>       > >
>>       > > Error log 1:
>>       > > service ceph start osd.4
>>       > > /var/log/ceph/ceph-osd.24.log
>>       > > (attachment file: ceph.24.log)
>>       > >
>>       > > Error log 2:
>>       > > /usr/bin/ceph-osd -c /etc/ceph/ceph.conf -i 4 -f
>>       > >  (attachment file: cli.24.log)
>>       >
>>       > This looks a lot like a problem with a stray directory that
>>       older versions
>>       > did not clean up (#11429)... but not quite.  Have you deleted
>>       pools in the
>>       > past? (Can you attach a 'ceph osd dump'?)?  Also, i fyou start
>>       the osd
>>       > with 'debug osd = 20' and 'debug filestore = 20' we can see
>>       which PG is
>>       > problematic.  If you install the 'ceph-test' package which
>>       contains
>>       > ceph-kvstore-tool, the output of
>>       >
>>       >  ceph-kvstore-tool /var/lib/ceph/osd/ceph-$id/current/db list
>>       >
>>       > would also be helpful.
>>       >
>>       > Thanks!
>>       > sage
>>       --
>>       To unsubscribe from this list: send the line "unsubscribe
>>       ceph-devel" in
>>       the body of a message to majordomo@vger.kernel.org
>>       More majordomo info at
>>       http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>>
>> --
>>
>> Best Regards,
>>
>> Wheat
>>
>>
>>



-- 
Best Regards,

Wheat

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Failed on starting osd-daemon after upgrade giant-0.87.1 tohammer-0.94.3
  2015-09-11 14:57           ` Haomai Wang
@ 2015-09-11 15:10             ` Sage Weil
  0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2015-09-11 15:10 UTC (permalink / raw)
  To: Haomai Wang; +Cc: 王锐, ceph-devel

On Fri, 11 Sep 2015, Haomai Wang wrote:
> On Fri, Sep 11, 2015 at 10:09 PM, Sage Weil <sage@newdream.net> wrote:
> > On Fri, 11 Sep 2015, Haomai Wang wrote:
> >> On Fri, Sep 11, 2015 at 8:56 PM, Sage Weil <sage@newdream.net> wrote:
> >>       On Fri, 11 Sep 2015, ?? wrote:
> >>       > Thank Sage Weil:
> >>       >
> >>       > 1. I delete some testing pools in the past, but is was a long
> >>       time ago (may be 2 months ago), in recently upgrade, do not
> >>       delete pools.
> >>       > 2.  ceph osd dump please see the (attachment file
> >>       ceph.osd.dump.log)
> >>       > 3. debug osd = 20' and 'debug filestore = 20  (attachment file
> >>       ceph.osd.5.log.tar.gz)
> >>
> >>       This one is failing on pool 54, which has been deleted.  In this
> >>       case you
> >>       can work around it by renaming current/54.* out of the way.
> >>
> >>       > 4. i install the ceph-test, but output error
> >>       > ceph-kvstore-tool /ceph/data5/current/db list
> >>       > Invalid argument: /ceph/data5/current/db: does not exist
> >>       (create_if_missing is false)
> >>
> >>       Sorry, I should have said current/omap, not current/db.  I'm
> >>       still curious
> >>       to see the key dump.  I'm not sure why the leveldb key for these
> >>       pgs is
> >>       missing...
> >>
> >>
> >> Yesterday I have a chat with wangrui and the reason is "infos"(legacy oid)
> >> is missing. I'm not sure why it's missing.
> >
> > Probably
> >
> > https://github.com/ceph/ceph/blob/hammer/src/osd/OSD.cc#L2908
> >
> > Oh, I think I see what happened:
> >
> >  - the pg removal was aborted pre-hammer.  On pre-hammer, thsi means that
> > load_pgs skips it here:
> >
> >  https://github.com/ceph/ceph/blob/firefly/src/osd/OSD.cc#L2121
> >
> >  - we upgrade to hammer.  we skip this pg (same reason), don't upgrade it,
> > but delete teh legacy infos object
> >
> >  https://github.com/ceph/ceph/blob/hammer/src/osd/OSD.cc#L2908
> >
> >  - now we see this crash...
> >
> > I think the fix is, in hammer, to bail out of peek_map_epoch if the infos
> > object isn't present, here
> >
> >  https://github.com/ceph/ceph/blob/hammer/src/osd/PG.cc#L2867
> >
> > Probably we should restructure so we can return a 'fail' value
> > instead of a magic epoch_t meaning the same...
> >
> > This is similar to the bug I'm fixing on master (and I think I just
> > realized what I was doing wrong there).
> 
> Hmm, I got it. So we could skip this assert or just like load_pgs to
> check pool whether exists?
> 
> I think it's urgent bug because I remember several people show me the
> alike crash.

Yeah.. take a look at https://github.com/ceph/ceph/pull/5892

Does that look right to you?  Packages are building now...

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-09-11 15:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-09 12:15 Failed on starting osd-daemon after upgrade giant-0.87.1 to hammer-0.94.3 王锐
2015-09-10 22:23 ` Sage Weil
     [not found]   ` <tencent_7EDD2DF72EF83A9470528405@qq.com>
2015-09-11 12:56     ` Failed on starting osd-daemon after upgrade giant-0.87.1 tohammer-0.94.3 Sage Weil
2015-09-11 13:53       ` Haomai Wang
     [not found]       ` <CACJqLyYMw60oWcX-GQKTQ6d4Rz8f5gNsXeANOrHCcsWJnzEpkQ@mail.gmail.com>
2015-09-11 14:09         ` Sage Weil
2015-09-11 14:57           ` Haomai Wang
2015-09-11 15:10             ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.