OSD crash when one of the PGs have problem

* OSD crash when one of the PGs have problem
@ 2016-01-20 14:10 Varada Kari
  2016-01-20 15:33 ` Sage Weil
  0 siblings, 1 reply; 9+ messages in thread
From: Varada Kari @ 2016-01-20 14:10 UTC (permalink / raw)
  To: ceph-devel

Hi all,

Sorry for the long mail. Wanted to give more information in the mail. 

We are seeing an issue in our test cluster when an osd is stopped gracefully and restarted after the maintenance. 
Please note, haven't used any no out option for the cluster. 
After the reboot, not able to one of the PG, and no drive problems are read errors are logged in dmesg/kernel logs. But we are not able to read the head object of PG for some reason, sorry of lack of logs we are running with debugs with 0/0 and build with jemalloc enabled for testing. Hence a customized version.

<snip>
2015-12-27 13:14:39.684307 7f5ae47ff700 -1 osd.1 405 *** Got signal Terminated ***
2015-12-27 13:14:39.779307 7f5ae47ff700  0 osd.1 405 prepare_to_stop telling mon we are shutting down
2016-01-04 10:46:13.966959 7fad39d1e800  0 ceph version 10.0.0(7eb76e73fd5b6e990ceba9d45e5a1fec92145fa5), process ceph-osd, pid 6975
2016-01-04 10:46:14.007516 7fad39d1e800  0 filestore(/var/lib/ceph/osd/ceph-1) backend xfs (magic 0x58465342)
2016-01-04 10:46:14.008565 7fad39d1e800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2016-01-04 10:46:14.008573 7fad39d1e800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option
2016-01-04 10:46:14.008601 7fad39d1e800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: splice is supported
2016-01-04 10:46:14.008882 7fad39d1e800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2016-01-04 10:46:14.008921 7fad39d1e800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: extsize is supported and your kernel >= 3.5
2016-01-04 10:46:14.339253 7fad39d1e800  0 filestore(/var/lib/ceph/osd/ceph-1) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
2016-01-04 10:46:14.532510 7fad39d1e800  0 <cls> cls/hello/cls_hello.cc:305: loading cls_hello
2016-01-04 10:46:14.534168 7fad39d1e800  0 <cls> cls/cephfs/cls_cephfs.cc:136: loading cephfs_size_scan
2016-01-04 10:46:14.535626 7fad39d1e800  0 osd.1 405 crush map has features 1107558400, adjusting msgr requires for clients
2016-01-04 10:46:14.535634 7fad39d1e800  0 osd.1 405 crush map has features 1107558400 was 8705, adjusting msgr requires for mons
2016-01-04 10:46:14.535639 7fad39d1e800  0 osd.1 405 crush map has features 1107558400, adjusting msgr requires for osds
2016-01-04 10:46:14.954060 7fad39d1e800  0 osd.1 405 load_pgs
2016-01-04 10:46:15.634892 7fad39d1e800 -1 osd.1 405 load_pgs unable to peek at 1.29e metadata, skipping
2016-01-04 10:46:16.334253 7fad39d1e800  0 osd.1 405 load_pgs opened 158 pgs
2016-01-04 10:46:16.336960 7fad39d1e800 -1 osd.1 405 log_to_monitors {default=true}
2016-01-04 10:46:16.702999 7fad273be700  0 filestore(/var/lib/ceph/osd/ceph-1)  error (17) File exists not handled on operation 0x7fad2c83f1c0 (201318686.0.0, or op 0, counting from 0)
2016-01-04 10:46:16.703014 7fad273be700  0 filestore(/var/lib/ceph/osd/ceph-1) unexpected error code
2016-01-04 10:46:16.703017 7fad273be700  0 filestore(/var/lib/ceph/osd/ceph-1)  transaction dump:
{
    "ops": [
        {
            "op_num": 0,
            "op_name": "mkcoll",
            "collection": "1.29e_head"
        },
        {
            "op_num": 1,
            "op_name": "coll_hint",
            "collection": "1.29e_head",
            "type": 1,
            "pg_num": 800,
            "expected_num_objects": 0
        },
        {
            "op_num": 2,
            "op_name": "touch",
            "collection": "1.29e_head",
            "oid": "1\/0000029e\/\/head"
        },
        {
            "op_num": 3,
            "op_name": "omap_setkeys",
            "collection": "1.29e_head",
            "oid": "1\/0000029e\/\/head",
            "attr_lens": {
                "_infover": 1
            }
        },
        {
            "op_num": 4,
            "op_name": "touch",
            "collection": "1.29e_head",
            "oid": "1\/0000029e\/\/head"
        },
        {
            "op_num": 5,
            "op_name": "omap_setkeys",
            "collection": "1.29e_head",
            "oid": "1\/0000029e\/\/head",
            "attr_lens": {
                "_biginfo": 196,
                "_epoch": 4,
                "_info": 847,
                "can_rollback_to": 12,
                "rollback_info_trimmed_to": 12
            }
        }
    ]
}

2016-01-04 10:46:16.731323 7fad273be700 -1 os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7fad273be700 time 2016-01-04 10:46:16.703103
os/FileStore.cc: 3017: FAILED assert(0 == "unexpected error")

 ceph version 10.0.0 (7eb76e73fd5b6e990ceba9d45e5a1fec92145fa5)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7fad39846a6b]
 2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0xa48) [0x7fad3968bf58]
 3: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long, ThreadPool::TPHandle*)+0x64) [0x7fad396932a4]
 4: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x2c3) [0x7fad39693583]
 5: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa56) [0x7fad398384e6]
 6: (ThreadPool::WorkThread::entry()+0x10) [0x7fad398393b0]
 7: (()+0x8182) [0x7fad38609182]
 8: (clone()+0x6d) [0x7fad367fc47d]
<snip>

 On the reboot, because of the journal reply, we didn't see any transaction dump issues this time, but we are seeing the following problem.

 <snip>
2016-01-04 10:47:09.302400 7f8190a3c800  0 ceph version 10.0.0 (7eb76e73fd5b6e990ceba9d45e5a1fec92145fa5), process ceph-osd, pid 9488
2016-01-04 10:47:09.310055 7f8190a3c800  0 filestore(/var/lib/ceph/osd/ceph-1) backend xfs (magic 0x58465342)
2016-01-04 10:47:09.310379 7f8190a3c800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2016-01-04 10:47:09.310384 7f8190a3c800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option
2016-01-04 10:47:09.310398 7f8190a3c800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: splice is supported
2016-01-04 10:47:09.310869 7f8190a3c800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2016-01-04 10:47:09.310935 7f8190a3c800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: extsize is supported and your kernel >= 3.5
2016-01-04 10:47:09.322795 7f8190a3c800  0 filestore(/var/lib/ceph/osd/ceph-1) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
2016-01-04 10:47:09.322828 7f8190a3c800  0 filestore(/var/lib/ceph/osd/ceph-1) disabling wbthrottle( filestore_wbthrottle_enable = 0 ) since do_fast_sync is enabled
2016-01-04 10:47:09.327861 7f8183ee0700  0 filestore(/var/lib/ceph/osd/ceph-1) ## Fast sync is enabled ##
2016-01-04 10:47:09.392333 7f8190a3c800  0 <cls> cls/hello/cls_hello.cc:305: loading cls_hello
2016-01-04 10:47:09.393968 7f8190a3c800  0 <cls> cls/cephfs/cls_cephfs.cc:136: loading cephfs_size_scan
2016-01-04 10:47:09.394528 7f8190a3c800  0 osd.1 410 crush map has features 1107558400, adjusting msgr requires for clients
2016-01-04 10:47:09.394536 7f8190a3c800  0 osd.1 410 crush map has features 1107558400 was 8705, adjusting msgr requires for mons
2016-01-04 10:47:09.394552 7f8190a3c800  0 osd.1 410 crush map has features 1107558400, adjusting msgr requires for osds
2016-01-04 10:47:09.458215 7f8190a3c800  0 osd.1 410 load_pgs
2016-01-04 10:47:10.935841 7f8190a3c800  0 osd.1 410 load_pgs opened 159 pgs
2016-01-04 10:47:10.936696 7f8190a3c800 -1 osd.1 410 log_to_monitors {default=true}
2016-01-04 10:47:13.942139 7f8190a3c800  0 osd.1 410 done with init, starting boot process
2016-01-04 10:47:14.117463 7f817d1f3700  0 -- 10.242.43.141:6825/9488 >> 10.242.43.143:6821/40041 pipe(0x7f8141c0e000 sd=83 :6825 s=0 pgs=0 cs=0 l=0 c=0x7f8141c14180).accept connect_seq 0 vs existing 0 state connecting
2016-01-04 10:47:15.649791 7f812eefe700  0 -- 10.242.43.141:6825/9488 >> 10.242.43.141:6801/6106 pipe(0x7f8141c0e000 sd=93 :6825 s=0 pgs=0 cs=0 l=0 c=0x7f8141c14480).accept connect_seq 0 vs existing 0 state connecting
2016-01-04 10:47:15.649826 7f812edfd700  0 -- 10.242.43.141:6825/9488 >> 10.242.43.141:6821/6925 pipe(0x7f8141c1c000 sd=137 :6825 s=0 pgs=0 cs=0 l=0 c=0x7f8141c14780).accept connect_seq 0 vs existing 0 state connecting
2016-01-04 10:47:15.924059 7f81527ed700 -1 osd/PGLog.h: In function 'void PGLog::IndexedLog::claim_log_and_clear_rollback_info(const pg_log_t&)' thread 7f81527ed700 time 2016-01-04 10:47:15.906256
osd/PGLog.h: 89: FAILED assert(rollback_info_trimmed_to_riter == log.rbegin())

 ceph version 10.0.0 (7eb76e73fd5b6e990ceba9d45e5a1fec92145fa5)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f8190564a6b]
 2: (PG::RecoveryState::Stray::react(PG::MLogRec const&)+0xa21) [0x7f8190173971]
 3: (boost::statechart::simple_state<PG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x1f4) [0x7f81901ad584]
 4: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x7f8190197bab]
 5: (PG::handle_peering_event(std::shared_ptr<PG::CephPeeringEvt>, PG::RecoveryCtx*)+0x1ce) [0x7f81901412ce]
 6: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x267) [0x7f81900cb677]
 7: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x18) [0x7f819010fed8]
 8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa56) [0x7f81905564e6]
 9: (ThreadPool::WorkThread::entry()+0x10) [0x7f81905573b0]
 10: (()+0x8182) [0x7f818f327182]
 11: (clone()+0x6d) [0x7f818d51a47d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
<snip>

 Whenever we restart the osd, we are hitting the problem and not able to get the osd up. Tried pg repair, which bails saying primary osd down.
 Are there any repair techniques available to get this corrected? 

 Thanks,
 Varada

^ permalink raw reply	[flat|nested] 9+ messages in thread