From mboxrd@z Thu Jan 1 00:00:00 1970 From: Haomai Wang Subject: Re: xattr spillout appears broken :( Date: Sat, 7 Jun 2014 02:55:20 +0800 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: Received: from mail-wi0-f170.google.com ([209.85.212.170]:49464 "EHLO mail-wi0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751818AbaFFSzW (ORCPT ); Fri, 6 Jun 2014 14:55:22 -0400 Received: by mail-wi0-f170.google.com with SMTP id bs8so1491749wib.3 for ; Fri, 06 Jun 2014 11:55:20 -0700 (PDT) In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Gregory Farnum , "ceph-devel@vger.kernel.org" The fix should make clone method copy "cephos" prefix xattrs On Sat, Jun 7, 2014 at 2:54 AM, Haomai Wang wrote: > Hi Greg, > > I have found the reason. "user.cephos.spill_out" can't be apply to new > object when calling clone method. So if the origin object is spill > out, the new object still has no spill out marker. > > I pushed a commit (https://github.com/ceph/ceph/pull/1932) to help > cover this situation. > > > On Tue, Jun 3, 2014 at 10:33 PM, Haomai Wang wrote: >> /familiar/not familiar/ >> >> On Tue, Jun 3, 2014 at 10:33 PM, Haomai Wang wrote: >>> Hi Gregory, >>> >>> I checked again and again each line change about spill out codes, >>> still failed to find anything wrong. >>> >>> I ran "ceph_test_rados" then activate scrub process several times >>> locally, nothing unusual. I'm familiar with teuthoghy jobs, maybe we >>> can find the common thing among fail jobs. >>> >>> >>> On Sat, May 31, 2014 at 2:06 AM, Gregory Farnum wrote: >>>> On Fri, May 30, 2014 at 2:18 AM, Haomai Wang wrote: >>>>> Hi Gregory, >>>>> >>>>> I try to reproduce the bug in my local machine but failed. >>>>> >>>>> My test cmdline: >>>>> ./ceph_test_rados --op read 100 --op write 100 --op delete 50 >>>>> --max-ops 400000 --objects 1024 --max-in-flight 64 --size 4000000 >>>>> --min-stride-size 400000 --max-stride-size 800000 --max-seconds 600 >>>>> --op copy_from 50 --op snap_create 50 --op snap_remove 50 --op >>>>> rollback 50 --op setattr 25 --op rmattr 25 --pool unique_pool_0 >>>>> >>>>> Is there any tip to reproduce it? >>>> >>>> Hmm. I've been directly running the teuthology tests that failed; that >>>> is the command line it's running though so I think the only difference >>>> would be that I've got OSD thrashing (ie, recovery) happening while >>>> the test is running. >>>> Most of the other failures were scrub turning up inconsistencies in >>>> the xattrs at each replica/shard of an object. I didn't see any >>>> obvious mechanism by which storing values in xattrs versus leveldb >>>> would impact these higher-level primitives, but maybe you have some >>>> idea? >>>> -Greg >>> >>> >>> >>> -- >>> Best Regards, >>> >>> Wheat >> >> >> >> -- >> Best Regards, >> >> Wheat > > > > -- > Best Regards, > > Wheat -- Best Regards, Wheat