All of lore.kernel.org
 help / color / mirror / Atom feed
* My favorite question
@ 2011-07-19 11:03 Fyodor Ustinov
  2011-07-19 14:29 ` Sage Weil
  0 siblings, 1 reply; 15+ messages in thread
From: Fyodor Ustinov @ 2011-07-19 11:03 UTC (permalink / raw)
  To: ceph-devel

Hi!


2011-07-19 14:00:39.509391   log 2011-07-19 14:00:34.185718 osd5 
10.5.51.145:6803/19737 1563 : [ERR] 0.f8 scrub stat mismatch, got 
2074/2074 objects, 0/0 clones, 8624991435/8590119115 bytes, 
8422853/8388798 kb.
2011-07-19 14:00:39.509391   log 2011-07-19 14:00:34.185732 osd5 
10.5.51.145:6803/19737 1564 : [ERR] 0.f8 scrub 1 errors

How to respond to this message?

WBR,
     Fyodor.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-19 11:03 My favorite question Fyodor Ustinov
@ 2011-07-19 14:29 ` Sage Weil
  2011-07-19 14:32   ` Fyodor Ustinov
  2011-07-22 15:27   ` Fyodor Ustinov
  0 siblings, 2 replies; 15+ messages in thread
From: Sage Weil @ 2011-07-19 14:29 UTC (permalink / raw)
  To: Fyodor Ustinov; +Cc: ceph-devel

On Tue, 19 Jul 2011, Fyodor Ustinov wrote:
> Hi!
> 
> 2011-07-19 14:00:39.509391   log 2011-07-19 14:00:34.185718 osd5
> 10.5.51.145:6803/19737 1563 : [ERR] 0.f8 scrub stat mismatch, got 2074/2074
> objects, 0/0 clones, 8624991435/8590119115 bytes, 8422853/8388798 kb.
> 2011-07-19 14:00:39.509391   log 2011-07-19 14:00:34.185732 osd5
> 10.5.51.145:6803/19737 1564 : [ERR] 0.f8 scrub 1 errors
> 
> How to respond to this message?

$ ceph pg repair 0.f8

should do the trick!

sage

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-19 14:29 ` Sage Weil
@ 2011-07-19 14:32   ` Fyodor Ustinov
  2011-07-22 15:27   ` Fyodor Ustinov
  1 sibling, 0 replies; 15+ messages in thread
From: Fyodor Ustinov @ 2011-07-19 14:32 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 07/19/2011 05:29 PM, Sage Weil wrote:
> ceph pg repair 0.f8
Thnx!

WBR,
     Fyodor.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-19 14:29 ` Sage Weil
  2011-07-19 14:32   ` Fyodor Ustinov
@ 2011-07-22 15:27   ` Fyodor Ustinov
  2011-07-25 20:54     ` Gregory Farnum
  1 sibling, 1 reply; 15+ messages in thread
From: Fyodor Ustinov @ 2011-07-22 15:27 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 07/19/2011 05:29 PM, Sage Weil wrote:
> On Tue, 19 Jul 2011, Fyodor Ustinov wrote:
>> Hi!
>>
>> 2011-07-19 14:00:39.509391   log 2011-07-19 14:00:34.185718 osd5
>> 10.5.51.145:6803/19737 1563 : [ERR] 0.f8 scrub stat mismatch, got 2074/2074
>> objects, 0/0 clones, 8624991435/8590119115 bytes, 8422853/8388798 kb.
>> 2011-07-19 14:00:39.509391   log 2011-07-19 14:00:34.185732 osd5
>> 10.5.51.145:6803/19737 1564 : [ERR] 0.f8 scrub 1 errors
>>
>> How to respond to this message?
> $ ceph pg repair 0.f8
>
> should do the trick!
>
> sage
Hmm... But today I seen again:

2011-07-22 17:26:51.134596   log 2011-07-22 17:26:51.010146 osd5 
10.5.51.145:6803/19737 2120 : [ERR] 0.f8 scrub stat mismatch, got 
2095/2095 objects, 0/0 clones, 8678199499/8713071819 bytes, 
8474814/8508869 kb.
2011-07-22 17:26:51.134596   log 2011-07-22 17:26:51.010159 osd5 
10.5.51.145:6803/19737 2121 : [ERR] 0.f8 scrub 1 errors

And 'ceph pg repair 0.f8' again fix:

2011-07-22 18:18:31.158248   log 2011-07-22 18:18:24.316646 osd5 
10.5.51.145:6803/19737 2122 : [ERR] 0.f8 repair stat mismatch, got 
2095/2095 objects, 0/0 clones, 8678199499/8713071819 bytes, 
8474814/8508869 kb.
2011-07-22 18:18:31.158248   log 2011-07-22 18:18:24.316680 osd5 
10.5.51.145:6803/19737 2123 : [ERR] 0.f8 repair 1 errors, 1 fixed

WBR,
     Fyodor.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-22 15:27   ` Fyodor Ustinov
@ 2011-07-25 20:54     ` Gregory Farnum
  2011-07-25 21:16       ` Fyodor Ustinov
  0 siblings, 1 reply; 15+ messages in thread
From: Gregory Farnum @ 2011-07-25 20:54 UTC (permalink / raw)
  To: Fyodor Ustinov; +Cc: ceph-devel

On Fri, Jul 22, 2011 at 8:27 AM, Fyodor Ustinov <ufm@ufm.su> wrote:
> On 07/19/2011 05:29 PM, Sage Weil wrote:
>>
>> On Tue, 19 Jul 2011, Fyodor Ustinov wrote:
>>>
>>> Hi!
>>>
>>> 2011-07-19 14:00:39.509391   log 2011-07-19 14:00:34.185718 osd5
>>> 10.5.51.145:6803/19737 1563 : [ERR] 0.f8 scrub stat mismatch, got
>>> 2074/2074
>>> objects, 0/0 clones, 8624991435/8590119115 bytes, 8422853/8388798 kb.
>>> 2011-07-19 14:00:39.509391   log 2011-07-19 14:00:34.185732 osd5
>>> 10.5.51.145:6803/19737 1564 : [ERR] 0.f8 scrub 1 errors
>>>
>>> How to respond to this message?
>>
>> $ ceph pg repair 0.f8
>>
>> should do the trick!
>>
>> sage
>
> Hmm... But today I seen again:
>
> 2011-07-22 17:26:51.134596   log 2011-07-22 17:26:51.010146 osd5
> 10.5.51.145:6803/19737 2120 : [ERR] 0.f8 scrub stat mismatch, got 2095/2095
> objects, 0/0 clones, 8678199499/8713071819 bytes, 8474814/8508869 kb.
> 2011-07-22 17:26:51.134596   log 2011-07-22 17:26:51.010159 osd5
> 10.5.51.145:6803/19737 2121 : [ERR] 0.f8 scrub 1 errors
>
> And 'ceph pg repair 0.f8' again fix:
>
> 2011-07-22 18:18:31.158248   log 2011-07-22 18:18:24.316646 osd5
> 10.5.51.145:6803/19737 2122 : [ERR] 0.f8 repair stat mismatch, got 2095/2095
> objects, 0/0 clones, 8678199499/8713071819 bytes, 8474814/8508869 kb.
> 2011-07-22 18:18:31.158248   log 2011-07-22 18:18:24.316680 osd5
> 10.5.51.145:6803/19737 2123 : [ERR] 0.f8 repair 1 errors, 1 fixed
>
> WBR,
>    Fyodor.

Hmmm, what are you doing with this PG? Have you been doing any snapshots?
I discussed this issue with Sam and it turns out that there's actually
a bug in the scrubbing code because it doesn't handle cloned data
well. That's probably what's happening here; I've created bug
http://tracker.newdream.net/issues/1338 to deal with it.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-25 20:54     ` Gregory Farnum
@ 2011-07-25 21:16       ` Fyodor Ustinov
  2011-07-25 22:55         ` Gregory Farnum
  0 siblings, 1 reply; 15+ messages in thread
From: Fyodor Ustinov @ 2011-07-25 21:16 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

On 07/25/2011 11:54 PM, Gregory Farnum wrote:
>
> Hmmm, what are you doing with this PG? Have you been doing any snapshots?
I'm not use snapshot in ceph. Only as simple file system. And I'm not 
doing anything "special" or "unnatural" :)

> I discussed this issue with Sam and it turns out that there's actually
> a bug in the scrubbing code because it doesn't handle cloned data
> well. That's probably what's happening here; I've created bug
> http://tracker.newdream.net/issues/1338 to deal with it.
Ok, thnx!

WBR,
     Fyodor.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-25 21:16       ` Fyodor Ustinov
@ 2011-07-25 22:55         ` Gregory Farnum
  2011-07-26  0:05           ` Fyodor Ustinov
  0 siblings, 1 reply; 15+ messages in thread
From: Gregory Farnum @ 2011-07-25 22:55 UTC (permalink / raw)
  To: Fyodor Ustinov; +Cc: ceph-devel

On Mon, Jul 25, 2011 at 2:16 PM, Fyodor Ustinov <ufm@ufm.su> wrote:
> On 07/25/2011 11:54 PM, Gregory Farnum wrote:
>>
>> Hmmm, what are you doing with this PG? Have you been doing any snapshots?
>
> I'm not use snapshot in ceph. Only as simple file system. And I'm not doing
> anything "special" or "unnatural" :)
Drat. In that case this isn't the bug we hoped/thought it was.
How old is this filesystem, what version of the code are you running
now, and do you have OSD logging?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-25 22:55         ` Gregory Farnum
@ 2011-07-26  0:05           ` Fyodor Ustinov
  2011-07-26 18:07             ` Gregory Farnum
  0 siblings, 1 reply; 15+ messages in thread
From: Fyodor Ustinov @ 2011-07-26  0:05 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

On 07/26/2011 01:55 AM, Gregory Farnum wrote:
> On Mon, Jul 25, 2011 at 2:16 PM, Fyodor Ustinov<ufm@ufm.su>  wrote:
>> On 07/25/2011 11:54 PM, Gregory Farnum wrote:
>>> Hmmm, what are you doing with this PG? Have you been doing any snapshots?
>> I'm not use snapshot in ceph. Only as simple file system. And I'm not doing
>> anything "special" or "unnatural" :)
> Drat. In that case this isn't the bug we hoped/thought it was.
> How old is this filesystem, what version of the code are you running
> now, and do you have OSD logging?

I'm using the latest available official version for ubuntu (now - 0.31, look forward to 0.32 :) from yours aptitude repository.
FS created 2011-06-09 (I do not remember version number).

I do not have "full" log.


osd.5.log.7.gz:2011-07-18 14:00:25.520178 7f3b30884700 log [ERR] : 0.f8 scrub stat mismatch, got 2067/2067 objects, 0/0 clones, 8595631307/8560758987 bytes, 8394181/8360126 kb.
osd.5.log.7.gz:2011-07-18 14:00:25.520229 7f3b30884700 log [ERR] : 0.f8 scrub 1 errors
osd.5.log.6.gz:2011-07-19 14:00:34.185655 7f3b31085700 log [ERR] : 0.f8 scrub stat mismatch, got 2074/2074 objects, 0/0 clones, 8624991435/8590119115 bytes, 8422853/8388798 kb.
osd.5.log.6.gz:2011-07-19 14:00:34.185726 7f3b31085700 log [ERR] : 0.f8 scrub 1 errors

mon.0.log.6.gz:2011-07-19 17:26:38.478032 7fb3877d4700 mon.0@0(leader) e1 handle_command mon_command(pg repair 0.f8 v 0) v1

osd.5.log.6.gz:2011-07-19 17:26:38.588817 7f3b30884700 log [ERR] : 0.f8 repair stat mismatch, got 2074/2074 objects, 0/0 clones, 8624991435/8590119115 bytes, 8422853/8388798 kb.
osd.5.log.6.gz:2011-07-19 17:26:38.588900 7f3b30884700 log [ERR] : 0.f8 repair 1 errors, 1 fixed

osd.5.log.5.gz:2011-07-20 17:26:44.616686 7f3b30884700 log [ERR] : 0.f8 scrub stat mismatch, got 2082/2082 objects, 0/0 clones, 8623673547/8658545867 bytes, 8421566/8455621 kb.
osd.5.log.5.gz:2011-07-20 17:26:44.616740 7f3b30884700 log [ERR] : 0.f8 scrub 1 errors
osd.5.log.5.gz:2011-07-20 17:26:44.616686 7f3b30884700 log [ERR] : 0.f8 scrub stat mismatch, got 2082/2082 objects, 0/0 clones, 8623673547/8658545867 bytes, 8421566/8455621 kb.
osd.5.log.5.gz:2011-07-20 17:26:44.616740 7f3b30884700 log [ERR] : 0.f8 scrub 1 errors
osd.5.log.3.gz:2011-07-22 17:26:51.010097 7f3b30884700 log [ERR] : 0.f8 scrub stat mismatch, got 2095/2095 objects, 0/0 clones, 8678199499/8713071819 bytes, 8474814/8508869 kb.
osd.5.log.3.gz:2011-07-22 17:26:51.010153 7f3b30884700 log [ERR] : 0.f8 scrub 1 errors

mon.0.log.3.gz:2011-07-22 18:18:24.207401 7fb3877d4700 mon.0@0(leader) e1 handle_command mon_command(pg repair 0.f8 v 0) v1

osd.5.log.3.gz:2011-07-22 18:18:24.316597 7f3b30884700 log [ERR] : 0.f8 repair stat mismatch, got 2095/2095 objects, 0/0 clones, 8678199499/8713071819 bytes, 8474814/8508869 kb.
osd.5.log.3.gz:2011-07-22 18:18:24.316673 7f3b30884700 log [ERR] : 0.f8 repair 1 errors, 1 fixed

mon.0.log.3.gz:2011-07-22 18:26:00.897319 7fb3877d4700 mon.0@0(leader) e1 handle_command mon_command(pg repair 0.f8 v 0) v1

osd.5.log.3.gz:2011-07-22 18:26:00.997632 7f3b30884700 log [INF] : 0.f8 repair ok, 0 fixed

osd.5.log.2.gz:2011-07-23 08:25:44.516668 7f3b31085700 osd5 705 pg[0.f8( v 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4] r=0 mlcod 705'2772 !hml active+clean]  sending commit on repgather(0x7f3b1d12fe80 applied 705'2774 rep_tid=58199 wfack= wfdisk= op=osd_op(client4814.0:525313 1000000b013.00000007 [write 0~2203648] 0.f8 snapc 1=[]) v2) 0x7f3b1c7cdd30
osd.5.log.2.gz:2011-07-23 08:25:44.516939 7f3b30884700 osd5 705 pg[0.f8( v 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4] r=0 mlcod 705'2773 !hml active+clean]  sending commit on repgather(0x7f3b0d5d9270 applied 705'2775 rep_tid=58200 wfack= wfdisk= op=osd_op(client4814.0:525314 1000000b013.00000007 [write 2203648~6144] 0.f8 snapc 1=[]) v2) 0x7f3b0d5d94b0
osd.5.log.2.gz:2011-07-23 08:25:44.517044 7f3b31085700 osd5 705 pg[0.f8( v 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4] r=0 mlcod 705'2774 !hml active+clean]  sending commit on repgather(0x7f3b1c071ed0 applied 705'2776 rep_tid=58201 wfack= wfdisk= op=osd_op(client4814.0:525315 1000000b013.00000007 [write 2209792~288768] 0.f8 snapc 1=[]) v2) 0x7f3b1c351c10
osd.5.log.2.gz:2011-07-23 08:25:44.517144 7f3b31085700 osd5 705 pg[0.f8( v 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4] r=0 mlcod 705'2775 !hml active+clean]  sending commit on repgather(0x7f3b0ce8db70 applied 705'2777 rep_tid=58202 wfack= wfdisk= op=osd_op(client4814.0:525316 1000000b013.00000007 [write 2498560~8192] 0.f8 snapc 1=[]) v2) 0x7f3b0d5d9840
osd.5.log.2.gz:2011-07-23 08:25:44.517596 7f3b30884700 osd5 705 pg[0.f8( v 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4] r=0 mlcod 705'2776 !hml active+clean]  sending commit on repgather(0x7f3b1ceab010 applied 705'2778 rep_tid=58203 wfack= wfdisk= op=osd_op(client4814.0:525317 1000000b013.00000007 [write 2506752~8192] 0.f8 snapc 1=[]) v2) 0x7f3b1c351980
osd.5.log.2.gz:2011-07-23 08:25:44.517666 7f3b31085700 osd5 705 pg[0.f8( v 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4] r=0 mlcod 705'2777 !hml active+clean]  sending commit on repgather(0x7f3b0d5dac70 applied 705'2779 rep_tid=58204 wfack= wfdisk= op=osd_op(client4814.0:525318 1000000b013.00000007 [write 2514944~8192] 0.f8 snapc 1=[]) v2) 0x7f3b0ceaeb50
osd.5.log.2.gz:2011-07-23 08:25:44.517731 7f3b30884700 osd5 705 pg[0.f8( v 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4] r=0 mlcod 705'2778 !hml active+clean]  sending commit on repgather(0x7f3b1cc216a0 applied 705'2780 rep_tid=58205 wfack= wfdisk= op=osd_op(client4814.0:525319 1000000b013.00000007 [write 2523136~16384] 0.f8 snapc 1=[]) v2) 0x7f3b1c79dd50
osd.5.log.2.gz:2011-07-23 08:25:44.517794 7f3b31085700 osd5 705 pg[0.f8( v 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4] r=0 mlcod 705'2779 !hml active+clean]  sending commit on repgather(0x7f3b0d5e2370 applied 705'2781 rep_tid=58206 wfack= wfdisk= op=osd_op(client4814.0:525320 1000000b013.00000007 [write 2539520~16384] 0.f8 snapc 1=[]) v2) 0x7f3b0ce917f0
osd.5.log.2.gz:2011-07-23 08:25:44.517887 7f3b30884700 osd5 705 pg[0.f8( v 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4] r=0 mlcod 705'2780 !hml active+clean]  sending commit on repgather(0x7f3b1dcdb3f0 applied 705'2782 rep_tid=58207 wfack= wfdisk= op=osd_op(client4814.0:525321 1000000b013.00000007 [write 2555904~16384] 0.f8 snapc 1=[]) v2) 0x7f3b1c356ad0
osd.5.log.2.gz:2011-07-23 08:25:44.560037 7f3b30884700 osd5 705 pg[0.f8( v 705'2783 (705'2781,705'2783] n=2099 ec=2 les/c 705/705 704/704/700) [5,4] r=0 mlcod 705'2781 !hml active+clean]  sending commit on repgather(0x7f3b0d5d9d90 applied 705'2783 rep_tid=58208 wfack= wfdisk= op=osd_op(client4814.0:525322 1000000b013.00000007 [write 2572288~1622016] 0.f8 snapc 1=[]) v2) 0x7f3b0ceb2050
osd.5.log.2.gz:2011-07-23 18:26:07.421820 7f3b31085700 log [INF] : 0.f8 scrub ok


WBR,
     Fyodor.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-26  0:05           ` Fyodor Ustinov
@ 2011-07-26 18:07             ` Gregory Farnum
  2011-07-26 18:37               ` Fyodor Ustinov
  0 siblings, 1 reply; 15+ messages in thread
From: Gregory Farnum @ 2011-07-26 18:07 UTC (permalink / raw)
  To: Fyodor Ustinov; +Cc: ceph-devel

The log is giving me some very strange information. Can you attach all
the logs that you have to tracker issue #1340?
(http://tracker.newdream.net/issues/1340)
Thanks!
-Greg

On Mon, Jul 25, 2011 at 5:05 PM, Fyodor Ustinov <ufm@ufm.su> wrote:
> On 07/26/2011 01:55 AM, Gregory Farnum wrote:
>>
>> On Mon, Jul 25, 2011 at 2:16 PM, Fyodor Ustinov<ufm@ufm.su>  wrote:
>>>
>>> On 07/25/2011 11:54 PM, Gregory Farnum wrote:
>>>>
>>>> Hmmm, what are you doing with this PG? Have you been doing any
>>>> snapshots?
>>>
>>> I'm not use snapshot in ceph. Only as simple file system. And I'm not
>>> doing
>>> anything "special" or "unnatural" :)
>>
>> Drat. In that case this isn't the bug we hoped/thought it was.
>> How old is this filesystem, what version of the code are you running
>> now, and do you have OSD logging?
>
> I'm using the latest available official version for ubuntu (now - 0.31, look
> forward to 0.32 :) from yours aptitude repository.
> FS created 2011-06-09 (I do not remember version number).
>
> I do not have "full" log.
>
>
> osd.5.log.7.gz:2011-07-18 14:00:25.520178 7f3b30884700 log [ERR] : 0.f8
> scrub stat mismatch, got 2067/2067 objects, 0/0 clones,
> 8595631307/8560758987 bytes, 8394181/8360126 kb.
> osd.5.log.7.gz:2011-07-18 14:00:25.520229 7f3b30884700 log [ERR] : 0.f8
> scrub 1 errors
> osd.5.log.6.gz:2011-07-19 14:00:34.185655 7f3b31085700 log [ERR] : 0.f8
> scrub stat mismatch, got 2074/2074 objects, 0/0 clones,
> 8624991435/8590119115 bytes, 8422853/8388798 kb.
> osd.5.log.6.gz:2011-07-19 14:00:34.185726 7f3b31085700 log [ERR] : 0.f8
> scrub 1 errors
>
> mon.0.log.6.gz:2011-07-19 17:26:38.478032 7fb3877d4700 mon.0@0(leader) e1
> handle_command mon_command(pg repair 0.f8 v 0) v1
>
> osd.5.log.6.gz:2011-07-19 17:26:38.588817 7f3b30884700 log [ERR] : 0.f8
> repair stat mismatch, got 2074/2074 objects, 0/0 clones,
> 8624991435/8590119115 bytes, 8422853/8388798 kb.
> osd.5.log.6.gz:2011-07-19 17:26:38.588900 7f3b30884700 log [ERR] : 0.f8
> repair 1 errors, 1 fixed
>
> osd.5.log.5.gz:2011-07-20 17:26:44.616686 7f3b30884700 log [ERR] : 0.f8
> scrub stat mismatch, got 2082/2082 objects, 0/0 clones,
> 8623673547/8658545867 bytes, 8421566/8455621 kb.
> osd.5.log.5.gz:2011-07-20 17:26:44.616740 7f3b30884700 log [ERR] : 0.f8
> scrub 1 errors
> osd.5.log.5.gz:2011-07-20 17:26:44.616686 7f3b30884700 log [ERR] : 0.f8
> scrub stat mismatch, got 2082/2082 objects, 0/0 clones,
> 8623673547/8658545867 bytes, 8421566/8455621 kb.
> osd.5.log.5.gz:2011-07-20 17:26:44.616740 7f3b30884700 log [ERR] : 0.f8
> scrub 1 errors
> osd.5.log.3.gz:2011-07-22 17:26:51.010097 7f3b30884700 log [ERR] : 0.f8
> scrub stat mismatch, got 2095/2095 objects, 0/0 clones,
> 8678199499/8713071819 bytes, 8474814/8508869 kb.
> osd.5.log.3.gz:2011-07-22 17:26:51.010153 7f3b30884700 log [ERR] : 0.f8
> scrub 1 errors
>
> mon.0.log.3.gz:2011-07-22 18:18:24.207401 7fb3877d4700 mon.0@0(leader) e1
> handle_command mon_command(pg repair 0.f8 v 0) v1
>
> osd.5.log.3.gz:2011-07-22 18:18:24.316597 7f3b30884700 log [ERR] : 0.f8
> repair stat mismatch, got 2095/2095 objects, 0/0 clones,
> 8678199499/8713071819 bytes, 8474814/8508869 kb.
> osd.5.log.3.gz:2011-07-22 18:18:24.316673 7f3b30884700 log [ERR] : 0.f8
> repair 1 errors, 1 fixed
>
> mon.0.log.3.gz:2011-07-22 18:26:00.897319 7fb3877d4700 mon.0@0(leader) e1
> handle_command mon_command(pg repair 0.f8 v 0) v1
>
> osd.5.log.3.gz:2011-07-22 18:26:00.997632 7f3b30884700 log [INF] : 0.f8
> repair ok, 0 fixed
>
> osd.5.log.2.gz:2011-07-23 08:25:44.516668 7f3b31085700 osd5 705 pg[0.f8( v
> 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4]
> r=0 mlcod 705'2772 !hml active+clean]  sending commit on
> repgather(0x7f3b1d12fe80 applied 705'2774 rep_tid=58199 wfack= wfdisk=
> op=osd_op(client4814.0:525313 1000000b013.00000007 [write 0~2203648] 0.f8
> snapc 1=[]) v2) 0x7f3b1c7cdd30
> osd.5.log.2.gz:2011-07-23 08:25:44.516939 7f3b30884700 osd5 705 pg[0.f8( v
> 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4]
> r=0 mlcod 705'2773 !hml active+clean]  sending commit on
> repgather(0x7f3b0d5d9270 applied 705'2775 rep_tid=58200 wfack= wfdisk=
> op=osd_op(client4814.0:525314 1000000b013.00000007 [write 2203648~6144] 0.f8
> snapc 1=[]) v2) 0x7f3b0d5d94b0
> osd.5.log.2.gz:2011-07-23 08:25:44.517044 7f3b31085700 osd5 705 pg[0.f8( v
> 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4]
> r=0 mlcod 705'2774 !hml active+clean]  sending commit on
> repgather(0x7f3b1c071ed0 applied 705'2776 rep_tid=58201 wfack= wfdisk=
> op=osd_op(client4814.0:525315 1000000b013.00000007 [write 2209792~288768]
> 0.f8 snapc 1=[]) v2) 0x7f3b1c351c10
> osd.5.log.2.gz:2011-07-23 08:25:44.517144 7f3b31085700 osd5 705 pg[0.f8( v
> 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4]
> r=0 mlcod 705'2775 !hml active+clean]  sending commit on
> repgather(0x7f3b0ce8db70 applied 705'2777 rep_tid=58202 wfack= wfdisk=
> op=osd_op(client4814.0:525316 1000000b013.00000007 [write 2498560~8192] 0.f8
> snapc 1=[]) v2) 0x7f3b0d5d9840
> osd.5.log.2.gz:2011-07-23 08:25:44.517596 7f3b30884700 osd5 705 pg[0.f8( v
> 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4]
> r=0 mlcod 705'2776 !hml active+clean]  sending commit on
> repgather(0x7f3b1ceab010 applied 705'2778 rep_tid=58203 wfack= wfdisk=
> op=osd_op(client4814.0:525317 1000000b013.00000007 [write 2506752~8192] 0.f8
> snapc 1=[]) v2) 0x7f3b1c351980
> osd.5.log.2.gz:2011-07-23 08:25:44.517666 7f3b31085700 osd5 705 pg[0.f8( v
> 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4]
> r=0 mlcod 705'2777 !hml active+clean]  sending commit on
> repgather(0x7f3b0d5dac70 applied 705'2779 rep_tid=58204 wfack= wfdisk=
> op=osd_op(client4814.0:525318 1000000b013.00000007 [write 2514944~8192] 0.f8
> snapc 1=[]) v2) 0x7f3b0ceaeb50
> osd.5.log.2.gz:2011-07-23 08:25:44.517731 7f3b30884700 osd5 705 pg[0.f8( v
> 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4]
> r=0 mlcod 705'2778 !hml active+clean]  sending commit on
> repgather(0x7f3b1cc216a0 applied 705'2780 rep_tid=58205 wfack= wfdisk=
> op=osd_op(client4814.0:525319 1000000b013.00000007 [write 2523136~16384]
> 0.f8 snapc 1=[]) v2) 0x7f3b1c79dd50
> osd.5.log.2.gz:2011-07-23 08:25:44.517794 7f3b31085700 osd5 705 pg[0.f8( v
> 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4]
> r=0 mlcod 705'2779 !hml active+clean]  sending commit on
> repgather(0x7f3b0d5e2370 applied 705'2781 rep_tid=58206 wfack= wfdisk=
> op=osd_op(client4814.0:525320 1000000b013.00000007 [write 2539520~16384]
> 0.f8 snapc 1=[]) v2) 0x7f3b0ce917f0
> osd.5.log.2.gz:2011-07-23 08:25:44.517887 7f3b30884700 osd5 705 pg[0.f8( v
> 705'2782 (705'2772,705'2782] n=2099 ec=2 les/c 705/705 704/704/700) [5,4]
> r=0 mlcod 705'2780 !hml active+clean]  sending commit on
> repgather(0x7f3b1dcdb3f0 applied 705'2782 rep_tid=58207 wfack= wfdisk=
> op=osd_op(client4814.0:525321 1000000b013.00000007 [write 2555904~16384]
> 0.f8 snapc 1=[]) v2) 0x7f3b1c356ad0
> osd.5.log.2.gz:2011-07-23 08:25:44.560037 7f3b30884700 osd5 705 pg[0.f8( v
> 705'2783 (705'2781,705'2783] n=2099 ec=2 les/c 705/705 704/704/700) [5,4]
> r=0 mlcod 705'2781 !hml active+clean]  sending commit on
> repgather(0x7f3b0d5d9d90 applied 705'2783 rep_tid=58208 wfack= wfdisk=
> op=osd_op(client4814.0:525322 1000000b013.00000007 [write 2572288~1622016]
> 0.f8 snapc 1=[]) v2) 0x7f3b0ceb2050
> osd.5.log.2.gz:2011-07-23 18:26:07.421820 7f3b31085700 log [INF] : 0.f8
> scrub ok
>
>
> WBR,
>    Fyodor.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-26 18:07             ` Gregory Farnum
@ 2011-07-26 18:37               ` Fyodor Ustinov
  2011-07-26 18:40                 ` Gregory Farnum
  0 siblings, 1 reply; 15+ messages in thread
From: Fyodor Ustinov @ 2011-07-26 18:37 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

On 07/26/2011 09:07 PM, Gregory Farnum wrote:
> The log is giving me some very strange information. Can you attach all
> the logs that you have to tracker issue #1340?
> (http://tracker.newdream.net/issues/1340)
> Thanks!
> -Greg
>
"All" - all available? Or only osd5?

WBR,
     Fyodor.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-26 18:37               ` Fyodor Ustinov
@ 2011-07-26 18:40                 ` Gregory Farnum
  2011-07-26 19:19                   ` Fyodor Ustinov
  0 siblings, 1 reply; 15+ messages in thread
From: Gregory Farnum @ 2011-07-26 18:40 UTC (permalink / raw)
  To: Fyodor Ustinov; +Cc: ceph-devel

On Tue, Jul 26, 2011 at 11:37 AM, Fyodor Ustinov <ufm@ufm.su> wrote:
> On 07/26/2011 09:07 PM, Gregory Farnum wrote:
>>
>> The log is giving me some very strange information. Can you attach all
>> the logs that you have to tracker issue #1340?
>> (http://tracker.newdream.net/issues/1340)
>> Thanks!
>> -Greg
>>
> "All" - all available? Or only osd5?
osd5 for now. Maybe the other daemon logs later. :)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-26 18:40                 ` Gregory Farnum
@ 2011-07-26 19:19                   ` Fyodor Ustinov
  2011-07-26 22:24                     ` Gregory Farnum
  0 siblings, 1 reply; 15+ messages in thread
From: Fyodor Ustinov @ 2011-07-26 19:19 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

On 07/26/2011 09:40 PM, Gregory Farnum wrote:
> On Tue, Jul 26, 2011 at 11:37 AM, Fyodor Ustinov<ufm@ufm.su>  wrote:
>> On 07/26/2011 09:07 PM, Gregory Farnum wrote:
>>> The log is giving me some very strange information. Can you attach all
>>> the logs that you have to tracker issue #1340?
>>> (http://tracker.newdream.net/issues/1340)
>>> Thanks!
>>> -Greg
>>>
>> "All" - all available? Or only osd5?
> osd5 for now. Maybe the other daemon logs later. :)

Done!

WBR,
     Fyodor.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-26 19:19                   ` Fyodor Ustinov
@ 2011-07-26 22:24                     ` Gregory Farnum
  2011-07-27 10:05                       ` Fyodor Ustinov
  0 siblings, 1 reply; 15+ messages in thread
From: Gregory Farnum @ 2011-07-26 22:24 UTC (permalink / raw)
  To: Fyodor Ustinov; +Cc: ceph-devel

Hmm, I'm afraid I can't see anything useful in these logs right now --
just confusion!

The only thing I can think of that might provide useful information
now is to turn off osd5, update its config file to have heavy
debugging information (debug osd = 20, at a minimum), and then start
the OSD back up again. Then we can see if pg 0.256 is still having any
issues (it is in the logs you sent me).

If not I'll have to mark this as "Can't reproduce" and hope we see it
again from somebody with higher log levels on. :(
-Greg

On Tue, Jul 26, 2011 at 12:19 PM, Fyodor Ustinov <ufm@ufm.su> wrote:
> On 07/26/2011 09:40 PM, Gregory Farnum wrote:
>>
>> On Tue, Jul 26, 2011 at 11:37 AM, Fyodor Ustinov<ufm@ufm.su>  wrote:
>>>
>>> On 07/26/2011 09:07 PM, Gregory Farnum wrote:
>>>>
>>>> The log is giving me some very strange information. Can you attach all
>>>> the logs that you have to tracker issue #1340?
>>>> (http://tracker.newdream.net/issues/1340)
>>>> Thanks!
>>>> -Greg
>>>>
>>> "All" - all available? Or only osd5?
>>
>> osd5 for now. Maybe the other daemon logs later. :)
>
> Done!
>
> WBR,
>    Fyodor.
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-26 22:24                     ` Gregory Farnum
@ 2011-07-27 10:05                       ` Fyodor Ustinov
  2011-07-27 15:57                         ` Gregory Farnum
  0 siblings, 1 reply; 15+ messages in thread
From: Fyodor Ustinov @ 2011-07-27 10:05 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

On 07/27/2011 01:24 AM, Gregory Farnum wrote:
> Hmm, I'm afraid I can't see anything useful in these logs right now --
> just confusion!
>
> The only thing I can think of that might provide useful information
> now is to turn off osd5, update its config file to have heavy
> debugging information (debug osd = 20, at a minimum), and then start
> the OSD back up again. Then we can see if pg 0.256 is still having any
> issues (it is in the logs you sent me).
>
> If not I'll have to mark this as "Can't reproduce" and hope we see it
> again from somebody with higher log levels on. :(
> -Greg

I started manual scrubbing on osd5 and got next message:

2011-07-27 11:38:30.360812   log 2011-07-27 11:38:28.540559 osd5 
10.5.51.145:6803/19737 2842 : [ERR] 0.256 scrub stat mismatch, got 
2196/2196 objects, 0/0 clones, 9046200695/9046200817 bytes, 
8834197/8834197 kb.
2011-07-27 11:38:30.360812   log 2011-07-27 11:38:28.540571 osd5 
10.5.51.145:6803/19737 2843 : [ERR] 0.256 scrub 1 errors

"Hmmm" I said to myself.

stop osd5
fsck -fy /dev/sdb1

And got:

Inode 230558734, i_blocks is 8, should be 16.  Fix? yes
Inode 230559279, i_blocks is 8208, should be 8200.  Fix? yes
Inode 230559410, i_blocks is 40, should be 32.  Fix? yes
Inode 230559527, i_blocks is 24, should be 32.  Fix? yes
Inode 231342174, i_blocks is 40, should be 32.  Fix? yes
Inode 231342209, i_blocks is 24, should be 32.  Fix? yes

Ups. Familiar message. :(

I login to different node and got the same.

Greg, I think to catch any errors no sense. 2.6.39 have broken ext4. I 
will try 3.0.0 (or 3.1.0) and tell about result.

WBR,
     Fyodor.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: My favorite question
  2011-07-27 10:05                       ` Fyodor Ustinov
@ 2011-07-27 15:57                         ` Gregory Farnum
  0 siblings, 0 replies; 15+ messages in thread
From: Gregory Farnum @ 2011-07-27 15:57 UTC (permalink / raw)
  To: Fyodor Ustinov; +Cc: ceph-devel

On Wed, Jul 27, 2011 at 3:05 AM, Fyodor Ustinov <ufm@ufm.su> wrote:
> On 07/27/2011 01:24 AM, Gregory Farnum wrote:
>>
>> Hmm, I'm afraid I can't see anything useful in these logs right now --
>> just confusion!
>>
>> The only thing I can think of that might provide useful information
>> now is to turn off osd5, update its config file to have heavy
>> debugging information (debug osd = 20, at a minimum), and then start
>> the OSD back up again. Then we can see if pg 0.256 is still having any
>> issues (it is in the logs you sent me).
>>
>> If not I'll have to mark this as "Can't reproduce" and hope we see it
>> again from somebody with higher log levels on. :(
>> -Greg
>
> I started manual scrubbing on osd5 and got next message:
>
> 2011-07-27 11:38:30.360812   log 2011-07-27 11:38:28.540559 osd5
> 10.5.51.145:6803/19737 2842 : [ERR] 0.256 scrub stat mismatch, got 2196/2196
> objects, 0/0 clones, 9046200695/9046200817 bytes, 8834197/8834197 kb.
> 2011-07-27 11:38:30.360812   log 2011-07-27 11:38:28.540571 osd5
> 10.5.51.145:6803/19737 2843 : [ERR] 0.256 scrub 1 errors
>
> "Hmmm" I said to myself.
>
> stop osd5
> fsck -fy /dev/sdb1
>
> And got:
>
> Inode 230558734, i_blocks is 8, should be 16.  Fix? yes
> Inode 230559279, i_blocks is 8208, should be 8200.  Fix? yes
> Inode 230559410, i_blocks is 40, should be 32.  Fix? yes
> Inode 230559527, i_blocks is 24, should be 32.  Fix? yes
> Inode 231342174, i_blocks is 40, should be 32.  Fix? yes
> Inode 231342209, i_blocks is 24, should be 32.  Fix? yes
>
> Ups. Familiar message. :(
>
> I login to different node and got the same.
>
> Greg, I think to catch any errors no sense. 2.6.39 have broken ext4. I will
> try 3.0.0 (or 3.1.0) and tell about result.
Ah, I think that would explain the errors. We'll have to get better at
identifying such things!
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-07-27 15:57 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-19 11:03 My favorite question Fyodor Ustinov
2011-07-19 14:29 ` Sage Weil
2011-07-19 14:32   ` Fyodor Ustinov
2011-07-22 15:27   ` Fyodor Ustinov
2011-07-25 20:54     ` Gregory Farnum
2011-07-25 21:16       ` Fyodor Ustinov
2011-07-25 22:55         ` Gregory Farnum
2011-07-26  0:05           ` Fyodor Ustinov
2011-07-26 18:07             ` Gregory Farnum
2011-07-26 18:37               ` Fyodor Ustinov
2011-07-26 18:40                 ` Gregory Farnum
2011-07-26 19:19                   ` Fyodor Ustinov
2011-07-26 22:24                     ` Gregory Farnum
2011-07-27 10:05                       ` Fyodor Ustinov
2011-07-27 15:57                         ` Gregory Farnum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.