All of lore.kernel.org
 help / color / mirror / Atom feed
* FAILED assert(peer_missing.count(fromshard))
@ 2015-01-16 16:39 Loic Dachary
  2015-01-16 18:10 ` Samuel Just
  0 siblings, 1 reply; 3+ messages in thread
From: Loic Dachary @ 2015-01-16 16:39 UTC (permalink / raw)
  To: Samuel Just; +Cc: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 572 bytes --]

Hi Sam,

In the context of http://tracker.ceph.com/issues/10524 FAILED assert(peer_missing.count(fromshard)) I propose to add some information for when it happens:

https://github.com/ceph/ceph/pull/3389

If what happens really is that a bad peer ends up being added with in missing_loc.add_location, that will be a useful information. I tried a number of scenarios and could not find the right conditions to reproduce the problem locally. Hopefully this additional information will show me where to go :-)

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: FAILED assert(peer_missing.count(fromshard))
  2015-01-16 16:39 FAILED assert(peer_missing.count(fromshard)) Loic Dachary
@ 2015-01-16 18:10 ` Samuel Just
  2015-01-16 18:16   ` Loic Dachary
  0 siblings, 1 reply; 3+ messages in thread
From: Samuel Just @ 2015-01-16 18:10 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

1) The part where you add the operator<< and change the debug output looks good.
2) The other part looks like it should be an assert?  Or it should
complain to the central log so that it causes the test to fail at
least?

1 and 2 should be separate commits.
-Sam

On Fri, Jan 16, 2015 at 8:39 AM, Loic Dachary <loic@dachary.org> wrote:
> Hi Sam,
>
> In the context of http://tracker.ceph.com/issues/10524 FAILED assert(peer_missing.count(fromshard)) I propose to add some information for when it happens:
>
> https://github.com/ceph/ceph/pull/3389
>
> If what happens really is that a bad peer ends up being added with in missing_loc.add_location, that will be a useful information. I tried a number of scenarios and could not find the right conditions to reproduce the problem locally. Hopefully this additional information will show me where to go :-)
>
> Cheers
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: FAILED assert(peer_missing.count(fromshard))
  2015-01-16 18:10 ` Samuel Just
@ 2015-01-16 18:16   ` Loic Dachary
  0 siblings, 0 replies; 3+ messages in thread
From: Loic Dachary @ 2015-01-16 18:16 UTC (permalink / raw)
  To: sjust; +Cc: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 1721 bytes --]



On 16/01/2015 19:10, Samuel Just wrote:
> 1) The part where you add the operator<< and change the debug output looks good.
> 2) The other part looks like it should be an assert?  Or it should
> complain to the central log so that it causes the test to fail at
> least?

Yes.

I'd rather have it report to central log for now instead of asserting. If it asserts it will be impossible to know if it is the source of the problem or not. If it does not assert and the problem does not show up anymore, it will mean that the origin of this specific problem is that we have a bad peer in the ok peers. If it asserts, it may mean that sometime a bad peer is among the good peers but not necessarily that this is the source of the problem. If it does not assert and the problem persist it will mean that we have two problems : a bad peer in good peers and the peer_missing assert, as separate issues.

Does that make sense ?

> 1 and 2 should be separate commits.

Ok.

> -Sam
> 
> On Fri, Jan 16, 2015 at 8:39 AM, Loic Dachary <loic@dachary.org> wrote:
>> Hi Sam,
>>
>> In the context of http://tracker.ceph.com/issues/10524 FAILED assert(peer_missing.count(fromshard)) I propose to add some information for when it happens:
>>
>> https://github.com/ceph/ceph/pull/3389
>>
>> If what happens really is that a bad peer ends up being added with in missing_loc.add_location, that will be a useful information. I tried a number of scenarios and could not find the right conditions to reproduce the problem locally. Hopefully this additional information will show me where to go :-)
>>
>> Cheers
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
>>

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-01-16 18:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-16 16:39 FAILED assert(peer_missing.count(fromshard)) Loic Dachary
2015-01-16 18:10 ` Samuel Just
2015-01-16 18:16   ` Loic Dachary

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.