All of lore.kernel.org
 help / color / mirror / Atom feed
* full osdmaps in mon txns
@ 2014-12-23 21:10 Sage Weil
  2015-01-01 15:50 ` Joao Eduardo Luis
       [not found] ` <alpine.DEB.2.00.1412231306170.28630-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
  0 siblings, 2 replies; 4+ messages in thread
From: Sage Weil @ 2014-12-23 21:10 UTC (permalink / raw)
  To: joao, gfarnum; +Cc: ceph-devel

This fun issue came up again in the form of 10422:

	http://tracker.ceph.com/issues/10422

I think we have 3 main options:

1. Ask users to do a mon scrub prior to upgrade to 
ensure it is safe.  If a mon is out of sync, manually kick it out, blow it 
away, and resync.

2. Do a one-time broadcast of the full osdmap across mons to ensure they 
are consistent after upgrade.  Bleh.

3. Include full encoded OSDMap in txns on updates going forward.

I like 3 because it solves this and all related problems going forward.  
The local encoding of full osdmaps has proven to be a huge headache.  
And, the patch to do it is remarkably simple

	https://github.com/ceph/ceph/pull/3247/files

and dovetails well with the new CRC.

What do you think?
sage

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: full osdmaps in mon txns
  2014-12-23 21:10 full osdmaps in mon txns Sage Weil
@ 2015-01-01 15:50 ` Joao Eduardo Luis
       [not found] ` <alpine.DEB.2.00.1412231306170.28630-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
  1 sibling, 0 replies; 4+ messages in thread
From: Joao Eduardo Luis @ 2015-01-01 15:50 UTC (permalink / raw)
  To: Sage Weil, gfarnum; +Cc: ceph-devel

On 12/23/2014 09:10 PM, Sage Weil wrote:
> This fun issue came up again in the form of 10422:
>
> 	http://tracker.ceph.com/issues/10422
>
> I think we have 3 main options:
>
> 1. Ask users to do a mon scrub prior to upgrade to
> ensure it is safe.  If a mon is out of sync, manually kick it out, blow it
> away, and resync.
>
> 2. Do a one-time broadcast of the full osdmap across mons to ensure they
> are consistent after upgrade.  Bleh.
>
> 3. Include full encoded OSDMap in txns on updates going forward.
>
> I like 3 because it solves this and all related problems going forward.
> The local encoding of full osdmaps has proven to be a huge headache.
> And, the patch to do it is remarkably simple
>
> 	https://github.com/ceph/ceph/pull/3247/files
>
> and dovetails well with the new CRC.

I prefer 3 as well.  Below is my reply on the pull request, which I 
wrote before addressing this email, and I shall leave it here for posterity!

(Also, I think the approach in the pull request is correct)

As far as I can tell, the whole idea about relying solely on incremental 
to locally build full osdmaps goes as back as a5e2dcb. This has me 
believe that while the idea may have seemed good at the time it may not 
have been based on a real issue.

Anyway, relaying a few MB's worth of osdmap (if it gets to that) over 
the wire doesn't concern me particularly -- the one thing that may be 
annoying is writing them to leveldb.

I fear that writing a just-big enough map to leveldb may cause a hang; 
while we do now have the async mechanism to handle this, I fear that we 
may end up waiting for a big transaction to be applied to leveldb before 
accepting the value (e.g., in Paxos::handle_begin() we will wait for the 
value to be applied to the store before we send out 
MMonPaxos::OP_ACCEPT). Then again, this can easily be something 
surmountable by adjusting timeouts if we ever hit it.


   -Joao

>
> What do you think?
> sage
>


-- 
Joao Eduardo Luis
Software Engineer | http://ceph.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: full osdmaps in mon txns
       [not found] ` <alpine.DEB.2.00.1412231306170.28630-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
@ 2015-01-05  9:12   ` Dan van der Ster
  2015-01-06  8:49     ` Dan van der Ster
  0 siblings, 1 reply; 4+ messages in thread
From: Dan van der Ster @ 2015-01-05  9:12 UTC (permalink / raw)
  To: Sage Weil
  Cc: ceph-devel-u79uwXL29TY76Z2rM5mHXA,
	gfarnum-H+wXaHxf7aLQT0dZR+AlfA,
	ceph-users-idqoXFIVOFJgJs9I8MT0rw

Hi Sage,

On Tue, Dec 23, 2014 at 10:10 PM, Sage Weil <sweil-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>
> This fun issue came up again in the form of 10422:
>
>         http://tracker.ceph.com/issues/10422
>
> I think we have 3 main options:
>
> 1. Ask users to do a mon scrub prior to upgrade to
> ensure it is safe.  If a mon is out of sync, manually kick it out, blow it
> away, and resync.
>

I just tried ceph scrub on a cluster we already upgraded from dumpling
to firefly:

2015-01-05 09:44:32.811194 mon.0 137.138.13.136:6789/0 622509 : [ERR]
scrub mismatch
2015-01-05 09:44:32.811289 mon.0 137.138.13.136:6789/0 622510 : [ERR]
mon.0 ScrubResult(keys
{auth=90,logm=1163,mdsmap=3,monmap=5,osdmap=1541,pgmap=645,pgmap_meta=6,pgmap_osd=208,pgmap_pg=24704}
crc {auth=2819756818,logm=2322364559,mdsmap=183360962,monmap=2103930516,osdmap=522825856,pgmap=2035077624,pgmap_meta=951037946,pgmap_osd=3980602023,pgmap_pg=502660406})
2015-01-05 09:44:32.811414 mon.0 137.138.13.136:6789/0 622511 : [ERR]
mon.2 ScrubResult(keys
{auth=90,logm=1163,mdsmap=3,monmap=5,osdmap=1541,pgmap=645,pgmap_meta=6,pgmap_osd=208,pgmap_pg=24704}
crc {auth=2819756818,logm=2322364559,mdsmap=183360962,monmap=2103930516,osdmap=3364575095,pgmap=2035077624,pgmap_meta=951037946,pgmap_osd=3980602023,pgmap_pg=502660406})

Is this the same issue as #10422? What would you do to recover?

Cheers, and Happy New Year!

Dan


>
> 2. Do a one-time broadcast of the full osdmap across mons to ensure they
> are consistent after upgrade.  Bleh.
>
> 3. Include full encoded OSDMap in txns on updates going forward.
>
> I like 3 because it solves this and all related problems going forward.
> The local encoding of full osdmaps has proven to be a huge headache.
> And, the patch to do it is remarkably simple
>
>         https://github.com/ceph/ceph/pull/3247/files
>
> and dovetails well with the new CRC.
>
> What do you think?
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: full osdmaps in mon txns
  2015-01-05  9:12   ` Dan van der Ster
@ 2015-01-06  8:49     ` Dan van der Ster
  0 siblings, 0 replies; 4+ messages in thread
From: Dan van der Ster @ 2015-01-06  8:49 UTC (permalink / raw)
  To: Sage Weil; +Cc: joao, gfarnum, ceph-devel, ceph-users

On Mon, Jan 5, 2015 at 10:12 AM, Dan van der Ster
<daniel.vanderster@cern.ch> wrote:
> Hi Sage,
>
> On Tue, Dec 23, 2014 at 10:10 PM, Sage Weil <sweil@redhat.com> wrote:
>>
>> This fun issue came up again in the form of 10422:
>>
>>         http://tracker.ceph.com/issues/10422
>>
>> I think we have 3 main options:
>>
>> 1. Ask users to do a mon scrub prior to upgrade to
>> ensure it is safe.  If a mon is out of sync, manually kick it out, blow it
>> away, and resync.
>>
>
> I just tried ceph scrub on a cluster we already upgraded from dumpling
> to firefly:
>
> 2015-01-05 09:44:32.811194 mon.0 137.138.13.136:6789/0 622509 : [ERR]
> scrub mismatch
> 2015-01-05 09:44:32.811289 mon.0 137.138.13.136:6789/0 622510 : [ERR]
> mon.0 ScrubResult(keys
> {auth=90,logm=1163,mdsmap=3,monmap=5,osdmap=1541,pgmap=645,pgmap_meta=6,pgmap_osd=208,pgmap_pg=24704}
> crc {auth=2819756818,logm=2322364559,mdsmap=183360962,monmap=2103930516,osdmap=522825856,pgmap=2035077624,pgmap_meta=951037946,pgmap_osd=3980602023,pgmap_pg=502660406})
> 2015-01-05 09:44:32.811414 mon.0 137.138.13.136:6789/0 622511 : [ERR]
> mon.2 ScrubResult(keys
> {auth=90,logm=1163,mdsmap=3,monmap=5,osdmap=1541,pgmap=645,pgmap_meta=6,pgmap_osd=208,pgmap_pg=24704}
> crc {auth=2819756818,logm=2322364559,mdsmap=183360962,monmap=2103930516,osdmap=3364575095,pgmap=2035077624,pgmap_meta=951037946,pgmap_osd=3980602023,pgmap_pg=502660406})
>
> Is this the same issue as #10422? What would you do to recover?


For posterity, here is how I fixed this:

# service ceph stop
=== mon.2 ===
Stopping Ceph mon.2 on p05153026953834...kill 30708...done
# mv /var/lib/ceph/mon/mon.2/ /tmp/mon.2.old
# mkdir  /var/lib/ceph/mon/mon.2
# ceph-mon --mkfs -i 2 --keyring /var/lib/ceph/tmp/keyring.mon.2
ceph-mon: set fsid to dd535a7e-4647-4bee-853d-f34112615f81
ceph-mon: created monfs at /var/lib/ceph/mon/mon.2 for mon.2
#

Then service ceph start, let mon.2 resync, and running ceph scrub
afterwards gives:

2015-01-06 09:43:06.431919 mon.0 137.138.13.136:6789/0 63 : [INF]
scrub ok on 0,1,2: ScrubResult(keys
{auth=69,logm=1455,mdsmap=3,monmap=5,osdmap=1005,pgmap=695,pgmap_meta=6,pgmap_osd=208,pgmap_pg=24704}
crc {auth=2072984173,logm=1432725577,mdsmap=183360962,monmap=2103930516,osdmap=731548464,pgmap=2718145611,pgmap_meta=990631329,pgmap_osd=31335898,pgmap_pg=4049607692})


Cheers, Dan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-01-06  8:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-23 21:10 full osdmaps in mon txns Sage Weil
2015-01-01 15:50 ` Joao Eduardo Luis
     [not found] ` <alpine.DEB.2.00.1412231306170.28630-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2015-01-05  9:12   ` Dan van der Ster
2015-01-06  8:49     ` Dan van der Ster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.