All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sage Weil <sage@newdream.net>
To: Alexandre Oliva <oliva@gnu.org>
Cc: sam.just@inktank.com, ceph-devel@vger.kernel.org
Subject: Re: [PATCH] reinstate ceph cluster_snap support
Date: Mon, 27 Oct 2014 14:00:57 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.00.1410271357160.15740@cobra.newdream.net> (raw)
In-Reply-To: <orbnp6uofj.fsf@free.home>

On Tue, 21 Oct 2014, Alexandre Oliva wrote:
> On Aug 27, 2013, Sage Weil <sage@inktank.com> wrote:
> 
> > Finally, eventually we should make this do a checkpoint on the mons too.  
> > We can add the osd snapping back in first, but before this can/should 
> > really be used the mons need to be snapshotted as well.  Probably that's 
> > just adding in a snapshot() method to MonitorStore.h and doing either a 
> > leveldb snap or making a full copy of store.db... I forget what leveldb is 
> > capable of here.
> 
> I suppose it might be a bit too late for Giant, but I finally got 'round
> to implementing this.  I attach the patch that implements it, to be
> applied on top of the updated version of the patch I posted before, also
> attached.
> 
> I have a backport to Firefly too, if there's interest.
> 
> I have tested both methods: btrfs snapshotting of store.db (I've
> manually turned store.db into a btrfs subvolume), and creating a new db
> with all (prefix,key,value) triples.  I'm undecided about inserting
> multiple transaction commits for the latter case; the mon mem use grew
> up a lot as it was, and in a few tests the snapshotting ran twice, but
> in the end a dump of all the data in the database created by btrfs
> snapshotting was identical to that created by explicit copying.  So, the
> former is preferred, since it's so incredibly more efficient.  I also
> considered hardlinking all files in store.db into a separate tree, but I
> didn't like the idea of coding that in C+-, :-) and I figured it might
> not work with other db backends, and maybe even not be guaranteed to
> work with leveldb.  It's probably not worth much more effort.

This looks pretty reasonable!

I think we definitely need to limit the size of the transaction when doing 
the snap.  The attached patch seems to try to do it all in one go, which 
is not going to work for large clusters.  Either re-use an existing 
tunable like the sync chunk size or add a new one?

Thanks!
sage


  reply	other threads:[~2014-10-27 21:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-22  9:10 [PATCH] reinstate ceph cluster_snap support Alexandre Oliva
2013-08-24  0:17 ` Sage Weil
2013-08-24 14:56   ` Alexandre Oliva
2013-08-27 22:21     ` Sage Weil
2013-08-28  0:54       ` Yan, Zheng
2013-08-28  4:34         ` Sage Weil
2013-12-17 12:14       ` Alexandre Oliva
2013-12-17 13:50         ` Alexandre Oliva
2013-12-17 14:22           ` Alexandre Oliva
2013-12-18 19:35             ` Gregory Farnum
2013-12-19  8:22               ` Alexandre Oliva
2014-10-21  2:49       ` Alexandre Oliva
2014-10-27 21:00         ` Sage Weil [this message]
2014-11-03 19:57           ` Alexandre Oliva
2014-11-13 18:02             ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.1410271357160.15740@cobra.newdream.net \
    --to=sage@newdream.net \
    --cc=ceph-devel@vger.kernel.org \
    --cc=oliva@gnu.org \
    --cc=sam.just@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.