From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Dillaman Subject: Re: Snapshots of consistency groups Date: Wed, 11 Jan 2017 20:36:56 -0500 Message-ID: References: <20160819124813.GB17630@gmail.com> <20160820162713.GA5997@gmail.com> Reply-To: dillaman@redhat.com Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: Received: from mail-lf0-f45.google.com ([209.85.215.45]:33195 "EHLO mail-lf0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752741AbdALBg7 (ORCPT ); Wed, 11 Jan 2017 20:36:59 -0500 Received: by mail-lf0-f45.google.com with SMTP id k86so3445626lfi.0 for ; Wed, 11 Jan 2017 17:36:58 -0800 (PST) In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Victor Denisov Cc: Mykola Golub , ceph-devel Given that specific use-case, yes, you can switch into an out of a snapshot whenever you please using the snap_set API. Of course, the snapshot API methods will need to be expanded to support namespaces. On Wed, Jan 11, 2017 at 8:10 PM, Victor Denisov wrote: > If I understand correctly I can't open required snapshot immediately > if I know only its namespace. > > I need to get the whole list of snapshots first and then find mine > using its namespace. > After that I want to redirect the opened image to my snapshot. > > On Wed, Jan 11, 2017 at 4:18 PM, Jason Dillaman wrote: >> You can specify the snapshot at image open as well. There is no need to refresh. >> >> On Wed, Jan 11, 2017 at 7:14 PM, Victor Denisov wrote: >>> Can I open an image and then redirect it to one of its snapshots? >>> >>> I see that there is a snap_set method. >>> Do I understand correctly that I need to: >>> open the image >>> invoke snap_set method >>> invoke refresh method >>> >>> Thanks in advance. >>> V. >>> >>> On Wed, Jan 4, 2017 at 10:39 AM, Victor Denisov wrote: >>>> I understood. Thanks! >>>> >>>> On Wed, Jan 4, 2017 at 10:34 AM, Jason Dillaman wrote: >>>>> The loop isn't that big of a deal -- but you could eliminate it >>>>> entirely if you just index the in-memory snapshot table via the >>>>> SnapshotNamespace variant instead of just indexing snapshots by name >>>>> (e.g. ImageCtx::snap_ids key switches from a string to a namespace). >>>>> This would be required anyway since you might otherwise have duplicate >>>>> names between namespaces. >>>>> >>>>> On Wed, Jan 4, 2017 at 1:29 PM, Victor Denisov wrote: >>>>>> It looks like next CDM is only next month. Let's try to figure it out in email. >>>>>> >>>>>>> Since you know which images are linked to the group and you know which >>>>>>> snapshots are in the group and which group snapshots are in the image, >>>>>>> you can reconcile any issues using the details in the >>>>>>> GroupSnapshotNamespace -- there shouldn't be any need to depend on the >>>>>>> actual snapshot name (it could technically just be a randomly assigned >>>>>>> UUID). >>>>>> >>>>>> Let's say I have a consistency group and a snapshot of this group: CG >>>>>> and CGSNAP. >>>>>> Images in this snapshot I'll define as: >>>>>> CG.images[0] - image1 >>>>>> CG.images[1] - image2 >>>>>> CG.images[2] - image3 >>>>>> >>>>>> Image snapshots in cg snapshot will be: >>>>>> CG.CGSNAP.snaps[0] - reference to snapshot of image 1 >>>>>> CG.CGSNAP.snaps[1] - reference to snapshot of image 2 >>>>>> >>>>>> Imagine that this snapshot was created, but wasn't finalized. >>>>>> CG.CGSNAP.state == PENDING. >>>>>> CG.CGSNAP.snaps.length == 0; >>>>>> I'll be writing in pseudo code. >>>>>> >>>>>> Now, let's say we want to remove this pending CGSNAP. This is the code >>>>>> how it's currently implemented: >>>>>> >>>>>> for (image: CG.images) { >>>>>> snap_name = image.id + "_" + CG.CGSNAP.id + "_" + CG.id // This name >>>>>> is unique because of uniqueness of the tuple (image.id, CG.CGSNAP.id, >>>>>> CG.id) >>>>>> remove_image_snapshot(snap_name); >>>>>> } >>>>>> remove_cg_snap(CGSNAP); >>>>>> >>>>>> However, if we don't rely on the name then this is how I envision the code: >>>>>> >>>>>> for (image: CG.images) { >>>>>> for (snap: image.snaps) { >>>>>> if (snap.namespace.cg_id == CG.id && snap.namespace.cg_snap_id == >>>>>> CG.CGSNAP.id) { // this is our snapshot >>>>>> remove_image_snapshot(snap.name); >>>>>> } >>>>>> } >>>>>> } >>>>>> remove_cg_snap(CGSNAP); >>>>>> >>>>>> In this solution I don't like the internal loop. >>>>>> What do you think? >>>>>> >>>>>> Thanks, >>>>>> Victor. >>>>>> >>>>>> >>>>>> On Tue, Jan 3, 2017 at 6:56 PM, Jason Dillaman wrote: >>>>>>> After starting the process of creating a group snapshot, you will >>>>>>> already have all the necessary data for the group snapshot namespace >>>>>>> [1] (group pool, group id, and group snapshot id) and the group >>>>>>> snapshot should be persistently recorded to disk as >>>>>>> GROUP_SNAPSHOT_STATE_PENDING. >>>>>>> >>>>>>> Looking at the snapshot create state machine [2], I don't see any >>>>>>> place that a crash (or similar failure) would matter before the actual >>>>>>> image snapshot record is created atomically. You would pass the fully >>>>>>> populated GroupSnapshotNamespace to snap_create, and if the snapshot >>>>>>> is created, it's linked to the group via that namespace and any >>>>>>> failures afterwards don't matter since they are linked -- if the >>>>>>> snapshot fails to be created, it isn't linked to the group but the >>>>>>> snapshot doesn't exist either so there isn't anything to clean up. >>>>>>> >>>>>>> Since you know which images are linked to the group and you know which >>>>>>> snapshots are in the group and which group snapshots are in the image, >>>>>>> you can reconcile any issues using the details in the >>>>>>> GroupSnapshotNamespace -- there shouldn't be any need to depend on the >>>>>>> actual snapshot name (it could technically just be a randomly assigned >>>>>>> UUID). >>>>>>> >>>>>>> Perhaps we could talk about this at a future RBD standup meeting that >>>>>>> you are able to join (or the next CDM). >>>>>>> >>>>>>> [1] https://github.com/ceph/ceph/blob/master/src/cls/rbd/cls_rbd_types.h#L249 >>>>>>> [2] https://github.com/ceph/ceph/blob/master/src/librbd/operation/SnapshotCreateRequest.h#L28 >>>>>>> >>>>>>> On Tue, Jan 3, 2017 at 7:40 PM, Victor Denisov wrote: >>>>>>>> Let's say we start creating a group snapshot. >>>>>>>> We invoke async snap_create method in Operations class. >>>>>>>> When we invoke this method we provide it with the snapshot name. >>>>>>>> >>>>>>>> While we are wating for the response we can be aborted. >>>>>>>> As a result we will be able to find the exact image snapshot using only its name >>>>>>>> as this was the only information we had at the time of running >>>>>>>> snap_create method. >>>>>>>> >>>>>>>> If snap_create was successful we will be able to find the snapshot >>>>>>>> otherwise we will not. >>>>>>>> However if we allow renaming snapshots from GroupSnapshotNamespace >>>>>>>> then we may not find the snapshot even if it >>>>>>>> was created successfully. >>>>>>>> >>>>>>>> On Fri, Dec 16, 2016 at 6:53 AM, Jason Dillaman wrote: >>>>>>>>> Can you give a little background on this specific inconsistent case >>>>>>>>> you are referring to? >>>>>>>>> >>>>>>>>> On Thu, Dec 15, 2016 at 7:05 PM, Victor Denisov wrote: >>>>>>>>>> Yes, but if image's snapshot is renamed then I'm not able to find this >>>>>>>>>> snapshot having only group's snapshot in an inconsistent state for >>>>>>>>>> example. >>>>>>>>>> >>>>>>>>>> V. >>>>>>>>>> >>>>>>>>>> On Thu, Dec 15, 2016 at 7:10 AM, Jason Dillaman wrote: >>>>>>>>>>> I think I might be confused. When creating a group snapshot, we have >>>>>>>>>>> the ConsistencyGroupSnapshot that allows you to store the necessary >>>>>>>>>>> linkage between the image's snapshot and its associated group snapshot >>>>>>>>>>> [1]. Why not just name the image's snapshots to the same name as the >>>>>>>>>>> parent group snapshot name and search the snapshot's metadata to get >>>>>>>>>>> the linkage? >>>>>>>>>>> >>>>>>>>>>> [1] https://github.com/ceph/ceph/blob/master/src/cls/rbd/cls_rbd_types.h#L255 >>>>>>>>>>> >>>>>>>>>>> On Tue, Dec 13, 2016 at 8:03 PM, Victor Denisov wrote: >>>>>>>>>>>> Jason, >>>>>>>>>>>> >>>>>>>>>>>> My current implementation of consistency group snapshot feature names >>>>>>>>>>>> image snapshots like: __ >>>>>>>>>>>> I rely on this fact when I need to remove a consistency group. It's >>>>>>>>>>>> necessary because if some of image snapshots were created, but the >>>>>>>>>>>> whole group snapshot operation failed, >>>>>>>>>>>> then the only way to find those dangling image snapshots is by this name. >>>>>>>>>>>> >>>>>>>>>>>> It means that we should forbid renaming snapshots from >>>>>>>>>>>> ConsistencyGroupSnapshot namespace. >>>>>>>>>>>> >>>>>>>>>>>> Another option is to allocate image snapshot ids during the creation >>>>>>>>>>>> of group snapshot, but this requires a major rewrite of the whole >>>>>>>>>>>> process of snapshot creation for images. >>>>>>>>>>>> >>>>>>>>>>>> What is your opinion on this? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> V. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Nov 1, 2016 at 8:01 AM, Jason Dillaman wrote: >>>>>>>>>>>>> I chatted with Xing on IRC this morning re: Cinder generic groups. It >>>>>>>>>>>>> sounds like RBD will need to support preserving the image's >>>>>>>>>>>>> consistency group snapshots even if the image is removed from the >>>>>>>>>>>>> group. In the OpenStack case, you won't have to worry about the image >>>>>>>>>>>>> being deleted while it still has associated group snapshots. >>>>>>>>>>>>> >>>>>>>>>>>>> We will also want to support being able to clone child images from a >>>>>>>>>>>>> group snapshot to ensure that we can thin provision new groups volumes >>>>>>>>>>>>> when creating a new group from a group snapshot. This means that the >>>>>>>>>>>>> group snapshots should be able to be protected/unprotected just like >>>>>>>>>>>>> standard user snapshots. >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Oct 31, 2016 at 9:07 AM, Jason Dillaman wrote: >>>>>>>>>>>>>> Looking at the Cinder codebase, I don't see any such restriction that >>>>>>>>>>>>>> would prevent you from removing a volume from a consistency group that >>>>>>>>>>>>>> has associated snapshots. I would double-check on the OpenStack >>>>>>>>>>>>>> development mailing list if this is correct and is the intent. Worst >>>>>>>>>>>>>> case, the RBD driver could raise an exception if there are still >>>>>>>>>>>>>> consistency group snapshots associated to the image. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Oct 28, 2016 at 6:41 PM, Victor Denisov wrote: >>>>>>>>>>>>>>> Another thing that bothers me. When we remove an image from a consistency group. >>>>>>>>>>>>>>> Should we remove all snapshots of this image that were created as part >>>>>>>>>>>>>>> of a consistency group snapshot? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The easiest solution would be to remove all snapshots that are in >>>>>>>>>>>>>>> GroupSnapshotNamespace and reference this consistency group. >>>>>>>>>>>>>>> I looked into cinder docs for this feature: >>>>>>>>>>>>>>> http://docs.openstack.org/admin-guide/blockstorage-consistency-groups.html >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> But it's not clear to me which behavior cinder expects. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> V. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Oct 26, 2016 at 6:16 AM, Jason Dillaman wrote: >>>>>>>>>>>>>>>> In a perfect world, it would be nice to add a new optional to "rbd >>>>>>>>>>>>>>>> snap ls" to show all snapshots (with a new column to indicate the >>>>>>>>>>>>>>>> associated namespace). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Oct 25, 2016 at 11:07 PM, Victor Denisov wrote: >>>>>>>>>>>>>>>>> Question. When we print out snapshots of an image, should the group >>>>>>>>>>>>>>>>> snapshots be listed, or should they be marked as special snapshots? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> V. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Oct 10, 2016 at 3:14 PM, Victor Denisov wrote: >>>>>>>>>>>>>>>>>> Ok. I didn't have any intention to throw exceptions. >>>>>>>>>>>>>>>>>> I was more concerned about whether it's ok to allocate and delete >>>>>>>>>>>>>>>>>> objects or I should use smart pointers. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, Oct 10, 2016 at 7:18 AM, Jason Dillaman wrote: >>>>>>>>>>>>>>>>>>> The only place exceptions are routinely used is within the "::decode" >>>>>>>>>>>>>>>>>>> functions. I would prefer to see the code not throwing new exceptions >>>>>>>>>>>>>>>>>>> on purpose. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Oct 7, 2016 at 2:26 PM, Victor Denisov wrote: >>>>>>>>>>>>>>>>>>>> Are any exceptions used in librbd code? Should the code be exception safe? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> V. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, Sep 16, 2016 at 10:37 AM, Jason Dillaman wrote: >>>>>>>>>>>>>>>>>>>>> On Thu, Sep 15, 2016 at 7:17 PM, Victor Denisov wrote: >>>>>>>>>>>>>>>>>>>>>> if (struct_v >= 5) { >>>>>>>>>>>>>>>>>>>>>> ::decode(snapshot_namespace, p); >>>>>>>>>>>>>>>>>>>>>> } else { >>>>>>>>>>>>>>>>>>>>>> snapshot_namespace = cls::rbd::UserSnapshotNamespace(); >>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> then code for ::encode function of cls_rbd_snap would change accordingly: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> instead of >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> boost::apply_visitor(cls::rbd::EncodeSnapshotTypeVisitor(bl), >>>>>>>>>>>>>>>>>>>>>> snapshot_namespace); >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I would do: >>>>>>>>>>>>>>>>>>>>>> ::encode(snapshot_namespace, bl); >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> +1 -- looks good to me >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> Jason >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> Jason >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Jason >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Jason >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Jason >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Jason >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Jason >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Jason >>>>> >>>>> >>>>> >>>>> -- >>>>> Jason >> >> >> >> -- >> Jason -- Jason