From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53991) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aJhcc-0000xH-Ow for qemu-devel@nongnu.org; Thu, 14 Jan 2016 08:07:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aJhcY-0001Br-DD for qemu-devel@nongnu.org; Thu, 14 Jan 2016 08:07:26 -0500 From: Markus Armbruster References: <1450802786-20893-1-git-send-email-kwolf@redhat.com> <1450802786-20893-7-git-send-email-kwolf@redhat.com> <5679BB6A.6080902@redhat.com> <87twmkt8u9.fsf@blackfin.pond.sub.org> <20160111160533.GF9454@noname.redhat.com> <87si22su2o.fsf@blackfin.pond.sub.org> <20160112173637.GK4841@noname.redhat.com> <87wprdooll.fsf@blackfin.pond.sub.org> <20160113141958.GA4356@noname.redhat.com> Date: Thu, 14 Jan 2016 14:07:12 +0100 In-Reply-To: <20160113141958.GA4356@noname.redhat.com> (Kevin Wolf's message of "Wed, 13 Jan 2016 15:19:58 +0100") Message-ID: <878u3scnsv.fsf@blackfin.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [PATCH 06/10] qemu-img: Prepare for locked images List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, mreitz@redhat.com Kevin Wolf writes: > Am 13.01.2016 um 09:44 hat Markus Armbruster geschrieben: >> Kevin Wolf writes: >> >> > Am 12.01.2016 um 16:20 hat Markus Armbruster geschrieben: >> >> Kevin Wolf writes: >> >> >> >> > Am 11.01.2016 um 16:49 hat Markus Armbruster geschrieben: >> >> >> Eric Blake writes: >> >> >> >> >> >> > On 12/22/2015 09:46 AM, Kevin Wolf wrote: >> >> >> >> This patch extends qemu-img for working with locked >> >> >> >> images. It prints a >> >> >> >> helpful error message when trying to access a locked image >> >> >> >> read-write, >> >> >> >> and adds a 'qemu-img force-unlock' command as well as a >> >> >> >> 'qemu-img check >> >> >> >> -r all --force' option in order to override a lock left >> >> >> >> behind after a >> >> >> >> qemu crash. >> >> >> >> >> >> >> >> Signed-off-by: Kevin Wolf >> >> >> >> --- >> >> >> >> include/block/block.h | 1 + >> >> >> >> include/qapi/error.h | 1 + >> >> >> >> qapi/common.json | 3 +- >> >> >> >> qemu-img-cmds.hx | 10 ++++-- >> >> >> >> qemu-img.c | 96 >> >> >> >> +++++++++++++++++++++++++++++++++++++++++++-------- >> >> >> >> qemu-img.texi | 20 ++++++++++- >> >> >> >> 6 files changed, 113 insertions(+), 18 deletions(-) >> >> >> >> >> >> >> > >> >> >> >> +++ b/include/qapi/error.h >> >> >> >> @@ -102,6 +102,7 @@ typedef enum ErrorClass { >> >> >> >> ERROR_CLASS_DEVICE_NOT_ACTIVE = QAPI_ERROR_CLASS_DEVICENOTACTIVE, >> >> >> >> ERROR_CLASS_DEVICE_NOT_FOUND = QAPI_ERROR_CLASS_DEVICENOTFOUND, >> >> >> >> ERROR_CLASS_KVM_MISSING_CAP = QAPI_ERROR_CLASS_KVMMISSINGCAP, >> >> >> >> + ERROR_CLASS_IMAGE_FILE_LOCKED = QAPI_ERROR_CLASS_IMAGEFILELOCKED, >> >> >> >> } ErrorClass; >> >> >> > >> >> >> > Wow - a new ErrorClass. It's been a while since we could >> >> >> > justify one of >> >> >> > these, but I think you might have found a case. >> >> >> >> >> >> Spell out the rationale for the new ErrorClass, please. >> >> > >> >> > Action to be taken for this error class: Decide whether the lock is a >> >> > leftover from a previous qemu run that ended in an unclean shutdown. If >> >> > so, retry with overriding the lock. >> >> > >> >> > Currently used by qemu-img when ordered to override a lock. libvirt >> >> > will need to do the same. >> >> >> >> Let's see whether I understand the intended use: >> >> >> >> open image >> >> if open fails with ImageFileLocked: >> >> guess whether the lock is stale >> >> if guessing not stale: >> >> error out >> >> open image with lock override >> >> >> >> Correct? >> > >> > Yes. Where "guess" is more or less "check whether the management tool >> > started qemu with this image, but didn't cleanly shut it down". This can >> > guess wrong if, and only if, some other user used a different algorithm >> > and forced an unlock even though the image didn't belong to them before >> > the crash. >> > >> >> Obvious troublespots: >> >> >> >> 1. If you guess wrong, you destroy the image. No worse than before, so >> >> okay, declare documentation problem. >> >> >> >> 2. TOCTTOU open to open with lock override >> >> [...] >> >> >> >> 3. TOCTTOU within open (hypothetical, haven't read your code) >> >> [...] >> > >> > Yes, these exist in theory. The question is what scenarios you want to >> > protect against and whether improving the mechanism to cover these cases >> > is worth the effort. >> > >> > The answer for what I wanted to protect is a manual action on an image >> > that is already in use. The user isn't quick enough to manually let two >> > processes open the same image at the same time, so I didn't consider >> > that scenario relevant. >> > >> > But assuming that everyone (including the human user) follows the above >> > protocol (force-unlock only what was yours before the crash), at least >> > cases 1 and 2 don't happen anyway. >> >> "Force-unlock only what you locked yourself" is easier to stipulate than >> to adhere to when the tools can't give you a hint on who did the >> locking. This is particularly true when "you" is a human, with human >> imperfect memory. >> >> I understand that this locking can't provide complete protection, and >> merely aims to catch certain common accidents. >> >> However, to avoid a false sense of security, its limitations need to be >> clearly documented. This very much includes the rule "force-unlock only >> what you locked yourself". In my opinion, it should also include the >> raciness. >> >> Sometimes, solving a problem is easier than documenting it. > > Maybe I'll just merge the migration fixes and shelve the rest of the > series until I'm bored enough to implement the "real thing" with an > incompatible feature flag, lock IDs with an autogenerated part and > another part from the user, saved host name and PID, and a qcow2 driver > that refuses to write anything to an image it doesn't hold the lock for > even in corner cases. For now, I've already used more time for this than > I intended (didn't expect all that live migration fun initially...). > > And after all, it's not my VMs that get corrupted. > > If you think that solving the problem "for real" is too easy to take > shortcuts, I'll happily create a BZ and assign it to you if you'd like > me to. I'm not sure what to do with this message. Did I tick you off? I took the trouble to understand what you're trying to do, in particular the limitations. I also explored a possible alternative that might be less limited. I didn't ask you to adopt this alternative. I didn't call your patch a shortcut. I did point out that its limitations need to be documented, and encouraged you to weigh documentation effort against implementation effort. Additionally, I wondered how this affects libvirt, and whether we need to cooperate more closely on locking. If that's review gone too far, then I fail to see why we bother with it.