From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47960) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1btWOi-0006UP-Bm for qemu-devel@nongnu.org; Mon, 10 Oct 2016 04:57:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1btWOh-0004Al-A0 for qemu-devel@nongnu.org; Mon, 10 Oct 2016 04:57:24 -0400 Date: Mon, 10 Oct 2016 10:57:13 +0200 From: Kevin Wolf Message-ID: <20161010085713.GA6775@noname.redhat.com> References: <1475272849-19990-1-git-send-email-jsnow@redhat.com> <1475272849-19990-11-git-send-email-jsnow@redhat.com> <1cf8db4a-8527-57ce-93be-d693064344bd@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1cf8db4a-8527-57ce-93be-d693064344bd@redhat.com> Subject: Re: [Qemu-devel] [PATCH v2 10/11] blockjob: refactor backup_start as backup_job_create List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: John Snow Cc: qemu-block@nongnu.org, vsementsov@virtuozzo.com, famz@redhat.com, jcody@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com Am 07.10.2016 um 20:39 hat John Snow geschrieben: > On 09/30/2016 06:00 PM, John Snow wrote: > >Refactor backup_start as backup_job_create, which only creates the job, > >but does not automatically start it. The old interface, 'backup_start', > >is not kept in favor of limiting the number of nearly-identical iterfaces > >that would have to be edited to keep up with QAPI changes in the future. > > > >Callers that wish to synchronously start the backup_block_job can > >instead just call block_job_start immediately after calling > >backup_job_create. > > > >Transactions are updated to use the new interface, calling block_job_start > >only during the .commit phase, which helps prevent race conditions where > >jobs may finish before we even finish building the transaction. This may > >happen, for instance, during empty block backup jobs. > > > > Sadly for me, I realized this patch has a potential problem. When we > were adding the bitmap operations, it became clear that the > atomicity point was during .prepare, not .commit. > > e.g. the bitmap is cleared or created during prepare, and backup_run > installs its Write Notifier at that point in time, too. Strictly speaking that's wrong then. The write notifier doesn't really hurt because it is never triggered between prepare and commit (we're holding the lock) and it can just be removed again. Clearing the bitmap is a bug because the caller could expect that the bitmap is in its original state if the transaction fails. I doubt this is a problem in practice, but we should fix it anyway. By the way, why did we allow to add a 'bitmap' option for DriveBackup without adding it to BlockdevBackup at the same time? > By changing BlockJobs to only run on commit, we've severed the > atomicity point such that some actions will take effect during > prepare, and others at commit. > > I still think it's the correct thing to do to delay the BlockJobs > until the commit phase, so I will start auditing the code to see how > hard it is to shift the atomicity point to commit instead. If it's > possible to do that, I think from the POV of the managing > application, having the atomicity point be > > Feel free to chime in with suggestions and counterpoints until then. I agree that jobs have to be started only at commit. There may be other things that are currently happening in prepare that really should be moved as well, but unless moving one thing but not the other doesn't break anything that was working, we can fix one thing at a time. Kevin