From: Stefan Hajnoczi <stefanha@gmail.com>
To: John Snow <jsnow@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"qemu-stable@nongnu.org" <qemu-stable@nongnu.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] block/backup: install notifier during creation
Date: Tue, 10 Sep 2019 10:19:42 +0200 [thread overview]
Message-ID: <20190910081942.GA23976@stefanha-x1.localdomain> (raw)
In-Reply-To: <154bc276-d782-443f-3db6-38d87992d609@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 4044 bytes --]
On Wed, Aug 21, 2019 at 04:01:52PM -0400, John Snow wrote:
>
>
> On 8/21/19 10:41 AM, Vladimir Sementsov-Ogievskiy wrote:
> > 09.08.2019 23:13, John Snow wrote:
> >> Backup jobs may yield prior to installing their handler, because of the
> >> job_co_entry shim which guarantees that a job won't begin work until
> >> we are ready to start an entire transaction.
> >>
> >> Unfortunately, this makes proving correctness about transactional
> >> points-in-time for backup hard to reason about. Make it explicitly clear
> >> by moving the handler registration to creation time, and changing the
> >> write notifier to a no-op until the job is started.
> >>
> >> Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> >> Signed-off-by: John Snow <jsnow@redhat.com>
> >> ---
> >> block/backup.c | 32 +++++++++++++++++++++++---------
> >> include/qemu/job.h | 5 +++++
> >> job.c | 2 +-
> >> 3 files changed, 29 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/block/backup.c b/block/backup.c
> >> index 07d751aea4..4df5b95415 100644
> >> --- a/block/backup.c
> >> +++ b/block/backup.c
> >> @@ -344,6 +344,13 @@ static int coroutine_fn backup_before_write_notify(
> >> assert(QEMU_IS_ALIGNED(req->offset, BDRV_SECTOR_SIZE));
> >> assert(QEMU_IS_ALIGNED(req->bytes, BDRV_SECTOR_SIZE));
> >>
> >> + /* The handler is installed at creation time; the actual point-in-time
> >> + * starts at job_start(). Transactions guarantee those two points are
> >> + * the same point in time. */
> >> + if (!job_started(&job->common.job)) {
> >> + return 0;
> >> + }
> >
> > Hmm, sorry if it is a stupid question, I'm not good in multiprocessing and in
> > Qemu iothreads..
> >
> > job_started just reads job->co. If bs runs in iothread, and therefore write-notifier
> > is in iothread, when job_start is called from main thread.. Is it guaranteed that
> > write-notifier will see job->co variable change early enough to not miss guest write?
> > Should not job->co be volatile for example or something like this?
> >
> > If not think about this patch looks good for me.
> >
>
> You know, it's a really good question.
> So good, in fact, that I have no idea.
>
> ¯\_(ツ)_/¯
>
> I'm fairly certain that IO will not come in until the .clean phase of a
> qmp_transaction, because bdrv_drained_begin(bs) is called during
> .prepare, and we activate the handler (by starting the job) in .commit.
> We do not end the drained section until .clean.
>
> I'm not fully clear on what threading guarantees we have otherwise,
> though; is it possible that "Thread A" would somehow lift the bdrv_drain
> on an IO thread ("Thread B") and, after that, "Thread B" would somehow
> still be able to see an outdated version of job->co that was set by
> "Thread A"?
>
> I doubt it; but I can't prove it.
In the qmp_backup() case (not qmp_transaction()) there is:
void qmp_drive_backup(DriveBackup *arg, Error **errp)
{
BlockJob *job;
job = do_drive_backup(arg, NULL, errp);
if (job) {
job_start(&job->job);
}
}
job_start() is called without any thread synchronization, which is
usually fine because the coroutine doesn't run until job_start() calls
aio_co_enter().
Now that the before write notifier has been installed early, there is
indeed a race between job_start() and the write notifier accessing
job->co from an IOThread.
The write before notifier might see job->co != NULL before job_start()
has finished. This could lead to issues if job_*() APIs are invoked by
the write notifier and access an in-between job state.
A safer approach is to set a BackupBlockJob variable at the beginning of
backup_run() and check it from the before write notifier.
That said, I don't understand the benefit of this patch and IMO it makes
the code harder to understand because now we need to think about the
created but not started state too.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2019-09-10 8:21 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-09 20:13 [Qemu-devel] [PATCH] block/backup: install notifier during creation John Snow
2019-08-20 22:42 ` John Snow
2019-08-21 14:41 ` Vladimir Sementsov-Ogievskiy
2019-08-21 20:01 ` John Snow
2019-09-10 8:19 ` Stefan Hajnoczi [this message]
2019-09-10 13:23 ` [Qemu-devel] [Qemu-block] " John Snow
2019-09-18 20:31 ` John Snow
2019-09-19 7:11 ` Vladimir Sementsov-Ogievskiy
2019-09-19 19:11 ` [Qemu-block] [Qemu-devel] " John Snow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190910081942.GA23976@stefanha-x1.localdomain \
--to=stefanha@gmail.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).