All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Snow <jsnow@redhat.com>
To: Jeff Cody <jcody@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	vsementsov@virtuozzo.com, qemu-block@nongnu.org,
	qemu-devel@nongnu.org, stefanha@redhat.com, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 4/6] blockjob: add block_job_start
Date: Mon, 7 Nov 2016 21:20:32 -0500	[thread overview]
Message-ID: <33ccab69-7bda-a082-7758-79ff40290804@redhat.com> (raw)
In-Reply-To: <20161108020528.GC8956@localhost.localdomain>



On 11/07/2016 09:05 PM, Jeff Cody wrote:
> On Mon, Nov 07, 2016 at 09:02:14PM -0500, John Snow wrote:
>>
>>
>> On 11/03/2016 08:17 AM, Kevin Wolf wrote:
>>> Am 02.11.2016 um 18:50 hat John Snow geschrieben:
>>>> Instead of automatically starting jobs at creation time via backup_start
>>>> et al, we'd like to return a job object pointer that can be started
>>>> manually at later point in time.
>>>>
>>>> For now, add the block_job_start mechanism and start the jobs
>>>> automatically as we have been doing, with conversions job-by-job coming
>>>> in later patches.
>>>>
>>>> Of note: cancellation of unstarted jobs will perform all the normal
>>>> cleanup as if the job had started, particularly abort and clean. The
>>>> only difference is that we will not emit any events, because the job
>>>> never actually started.
>>>>
>>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>>
>>>> diff --git a/block/commit.c b/block/commit.c
>>>> index 20d27e2..5b7c454 100644
>>>> --- a/block/commit.c
>>>> +++ b/block/commit.c
>>>> @@ -289,10 +289,9 @@ void commit_start(const char *job_id, BlockDriverState *bs,
>>>>     s->backing_file_str = g_strdup(backing_file_str);
>>>>
>>>>     s->on_error = on_error;
>>>> -    s->common.co = qemu_coroutine_create(s->common.driver->start, s);
>>>>
>>>>     trace_commit_start(bs, base, top, s, s->common.co);
>>>
>>> s->common.co is now uninitialised and should probably be removed from
>>> the tracepoint arguments. The same is true for mirror and stream.
>>>
>>>> -    qemu_coroutine_enter(s->common.co);
>>>> +    block_job_start(&s->common);
>>>> }
>>>
>>>> diff --git a/blockjob.c b/blockjob.c
>>>> index e3c458c..16c5159 100644
>>>> --- a/blockjob.c
>>>> +++ b/blockjob.c
>>>> @@ -174,7 +174,8 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
>>>>     job->blk           = blk;
>>>>     job->cb            = cb;
>>>>     job->opaque        = opaque;
>>>> -    job->busy          = true;
>>>> +    job->busy          = false;
>>>> +    job->paused        = true;
>>>>     job->refcnt        = 1;
>>>>     bs->job = job;
>>>>
>>>> @@ -202,6 +203,21 @@ bool block_job_is_internal(BlockJob *job)
>>>>     return (job->id == NULL);
>>>> }
>>>>
>>>> +static bool block_job_started(BlockJob *job)
>>>> +{
>>>> +    return job->co;
>>>> +}
>>>> +
>>>> +void block_job_start(BlockJob *job)
>>>> +{
>>>> +    assert(job && !block_job_started(job) && job->paused &&
>>>> +           !job->busy && job->driver->start);
>>>> +    job->paused = false;
>>>> +    job->busy = true;
>>>> +    job->co = qemu_coroutine_create(job->driver->start, job);
>>>> +    qemu_coroutine_enter(job->co);
>>>> +}
>>>
>>> We allow the user to pause a job while it's not started yet. You
>>> classified this as "harmless". But if we accept this, can we really
>>> unconditionally enter the coroutine even if the job has been paused?
>>> Can't a user expect that a job remains in paused state when they
>>> explicitly requested a pause and the job was already internally paused,
>>> like in this case by block_job_create()?
>>>
>>
>> What will end up happening is that we'll enter the job, and then it'll pause
>> immediately upon entrance. Is that a problem?
>>
>> If the jobs themselves are not checking their pause state fastidiously, it
>> could be (but block/backup does -- after it creates a write notifier.)
>>
>> Do we want a stronger guarantee here?
>>
>> Naively I think it's OK as-is, but I could add a stronger boolean in that
>> lets us know if it's okay to start or not, and we could delay the actual
>> creation and start until the 'resume' comes in if you'd like.
>>
>> I'd like to avoid the complexity if we can help it, but perhaps I'm not
>> thinking carefully enough about the existing edge cases.
>>
>
> Is there any reason we can't just use job->pause_count here?  When the job
> is created, set job->paused = true, and job->pause_count = 1.  In the
> block_job_start(), check the pause_count prior to qemu_coroutine_enter():
>
>     void block_job_start(BlockJob *job)
>     {
>         assert(job && !block_job_started(job) && job->paused &&
>               !job->busy && job->driver->start);
>         job->co = qemu_coroutine_create(job->driver->start, job);
>         job->paused = --job->pause_count > 0;
>         if (!job->paused) {
>             job->busy = true;
>             qemu_coroutine_enter(job->co);
>         }
>     }
>

Solid point. Let's do it this way.
Thanks!

>
>>> The same probably also applies to the internal job pausing during
>>> bdrv_drain_all_begin/end, though as you know there is a larger problem
>>> with starting jobs under drain_all anyway. For now, we just need to keep
>>> in mind that we can neither create nor start a job in such sections.
>>>
>>
>> Yeah, there are deeper problems there. As long as the existing critical
>> sections don't allow us to create jobs (started or not) I think we're
>> probably already OK.
>>
>>> Kevin
>>>

  reply	other threads:[~2016-11-08  2:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-02 17:50 [Qemu-devel] [PATCH v3 0/6] jobs: fix transactional race condition John Snow
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 1/6] blockjob: fix dead pointer in txn list John Snow
2016-11-08  2:47   ` Jeff Cody
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 2/6] blockjob: add .clean property John Snow
2016-11-08  2:51   ` Jeff Cody
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 3/6] blockjob: add .start field John Snow
2016-11-08  2:58   ` Jeff Cody
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 4/6] blockjob: add block_job_start John Snow
2016-11-03 12:17   ` Kevin Wolf
2016-11-08  2:02     ` John Snow
2016-11-08  2:05       ` Jeff Cody
2016-11-08  2:20         ` John Snow [this message]
2016-11-08  9:16         ` Kevin Wolf
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 5/6] blockjob: refactor backup_start as backup_job_create John Snow
2016-11-03 13:17   ` Kevin Wolf
2016-11-08  5:41     ` John Snow
2016-11-08  9:11       ` Kevin Wolf
2016-11-08 15:24         ` John Snow
2016-11-08 18:30           ` Jeff Cody
2016-11-08  3:14   ` Jeff Cody
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 6/6] iotests: add transactional failure race test John Snow
2016-11-03 13:21 ` [Qemu-devel] [PATCH v3 0/6] jobs: fix transactional race condition Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33ccab69-7bda-a082-7758-79ff40290804@redhat.com \
    --to=jsnow@redhat.com \
    --cc=jcody@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.