qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Max Reitz <mreitz@redhat.com>, qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, Kevin Wolf <kwolf@redhat.com>,
	John Snow <jsnow@redhat.com>
Subject: Re: [PATCH for-6.1? v2 5/7] job: Add job_cancel_requested()
Date: Tue, 3 Aug 2021 15:35:50 +0300	[thread overview]
Message-ID: <d370135d-9a34-1863-61a8-309c21e9e368@virtuozzo.com> (raw)
In-Reply-To: <6ae8d39e-62cc-bd8c-c78e-d2747eb83715@redhat.com>

02.08.2021 13:23, Max Reitz wrote:
> On 27.07.21 17:47, Vladimir Sementsov-Ogievskiy wrote:
>> 27.07.2021 18:39, Max Reitz wrote:
>>> On 27.07.21 15:04, Vladimir Sementsov-Ogievskiy wrote:
>>>> 26.07.2021 17:46, Max Reitz wrote:
>>>>> Most callers of job_is_cancelled() actually want to know whether the job
>>>>> is on its way to immediate termination.  For example, we refuse to pause
>>>>> jobs that are cancelled; but this only makes sense for jobs that are
>>>>> really actually cancelled.
>>>>>
>>>>> A mirror job that is cancelled during READY with force=false should
>>>>> absolutely be allowed to pause.  This "cancellation" (which is actually
>>>>> a kind of completion) may take an indefinite amount of time, and so
>>>>> should behave like any job during normal operation.  For example, with
>>>>> on-target-error=stop, the job should stop on write errors. (In
>>>>> contrast, force-cancelled jobs should not get write errors, as they
>>>>> should just terminate and not do further I/O.)
>>>>>
>>>>> Therefore, redefine job_is_cancelled() to only return true for jobs that
>>>>> are force-cancelled (which as of HEAD^ means any job that interprets the
>>>>> cancellation request as a request for immediate termination), and add
>>>>> job_cancel_request() as the general variant, which returns true for any
>>>>
>>>> job_cancel_requested()
>>>>
>>>>> jobs which have been requested to be cancelled, whether it be
>>>>> immediately or after an arbitrarily long completion phase.
>>>>>
>>>>> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
>>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>>> ---
>>>>>   include/qemu/job.h |  8 +++++++-
>>>>>   block/mirror.c     | 10 ++++------
>>>>>   job.c              |  7 ++++++-
>>>>>   3 files changed, 17 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/include/qemu/job.h b/include/qemu/job.h
>>>>> index 8aa90f7395..032edf3c5f 100644
>>>>> --- a/include/qemu/job.h
>>>>> +++ b/include/qemu/job.h
>>>>> @@ -436,9 +436,15 @@ const char *job_type_str(const Job *job);
>>>>>   /** Returns true if the job should not be visible to the management layer. */
>>>>>   bool job_is_internal(Job *job);
>>>>>   -/** Returns whether the job is scheduled for cancellation. */
>>>>> +/** Returns whether the job is being cancelled. */
>>>>>   bool job_is_cancelled(Job *job);
>>>>>   +/**
>>>>> + * Returns whether the job is scheduled for cancellation (at an
>>>>> + * indefinite point).
>>>>> + */
>>>>> +bool job_cancel_requested(Job *job);
>>>>> +
>>>>>   /** Returns whether the job is in a completed state. */
>>>>>   bool job_is_completed(Job *job);
>>>>>   diff --git a/block/mirror.c b/block/mirror.c
>>>>> index e93631a9f6..72e02fa34e 100644
>>>>> --- a/block/mirror.c
>>>>> +++ b/block/mirror.c
>>>>> @@ -936,7 +936,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
>>>>>           /* Transition to the READY state and wait for complete. */
>>>>>           job_transition_to_ready(&s->common.job);
>>>>>           s->actively_synced = true;
>>>>> -        while (!job_is_cancelled(&s->common.job) && !s->should_complete) {
>>>>> +        while (!job_cancel_requested(&s->common.job) && !s->should_complete) {
>>>>>               job_yield(&s->common.job);
>>>>>           }
>>>>>           s->common.job.cancelled = false;
>>>>> @@ -1043,7 +1043,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
>>>>>               }
>>>>>                 should_complete = s->should_complete ||
>>>>> -                job_is_cancelled(&s->common.job);
>>>>> + job_cancel_requested(&s->common.job);
>>>>>               cnt = bdrv_get_dirty_count(s->dirty_bitmap);
>>>>>           }
>>>>>   @@ -1087,7 +1087,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
>>>>>           trace_mirror_before_sleep(s, cnt, job_is_ready(&s->common.job),
>>>>>                                     delay_ns);
>>>>>           job_sleep_ns(&s->common.job, delay_ns);
>>>>> -        if (job_is_cancelled(&s->common.job) && s->common.job.force_cancel) {
>>>>> +        if (job_is_cancelled(&s->common.job)) {
>>>>>               break;
>>>>>           }
>>>>>           s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>>>>> @@ -1099,9 +1099,7 @@ immediate_exit:
>>>>>            * or it was cancelled prematurely so that we do not guarantee that
>>>>>            * the target is a copy of the source.
>>>>>            */
>>>>> -        assert(ret < 0 ||
>>>>> -               (s->common.job.force_cancel &&
>>>>> -                job_is_cancelled(&s->common.job)));
>>>>> +        assert(ret < 0 || job_is_cancelled(&s->common.job));
>>>
>>> (As a note, I hope this does the job regarding your suggestions for patch 4. :))
>>>
>>>>>           assert(need_drain);
>>>>>           mirror_wait_for_all_io(s);
>>>>>       }
>>>>> diff --git a/job.c b/job.c
>>>>> index e78d893a9c..dba17a680f 100644
>>>>> --- a/job.c
>>>>> +++ b/job.c
>>>>> @@ -216,6 +216,11 @@ const char *job_type_str(const Job *job)
>>>>>   }
>>>>>     bool job_is_cancelled(Job *job)
>>>>> +{
>>>>> +    return job->cancelled && job->force_cancel;
>>>>
>>>> can job->cancelled be false when job->force_cancel is true ? I think not and worth an assertion here. Something like
>>>>
>>>> if (job->force_cancel) {
>>>>    assert(job->cancelled);
>>>>    return true;
>>>> }
>>>>
>>>> return false;
>>>
>>> Sounds good, why not.
>>>
>>>>
>>>>> +}
>>>>> +
>>>>> +bool job_cancel_requested(Job *job)
>>>>>   {
>>>>>       return job->cancelled;
>>>>>   }
>>>>> @@ -1015,7 +1020,7 @@ void job_complete(Job *job, Error **errp)
>>>>>       if (job_apply_verb(job, JOB_VERB_COMPLETE, errp)) {
>>>>>           return;
>>>>>       }
>>>>> -    if (job_is_cancelled(job) || !job->driver->complete) {
>>>>> +    if (job_cancel_requested(job) || !job->driver->complete) {
>>>>>           error_setg(errp, "The active block job '%s' cannot be completed",
>>>>>                      job->id);
>>>>>           return;
>>>>>
>>>>
>>>> I think it's a correct change, although there may be unexpected side-effects, it's hard to imagine all consequences of changing job_is_cancelled() semantics called in several places in job.c.
>>
>> For example: so we now don't set -ECANCELED in job_update_rc for soft-cancel..
>>
>> This mean that job_finalize_single() will call job_commit instead of job_abort, and job_commit may do some graph changes, which shouldn't happen for soft-cancel
> 
> So the question is when these two conditions come into play.
> 
> There are two places that set job->ret to ECANCELED if the job is cancelled, namely job_update_rc(), and job_finish_sync().
> 
> job_finish_sync() will do so only after the job has been completed, which requires the job to either have been aborted (i.e. ret is non-zero anyway) or job_completed() to have been called. job_completed() is called by job_exit(), which is run after the job’s main loop has exited.  If mirror is soft-cancelled, mirror_run() will clear s->common.job.cancelled before returning, so job_finish_sync() will not see the job as cancelled.
> 
> job_update_rc() is called from three places:
> 
> job_finalize_single(): Asserts that job_is_completed(), so the same reasoning as for job_finish_sync() applies.
> 
> job_prepare(): Called by job_do_finalize(), which can only happen when the job is completed.  (JobVerbTable only allows finalization when the job is PENDING, which is a state where job_is_completed() is true, i.e. after mirror_run().)
> 
> job_completed(): Same reasoning as for job_finish_sync().
> 
> 
> So it looks to me like these places that set job->ret to ECANCELED if the job has been cancelled do not consider a soft-cancelled mirror job to have been cancelled, which makes using job_is_cancelled() instead of job_cancel_requested() correct there. (And most likely, we can drop the `.cancelled = false` statements from the mirror job in turn.)
> 


Hm, reasonable. OK than, thanks for explanation.


-- 
Best regards,
Vladimir


  reply	other threads:[~2021-08-03 12:36 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-26 14:46 [PATCH for-6.1? v2 0/7] mirror: Handle errors after READY cancel Max Reitz
2021-07-26 14:46 ` [PATCH for-6.1? v2 1/7] mirror: Keep s->synced on error Max Reitz
2021-07-26 14:46 ` [PATCH for-6.1? v2 2/7] mirror: Drop s->synced Max Reitz
2021-07-26 15:26   ` Eric Blake
2021-07-27 11:41   ` Vladimir Sementsov-Ogievskiy
2021-07-26 14:46 ` [PATCH for-6.1? v2 3/7] job: @force parameter for job_cancel_sync{, _all}() Max Reitz
2021-08-03 14:00   ` [PATCH for-6.1? v2 3/7] job: @force parameter for job_cancel_sync{,_all}() Kevin Wolf
2021-07-26 14:46 ` [PATCH for-6.1? v2 4/7] jobs: Give Job.force_cancel more meaning Max Reitz
2021-07-26 21:38   ` Eric Blake
2021-07-27 12:40   ` Vladimir Sementsov-Ogievskiy
2021-07-26 14:46 ` [PATCH for-6.1? v2 5/7] job: Add job_cancel_requested() Max Reitz
2021-07-27 13:04   ` Vladimir Sementsov-Ogievskiy
2021-07-27 15:39     ` Max Reitz
2021-07-27 15:47       ` Vladimir Sementsov-Ogievskiy
2021-08-02 10:23         ` Max Reitz
2021-08-03 12:35           ` Vladimir Sementsov-Ogievskiy [this message]
2021-08-03 14:25   ` Kevin Wolf
2021-08-04  8:07     ` Max Reitz
2021-08-04 10:34       ` Kevin Wolf
2021-08-04 11:00         ` Peter Krempa
2021-08-04 14:15         ` Max Reitz
2021-07-26 14:46 ` [PATCH for-6.1? v2 6/7] mirror: Check job_is_cancelled() earlier Max Reitz
2021-07-27 13:13   ` Vladimir Sementsov-Ogievskiy
2021-07-27 15:40     ` Max Reitz
2021-08-03 14:34   ` Kevin Wolf
2021-08-04  8:25     ` Max Reitz
2021-08-04  9:48       ` Kevin Wolf
2021-08-04 10:12         ` Max Reitz
2021-07-26 14:46 ` [PATCH for-6.1? v2 7/7] iotests: Add mirror-ready-cancel-error test Max Reitz
2021-07-27 13:24   ` Vladimir Sementsov-Ogievskiy
2021-08-03 14:36 ` [PATCH for-6.1? v2 0/7] mirror: Handle errors after READY cancel Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d370135d-9a34-1863-61a8-309c21e9e368@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).