[RFC PATCH 0/6] job: replace AioContext lock with job_mutex

* [RFC PATCH 0/6] job: replace AioContext lock with job_mutex
@ 2021-07-07 16:58 Emanuele Giuseppe Esposito
  2021-07-07 16:58 ` [RFC PATCH 1/6] job: use getter/setters instead of accessing the Job fields directly Emanuele Giuseppe Esposito
                   ` (7 more replies)
  0 siblings, 8 replies; 33+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-07-07 16:58 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel,
	Wen Congyang, Xie Changlong, Markus Armbruster, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

This is a continuation on the work to reduce (and possibly get rid of) the usage of AioContext lock, by introducing smaller granularity locks to keep the thread safety.

This series aims to:
1) remove the aiocontext lock and substitute it with the already existing
   global job_mutex
2) fix what it looks like to be an oversight when moving the blockjob.c logic
   into the more generic job.c: job_mutex was introduced especially to
   protect job->busy flag, but it seems that it was not used in successive
   patches, because there are multiple code sections that directly
   access the field without any locking.
3) use job_mutex instead of the aiocontext_lock
4) extend the reach of the job_mutex to protect all shared fields
   that the job structure has.

The reason why we propose to use the existing job_mutex and not make one for
each job is to keep things as simple as possible for now, and because
the jobs are not in the execution critical path, so we can affort
some delays.
Having a lock per job would increase overall complexity and
increase the chances of deadlocks (one good example could be the job
transactions, where multiple jobs are grouped together).
Anyways, the per-job mutex can always be added in the future.

Patch 1-4 are in preparation for patch 5. They try to simplify and clarify
the job_mutex usage. Patch 5 tries to add proper syncronization to the job
structure, replacing the AioContext lock when necessary.
Patch 6 just removes unnecessary AioContext locks that are now unneeded.

RFC: I am not sure the way I layed out the locks is ideal.
But their usage should not make deadlocks. I also made sure
the series passess all qemu_iotests.

What is very clear from this patch is that it
is strictly related to the brdv_* and lower level calls, because
they also internally check or even use the aiocontext lock.
Therefore, in order to make it work, I temporarly added some
aiocontext_acquire/release pair around the function that
still assert for them or assume they are hold and temporarly
unlock (unlock() - lock()).

I also apologize for the amount of changes in patch 5, any suggestion on
how to improve the patch layout is also very much appreciated.

Emanuele Giuseppe Esposito (6):
  job: use getter/setters instead of accessing the Job fields directly
  job: _locked functions and public job_lock/unlock for next patch
  job: minor changes to simplify locking
  job.h: categorize job fields
  job: use global job_mutex to protect struct Job
  jobs: remove unnecessary AioContext aquire/release pairs

 include/block/blockjob_int.h   |   1 +
 include/qemu/job.h             | 159 ++++++++++--
 block.c                        |   2 +-
 block/backup.c                 |   4 +
 block/commit.c                 |   4 +-
 block/mirror.c                 |  30 ++-
 block/monitor/block-hmp-cmds.c |   6 -
 block/replication.c            |   3 +-
 blockdev.c                     | 235 ++++++------------
 blockjob.c                     | 140 +++++++----
 job-qmp.c                      |  65 +++--
 job.c                          | 432 ++++++++++++++++++++++++++-------
 qemu-img.c                     |  19 +-
 13 files changed, 724 insertions(+), 376 deletions(-)

-- 
2.31.1

^ permalink raw reply	[flat|nested] 33+ messages in thread