On Thu, Mar 25, 2021 at 12:29:39PM +0100, Paolo Bonzini wrote:
> An invariant of the current rwlock is that if multiple coroutines hold a
> reader lock, all must be runnable. The unlock implementation relies on
> this, choosing to wake a single coroutine when the final read lock
> holder exits the critical section, assuming that it will wake a
> coroutine attempting to acquire a write lock.
> 
> The downgrade implementation violates this assumption by creating a
> read lock owning coroutine that is exclusively runnable - any other
> coroutines that are waiting to acquire a read lock are *not* made
> runnable when the write lock holder converts its ownership to read
> only.
> 
> More in general, the old implementation had lots of other fairness bugs.
> The root cause of the bugs was that CoQueue would wake up readers even
> if there were pending writers, and would wake up writers even if there
> were readers.  In that case, the coroutine would go back to sleep *at
> the end* of the CoQueue, losing its place at the head of the line.
> 
> To fix this, keep the queue of waiters explicitly in the CoRwlock
> instead of using CoQueue, and store for each whether it is a
> potential reader or a writer.  This way, downgrade can look at the
> first queued coroutines and wake it only if it is a reader, causing
> all other readers in line to be released in turn.
> 
> Reported-by: David Edmondson <david.edmondson@oracle.com>
> Reviewed-by: David Edmondson <david.edmondson@oracle.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> v3->v4: clean up the code and fix upgrade logic.  Fix upgrade comment too.
> 
>  include/qemu/coroutine.h   |  17 +++--
>  util/qemu-coroutine-lock.c | 148 ++++++++++++++++++++++++-------------
>  2 files changed, 106 insertions(+), 59 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>