kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Ben Gardon <bgardon@google.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Peter Shier <pshier@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
	Waiman Long <longman@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>
Subject: Re: [PATCH 3/3] sched: Add cond_resched_rwlock
Date: Tue, 27 Oct 2020 10:56:36 -0700	[thread overview]
Message-ID: <20201027175634.GI1021@linux.intel.com> (raw)
In-Reply-To: <20201027164950.1057601-3-bgardon@google.com>

On Tue, Oct 27, 2020 at 09:49:50AM -0700, Ben Gardon wrote:
> Rescheduling while holding a spin lock is essential for keeping long
> running kernel operations running smoothly. Add the facility to
> cond_resched rwlocks.

This adds two new exports and two new macros without any in-tree users, which
is generally frowned upon.  You and I know these will be used by KVM's new
TDP MMU, but the non-KVM folks, and more importantly the maintainers of this
code, are undoubtedly going to ask "why".  I.e. these patches probably belong
in the KVM series to switch to a rwlock for the TDP MMU.

Regarding the code, it's all copy-pasted from the spinlock code and darn near
identical.  It might be worth adding builder macros for these.

> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  include/linux/sched.h | 12 ++++++++++++
>  kernel/sched/core.c   | 40 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 52 insertions(+)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 77179160ec3ab..2eb0c53fce115 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1841,12 +1841,24 @@ static inline int _cond_resched(void) { return 0; }
>  })
>  
>  extern int __cond_resched_lock(spinlock_t *lock);
> +extern int __cond_resched_rwlock_read(rwlock_t *lock);
> +extern int __cond_resched_rwlock_write(rwlock_t *lock);
>  
>  #define cond_resched_lock(lock) ({				\
>  	___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\
>  	__cond_resched_lock(lock);				\
>  })
>  
> +#define cond_resched_rwlock_read(lock) ({			\
> +	__might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);	\
> +	__cond_resched_rwlock_read(lock);			\
> +})
> +
> +#define cond_resched_rwlock_write(lock) ({			\
> +	__might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);	\
> +	__cond_resched_rwlock_write(lock);			\
> +})
> +
>  static inline void cond_resched_rcu(void)
>  {
>  #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU)
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d2003a7d5ab55..ac58e7829a063 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6152,6 +6152,46 @@ int __cond_resched_lock(spinlock_t *lock)
>  }
>  EXPORT_SYMBOL(__cond_resched_lock);
>  
> +int __cond_resched_rwlock_read(rwlock_t *lock)
> +{
> +	int resched = should_resched(PREEMPT_LOCK_OFFSET);
> +	int ret = 0;
> +
> +	lockdep_assert_held(lock);
> +
> +	if (rwlock_needbreak(lock) || resched) {
> +		read_unlock(lock);
> +		if (resched)
> +			preempt_schedule_common();
> +		else
> +			cpu_relax();
> +		ret = 1;

AFAICT, this rather odd code flow from __cond_resched_lock() is an artifact of
code changes over the years and not intentionally weird.  IMO, it would be
cleaner and easier to read as:

	int resched = should_resched(PREEMPT_LOCK_OFFSET);

	lockdep_assert_held(lock);

	if (!rwlock_needbreak(lock) && !resched)
		return 0;

	read_unlock(lock);
	if (resched)
		preempt_schedule_common();
	else
		cpu_relax();
	read_lock(lock)
	return 1;


> +		read_lock(lock);
> +	}
> +	return ret;
> +}
> +EXPORT_SYMBOL(__cond_resched_rwlock_read);
> +
> +int __cond_resched_rwlock_write(rwlock_t *lock)
> +{
> +	int resched = should_resched(PREEMPT_LOCK_OFFSET);
> +	int ret = 0;
> +
> +	lockdep_assert_held(lock);

This shoulid be lockdep_assert_held_write.

> +
> +	if (rwlock_needbreak(lock) || resched) {
> +		write_unlock(lock);
> +		if (resched)
> +			preempt_schedule_common();
> +		else
> +			cpu_relax();
> +		ret = 1;
> +		write_lock(lock);
> +	}
> +	return ret;
> +}
> +EXPORT_SYMBOL(__cond_resched_rwlock_write);
> +
>  /**
>   * yield - yield the current processor to other threads.
>   *
> -- 
> 2.29.0.rc2.309.g374f81d7ae-goog
> 

  reply	other threads:[~2020-10-27 18:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-27 16:49 [PATCH 1/3] locking/rwlocks: Add contention detection for rwlocks Ben Gardon
2020-10-27 16:49 ` [PATCH 2/3] sched: Add needbreak " Ben Gardon
2020-10-27 16:49 ` [PATCH 3/3] sched: Add cond_resched_rwlock Ben Gardon
2020-10-27 17:56   ` Sean Christopherson [this message]
2020-10-27 19:17     ` Peter Zijlstra
2020-10-27 20:17     ` Davidlohr Bueso
2020-10-27 18:44   ` Peter Zijlstra
2020-10-27 18:50     ` Paolo Bonzini
2020-10-30 17:09   ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201027175634.GI1021@linux.intel.com \
    --to=sean.j.christopherson@intel.com \
    --cc=bgardon@google.com \
    --cc=dave@stgolabs.net \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pshier@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).