linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jean-Philippe Brucker <jean-philippe@linaro.org>
To: Jacob Pan <jacob.pan.linux@gmail.com>
Cc: iommu@lists.linux-foundation.org,
	LKML <linux-kernel@vger.kernel.org>,
	Joerg Roedel <joro@8bytes.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Jonathan Corbet <corbet@lwn.net>,
	linux-api@vger.kernel.org,
	Jean-Philippe Brucker <jean-philippe@linaro.com>,
	Eric Auger <eric.auger@redhat.com>,
	Jacob Pan <jacob.jun.pan@linux.intel.com>,
	Yi Liu <yi.l.liu@intel.com>, "Tian, Kevin" <kevin.tian@intel.com>,
	Raj Ashok <ashok.raj@intel.com>, Wu Hao <hao.wu@intel.com>,
	Yi Sun <yi.y.sun@intel.com>, Dave Jiang <dave.jiang@intel.com>,
	Randy Dunlap <rdunlap@infradead.org>
Subject: Re: [PATCH v3 11/14] iommu/ioasid: Support mm type ioasid_set notifications
Date: Fri, 23 Oct 2020 13:43:32 +0200	[thread overview]
Message-ID: <20201023114332.GD2265982@myrica> (raw)
In-Reply-To: <1601329121-36979-12-git-send-email-jacob.jun.pan@linux.intel.com>

On Mon, Sep 28, 2020 at 02:38:38PM -0700, Jacob Pan wrote:
> As a system-wide resource, IOASID is often shared by multiple kernel
> subsystems that are independent of each other. However, at the
> ioasid_set level, these kernel subsystems must communicate with each
> other for ownership checking, event notifications, etc. For example, on
> Intel Scalable IO Virtualization (SIOV) enabled platforms, KVM and VFIO
> instances under the same process/guest must be aware of a shared IOASID
> set.
> IOASID_SET_TYPE_MM token type was introduced to explicitly mark an
> IOASID set that belongs to a process, thus use the same mm_struct
> pointer as a token. Users of the same process can then identify with
> each other based on this token.
> 
> This patch introduces MM token specific event registration APIs. Event
> subscribers such as KVM instances can register IOASID event handler
> without the knowledge of its ioasid_set. Event handlers are registered
> based on its mm_struct pointer as a token. In case when subscribers
> register handler *prior* to the creation of the ioasid_set, the
> handler’s notification block is stored in a pending list within IOASID
> core. Once the ioasid_set of the MM token is created, the notification
> block will be registered by the IOASID core.
> 
> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>  drivers/iommu/ioasid.c | 117 +++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/ioasid.h |  15 +++++++
>  2 files changed, 132 insertions(+)
> 
> diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
> index 894b17c06ead..d5faeb559a43 100644
> --- a/drivers/iommu/ioasid.c
> +++ b/drivers/iommu/ioasid.c
> @@ -889,6 +889,29 @@ void ioasid_set_put(struct ioasid_set *set)
>  }
>  EXPORT_SYMBOL_GPL(ioasid_set_put);
>  
> +/*
> + * ioasid_find_mm_set - Retrieve IOASID set with mm token
> + * Take a reference of the set if found.
> + */
> +static struct ioasid_set *ioasid_find_mm_set(struct mm_struct *token)
> +{
> +	struct ioasid_set *set;
> +	unsigned long index;
> +
> +	spin_lock(&ioasid_allocator_lock);

This could deadlock since ioasid_set_put() takes the nb_lock while holding
the allocator_lock.

> +
> +	xa_for_each(&ioasid_sets, index, set) {
> +		if (set->type == IOASID_SET_TYPE_MM && set->token == token) {
> +			refcount_inc(&set->ref);
> +			goto exit_unlock;
> +		}
> +	}
> +	set = NULL;
> +exit_unlock:
> +	spin_unlock(&ioasid_allocator_lock);
> +	return set;
> +}
> +
>  /**
>   * ioasid_adjust_set - Adjust the quota of an IOASID set
>   * @set:	IOASID set to be assigned
> @@ -1121,6 +1144,100 @@ void ioasid_unregister_notifier(struct ioasid_set *set,
>  }
>  EXPORT_SYMBOL_GPL(ioasid_unregister_notifier);
>  
> +/**
> + * ioasid_register_notifier_mm - Register a notifier block on the IOASID set
> + *                               created by the mm_struct pointer as the token
> + *
> + * @mm: the mm_struct token of the ioasid_set
> + * @nb: notfier block to be registered on the ioasid_set

Maybe add that the priority in @nb needs to be set by the caller using a
ioasid_notifier_priority value

> + *
> + * This a variant of ioasid_register_notifier() where the caller intends to
> + * listen to IOASID events belong the ioasid_set created under the same

                         that belong to

> + * process. Caller is not aware of the ioasid_set, no need to hold reference
> + * of the ioasid_set.
> + */
> +int ioasid_register_notifier_mm(struct mm_struct *mm, struct notifier_block *nb)
> +{
> +	struct ioasid_set_nb *curr;
> +	struct ioasid_set *set;
> +	int ret = 0;
> +
> +	if (!mm)
> +		return -EINVAL;
> +
> +	spin_lock(&ioasid_nb_lock);
> +
> +	/* Check for duplicates, nb is unique per set */
> +	list_for_each_entry(curr, &ioasid_nb_pending_list, list) {
> +		if (curr->token == mm && curr->nb == nb) {
> +			ret = -EBUSY;
> +			goto exit_unlock;
> +		}
> +	}
> +
> +	/* Check if the token has an existing set */
> +	set = ioasid_find_mm_set(mm);
> +	if (!set) {
> +		/* Add to the rsvd list as inactive */
> +		curr->active = false;

curr is invalid here

> +	} else {
> +		/* REVISIT: Only register empty set for now. Can add an option
> +		 * in the future to playback existing PASIDs.
> +		 */

You can drop this comment

> +		if (set->nr_ioasids) {
> +			pr_warn("IOASID set %d not empty\n", set->id);
> +			ret = -EBUSY;
> +			goto exit_unlock;
> +		}
> +		curr = kzalloc(sizeof(*curr), GFP_ATOMIC);

Needs to be in the common path, before this block

> +		if (!curr) {
> +			ret = -ENOMEM;
> +			goto exit_unlock;
> +		}
> +		curr->token = mm;
> +		curr->nb = nb;
> +		curr->active = true;
> +		curr->set = set;
> +
> +		/* Set already created, add to the notifier chain */
> +		atomic_notifier_chain_register(&set->nh, nb);
> +		/*
> +		 * Do not hold a reference, if the set gets destroyed, the nb
> +		 * entry will be marked inactive.
> +		 */
> +		ioasid_set_put(set);
> +	}
> +
> +	list_add(&curr->list, &ioasid_nb_pending_list);
> +
> +exit_unlock:
> +	spin_unlock(&ioasid_nb_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_register_notifier_mm);
> +
> +void ioasid_unregister_notifier_mm(struct mm_struct *mm, struct notifier_block *nb)
> +{
> +	struct ioasid_set_nb *curr;
> +
> +	spin_lock(&ioasid_nb_lock);
> +	list_for_each_entry(curr, &ioasid_nb_pending_list, list) {
> +		if (curr->token == mm && curr->nb == nb) {
> +			list_del(&curr->list);
> +			spin_unlock(&ioasid_nb_lock);
> +			if (curr->active) {
> +				atomic_notifier_chain_unregister(&curr->set->nh,
> +								 nb);

I think it's possible for ioasid_set_put_locked() to free the set right
after we release the lock, then this unregister() will be use-after-free.
Best keep holding the lock for this.

> +			}
> +			kfree(curr);
> +			return;
> +		}
> +	}
> +	pr_warn("No ioasid set found for mm token %llx\n",  (u64)mm);

Use %p

Overall I think error messages in this series could be demoted to
pr_debug, since they're useful when developing users to this API, but
figuring out whether they can be user-triggered and need care is too much
work.

Thanks,
Jean

> +	spin_unlock(&ioasid_nb_lock);
> +}
> +EXPORT_SYMBOL_GPL(ioasid_unregister_notifier_mm);
> +
>  MODULE_AUTHOR("Jean-Philippe Brucker <jean-philippe.brucker@arm.com>");
>  MODULE_AUTHOR("Jacob Pan <jacob.jun.pan@linux.intel.com>");
>  MODULE_DESCRIPTION("IO Address Space ID (IOASID) allocator");
> diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
> index 1b551c99d568..c6cc855aadb6 100644
> --- a/include/linux/ioasid.h
> +++ b/include/linux/ioasid.h
> @@ -132,6 +132,9 @@ void ioasid_unregister_notifier(struct ioasid_set *set,
>  void ioasid_set_for_each_ioasid(struct ioasid_set *sdata,
>  				void (*fn)(ioasid_t id, void *data),
>  				void *data);
> +int ioasid_register_notifier_mm(struct mm_struct *mm, struct notifier_block *nb);
> +void ioasid_unregister_notifier_mm(struct mm_struct *mm, struct notifier_block *nb);
> +
>  #else /* !CONFIG_IOASID */
>  static inline void ioasid_install_capacity(ioasid_t total)
>  {
> @@ -238,5 +241,17 @@ static inline void ioasid_set_for_each_ioasid(struct ioasid_set *sdata,
>  					      void *data)
>  {
>  }
> +
> +static inline int ioasid_register_notifier_mm(struct mm_struct *mm,
> +					      struct notifier_block *nb)
> +{
> +	return -ENOTSUPP;
> +}
> +
> +static inline void ioasid_unregister_notifier_mm(struct mm_struct *mm,
> +						 struct notifier_block *nb)
> +{
> +}
> +
>  #endif /* CONFIG_IOASID */
>  #endif /* __LINUX_IOASID_H */
> -- 
> 2.7.4
> 

  reply	other threads:[~2020-10-23 11:43 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-28 21:38 [PATCH v3 00/14] IOASID extensions for guest SVA Jacob Pan
2020-09-28 21:38 ` [PATCH v3 01/14] docs: Document IO Address Space ID (IOASID) APIs Jacob Pan
2020-10-20 13:58   ` Jean-Philippe Brucker
2020-10-26 21:05     ` Jacob Pan
2020-10-30 10:18       ` Jean-Philippe Brucker
2020-11-02 23:41         ` Jacob Pan
2020-09-28 21:38 ` [PATCH v3 02/14] iommu/ioasid: Rename ioasid_set_data() Jacob Pan
2020-09-28 21:38 ` [PATCH v3 03/14] iommu/ioasid: Add a separate function for detach data Jacob Pan
2020-10-21 14:54   ` Jean-Philippe Brucker
2020-09-28 21:38 ` [PATCH v3 04/14] iommu/ioasid: Support setting system-wide capacity Jacob Pan
2020-10-21 14:54   ` Jean-Philippe Brucker
2020-09-28 21:38 ` [PATCH v3 05/14] iommu/ioasid: Redefine IOASID set and allocation APIs Jacob Pan
2020-10-21 14:55   ` Jean-Philippe Brucker
2020-09-28 21:38 ` [PATCH v3 06/14] iommu/ioasid: Introduce API to adjust the quota of an ioasid_set Jacob Pan
2020-10-21 14:57   ` Jean-Philippe Brucker
2020-09-28 21:38 ` [PATCH v3 07/14] iommu/ioasid: Add an iterator API for ioasid_set Jacob Pan
2020-10-21 14:57   ` Jean-Philippe Brucker
2020-09-28 21:38 ` [PATCH v3 08/14] iommu/ioasid: Add reference couting functions Jacob Pan
2020-10-21 14:58   ` Jean-Philippe Brucker
2020-09-28 21:38 ` [PATCH v3 09/14] iommu/ioasid: Introduce ioasid_set private ID Jacob Pan
2020-10-23 11:39   ` Jean-Philippe Brucker
2020-09-28 21:38 ` [PATCH v3 10/14] iommu/ioasid: Introduce notification APIs Jacob Pan
2020-10-23 11:41   ` Jean-Philippe Brucker
2020-09-28 21:38 ` [PATCH v3 11/14] iommu/ioasid: Support mm type ioasid_set notifications Jacob Pan
2020-10-23 11:43   ` Jean-Philippe Brucker [this message]
2020-09-28 21:38 ` [PATCH v3 12/14] iommu/vt-d: Remove mm reference for guest SVA Jacob Pan
2020-09-28 21:38 ` [PATCH v3 13/14] iommu/vt-d: Listen to IOASID notifications Jacob Pan
2020-09-28 21:38 ` [PATCH v3 14/14] iommu/vt-d: Store guest PASID during bind Jacob Pan
2020-10-19 16:51 ` [PATCH v3 00/14] IOASID extensions for guest SVA Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201023114332.GD2265982@myrica \
    --to=jean-philippe@linaro.org \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=dave.jiang@intel.com \
    --cc=dwmw2@infradead.org \
    --cc=eric.auger@redhat.com \
    --cc=hao.wu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jacob.pan.linux@gmail.com \
    --cc=jean-philippe@linaro.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).