All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	David Woodhouse <dwmw2@infradead.org>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	Joerg Roedel <joro@8bytes.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	"llvm@lists.linux.dev" <llvm@lists.linux.dev>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Miguel Ojeda <ojeda@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Shuah Khan <shuah@kernel.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Tom Rix <trix@redhat.com>, Will Deacon <will@kernel.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Chaitanya Kulkarni <chaitanyak@nvidia.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Daniel Jordan <daniel.m.jordan@oracle.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	Eric Auger <eric.auger@redhat.com>,
	Eric Farman <farman@linux.ibm.com>,
	Jason Wang <jasowang@redhat.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	"Martins, Joao" <joao.m.martins@oracle.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	Niklas Schnelle <schnelle@linux.ibm.com>,
	Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>,
	Keqian Zhu <zhukeqian1@huawei.com>
Subject: Re: [PATCH v3 10/15] iommufd: IOCTLs for the io_pagetable
Date: Mon, 7 Nov 2022 11:02:20 -0400	[thread overview]
Message-ID: <Y2kd/Ptt0iR6SGsh@nvidia.com> (raw)
In-Reply-To: <BN9PR11MB52765289F880B8A7297077318C3B9@BN9PR11MB5276.namprd11.prod.outlook.com>

On Fri, Nov 04, 2022 at 08:32:30AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, October 26, 2022 2:12 AM
> > 
> > +int iommufd_ioas_allow_iovas(struct iommufd_ucmd *ucmd)
> > +{
> > +	struct iommu_ioas_allow_iovas *cmd = ucmd->cmd;
> > +	struct rb_root_cached allowed_iova = RB_ROOT_CACHED;
> > +	struct interval_tree_node *node;
> > +	struct iommufd_ioas *ioas;
> > +	struct io_pagetable *iopt;
> > +	int rc = 0;
> > +
> > +	ioas = iommufd_get_ioas(ucmd, cmd->ioas_id);
> > +	if (IS_ERR(ioas))
> > +		return PTR_ERR(ioas);
> > +	iopt = &ioas->iopt;
> 
> Missed the check of __reserved field

Done

> > +
> > +int iommufd_ioas_copy(struct iommufd_ucmd *ucmd)
> > +{
> > +	struct iommu_ioas_copy *cmd = ucmd->cmd;
> > +	struct iommufd_ioas *src_ioas;
> > +	struct iommufd_ioas *dst_ioas;
> > +	unsigned int flags = 0;
> > +	LIST_HEAD(pages_list);
> > +	unsigned long iova;
> > +	int rc;
> > +
> > +	if ((cmd->flags &
> > +	     ~(IOMMU_IOAS_MAP_FIXED_IOVA |
> > IOMMU_IOAS_MAP_WRITEABLE |
> > +	       IOMMU_IOAS_MAP_READABLE)))
> > +		return -EOPNOTSUPP;
> > +	if (cmd->length >= ULONG_MAX)
> > +		return -EOVERFLOW;
> 
> and overflow on cmd->dest_iova/src_iova

Yep

> > +
> > +	src_ioas = iommufd_get_ioas(ucmd, cmd->src_ioas_id);
> > +	if (IS_ERR(src_ioas))
> > +		return PTR_ERR(src_ioas);
> > +	rc = iopt_get_pages(&src_ioas->iopt, cmd->src_iova, cmd->length,
> > +			    &pages_list);
> > +	iommufd_put_object(&src_ioas->obj);
> > +	if (rc)
> > +		goto out_pages;
> 
> direct return given iopt_get_pages() already called
> iopt_free_pages_list() upon error.

Ok

> > +int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd)
> > +{
> > +	struct iommu_ioas_unmap *cmd = ucmd->cmd;
> > +	struct iommufd_ioas *ioas;
> > +	unsigned long unmapped = 0;
> > +	int rc;
> > +
> > +	ioas = iommufd_get_ioas(ucmd, cmd->ioas_id);
> > +	if (IS_ERR(ioas))
> > +		return PTR_ERR(ioas);
> > +
> > +	if (cmd->iova == 0 && cmd->length == U64_MAX) {
> > +		rc = iopt_unmap_all(&ioas->iopt, &unmapped);
> > +		if (rc)
> > +			goto out_put;
> > +	} else {
> > +		if (cmd->iova >= ULONG_MAX || cmd->length >=
> > ULONG_MAX) {
> > +			rc = -EOVERFLOW;
> > +			goto out_put;
> > +		}
> 
> Above check can be moved before iommufd_get_ioas().

They have to be after this:

	if (cmd->iova == 0 && cmd->length == U64_MAX) {

Or it will false trigger


> > +static int iommufd_option(struct iommufd_ucmd *ucmd)
> > +{
> > +	struct iommu_option *cmd = ucmd->cmd;
> > +	int rc;
> > +
> 
> lack of __reserved check

Done

> >  static struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
> >  	IOCTL_OP(IOMMU_DESTROY, iommufd_destroy, struct
> > iommu_destroy, id),
> > +	IOCTL_OP(IOMMU_IOAS_ALLOC, iommufd_ioas_alloc_ioctl,
> > +		 struct iommu_ioas_alloc, out_ioas_id),
> > +	IOCTL_OP(IOMMU_IOAS_ALLOW_IOVAS, iommufd_ioas_allow_iovas,
> > +		 struct iommu_ioas_allow_iovas, allowed_iovas),
> > +	IOCTL_OP(IOMMU_IOAS_COPY, iommufd_ioas_copy, struct
> > iommu_ioas_copy,
> > +		 src_iova),
> > +	IOCTL_OP(IOMMU_IOAS_IOVA_RANGES, iommufd_ioas_iova_ranges,
> > +		 struct iommu_ioas_iova_ranges, out_iova_alignment),
> > +	IOCTL_OP(IOMMU_IOAS_MAP, iommufd_ioas_map, struct
> > iommu_ioas_map,
> > +		 __reserved),
> > +	IOCTL_OP(IOMMU_IOAS_UNMAP, iommufd_ioas_unmap, struct
> > iommu_ioas_unmap,
> > +		 length),
> > +	IOCTL_OP(IOMMU_OPTION, iommufd_option, struct iommu_option,
> > +		 val64),
> >  };
> 
> Just personal preference - it reads better to me if the above order (and
> the enum definition in iommufd.h) can be same as how those commands
> are defined/explained in iommufd.h.

I prefer "keep sorted" for these kinds of lists, it is much easier to
maintain in the long run

> > + * Query an IOAS for ranges of allowed IOVAs. Mapping IOVA outside these
> > ranges
> > + * is not allowed. out_num_iovas will be set to the total number of iovas
> > and
> > + * the out_valid_iovas[] will be filled in as space permits.
> 
> out_num_iovas and out_valid_iovas[] are stale.

Done

> > + *
> > + * The allowed ranges are dependent on the HW path the DMA operation
> > takes, and
> > + * can change during the lifetime of the IOAS. A fresh empty IOAS will have a
> > + * full range, and each attached device will narrow the ranges based on that
> > + * devices HW restrictions. Detatching a device can widen the ranges.
> 
> devices -> device's

Ok

> > +/**
> > + * struct iommu_ioas_allow_iovas - ioctl(IOMMU_IOAS_ALLOW_IOVAS)
> > + * @size: sizeof(struct iommu_ioas_allow_iovas)
> > + * @ioas_id: IOAS ID to allow IOVAs from
> 
> missed num_iovas and __reserved

Hurm. how do I get make W=1 to cover the header files?

> > + * @allowed_iovas: Pointer to array of struct iommu_iova_range
> > + *
> > + * Ensure a range of IOVAs are always available for allocation. If this call
> > + * succeeds then IOMMU_IOAS_IOVA_RANGES will never return a list of
> > IOVA ranges
> > + * that are narrower than the ranges provided here. This call will fail if
> > + * IOMMU_IOAS_IOVA_RANGES is currently narrower than the given ranges.
> > + *
> > + * When an IOAS is first created the IOVA_RANGES will be maximally sized,
> > and as
> > + * devices are attached the IOVA will narrow based on the device
> > restrictions.
> > + * When an allowed range is specified any narrowing will be refused, ie
> > device
> > + * attachment can fail if the device requires limiting within the allowed
> > range.
> > + *
> > + * Automatic IOVA allocation is also impacted by this call. MAP will only
> > + * allocate within the allowed IOVAs if they are present.
> 
> According to iopt_check_iova() FIXED_IOVA can specify an iova which
> is not in allowed list but in the list of reported IOVA_RANGES. Is it
> correct or make more sense to have FIXED_IOVA also under guard of
> the allowed list (if violating then fail the map call)?

The concept was the allow list only really impacts domain
attachment. When a user uses FIXED they have to know what they are
doing. There is not a good reason to deny the user to use any IOVA
that is not restricted by the reserved list.

> > + * @length: Number of bytes to unmap, and return back the bytes
> > unmapped
> > + *
> > + * Unmap an IOVA range. The iova/length must be a superset of a
> > previously
> > + * mapped range used with IOMMU_IOAS_PAGETABLE_MAP or COPY.
> 
> remove 'PAGETABLE'

Done

> 
> > +/**
> > + * enum iommufd_option
> > + * @IOMMU_OPTION_RLIMIT_MODE:
> > + *    Change how RLIMIT_MEMLOCK accounting works. The caller must have
> > privilege
> > + *    to invoke this. Value 0 (default) is user based accouting, 1 uses process
> > + *    based accounting. Global option, object_id must be 0
> > + * @IOMMU_OPTION_HUGE_PAGES:
> > + *    Value 1 (default) allows contiguous pages to be combined when
> > generating
> > + *    iommu mappings. Value 0 disables combining, everything is mapped to
> > + *    PAGE_SIZE. This can be useful for benchmarking.  This is a per-IOAS
> > + *    option, the object_id must be the IOAS ID.
> 
> What about HWPT ID? Is there value of supporting HWPT's with different
> mapping size attached to the same IOAS?

This is a global IOAS flag, it just makes everything use PAGE_SIZE. It
is really only interesting for debugging and benchmarking. The test
suite and syzkaller have both made use of this to improve coverage.

> > +/**
> > + * @size: sizeof(struct iommu_option)
> > + * @option_id: One of enum iommufd_option
> > + * @op: One of enum iommufd_option_ops
> > + * @__reserved: Must be 0
> > + * @object_id: ID of the object if required
> > + * @val64: Option value to set or value returned on get
> > + *
> > + * Change a simple option value. This multiplexor allows controlling a
> > options
> > + * on objects. IOMMU_OPTION_OP_SET will load an option and
> > IOMMU_OPTION_OP_GET
> > + * will return the current value.
> > + */
> 
> This is quite generic. Does it imply that future device capability reporting
> can be also implemented based on this cmd, i.e. have OP_GET on a
> device object?

I don't view this as a way to do capabilities. I think we will have a
capability ioctl as well. This is really for something that can be
get & set.

Thanks,
Jason

  reply	other threads:[~2022-11-07 15:02 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-25 18:12 [PATCH v3 00/15] IOMMUFD Generic interface Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 01/15] iommu: Add IOMMU_CAP_ENFORCE_CACHE_COHERENCY Jason Gunthorpe
2022-10-26 12:45   ` Baolu Lu
2022-11-03  5:03   ` Tian, Kevin
2022-11-04 19:25     ` Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 02/15] iommu: Add device-centric DMA ownership interfaces Jason Gunthorpe
2022-11-03  5:11   ` Tian, Kevin
2022-11-04 19:32     ` Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 03/15] interval-tree: Add a utility to iterate over spans in an interval tree Jason Gunthorpe
2022-11-03  5:31   ` Tian, Kevin
2022-11-04 19:38     ` Jason Gunthorpe
2022-11-05  1:32       ` Tian, Kevin
2022-11-05  1:48       ` Matthew Wilcox
2022-11-07 14:38         ` Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 04/15] iommufd: Overview documentation Jason Gunthorpe
2022-10-26  4:17   ` Bagas Sanjaya
2022-10-28 19:09     ` Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 05/15] iommufd: File descriptor, context, kconfig and makefiles Jason Gunthorpe
2022-10-26 12:58   ` Baolu Lu
2022-10-26 17:14     ` Jason Gunthorpe
2022-10-29  3:43       ` Baolu Lu
2022-11-03  7:22   ` Tian, Kevin
2022-11-07 17:00     ` Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 06/15] kernel/user: Allow user::locked_vm to be usable for iommufd Jason Gunthorpe
2022-11-03  7:23   ` Tian, Kevin
2022-10-25 18:12 ` [PATCH v3 07/15] iommufd: PFN handling for iopt_pages Jason Gunthorpe
2022-11-01 19:38   ` Nicolin Chen
2022-11-02 13:13     ` Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 08/15] iommufd: Algorithms for PFN storage Jason Gunthorpe
2022-10-31 16:01   ` [PATCH v3 8/15] " Jason Gunthorpe
2022-11-01 16:09   ` Jason Gunthorpe
2022-11-03 20:08   ` Jason Gunthorpe
2022-11-04 16:26     ` Jason Gunthorpe
2022-11-04 16:04   ` Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 09/15] iommufd: Data structure to provide IOVA to PFN mapping Jason Gunthorpe
2022-10-26 18:46   ` [PATCH v3 9/15] " Jason Gunthorpe
2022-10-27 11:37   ` Jason Gunthorpe
2022-10-27 13:35   ` Jason Gunthorpe
2022-10-28 18:52   ` Jason Gunthorpe
2022-11-01 19:17   ` [PATCH v3 09/15] " Nicolin Chen
2022-11-02 13:11     ` Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 10/15] iommufd: IOCTLs for the io_pagetable Jason Gunthorpe
2022-10-26 17:01   ` Jason Gunthorpe
2022-10-26 23:17   ` Jason Gunthorpe
2022-10-29  7:25   ` Baolu Lu
2022-11-07 14:17     ` Jason Gunthorpe
2022-11-04  8:32   ` Tian, Kevin
2022-11-07 15:02     ` Jason Gunthorpe [this message]
2022-11-08  2:05       ` Tian, Kevin
2022-11-08 17:29         ` Jason Gunthorpe
2022-11-09  2:50           ` Tian, Kevin
2022-11-09 13:05             ` Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 11/15] iommufd: Add a HW pagetable object Jason Gunthorpe
2022-11-04 10:00   ` Tian, Kevin
2022-10-25 18:12 ` [PATCH v3 12/15] iommufd: Add kAPI toward external drivers for physical devices Jason Gunthorpe
2022-10-29  7:19   ` Baolu Lu
2022-11-07 14:14     ` Jason Gunthorpe
2022-11-05  7:17   ` Tian, Kevin
2022-11-07 17:54     ` Jason Gunthorpe
2022-11-08  2:17       ` Tian, Kevin
2022-10-25 18:12 ` [PATCH v3 13/15] iommufd: Add kAPI toward external drivers for kernel access Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 14/15] iommufd: vfio container FD ioctl compatibility Jason Gunthorpe
2022-10-27 14:12   ` Jason Gunthorpe
2022-11-01 19:45   ` Nicolin Chen
2022-11-02 13:15     ` Jason Gunthorpe
2022-11-05  0:07   ` Baolu Lu
2022-11-07 14:23     ` Jason Gunthorpe
2022-11-05  9:31   ` Tian, Kevin
2022-11-07 17:08     ` Jason Gunthorpe
2022-11-07 23:53       ` Tian, Kevin
2022-11-08  0:09         ` Jason Gunthorpe
2022-11-08  0:13           ` Tian, Kevin
2022-11-08  0:17             ` Jason Gunthorpe
2022-10-25 18:12 ` [PATCH v3 15/15] iommufd: Add a selftest Jason Gunthorpe
2022-11-01 20:32   ` Nicolin Chen
2022-11-02 13:17     ` Jason Gunthorpe
2022-11-02 18:49       ` Nathan Chancellor
2022-11-04  1:01   ` Jason Gunthorpe
2022-11-04  5:43     ` Tian, Kevin
2022-11-04 19:42       ` Jason Gunthorpe
2022-10-28 23:57 ` [PATCH v3 00/15] IOMMUFD Generic interface Nicolin Chen
2022-11-04 21:27 ` Alex Williamson
2022-11-04 22:03   ` Alex Williamson
2022-11-07 14:22     ` Jason Gunthorpe
2022-11-07 14:19   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y2kd/Ptt0iR6SGsh@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=chaitanyak@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=corbet@lwn.net \
    --cc=daniel.m.jordan@oracle.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=dwmw2@infradead.org \
    --cc=eric.auger@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=iommu@lists.linux.dev \
    --cc=jasowang@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=joao.m.martins@oracle.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=mjrosato@linux.ibm.com \
    --cc=mst@redhat.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=nicolinc@nvidia.com \
    --cc=ojeda@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=schnelle@linux.ibm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=shuah@kernel.org \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=trix@redhat.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.