All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: David Woodhouse <dwmw2@infradead.org>, joro@8bytes.org
Cc: ashok.raj@intel.com, leedom@chelsio.com, Harsh@chelsio.com,
	herbert@gondor.apana.org.au, iommu@lists.linux-foundation.org,
	linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] iommu/vt-d: Fix scatterlist offset handling
Date: Tue, 3 Oct 2017 19:05:17 +0100	[thread overview]
Message-ID: <db8c70d7-5978-20e8-d29a-91a4312f97dd@arm.com> (raw)
In-Reply-To: <1507035334.29211.105.camel@infradead.org>

On 03/10/17 13:55, David Woodhouse wrote:
> On Thu, 2017-09-28 at 15:14 +0100, Robin Murphy wrote:
>> The intel-iommu DMA ops fail to correctly handle scatterlists where
>> sg->offset is greater than PAGE_SIZE - the IOVA allocation is computed
>> appropriately based on the page-aligned portion of the offset, but the
>> mapping is set up relative to sg->page, which means it fails to actually
>> cover the whole buffer (and in the worst case doesn't cover it at all):
>>
>>     (sg->dma_address + sg->dma_len) ----+
>>     sg->dma_address ---------+          |
>>     iov_pfn------+           |          |
>>                  |           |          |
>>                  v           v          v
>> iova:   a        b        c        d        e        f
>>         |--------|--------|--------|--------|--------|
>>                           <...calculated....>
>>                  [_____mapped______]
>> pfn:    0        1        2        3        4        5
>>         |--------|--------|--------|--------|--------|
>>                  ^           ^          ^
>>                  |           |          |
>>     sg->page ----+           |          |
>>     sg->offset --------------+          |
>>     (sg->offset + sg->length) ----------+
> 
> I'd still dearly love to see some clear documentation of what it means
> for sg->offset to be outside the page referenced by sg->page.

I think the key is that for each SG segment, sg->page doesn't
necessarily represent "a" page, but the first of one or more contiguous
pages. Disregarding offsets for the moment, Here's a typical example of
a 120KB buffer from the block layer as processed by iommu_dma_map_sg():

[   16.092649] == initial (4) ==
[   16.095591]  0: virt ffff800001372000	phys 0x0000000081372000	dma 0x0000000000000000
[   16.095591] 		offset 0x00000000	length 0x0000e000	dma_len 0x00000000
[   16.109541]  1: virt ffff800001380000	phys 0x0000000081380000	dma 0x0000000000000000
[   16.109541] 		offset 0x00000000	length 0x0000d000	dma_len 0x00000000
[   16.123491]  2: virt ffff80000138e000	phys 0x000000008138e000	dma 0x0000000000000000
[   16.123491] 		offset 0x00000000	length 0x00002000	dma_len 0x00000000
[   16.137440]  3: virt ffff800001390000	phys 0x0000000081390000	dma 0x0000000000000000
[   16.137440] 		offset 0x00000000	length 0x00001000	dma_len 0x00000000
[   16.216167] == final   (2) ==
[   16.219106]  0: virt ffff800001372000	phys 0x0000000081372000	dma 0x00000000ffb60000
[   16.219106] 		offset 0x00000000	length 0x0000e000	dma_len 0x0000e000
[   16.233056]  1: virt ffff800001380000	phys 0x0000000081380000	dma 0x00000000ffb70000
[   16.233056] 		offset 0x00000000	length 0x0000d000	dma_len 0x00010000

i.e. segments of 14 pages, 13 pages, 2 pages and 1 page respectively
(and we further merge the resulting DMA-contiguous segments on top of
that).

Now, there are indeed plenty of drivers and subsystems which do work on
lists of explicitly single pages - anything doing some variant of
"addr = kmap_atomic(sg_page(sg)) + sg->offset;" is easy to spot - but I
don't think DMA API implementations are in a position to make any kind
of assumption; nearly all of them just shut up and handle sg->length
bytes from sg_phys(sg) without questioning the caller, and I reckon
that's exactly what they should be doing.

> Or is it really not "outside", and it's *only* valid for the offset to
> be > PAGE_OFFSET when it's a huge page, so we can check that with a
> BUG_ON() ? 
> 
> In particular, I'd like to know what is intended in the Xen PV case,
> where there isn't a straight correspondence between pfn and mfn. Is the
> out-of-range sg->offset intended to refer to the next *pfn* after sg-
>> page, or to the next *mfn* after sg->page? 

Logically, it should mean the same thing as whatever a length of more
than 1 page means to Xen - judging by blkif_queue_rw_req() at least,
that seems to be a BUG_ON() in both cases.

> I confess I've only followed this thread vaguely, but I haven't seen a
> *coherent* explanation except in the huge page case (in which case I
> want to see that BUG_ON in the patch) of why this isn't just totally
> bogus.

As I've said before, I'd certainly consider it a denormalised case, but
not a bogus one, and certainly not something that is the DMA API's job
to police. Having now audited every dma_map_ops::map_sg implementation I
could find, the only ones not using sg_phys()/sg_virt() or some other
construction immune to the absolute offset value (MIPS even explicitly
normalises it) are intel-iommu and arch/frv, and the latter is clearly
broken anyway as it ignores sg->length.

Robin.

  reply	other threads:[~2017-10-03 18:05 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-28 14:14 [PATCH] iommu/vt-d: Fix scatterlist offset handling Robin Murphy
2017-09-28 16:17 ` Casey Leedom
2017-09-28 13:29   ` Raj, Ashok
2017-09-28 16:59     ` Robin Murphy
2017-09-28 15:43       ` Raj, Ashok
2017-10-03 19:36         ` Raj, Ashok
2017-09-29  8:14 ` Harsh Jain
     [not found]   ` <fe25071a-18bf-e468-01e7-36515f2110e2-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>
2017-09-29 16:18     ` Casey Leedom
2017-09-29 16:18       ` Casey Leedom
     [not found]       ` <MWHPR12MB160034E91A834504FE85C07BC87E0-Gy0DoCVfaSVsWITs4OkDoAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-10-03 12:22         ` Harsh Jain
2017-10-03 22:22           ` Casey Leedom
     [not found] ` <644c3e01654f8bd48d669c36e424959d6ef0e27e.1506607370.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2017-10-03 12:55   ` David Woodhouse
2017-10-03 12:55     ` David Woodhouse
2017-10-03 18:05     ` Robin Murphy [this message]
2017-10-03 22:16       ` David Woodhouse
2017-10-04 11:18         ` Robin Murphy
2017-10-06 14:43       ` Joerg Roedel
2017-10-06 12:54         ` Raj, Ashok
     [not found]         ` <20171006144309.GA30803-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2017-11-06 18:47           ` Jacob Pan
2017-11-06 18:47             ` Jacob Pan
2017-11-15 23:54             ` Jacob Pan
2017-11-15 23:54               ` Jacob Pan
2017-11-16 21:32               ` Alex Williamson
     [not found]                 ` <20171116143244.2583d044-1yVPhWWZRC1BDLzU/O5InQ@public.gmane.org>
2017-11-16 21:09                   ` Raj, Ashok
2017-11-16 21:09                     ` Raj, Ashok
2017-11-17 16:18                     ` Alex Williamson
2017-11-17 16:18                       ` Alex Williamson
2017-11-17 15:48                       ` Raj, Ashok
2017-11-17 17:44                         ` Casey Leedom
2017-11-17 17:44                           ` Casey Leedom
2017-11-17 17:44                           ` Casey Leedom
     [not found]                           ` <SN1PR12MB035214EF471935B6F4220E36C82F0-z7L1TMIYDg4e2a8M8f4RFAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-11-17 18:09                             ` Jacob Pan
2017-11-17 18:09                               ` Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=db8c70d7-5978-20e8-d29a-91a4312f97dd@arm.com \
    --to=robin.murphy@arm.com \
    --cc=Harsh@chelsio.com \
    --cc=ashok.raj@intel.com \
    --cc=dwmw2@infradead.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=leedom@chelsio.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.