Re: ehci-pci breakage with dma-mapping changes in 5.4-rc2

From: Arvind Sankar <nivedita@alum.mit.edu>
To: Christoph Hellwig <hch@lst.de>
Cc: Arvind Sankar <nivedita@alum.mit.edu>,
	Christoph Hellwig <hch@infradead.org>,
	linux-kernel@vger.kernel.org
Subject: Re: ehci-pci breakage with dma-mapping changes in 5.4-rc2
Date: Tue, 8 Oct 2019 11:47:31 -0400	[thread overview]
Message-ID: <20191008154731.GA616259@rani.riverdale.lan> (raw)
In-Reply-To: <20191008115103.GA463127@rani.riverdale.lan>

On Tue, Oct 08, 2019 at 07:51:03AM -0400, Arvind Sankar wrote:
> On Tue, Oct 08, 2019 at 09:32:10AM +0200, Christoph Hellwig wrote:
> > On Mon, Oct 07, 2019 at 07:54:02PM -0400, Arvind Sankar wrote:
> > > > Do you want me to resend the patch as its own mail, or do you just take
> > > > it with a Tested-by: from me? If the former, I assume you're ok with me
> > > > adding your Signed-off-by?
> > > > 
> > > > Thanks
> > > 
> > > A question on the original change though -- what happens if a single
> > > device (or a single IOMMU domain really) does want >4G DMA address
> > > space? Was that not previously allowed either?
> > 
> > Your EHCI device actually supports the larger addressing.  Without an
> > IOMMU (or with accidentally enabled passthrough mode as in your report)
> > that will use bounce buffers for physical address that are too large.
> > With an iommu we can just remap, and by default those remap addresses
> > are under 32-bit just to make everyones life easier.
> > 
> > The dma_get_required_mask function is misnamed unfortunately, what it
> > really means is the optimal mask, that is one that avoids bounce
> > buffering or other complications.
> 
> I understand that my EHCI device, even though it only supports 32-bit
> adddressing, will be able to DMA into anywhere in physical RAM, whether
> below 4G or not, via the IOMMU or bounce buffering.
> 
> What I mean is, do there exist devices (which would necessarily support
> 64-bit DMA) that want to DMA using bigger than 4Gb buffers. Eg a GPU
> accelerator card with 16Gb of RAM on-board that wants to map 6Gb for DMA
> in one go, or 5 accelerator cards that are in one IOMMU domain and want
> to simultaneously map 1Gb each.

Ok, I see that almost nothing actually uses dma_get_required_mask. So if
something did need >4Gb space, the IOMMU would allocate it anyway
regardless of dma_get_required_mask.

I'm thinking this means that we actually only needed to change
dma_get_required_mask to dma_direct_get_required_mask in
iommu_need_mapping, and the rest of the change is unnecessary?

Below is list of stuff calling dma_get_required_mask currently:

/usr/src/git/linux # find . -type f -name \*.[ch] | xargs grep -w dma_get_required_mask

./drivers/mmc/host/sdhci-of-dwcmshc.c:  extra = DIV_ROUND_UP_ULL(dma_get_required_mask(&pdev->dev), SZ_128M);
./drivers/pci/controller/vmd.c: return dma_get_required_mask(to_vmd_dev(dev));
./drivers/gpu/drm/etnaviv/etnaviv_gpu.c:                u32 dma_mask = (u32)dma_get_required_mask(gpu->dev);
./drivers/message/fusion/mptbase.c:             const uint64_t required_mask = dma_get_required_mask
./drivers/scsi/dpt_i2o.c:           dma_get_required_mask(&pDev->dev) > DMA_BIT_MASK(32) &&
./drivers/scsi/aic7xxx/aic79xx_osm_pci.c:               const u64 required_mask = dma_get_required_mask(dev);
./drivers/scsi/aic7xxx/aic7xxx_osm_pci.c:           && dma_get_required_mask(dev) > DMA_BIT_MASK(32)) {
./drivers/scsi/mpt3sas/mpt3sas_base.c:  required_mask = dma_get_required_mask(&pdev->dev);
./drivers/scsi/qla2xxx/qla_os.c:                if (MSD(dma_get_required_mask(&ha->pdev->dev)) &&
./drivers/scsi/aacraid/aachba.c:        if (dma_get_required_mask(&dev->pdev->dev) > DMA_BIT_MASK(32))
./drivers/scsi/aacraid/comminit.c:                              dma_get_required_mask(&dev->pdev->dev) >> 12;
./drivers/scsi/esas2r/esas2r_init.c:        dma_get_required_mask(&pcid->dev) > DMA_BIT_MASK(32) &&
./drivers/infiniband/sw/rxe/rxe_verbs.c:                                     dma_get_required_mask(&dev->dev));
./include/linux/dma-mapping.h:u64 dma_get_required_mask(struct device *dev);
./include/linux/dma-mapping.h:static inline u64 dma_get_required_mask(struct device *dev)
./include/linux/dma-mapping.h:                      dma_get_required_mask(dev);
./kernel/dma/mapping.c:u64 dma_get_required_mask(struct device *dev)
./kernel/dma/mapping.c:EXPORT_SYMBOL_GPL(dma_get_required_mask);

For the drivers that do currently use dma_get_required_mask, I believe
we will need to replace most of them with dma_direct_get_required_mask
as well to maintain passthrough functionality -- the fusion, scsi, and
infinband drivers all seem to be using this call to determine whether to
expose the device's 64-bit DMA capability or not. With the change they
will think they only need 32-bit DMA, which in turn will disable
passthrough on them.

The etnaviv driver is doing something funky that I'm not sure about, but
I *think* that one might want the real physical range as well. The mmc
driver I think might be ok with the 32-bit range.

The vmd controller one not sure about.

In dma-mapping.h, the function is used in dma_addressing_limited, which
in turn is used in a couple of amd drm drivers, again not sure about
these.