IOMMU Archive on lore.kernel.org
 help / color / Atom feed
From: Lu Baolu <baolu.lu@linux.intel.com>
To: AM <am.online.edu@gmail.com>, iommu@lists.linux-foundation.org
Subject: Re: kernel BUG at drivers/iommu/intel-iommu.c:667!
Date: Wed, 4 Dec 2019 10:26:05 +0800
Message-ID: <b9b0cd40-560a-089d-0582-5f180f92c5c0@linux.intel.com> (raw)
In-Reply-To: <CAL20ACLnsBVG6g6BBVASP9jYBzXQJR-y7WRY-D4a8Rv+aZtWCg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4451 bytes --]

Hi,

On 12/4/19 3:22 AM, AM wrote:
> Hi Lu Baolu,
> 
> I tried kernel 4.18.0-147.6.el8.x86_64+debug and used the following API 
> sequence for mapping multiple hugepages:
> 
> get_user_pages_fast()
> sg_alloc_table_from_pages()
> // also tried sg_alloc_table() w/ sg_set_page() using 1GB size for each 
> entry
> dma_map_sg()
> 
> I'm able to DMA upto 1GB successfully and validate the data. Also, DMA 
> above 1GB completes w/o any error, but data isn't correct starting 
> immediately after 1GB i.e. second GB offset 0x40000000 starts showing 
> data mismatches.

I am not sure whether you followed the right way to build a sg list. But
I do care that Intel IOMMU does dma_map_sg() right.

Can you please try the attached patch? It can help you get more tracing
information when you call dma_map_sg().

Best regards,
baolu

> 
> I've used get_user_pages_fast() in two ways to no avail:
> 1. populate page array w/ first page of 1GB hugepage and used 
> sg_set_page() for
>     setting 1GB size of the page entry. This debugging effort uses the fact
>     that all pages following the first page of huge page start address 
> are contiguous.
>     Ideally dma_map_sg() should coalesce contiguous pages, and my 
> intention was to collect
>     more data from debugging.
> 2. populate page array w/ all pages from all hugepages
> 
> Thanks,
> -am
> 
> 
> 
> On Sun, Dec 1, 2019 at 7:33 PM Anand Misra <am.online.edu@gmail.com 
> <mailto:am.online.edu@gmail.com>> wrote:
> 
>     [+iommu_list]
> 
>     Application isn't aware of hugepage but a userspace (lower) level
>     stack is aware of the type of memory being allocated on behalf of
>     application, which in turn communicates w/ driver via ioctl. I'm
>     trying to make it more agnostic by using dma_map_sg() when multiple
>     GBs are required by application. Otherwise, I'm using
>     dmap_map_page(). Admittedly, I'm learning these concepts/APIs for
>     Linux along the way.
> 
>     Thanks,
>     -am
> 
> 
>     On Sun, Dec 1, 2019 at 7:12 PM Lu Baolu <baolu.lu@linux.intel.com
>     <mailto:baolu.lu@linux.intel.com>> wrote:
> 
>         Hi,
> 
>         On 12/2/19 11:00 AM, Anand Misra wrote:
>          > Thanks, Lu Baolu. This is the dev version we've in our
>         company. I can
>          > try on a Ubuntu with a recent kernel version. Although, do
>         you think I'm  > going in the right direction? Is it possible to
>         have multiple hugepages
>          > mapped via iommu to get contiguous mapping for DMA?
>          >
> 
>         You mentioned:
> 
>         "
>         I'm trying to use iommu for multiple hugepages (mmap'ed by
>         process and
>         pushed to driver via ioctl). The expectation is to have multiple
>         hugepages mapped via iommu with each huge page having an entry
>         in iommu
>         (i.e. minimize table walk for DMA). Is this possible?
>         "
> 
>         Currently huge page mapping is hidden in iommu driver according
>         to the
>         iommu capability and size of map range. Why do you want to be
>         aware of
>         it in driver or even application level?
> 
>         Best regards,
>         baolu
> 
>          > -am
>          >
>          >
>          > On Sun, Dec 1, 2019 at 6:24 PM Lu Baolu
>         <baolu.lu@linux.intel.com <mailto:baolu.lu@linux.intel.com>
>          > <mailto:baolu.lu@linux.intel.com
>         <mailto:baolu.lu@linux.intel.com>>> wrote:
>          >
>          >     Hi,
>          >
>          >     On 12/2/19 9:46 AM, Anand Misra wrote:
>          >      > Hello:
>          >      >
>          >      > I'm in process of adding iommu support in my driver
>         for a PCIe
>          >     device.
>          >      > The device doesn't publish ACS/ATS via its config
>         space. I've
>          >     following
>          >      > config:
>          >      >
>          >      > Linux cmdline: "intel-iommu=on iommu=pt
>          >      > vfio_iommu_type1.allow_unsafe_interrupts=1
>          >     pcie_acs_override=downstream"
>          >      > Centos kernel: 3.10.0-1062.1.2.el7.x86_64
>          >      >
>          >
>          >     Can you please use the latest kernel for test? 3.10 seems
>         to be pretty
>          >     old and there are a lot of changes after it.
>          >
>          >     Best regards,
>          >     baolu
>          >
> 

[-- Attachment #2: 0001-iommu-vt-d-trace-Extend-map_sg-trace-event.patch --]
[-- Type: text/x-patch, Size: 2855 bytes --]

From c10422b2827b3fd4141ddac2601608ed6c883cea Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu@linux.intel.com>
Date: Wed, 4 Dec 2019 10:10:20 +0800
Subject: [PATCH 1/1] iommu/vt-d: trace: Extend map_sg trace event

Current map_sg stores trace message in a coarse manner. This
extends it so that more detailed messages could be traced.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c        |  4 +--
 include/trace/events/intel_iommu.h | 43 +++++++++++++++++++++++++-----
 2 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 6db6d969e31c..b47b8ba5ac0f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3769,8 +3769,8 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
 		return 0;
 	}
 
-	trace_map_sg(dev, iova_pfn << PAGE_SHIFT,
-		     sg_phys(sglist), size << VTD_PAGE_SHIFT);
+	for_each_sg(sglist, sg, nelems, i)
+		trace_map_sg(dev, i + 1, nelems, sg);
 
 	return nelems;
 }
diff --git a/include/trace/events/intel_iommu.h b/include/trace/events/intel_iommu.h
index 54e61d456cdf..8b0199d80b75 100644
--- a/include/trace/events/intel_iommu.h
+++ b/include/trace/events/intel_iommu.h
@@ -49,12 +49,6 @@ DEFINE_EVENT(dma_map, map_single,
 	TP_ARGS(dev, dev_addr, phys_addr, size)
 );
 
-DEFINE_EVENT(dma_map, map_sg,
-	TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr,
-		 size_t size),
-	TP_ARGS(dev, dev_addr, phys_addr, size)
-);
-
 DEFINE_EVENT(dma_map, bounce_map_single,
 	TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr,
 		 size_t size),
@@ -99,6 +93,43 @@ DEFINE_EVENT(dma_unmap, bounce_unmap_single,
 	TP_ARGS(dev, dev_addr, size)
 );
 
+DECLARE_EVENT_CLASS(dma_map_sg,
+	TP_PROTO(struct device *dev, int index, int total,
+		 struct scatterlist *sg),
+
+	TP_ARGS(dev, index, total, sg),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(dev))
+		__field(dma_addr_t, dev_addr)
+		__field(phys_addr_t, phys_addr)
+		__field(size_t,	size)
+		__field(int, index)
+		__field(int, total)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name(dev));
+		__entry->dev_addr = sg->dma_address;
+		__entry->phys_addr = sg_phys(sg);
+		__entry->size = sg->dma_length;
+		__entry->index = index;
+		__entry->total = total;
+	),
+
+	TP_printk("dev=%s [%d/%d] dev_addr=0x%llx phys_addr=0x%llx size=%zu",
+		  __get_str(dev_name), __entry->index, __entry->total,
+		  (unsigned long long)__entry->dev_addr,
+		  (unsigned long long)__entry->phys_addr,
+		  __entry->size)
+);
+
+DEFINE_EVENT(dma_map_sg, map_sg,
+	TP_PROTO(struct device *dev, int index, int total,
+		 struct scatterlist *sg),
+	TP_ARGS(dev, index, total, sg)
+);
+
 #endif /* _TRACE_INTEL_IOMMU_H */
 
 /* This part must be outside protection */
-- 
2.17.1


[-- Attachment #3: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

      reply index

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-02  1:46 Anand Misra
2019-12-02  1:51 ` Anand Misra
2019-12-02  2:23 ` Lu Baolu
     [not found]   ` <CAL20ACLtwjDLaPattEkPiufsgHa2k-4Wb_Dw7Urh9we0QwbJfQ@mail.gmail.com>
     [not found]     ` <da7fb26f-022b-eaad-1a91-11cf15531f8a@linux.intel.com>
2019-12-02  3:33       ` Anand Misra
2019-12-03 19:22         ` AM
2019-12-04  2:26           ` Lu Baolu [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b9b0cd40-560a-089d-0582-5f180f92c5c0@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=am.online.edu@gmail.com \
    --cc=iommu@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

IOMMU Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-iommu/0 linux-iommu/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-iommu linux-iommu/ https://lore.kernel.org/linux-iommu \
		iommu@lists.linux-foundation.org
	public-inbox-index linux-iommu

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.linux-foundation.lists.iommu


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git