iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* kernel BUG at drivers/iommu/intel-iommu.c:667!
@ 2019-12-02  1:46 Anand Misra
  2019-12-02  1:51 ` Anand Misra
  2019-12-02  2:23 ` Lu Baolu
  0 siblings, 2 replies; 6+ messages in thread
From: Anand Misra @ 2019-12-02  1:46 UTC (permalink / raw)
  To: iommu


[-- Attachment #1.1: Type: text/plain, Size: 4599 bytes --]

Hello:

I'm in process of adding iommu support in my driver for a PCIe device. The
device doesn't publish ACS/ATS via its config space. I've following config:

Linux cmdline: "intel-iommu=on iommu=pt
vfio_iommu_type1.allow_unsafe_interrupts=1 pcie_acs_override=downstream"
Centos kernel: 3.10.0-1062.1.2.el7.x86_64

I'm trying to use iommu for multiple hugepages (mmap'ed by process and
pushed to driver via ioctl). The expectation is to have multiple hugepages
mapped via iommu with each huge page having an entry in iommu (i.e.
minimize table walk for DMA). Is this possible?

[1] The driver ioctl has the following sequence:

1. get_user_pages_fast() for each hugepage start address for one page
2. sg_alloc_table_from_pages() using sgt from #3
3. dma_map_sg() for num hugepages using sgt from #4

I'm getting kernel crash at #3 for "domain_get_iommu+0x55/0x70":

----------------------
[148794.896405] kernel BUG at drivers/iommu/intel-iommu.c:667!
[148794.896409] invalid opcode: 0000 [#1] SMP
[148794.896414] Modules linked in: mydrv(OE) nfsv3 nfs_acl nfs lockd grace
fscache xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4
nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c
br_netfilter bridge stp llc overlay(T) ipmi_devintf ipmi_msghandler sunrpc
vfat fat iTCO_wdt mei_wdt iTCO_vendor_support sb_edac intel_powerclamp
coretemp intel_rapl iosf_mbi kvm_intel kvm snd_hda_codec_hdmi irqbypass
crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_codec_realtek lrw
gf128mul glue_helper ablk_helper cryptd dell_smbios snd_hda_codec_generic
intel_wmi_thunderbolt dcdbas dell_wmi_descriptor pcspkr snd_hda_intel
snd_hda_codec snd_hda_core snd_hwdep i2c_i801 snd_seq snd_seq_device sg
snd_pcm lpc_ich ftdi_sio snd_timer
[148794.896522]  joydev snd mei_me mei soundcore pcc_cpufreq binfmt_misc
ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic sr_mod
cdrom nouveau video mxm_wmi i2c_algo_bit drm_kms_helper crct10dif_pclmul
crct10dif_common crc32c_intel serio_raw syscopyarea sysfillrect sysimgblt
fb_sys_fops ttm ahci drm libahci ata_generic e1000e pata_acpi libata ptp
pps_core drm_panel_orientation_quirks wmi [last unloaded: mydrv]
[148794.896587] CPU: 0 PID: 6020 Comm: TestIommu Kdump: loaded Tainted: G
        OE  ------------ T 3.10.0-1062.1.2.el7.x86_64 #1
[148794.896592] Hardware name: Dell Inc. Precision Tower 5810/0HHV7N, BIOS
A25 02/02/2018
[148794.896597] task: ffff8c82b6e0d230 ti: ffff8c8ac5b6c000 task.ti:
ffff8c8ac5b6c000
[148794.896601] RIP: 0010:[<ffffffff8efff195>]  [<ffffffff8efff195>]
domain_get_iommu+0x55/0x70
[148794.896611] RSP: 0018:ffff8c8ac5b6fce8  EFLAGS: 00010202
[148794.896614] RAX: ffff8c8adbeb0b00 RBX: ffff8c8ad4ac7600 RCX:
0000000000000000
[148794.896619] RDX: 00000000fffffff0 RSI: ffff8c8ace6e5940 RDI:
ffff8c8adbeb0b00
[148794.896622] RBP: ffff8c8ac5b6fce8 R08: 000000000001f0a0 R09:
ffffffff8f00255e
[148794.896626] R10: ffff8c8bdfc1f0a0 R11: fffff941bc39b940 R12:
0000000000000001
[148794.896630] R13: ffff8c4ce6b9d098 R14: 0000000000000000 R15:
ffff8c8ac8f22a00
[148794.896635] FS:  00007f1548320740(0000) GS:ffff8c8bdfc00000(0000)
knlGS:0000000000000000
[148794.896639] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[148794.896642] CR2: 00007f1547373689 CR3: 00000036f17c8000 CR4:
00000000003607f0
[148794.896647] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[148794.896651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[148794.896654] Call Trace:
[148794.896660]  [<ffffffff8f002ee5>] intel_map_sg+0x65/0x1e0
[...]

----------------------


[2] I've also tried using iommu APIs directly in my driver but I get "PTE
Read access is not set" for DMA read when attempting DMA from host to
device memory (size 1KB).

DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Read] Request device [02:00.0] fault addr ffffc030b000 [fault
reason 06] PTE Read access is not set

I see the following messages after DMA failure (and eventually system
crash):

DMAR: DRHD: handling fault status reg 100
DMAR: DRHD: handling fault status reg 100


I've used the following sequence with iommu APIs:

iommu_init:

    iommu_group = iommu_group_get(dev)

    iommu_domain = iommu_domain_alloc(&pci_bus_type)

    init_iova_domain(&iova_domain)

    iommu_attach_group(iommu_domain, iommu_group)

iommu_map:

iova = alloc_iova(&iova_domain, size >> shift, end >> shift, true);

    addr = iova_dma_addr(&iova_domain, iova);

iommu_map_sg(iommu_domain, addr, sgl, sgt->nents, IOMMU_READ | IOMMU_WRITE);


Thanks,
am

[-- Attachment #1.2: Type: text/html, Size: 13982 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: kernel BUG at drivers/iommu/intel-iommu.c:667!
  2019-12-02  1:46 kernel BUG at drivers/iommu/intel-iommu.c:667! Anand Misra
@ 2019-12-02  1:51 ` Anand Misra
  2019-12-02  2:23 ` Lu Baolu
  1 sibling, 0 replies; 6+ messages in thread
From: Anand Misra @ 2019-12-02  1:51 UTC (permalink / raw)
  To: iommu


[-- Attachment #1.1: Type: text/plain, Size: 5086 bytes --]

Correction:

1. get_user_pages_fast() for each hugepage start address for one page
2. sg_alloc_table_from_pages()  using page array from #1
3. dma_map_sg() for num hugepages using sgt from #2




On Sun, Dec 1, 2019 at 5:46 PM Anand Misra <am.online.edu@gmail.com> wrote:

> Hello:
>
> I'm in process of adding iommu support in my driver for a PCIe device. The
> device doesn't publish ACS/ATS via its config space. I've following config:
>
> Linux cmdline: "intel-iommu=on iommu=pt
> vfio_iommu_type1.allow_unsafe_interrupts=1 pcie_acs_override=downstream"
> Centos kernel: 3.10.0-1062.1.2.el7.x86_64
>
> I'm trying to use iommu for multiple hugepages (mmap'ed by process and
> pushed to driver via ioctl). The expectation is to have multiple hugepages
> mapped via iommu with each huge page having an entry in iommu (i.e.
> minimize table walk for DMA). Is this possible?
>
> [1] The driver ioctl has the following sequence:
>
> 1. get_user_pages_fast() for each hugepage start address for one page
> 2. sg_alloc_table_from_pages() using sgt from #3
> 3. dma_map_sg() for num hugepages using sgt from #4
>
> I'm getting kernel crash at #3 for "domain_get_iommu+0x55/0x70":
>
> ----------------------
> [148794.896405] kernel BUG at drivers/iommu/intel-iommu.c:667!
> [148794.896409] invalid opcode: 0000 [#1] SMP
> [148794.896414] Modules linked in: mydrv(OE) nfsv3 nfs_acl nfs lockd grace
> fscache xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4
> nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c
> br_netfilter bridge stp llc overlay(T) ipmi_devintf ipmi_msghandler sunrpc
> vfat fat iTCO_wdt mei_wdt iTCO_vendor_support sb_edac intel_powerclamp
> coretemp intel_rapl iosf_mbi kvm_intel kvm snd_hda_codec_hdmi irqbypass
> crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_codec_realtek lrw
> gf128mul glue_helper ablk_helper cryptd dell_smbios snd_hda_codec_generic
> intel_wmi_thunderbolt dcdbas dell_wmi_descriptor pcspkr snd_hda_intel
> snd_hda_codec snd_hda_core snd_hwdep i2c_i801 snd_seq snd_seq_device sg
> snd_pcm lpc_ich ftdi_sio snd_timer
> [148794.896522]  joydev snd mei_me mei soundcore pcc_cpufreq binfmt_misc
> ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic sr_mod
> cdrom nouveau video mxm_wmi i2c_algo_bit drm_kms_helper crct10dif_pclmul
> crct10dif_common crc32c_intel serio_raw syscopyarea sysfillrect sysimgblt
> fb_sys_fops ttm ahci drm libahci ata_generic e1000e pata_acpi libata ptp
> pps_core drm_panel_orientation_quirks wmi [last unloaded: mydrv]
> [148794.896587] CPU: 0 PID: 6020 Comm: TestIommu Kdump: loaded Tainted: G
>           OE  ------------ T 3.10.0-1062.1.2.el7.x86_64 #1
> [148794.896592] Hardware name: Dell Inc. Precision Tower 5810/0HHV7N, BIOS
> A25 02/02/2018
> [148794.896597] task: ffff8c82b6e0d230 ti: ffff8c8ac5b6c000 task.ti:
> ffff8c8ac5b6c000
> [148794.896601] RIP: 0010:[<ffffffff8efff195>]  [<ffffffff8efff195>]
> domain_get_iommu+0x55/0x70
> [148794.896611] RSP: 0018:ffff8c8ac5b6fce8  EFLAGS: 00010202
> [148794.896614] RAX: ffff8c8adbeb0b00 RBX: ffff8c8ad4ac7600 RCX:
> 0000000000000000
> [148794.896619] RDX: 00000000fffffff0 RSI: ffff8c8ace6e5940 RDI:
> ffff8c8adbeb0b00
> [148794.896622] RBP: ffff8c8ac5b6fce8 R08: 000000000001f0a0 R09:
> ffffffff8f00255e
> [148794.896626] R10: ffff8c8bdfc1f0a0 R11: fffff941bc39b940 R12:
> 0000000000000001
> [148794.896630] R13: ffff8c4ce6b9d098 R14: 0000000000000000 R15:
> ffff8c8ac8f22a00
> [148794.896635] FS:  00007f1548320740(0000) GS:ffff8c8bdfc00000(0000)
> knlGS:0000000000000000
> [148794.896639] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [148794.896642] CR2: 00007f1547373689 CR3: 00000036f17c8000 CR4:
> 00000000003607f0
> [148794.896647] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [148794.896651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [148794.896654] Call Trace:
> [148794.896660]  [<ffffffff8f002ee5>] intel_map_sg+0x65/0x1e0
> [...]
>
> ----------------------
>
>
> [2] I've also tried using iommu APIs directly in my driver but I get "PTE
> Read access is not set" for DMA read when attempting DMA from host to
> device memory (size 1KB).
>
> DMAR: DRHD: handling fault status reg 2
> DMAR: [DMA Read] Request device [02:00.0] fault addr ffffc030b000 [fault
> reason 06] PTE Read access is not set
>
> I see the following messages after DMA failure (and eventually system
> crash):
>
> DMAR: DRHD: handling fault status reg 100
> DMAR: DRHD: handling fault status reg 100
>
>
> I've used the following sequence with iommu APIs:
>
> iommu_init:
>
>     iommu_group = iommu_group_get(dev)
>
>     iommu_domain = iommu_domain_alloc(&pci_bus_type)
>
>     init_iova_domain(&iova_domain)
>
>     iommu_attach_group(iommu_domain, iommu_group)
>
> iommu_map:
>
> iova = alloc_iova(&iova_domain, size >> shift, end >> shift, true);
>
>     addr = iova_dma_addr(&iova_domain, iova);
>
> iommu_map_sg(iommu_domain, addr, sgl, sgt->nents, IOMMU_READ |
> IOMMU_WRITE);
>
>
> Thanks,
> am
>
>

[-- Attachment #1.2: Type: text/html, Size: 15147 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: kernel BUG at drivers/iommu/intel-iommu.c:667!
  2019-12-02  1:46 kernel BUG at drivers/iommu/intel-iommu.c:667! Anand Misra
  2019-12-02  1:51 ` Anand Misra
@ 2019-12-02  2:23 ` Lu Baolu
       [not found]   ` <CAL20ACLtwjDLaPattEkPiufsgHa2k-4Wb_Dw7Urh9we0QwbJfQ@mail.gmail.com>
  1 sibling, 1 reply; 6+ messages in thread
From: Lu Baolu @ 2019-12-02  2:23 UTC (permalink / raw)
  To: Anand Misra, iommu

Hi,

On 12/2/19 9:46 AM, Anand Misra wrote:
> Hello:
> 
> I'm in process of adding iommu support in my driver for a PCIe device. 
> The device doesn't publish ACS/ATS via its config space. I've following 
> config:
> 
> Linux cmdline: "intel-iommu=on iommu=pt 
> vfio_iommu_type1.allow_unsafe_interrupts=1 pcie_acs_override=downstream"
> Centos kernel: 3.10.0-1062.1.2.el7.x86_64
> 

Can you please use the latest kernel for test? 3.10 seems to be pretty
old and there are a lot of changes after it.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: kernel BUG at drivers/iommu/intel-iommu.c:667!
       [not found]     ` <da7fb26f-022b-eaad-1a91-11cf15531f8a@linux.intel.com>
@ 2019-12-02  3:33       ` Anand Misra
  2019-12-03 19:22         ` AM
  0 siblings, 1 reply; 6+ messages in thread
From: Anand Misra @ 2019-12-02  3:33 UTC (permalink / raw)
  To: Lu Baolu, iommu


[-- Attachment #1.1: Type: text/plain, Size: 2198 bytes --]

[+iommu_list]

Application isn't aware of hugepage but a userspace (lower) level stack is
aware of the type of memory being allocated on behalf of application, which
in turn communicates w/ driver via ioctl. I'm trying to make it more
agnostic by using dma_map_sg() when multiple GBs are required by
application. Otherwise, I'm using dmap_map_page(). Admittedly, I'm learning
these concepts/APIs for Linux along the way.

Thanks,
-am


On Sun, Dec 1, 2019 at 7:12 PM Lu Baolu <baolu.lu@linux.intel.com> wrote:

> Hi,
>
> On 12/2/19 11:00 AM, Anand Misra wrote:
> > Thanks, Lu Baolu. This is the dev version we've in our company. I can
> > try on a Ubuntu with a recent kernel version. Although, do you think
> I'm  > going in the right direction? Is it possible to have multiple
> hugepages
> > mapped via iommu to get contiguous mapping for DMA?
> >
>
> You mentioned:
>
> "
> I'm trying to use iommu for multiple hugepages (mmap'ed by process and
> pushed to driver via ioctl). The expectation is to have multiple
> hugepages mapped via iommu with each huge page having an entry in iommu
> (i.e. minimize table walk for DMA). Is this possible?
> "
>
> Currently huge page mapping is hidden in iommu driver according to the
> iommu capability and size of map range. Why do you want to be aware of
> it in driver or even application level?
>
> Best regards,
> baolu
>
> > -am
> >
> >
> > On Sun, Dec 1, 2019 at 6:24 PM Lu Baolu <baolu.lu@linux.intel.com
> > <mailto:baolu.lu@linux.intel.com>> wrote:
> >
> >     Hi,
> >
> >     On 12/2/19 9:46 AM, Anand Misra wrote:
> >      > Hello:
> >      >
> >      > I'm in process of adding iommu support in my driver for a PCIe
> >     device.
> >      > The device doesn't publish ACS/ATS via its config space. I've
> >     following
> >      > config:
> >      >
> >      > Linux cmdline: "intel-iommu=on iommu=pt
> >      > vfio_iommu_type1.allow_unsafe_interrupts=1
> >     pcie_acs_override=downstream"
> >      > Centos kernel: 3.10.0-1062.1.2.el7.x86_64
> >      >
> >
> >     Can you please use the latest kernel for test? 3.10 seems to be
> pretty
> >     old and there are a lot of changes after it.
> >
> >     Best regards,
> >     baolu
> >
>

[-- Attachment #1.2: Type: text/html, Size: 3510 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: kernel BUG at drivers/iommu/intel-iommu.c:667!
  2019-12-02  3:33       ` Anand Misra
@ 2019-12-03 19:22         ` AM
  2019-12-04  2:26           ` Lu Baolu
  0 siblings, 1 reply; 6+ messages in thread
From: AM @ 2019-12-03 19:22 UTC (permalink / raw)
  To: Lu Baolu, iommu


[-- Attachment #1.1: Type: text/plain, Size: 3343 bytes --]

Hi Lu Baolu,

I tried kernel 4.18.0-147.6.el8.x86_64+debug and used the following API
sequence for mapping multiple hugepages:

get_user_pages_fast()
sg_alloc_table_from_pages()
// also tried sg_alloc_table() w/ sg_set_page() using 1GB size for each
entry
dma_map_sg()

I'm able to DMA upto 1GB successfully and validate the data. Also, DMA
above 1GB completes w/o any error, but data isn't correct starting
immediately after 1GB i.e. second GB offset 0x40000000 starts showing data
mismatches.

I've used get_user_pages_fast() in two ways to no avail:
1. populate page array w/ first page of 1GB hugepage and used sg_set_page()
for
   setting 1GB size of the page entry. This debugging effort uses the fact
   that all pages following the first page of huge page start address are
contiguous.
   Ideally dma_map_sg() should coalesce contiguous pages, and my intention
was to collect
   more data from debugging.
2. populate page array w/ all pages from all hugepages

Thanks,
-am



On Sun, Dec 1, 2019 at 7:33 PM Anand Misra <am.online.edu@gmail.com> wrote:

> [+iommu_list]
>
> Application isn't aware of hugepage but a userspace (lower) level stack is
> aware of the type of memory being allocated on behalf of application, which
> in turn communicates w/ driver via ioctl. I'm trying to make it more
> agnostic by using dma_map_sg() when multiple GBs are required by
> application. Otherwise, I'm using dmap_map_page(). Admittedly, I'm learning
> these concepts/APIs for Linux along the way.
>
> Thanks,
> -am
>
>
> On Sun, Dec 1, 2019 at 7:12 PM Lu Baolu <baolu.lu@linux.intel.com> wrote:
>
>> Hi,
>>
>> On 12/2/19 11:00 AM, Anand Misra wrote:
>> > Thanks, Lu Baolu. This is the dev version we've in our company. I can
>> > try on a Ubuntu with a recent kernel version. Although, do you think
>> I'm  > going in the right direction? Is it possible to have multiple
>> hugepages
>> > mapped via iommu to get contiguous mapping for DMA?
>> >
>>
>> You mentioned:
>>
>> "
>> I'm trying to use iommu for multiple hugepages (mmap'ed by process and
>> pushed to driver via ioctl). The expectation is to have multiple
>> hugepages mapped via iommu with each huge page having an entry in iommu
>> (i.e. minimize table walk for DMA). Is this possible?
>> "
>>
>> Currently huge page mapping is hidden in iommu driver according to the
>> iommu capability and size of map range. Why do you want to be aware of
>> it in driver or even application level?
>>
>> Best regards,
>> baolu
>>
>> > -am
>> >
>> >
>> > On Sun, Dec 1, 2019 at 6:24 PM Lu Baolu <baolu.lu@linux.intel.com
>> > <mailto:baolu.lu@linux.intel.com>> wrote:
>> >
>> >     Hi,
>> >
>> >     On 12/2/19 9:46 AM, Anand Misra wrote:
>> >      > Hello:
>> >      >
>> >      > I'm in process of adding iommu support in my driver for a PCIe
>> >     device.
>> >      > The device doesn't publish ACS/ATS via its config space. I've
>> >     following
>> >      > config:
>> >      >
>> >      > Linux cmdline: "intel-iommu=on iommu=pt
>> >      > vfio_iommu_type1.allow_unsafe_interrupts=1
>> >     pcie_acs_override=downstream"
>> >      > Centos kernel: 3.10.0-1062.1.2.el7.x86_64
>> >      >
>> >
>> >     Can you please use the latest kernel for test? 3.10 seems to be
>> pretty
>> >     old and there are a lot of changes after it.
>> >
>> >     Best regards,
>> >     baolu
>> >
>>
>

[-- Attachment #1.2: Type: text/html, Size: 6358 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: kernel BUG at drivers/iommu/intel-iommu.c:667!
  2019-12-03 19:22         ` AM
@ 2019-12-04  2:26           ` Lu Baolu
  0 siblings, 0 replies; 6+ messages in thread
From: Lu Baolu @ 2019-12-04  2:26 UTC (permalink / raw)
  To: AM, iommu

[-- Attachment #1: Type: text/plain, Size: 4451 bytes --]

Hi,

On 12/4/19 3:22 AM, AM wrote:
> Hi Lu Baolu,
> 
> I tried kernel 4.18.0-147.6.el8.x86_64+debug and used the following API 
> sequence for mapping multiple hugepages:
> 
> get_user_pages_fast()
> sg_alloc_table_from_pages()
> // also tried sg_alloc_table() w/ sg_set_page() using 1GB size for each 
> entry
> dma_map_sg()
> 
> I'm able to DMA upto 1GB successfully and validate the data. Also, DMA 
> above 1GB completes w/o any error, but data isn't correct starting 
> immediately after 1GB i.e. second GB offset 0x40000000 starts showing 
> data mismatches.

I am not sure whether you followed the right way to build a sg list. But
I do care that Intel IOMMU does dma_map_sg() right.

Can you please try the attached patch? It can help you get more tracing
information when you call dma_map_sg().

Best regards,
baolu

> 
> I've used get_user_pages_fast() in two ways to no avail:
> 1. populate page array w/ first page of 1GB hugepage and used 
> sg_set_page() for
>     setting 1GB size of the page entry. This debugging effort uses the fact
>     that all pages following the first page of huge page start address 
> are contiguous.
>     Ideally dma_map_sg() should coalesce contiguous pages, and my 
> intention was to collect
>     more data from debugging.
> 2. populate page array w/ all pages from all hugepages
> 
> Thanks,
> -am
> 
> 
> 
> On Sun, Dec 1, 2019 at 7:33 PM Anand Misra <am.online.edu@gmail.com 
> <mailto:am.online.edu@gmail.com>> wrote:
> 
>     [+iommu_list]
> 
>     Application isn't aware of hugepage but a userspace (lower) level
>     stack is aware of the type of memory being allocated on behalf of
>     application, which in turn communicates w/ driver via ioctl. I'm
>     trying to make it more agnostic by using dma_map_sg() when multiple
>     GBs are required by application. Otherwise, I'm using
>     dmap_map_page(). Admittedly, I'm learning these concepts/APIs for
>     Linux along the way.
> 
>     Thanks,
>     -am
> 
> 
>     On Sun, Dec 1, 2019 at 7:12 PM Lu Baolu <baolu.lu@linux.intel.com
>     <mailto:baolu.lu@linux.intel.com>> wrote:
> 
>         Hi,
> 
>         On 12/2/19 11:00 AM, Anand Misra wrote:
>          > Thanks, Lu Baolu. This is the dev version we've in our
>         company. I can
>          > try on a Ubuntu with a recent kernel version. Although, do
>         you think I'm  > going in the right direction? Is it possible to
>         have multiple hugepages
>          > mapped via iommu to get contiguous mapping for DMA?
>          >
> 
>         You mentioned:
> 
>         "
>         I'm trying to use iommu for multiple hugepages (mmap'ed by
>         process and
>         pushed to driver via ioctl). The expectation is to have multiple
>         hugepages mapped via iommu with each huge page having an entry
>         in iommu
>         (i.e. minimize table walk for DMA). Is this possible?
>         "
> 
>         Currently huge page mapping is hidden in iommu driver according
>         to the
>         iommu capability and size of map range. Why do you want to be
>         aware of
>         it in driver or even application level?
> 
>         Best regards,
>         baolu
> 
>          > -am
>          >
>          >
>          > On Sun, Dec 1, 2019 at 6:24 PM Lu Baolu
>         <baolu.lu@linux.intel.com <mailto:baolu.lu@linux.intel.com>
>          > <mailto:baolu.lu@linux.intel.com
>         <mailto:baolu.lu@linux.intel.com>>> wrote:
>          >
>          >     Hi,
>          >
>          >     On 12/2/19 9:46 AM, Anand Misra wrote:
>          >      > Hello:
>          >      >
>          >      > I'm in process of adding iommu support in my driver
>         for a PCIe
>          >     device.
>          >      > The device doesn't publish ACS/ATS via its config
>         space. I've
>          >     following
>          >      > config:
>          >      >
>          >      > Linux cmdline: "intel-iommu=on iommu=pt
>          >      > vfio_iommu_type1.allow_unsafe_interrupts=1
>          >     pcie_acs_override=downstream"
>          >      > Centos kernel: 3.10.0-1062.1.2.el7.x86_64
>          >      >
>          >
>          >     Can you please use the latest kernel for test? 3.10 seems
>         to be pretty
>          >     old and there are a lot of changes after it.
>          >
>          >     Best regards,
>          >     baolu
>          >
> 

[-- Attachment #2: 0001-iommu-vt-d-trace-Extend-map_sg-trace-event.patch --]
[-- Type: text/x-patch, Size: 2855 bytes --]

From c10422b2827b3fd4141ddac2601608ed6c883cea Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu@linux.intel.com>
Date: Wed, 4 Dec 2019 10:10:20 +0800
Subject: [PATCH 1/1] iommu/vt-d: trace: Extend map_sg trace event

Current map_sg stores trace message in a coarse manner. This
extends it so that more detailed messages could be traced.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c        |  4 +--
 include/trace/events/intel_iommu.h | 43 +++++++++++++++++++++++++-----
 2 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 6db6d969e31c..b47b8ba5ac0f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3769,8 +3769,8 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
 		return 0;
 	}
 
-	trace_map_sg(dev, iova_pfn << PAGE_SHIFT,
-		     sg_phys(sglist), size << VTD_PAGE_SHIFT);
+	for_each_sg(sglist, sg, nelems, i)
+		trace_map_sg(dev, i + 1, nelems, sg);
 
 	return nelems;
 }
diff --git a/include/trace/events/intel_iommu.h b/include/trace/events/intel_iommu.h
index 54e61d456cdf..8b0199d80b75 100644
--- a/include/trace/events/intel_iommu.h
+++ b/include/trace/events/intel_iommu.h
@@ -49,12 +49,6 @@ DEFINE_EVENT(dma_map, map_single,
 	TP_ARGS(dev, dev_addr, phys_addr, size)
 );
 
-DEFINE_EVENT(dma_map, map_sg,
-	TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr,
-		 size_t size),
-	TP_ARGS(dev, dev_addr, phys_addr, size)
-);
-
 DEFINE_EVENT(dma_map, bounce_map_single,
 	TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr,
 		 size_t size),
@@ -99,6 +93,43 @@ DEFINE_EVENT(dma_unmap, bounce_unmap_single,
 	TP_ARGS(dev, dev_addr, size)
 );
 
+DECLARE_EVENT_CLASS(dma_map_sg,
+	TP_PROTO(struct device *dev, int index, int total,
+		 struct scatterlist *sg),
+
+	TP_ARGS(dev, index, total, sg),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(dev))
+		__field(dma_addr_t, dev_addr)
+		__field(phys_addr_t, phys_addr)
+		__field(size_t,	size)
+		__field(int, index)
+		__field(int, total)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name(dev));
+		__entry->dev_addr = sg->dma_address;
+		__entry->phys_addr = sg_phys(sg);
+		__entry->size = sg->dma_length;
+		__entry->index = index;
+		__entry->total = total;
+	),
+
+	TP_printk("dev=%s [%d/%d] dev_addr=0x%llx phys_addr=0x%llx size=%zu",
+		  __get_str(dev_name), __entry->index, __entry->total,
+		  (unsigned long long)__entry->dev_addr,
+		  (unsigned long long)__entry->phys_addr,
+		  __entry->size)
+);
+
+DEFINE_EVENT(dma_map_sg, map_sg,
+	TP_PROTO(struct device *dev, int index, int total,
+		 struct scatterlist *sg),
+	TP_ARGS(dev, index, total, sg)
+);
+
 #endif /* _TRACE_INTEL_IOMMU_H */
 
 /* This part must be outside protection */
-- 
2.17.1


[-- Attachment #3: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-12-04  2:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-02  1:46 kernel BUG at drivers/iommu/intel-iommu.c:667! Anand Misra
2019-12-02  1:51 ` Anand Misra
2019-12-02  2:23 ` Lu Baolu
     [not found]   ` <CAL20ACLtwjDLaPattEkPiufsgHa2k-4Wb_Dw7Urh9we0QwbJfQ@mail.gmail.com>
     [not found]     ` <da7fb26f-022b-eaad-1a91-11cf15531f8a@linux.intel.com>
2019-12-02  3:33       ` Anand Misra
2019-12-03 19:22         ` AM
2019-12-04  2:26           ` Lu Baolu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).