xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Atheros WiFi - memory paging failure on driver load
@ 2016-07-12  3:59 Andrey Grodzovsky
  2016-07-15 10:04 ` Andrew Cooper
  0 siblings, 1 reply; 9+ messages in thread
From: Andrey Grodzovsky @ 2016-07-12  3:59 UTC (permalink / raw)
  To: xen-devel; +Cc: Jürgen Walter • Quattru, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 20887 bytes --]

Hello

Some background -

We are trying to run Qualcomm Atheros AR928X Wireless Network Adapter and
have a crash right on driver load, following are our observations and
questions.

Jurgen's observation -

" The Atheros card "Qualcomm Atheros AR928X Wireless Network Adapter
(PCI-Express) (rev 01)"  is plugged into the host system (datatron).
When I attach it to the DomU - the module "ath9k" is automatically loaded,
but it gives an exception "iowrite32+0x2b/0x30".
No idea what the issue is (tried also with another Atheros Card (ath10k) -
similar problem). When I try an Intel card, it works.
(the card also works on the Dom0 - so the Linux driver and HW is OK)."

Debugging -

After some investigation with kgdb and iommu trace on DomU it seems the
iomap of PCI BAR for the device returns a a mapping f which first 0x1000
bytes are read only and that causes access violation when trying to write
registers mapped to this area (all the regs with offset < 0x1000) - why
this happens i still don't know. Register writes with offsets > 0x1000 are
fine.

Running same driver on Dom0 is totally fine

Bellow the sigsev backtrace and xen dmesg from DomU

As can be seen there is ioremap_ of size 0x10000 starting
at ffffc900402c0000 but as i said, i noticed that anything bellow
ffffc900402c*1000 *is not writable (from gdb using set addr = val) and only
readable while anything above this address is writeable.

Question -

Please give any advise on this issue and especially how to approach
debugging this both on Domu and Dom0 and where in xen code to look for
possible issues.

Thanks.
Andrey

P.S I was also unable to  cat /sys/kernel/debug/tracing/trace_pipe >
mydump.txt & to actually record the iommu traces due to another paging
crash (stack in the end)

[   37.837467] ath9k 0000:00:00.0: Xen PCI mapped GSI16 to IRQ8
[   37.837473] pcifront pci-0: read dev=0000:00:00.0 - offset 3d size 1
[   37.837498] pcifront pci-0: read got back value 1
[   37.837505] pcifront pci-0: read dev=0000:00:00.0 - offset 4 size 2
[   37.840006] pcifront pci-0: read got back value 103
[   37.853341] pcifront pci-0: read dev=0000:00:00.0 - offset c size 1
[   37.853374] pcifront pci-0: read got back value 8
[   37.853383] pcifront pci-0: write dev=0000:00:00.0 - offset d size 1 val
a8
[   37.853431] pcifront pci-0: read dev=0000:00:00.0 - offset 4 size 2
[   37.853462] pcifront pci-0: read got back value 103
[   37.853472] pcifront pci-0: write dev=0000:00:00.0 - offset 4 size 2 val
107
[   37.853527] pcifront pci-0: read dev=0000:00:00.0 - offset 40 size 4
[   37.853565] pcifront pci-0: read got back value 3c25001
[   37.853574] pcifront pci-0: write dev=0000:00:00.0 - offset 40 size 4
val 3c20001
*[   37.855784] mmiotrace: ioremap_*(0xf7b00000, 0x10000) =
ffffc900402c0000*
[   37.855842] pcifront pci-0: read dev=0000:00:00.0 - offset c size 1
[   37.855879] pcifront pci-0: read got back value 8
[   38.301919] BUG: unable to handle kernel paging request at
ffffc900402c0040
[   38.301930] IP: [<ffffffff8132387b>] iowrite32+0x2b/0x30
[   38.301939] PGD 3fdf4067 PUD 3e1a8067 PMD 49d7067 PTE 80100000f7b00075
[   38.301947] Oops: 0003 [#1] SMP
[   38.301952] Modules linked in: ath9k(OE+) ath9k_common(OE) ath9k_hw(OE)
ath(OE) mac80211(E) cfg80211(E) rfkill(E) xen_pcifront(E) intel_rapl(E)
x86_pkg_temp_thermal(E) coretemp(E) crct10dif_pclmul(E) crc32_pclmul(E)
evdev(E) ghash_clmulni_intel(E) pcspkr(E) uio_netx(E) uio(E) autofs4(E)
ext4(E) ecb(E) crc16(E) mbcache(E) jbd2(E) crc32c_generic(E)
crc32c_intel(E) aesni_intel(E) xen_netfront(E) xen_blkfront(E)
aes_x86_64(E) glue_helper(E) lrw(E) gf128mul(E) ablk_helper(E) cryptd(E)
[   38.302000] CPU: 0 PID: 696 Comm: systemd-udevd Tainted: G        W  OE
  4.5.3-dbg #2
[   38.302008] task: ffff8800044880c0 ti: ffff8800048a8000 task.ti:
ffff8800048a8000
[   38.302015] RIP: e030:[<ffffffff8132387b>]  [<ffffffff8132387b>]
iowrite32+0x2b/0x30
[   38.302024] RSP: e02b:ffff8800048ab8c8  EFLAGS: 00010196
[   38.302029] RAX: 0000000000000000 RBX: 000000000000000e RCX:
ffff88003ac34018
[   38.302035] RDX: ffffc900402c0040 RSI: ffffc900402c0040 RDI:
0000000000000000
[   38.302040] RBP: ffff8800048ab928 R08: 0000000000000016 R09:
0000000000000002
[   38.302046] R10: ffffffff81b08000 R11: ffffffff81b07fc0 R12:
0000000000000016
[   38.302051] R13: ffff880004b4d098 R14: ffff8800048abec0 R15:
ffffffffc03c74c0
[   38.302060] FS:  00007ff256fb18c0(0000) GS:ffff88003f800000(0000)
knlGS:ffff88003f800000
[   38.302068] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[   38.302073] CR2: ffffc900402c0040 CR3: 0000000004a05000 CR4:
0000000000042660
[   38.302079] Stack:
[   38.302083]  ffffffffc03a70f5 0000000000000040 ffff88003ac34018
ffffffffc0339a5b
[   38.302092]  ffff88003ac34018 ffff88003ac34068 ffff88003ac81520
0000809400008090
[   38.302102]  0000000000008098 0000000000000000 0000000000000000
000000003c5869cf
[   38.302111] Call Trace:
[   38.302123]  [<ffffffffc03a70f5>] ? ath9k_iowrite32+0xde/0xf5 [ath9k]
[   38.302139]  [<ffffffffc0339a5b>] ? ath9k_hw_update_mibstats+0x62/0xd6
[ath9k_hw]
[   38.302155]  [<ffffffffc033a722>] ? ath9k_enable_mib_counters+0xc2/0x119
[ath9k_hw]
[   38.302171]  [<ffffffffc033a983>] ? ath9k_hw_ani_init+0x148/0x14b
[ath9k_hw]
[   38.302182]  [<ffffffffc0313b00>] ? ath9k_hw_post_init+0x104/0x16c
[ath9k_hw]
[   38.302195]  [<ffffffffc0313f22>] ? __ath9k_hw_init+0x377/0x3fd
[ath9k_hw]
[   38.302206]  [<ffffffffc031403e>] ? ath9k_hw_init+0x96/0xd6 [ath9k_hw]
[   38.302218]  [<ffffffffc03a877a>] ? ath9k_init_softc+0x643/0x823 [ath9k]
[   38.302229]  [<ffffffffc03a8f7d>] ? ath9k_init_device+0x43/0x21d [ath9k]
[   38.302243]  [<ffffffffc03aad2e>] ? ath9k_tasklet+0x462/0x462 [ath9k]
[   38.302255]  [<ffffffffc03c1638>] ? ath_pci_probe+0x2c8/0x38f [ath9k]
[   38.302263]  [<ffffffff815bab16>] ? _raw_spin_unlock_irqrestore+0x16/0x20
[   38.302271]  [<ffffffff814431a4>] ? __pm_runtime_resume+0x54/0x70
[   38.302278]  [<ffffffff813566ff>] ? local_pci_probe+0x3f/0x90
[   38.302284]  [<ffffffff81357ab0>] ? pci_device_probe+0x100/0x140
[   38.302290]  [<ffffffff8143853c>] ? driver_probe_device+0x21c/0x430
[   38.302296]  [<ffffffff814387cb>] ? __driver_attach+0x7b/0x80
[   38.302301]  [<ffffffff81438750>] ? driver_probe_device+0x430/0x430
[   38.302308]  [<ffffffff81436007>] ? bus_for_each_dev+0x67/0xb0
[   38.302313]  [<ffffffff8143773f>] ? bus_add_driver+0x1df/0x270
[   38.302319]  [<ffffffffc02f0000>] ? 0xffffffffc02f0000
[   38.302324]  [<ffffffff81438fe7>] ? driver_register+0x57/0xc0
[   38.302334]  [<ffffffffc03c19ad>] ? ath_pci_init+0x23/0x25 [ath9k]
[   38.302345]  [<ffffffffc02f000d>] ? ath9k_init+0xd/0x1000 [ath9k]
[   38.302352]  [<ffffffff81002122>] ? do_one_initcall+0xb2/0x200
[   38.302358]  [<ffffffff815b74d4>] ? preempt_schedule_common+0x14/0x30
[   38.302363]  [<ffffffff815b7509>] ? _cond_resched+0x19/0x20
[   38.302371]  [<ffffffff811cf032>] ? kmem_cache_alloc_trace+0x82/0x220
[   38.302378]  [<ffffffff8116d58a>] ? do_init_module+0x5b/0x1ce
[   38.302384]  [<ffffffff810fa676>] ? load_module+0x2146/0x2790
[   38.302389]  [<ffffffff810f71d0>] ? __symbol_put+0x60/0x60
[   38.302395]  [<ffffffff810f7628>] ?
copy_module_from_fd.isra.51+0xf8/0x140
[   38.302401]  [<ffffffff810faed8>] ? SYSC_finit_module+0xa8/0xd0
[   38.302408]  [<ffffffff815bb072>] ? system_call_fast_compare_end+0xc/0x67
[   38.302413] Code: 48 81 fe ff ff 03 00 48 89 f2 77 1f 48 81 fe 00 00 01
00 76 07 0f b7 d6 89 f8 ef c3 48 c7 c6 df 6a 84 81 48 89 d7 e9 85 fe ff ff
<89> 3e c3 66 90 48 81 ff ff ff 03 00 77 28 48 81 ff 00 00 01 00
[   38.302460] RIP  [<ffffffff8132387b>] iowrite32+0x2b/0x30
[   38.302466]  RSP <ffff8800048ab8c8>
[   38.302469] CR2: ffffc900402c0040
[   38.302475] ---[ end trace 092ecee276c2da16 ]---



(XEN) Xen version 4.6.0 (Debian 4.6.0-1+nmu2) (ijc@debian.org) (gcc (Debian
5.3.1-8) 5.3.1 20160205) debug=n Tue Feb  9 17:46:27 UTC 2016
(XEN) Bootloader: GRUB 2.02~beta2-36
(XEN) Command line: placeholder log_lvl=all loglvl=all guest_loglvl=all
iommu=verbose
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: none; EDID transfer time: 0 seconds
(XEN)  EDID info not retrieved because no DDC retrieval method detected
(XEN) Disc information:
(XEN)  Found 3 MBR signatures
(XEN)  Found 3 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 0000000000099c00 (usable)
(XEN)  0000000000099c00 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 0000000020000000 (usable)
(XEN)  0000000020000000 - 0000000020200000 (reserved)
(XEN)  0000000020200000 - 0000000040004000 (usable)
(XEN)  0000000040004000 - 0000000040005000 (reserved)
(XEN)  0000000040005000 - 00000000d9d01000 (usable)
(XEN)  00000000d9d01000 - 00000000da191000 (reserved)
(XEN)  00000000da191000 - 00000000da192000 (ACPI data)
(XEN)  00000000da192000 - 00000000da2af000 (ACPI NVS)
(XEN)  00000000da2af000 - 00000000da6da000 (reserved)
(XEN)  00000000da6da000 - 00000000da6db000 (usable)
(XEN)  00000000da6db000 - 00000000da71e000 (ACPI NVS)
(XEN)  00000000da71e000 - 00000000daddf000 (usable)
(XEN)  00000000daddf000 - 00000000daff2000 (reserved)
(XEN)  00000000daff2000 - 00000000db000000 (usable)
(XEN)  00000000db800000 - 00000000dfa00000 (reserved)
(XEN)  00000000f8000000 - 00000000fc000000 (reserved)
(XEN)  00000000fec00000 - 00000000fec01000 (reserved)
(XEN)  00000000fed00000 - 00000000fed04000 (reserved)
(XEN)  00000000fed1c000 - 00000000fed20000 (reserved)
(XEN)  00000000fee00000 - 00000000fee01000 (reserved)
(XEN)  00000000ff000000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 000000041e600000 (usable)
(XEN) ACPI: RSDP 000F0490, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT DA2A0080, 007C (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FACP DA2A9C58, 00F4 (r4 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: DSDT DA2A0190, 9AC5 (r2 ALASKA    A M I        1 INTL 20051117)
(XEN) ACPI: FACS DA2ADF80, 0040
(XEN) ACPI: APIC DA2A9D50, 0092 (r3 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FPDT DA2A9DE8, 0044 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: MCFG DA2A9E30, 003C (r1 ALASKA    A M I  1072009 MSFT       97)
(XEN) ACPI: HPET DA2A9E70, 0038 (r1 ALASKA    A M I  1072009 AMI.        5)
(XEN) ACPI: SSDT DA2A9EA8, 036D (r1 SataRe SataTabl     1000 INTL 20091112)
(XEN) ACPI: SSDT DA2AA218, 08A2 (r1  PmRef  Cpu0Ist     3000 INTL 20051117)
(XEN) ACPI: SSDT DA2AAAC0, 0A92 (r1  PmRef    CpuPm     3000 INTL 20051117)
(XEN) ACPI: DMAR DA2AB558, 00B8 (r1 INTEL      SNB         1 INTL        1)
(XEN) ACPI: ASF! DA2AB610, 00A5 (r32 INTEL       HCG        1 TFSM    F4240)
(XEN) ACPI: BGRT DA2AB6B8, 0038 (r0 ALASKA    A M I  1072009 AMI     10013)
(XEN) System RAM: 16263MB (16653732kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-000000041e600000
(XEN) Domain heap initialised
(XEN) found SMP MP-table at 000fd7e0
(XEN) DMI 2.7 present.
(XEN) Using APIC driver default
(XEN) ACPI: PM-Timer IO Port: 0x408
(XEN) ACPI: SLEEP INFO: pm1x_cnt[1:404,1:0], pm1x_evt[1:400,1:0]
(XEN) ACPI: 32/64X FACS address mismatch in FADT -
da2adf80/0000000000000000, using 32
(XEN) ACPI:             wakeup_vec[da2adf8c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
(XEN) Processor #0 7:10 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
(XEN) Processor #2 7:10 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
(XEN) Processor #4 7:10 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
(XEN) Processor #6 7:10 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x01] enabled)
(XEN) Processor #1 7:10 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x03] enabled)
(XEN) Processor #3 7:10 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x05] enabled)
(XEN) Processor #5 7:10 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x07] enabled)
(XEN) Processor #7 7:10 APIC version 21
(XEN) ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
(XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
(XEN) ACPI: HPET id: 0x8086a701 base: 0xfed00000
(XEN) ERST table was not found
(XEN) ACPI: BGRT: invalidating v1 image at 0xce4c7018
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 8 CPUs (0 hotplug CPUs)
(XEN) IRQ limits: 24 GSI, 1528 MSI/MSI-X
(XEN) Switched to APIC driver x2apic_cluster.
(XEN) xstate_init: using cntxt_size: 0x340 and states: 0x7
(XEN) Intel machine check reporting enabled
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2292.620 MHz processor.
(XEN) Initing memory sharing.
(XEN) alt table ffff82d0802bd010 -> ffff82d0802be240
(XEN) PCI: MCFG configuration 0: base f8000000 segment 0000 buses 00 - 3f
(XEN) PCI: MCFG area at f8000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-3f
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB.
(XEN) Intel VT-d iommu 1 supported page sizes: 4kB.
(XEN) Intel VT-d Snoop Control not enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Shared EPT tables not enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) TSC deadline timer enabled
(XEN) Platform timer is 14.318MHz HPET
(XEN) Allocated console ring of 64 KiB.
(XEN) mwait-idle: MWAIT substates: 0x21120
(XEN) mwait-idle: v0.4 model 0x3a
(XEN) mwait-idle: lapic_timer_reliable_states 0xffffffff
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB
(XEN) Brought up 8 CPUs
(XEN) ACPI sleep modes: S3
(XEN) VPMU: disabled
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) Dom0 has maximum 792 PIRQs
(XEN) NX (Execute Disable) protection active
(XEN) *** LOADING DOMAIN 0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x1d59000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   000000040e000000->0000000410000000 (4062057 pages to
be allocated)
(XEN)  Init. ramdisk: 000000041d4d6000->000000041e5ff665
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff81d59000
(XEN)  Init. ramdisk: 0000000000000000->0000000000000000
(XEN)  Phys-Mach map: 0000008000000000->0000008001f16498
(XEN)  Start info:    ffffffff81d59000->ffffffff81d594b4
(XEN)  Page tables:   ffffffff81d5a000->ffffffff81d6d000
(XEN)  Boot stack:    ffffffff81d6d000->ffffffff81d6e000
(XEN)  TOTAL:         ffffffff80000000->ffffffff82000000
(XEN)  ENTRY ADDRESS: ffffffff81b2b1f0
(XEN) Dom0 has maximum 8 VCPUs
(XEN) Bogus DMIBAR 0xfed18001 on 0000:00:00.0
(XEN) Scrubbing Free RAM on 1 nodes using 4 CPUs
(XEN) .................................done.
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input
to Xen)
(XEN) Freed 288kB init memory.
(XEN) Bogus DMIBAR 0xfed18001 on 0000:00:00.0
(XEN) PCI add device 0000:00:00.0
(XEN) PCI add device 0000:00:01.0
(XEN) PCI add device 0000:00:02.0
(XEN) PCI add device 0000:00:14.0
(XEN) PCI add device 0000:00:16.0
(XEN) PCI add device 0000:00:16.3
(XEN) PCI add device 0000:00:19.0
(XEN) PCI add device 0000:00:1a.0
(XEN) PCI add device 0000:00:1b.0
(XEN) PCI add device 0000:00:1c.0
(XEN) PCI add device 0000:00:1c.1
(XEN) PCI add device 0000:00:1c.2
(XEN) PCI add device 0000:00:1c.4
(XEN) PCI add device 0000:00:1d.0
(XEN) PCI add device 0000:00:1f.0
(XEN) PCI add device 0000:00:1f.2
(XEN) PCI add device 0000:00:1f.3
(XEN) PCI add device 0000:01:00.0
(XEN) PCI phantom 0000:01:00.4
(XEN) PCI add device 0000:02:00.0
(XEN) PCI add device 0000:03:00.0
(XEN) PCI add device 0000:04:00.0
(XEN) PCI add device 0000:05:00.0
(XEN) PCI add device 0000:06:00.0



Jul 12 02:40:34 debian-guest-01 kernel: [  198.172284] mmiotrace: CPU1 is
down.
Jul 12 02:40:34 debian-guest-01 kernel: [  198.172295] mmiotrace: enabled.
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361149] BUG: unable to
handle kernel NULL pointer dereference at 0000000000000058
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361166] IP:
[<ffffffff813569b5>] pci_dev_driver+0x5/0x40
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361176] PGD 3b5a3067 PUD
3a4e5067 PMD 0
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361182] Oops: 0000 [#1] SMP
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361186] Modules linked in:
intel_rapl(E) x86_pkg_temp_thermal(E) coretemp(E) crct10dif_pclmul(E)
crc32_pclmul(E) evdev(E) pcspkr(E) ghash_clmulni_intel(E) uio_netx(E)
uio(E) autofs4(E) ext4(E) ecb(E) crc16(E) mbcache(E) jbd2(E)
crc32c_generic(E) crc32c_intel(E) aesni_intel(E) xen_netfront(E)
xen_blkfront(E) aes_x86_64(E) glue_helper(E) lrw(E) gf128mul(E)
ablk_helper(E) cryptd(E)
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361218] CPU: 0 PID: 695
Comm: cat Tainted: G        W   E   4.5.3-dbg #2
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361226] task:
ffff880004440080 ti: ffff880004898000 task.ti: ffff880004898000
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361234] RIP:
e030:[<ffffffff813569b5>]  [<ffffffff813569b5>] pci_dev_driver+0x5/0x40
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361243] RSP:
e02b:ffff88000489be18  EFLAGS: 00010206
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361247] RAX:
0000000000000000 RBX: 0000000000000500 RCX: 0000000000000000
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361252] RDX:
0000000000000000 RSI: 0000000000000200 RDI: 0000000000000000
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361259] RBP:
ffff8800037ed0d0 R08: 0000000000000000 R09: 0000000000000003
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361264] R10:
0000000000000867 R11: 0000000000000246 R12: ffff8800037ec000
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361269] R13:
ffff88003c14caa0 R14: 0000000000000000 R15: 0000000000000378
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361279] FS:
 00007f1c3e8bf700(0000) GS:ffff88003f800000(0000) knlGS:0000000000000000
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361287] CS:  e033 DS: 0000
ES: 0000 CR0: 0000000080050033
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361292] CR2:
0000000000000058 CR3: 000000003ac9c000 CR4: 0000000000042660
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361298] Stack:
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361302]  ffffffff8113b766
0000000000000000 00007f1c3e89d000 0000000000020000
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361312]  00007f1c3e89d000
0000000000020000 ffff88000489bf20 ffff8800037ed0d0
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361321]  0000000000020000
0000000000000000 ffff8800037ec000 ffffffff811356df
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361330] Call Trace:
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361338]
 [<ffffffff8113b766>] ? mmio_read+0x86/0x1f0
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361345]
 [<ffffffff811356df>] ? tracing_read_pipe+0xaf/0x380
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361353]
 [<ffffffff811ec8f1>] ? vfs_read+0x81/0x120
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361359]
 [<ffffffff811ed862>] ? SyS_read+0x52/0xc0
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361366]
 [<ffffffff815bb072>] ? system_call_fast_compare_end+0xc/0x67
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361372] Code: c3 48 c7 c6 ed
ff ff ff eb cf 48 c7 c6 ea ff ff ff eb d0 e8 5e 33 d2 ff 0f 1f 40 00 66 2e
0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <48> 8b 47 58 48 85 c0 74 02 f3 c3
48 8d 97 90 03 00 00 48 81 c7
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361421] RIP
 [<ffffffff813569b5>] pci_dev_driver+0x5/0x40
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361427]  RSP
<ffff88000489be18>
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361431] CR2: 0000000000000058
Jul 12 02:40:55 debian-guest-01 kernel: [  218.361444] ---[ end trace
c51c445d784a8027 ]---

[-- Attachment #1.2: Type: text/html, Size: 25909 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atheros WiFi - memory paging failure on driver load
  2016-07-12  3:59 Atheros WiFi - memory paging failure on driver load Andrey Grodzovsky
@ 2016-07-15 10:04 ` Andrew Cooper
  2016-07-15 10:16   ` Jürgen Walter • Quattru
  2016-07-16  3:45   ` Andrey Grodzovsky
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Cooper @ 2016-07-15 10:04 UTC (permalink / raw)
  To: Andrey Grodzovsky, xen-devel
  Cc: Jan Beulich, Jürgen Walter • Quattru


[-- Attachment #1.1: Type: text/plain, Size: 1934 bytes --]

On 12/07/16 04:59, Andrey Grodzovsky wrote:
> Hello
>
> Some background -
>
> We are trying to run Qualcomm Atheros AR928X Wireless Network Adapter
> and have a crash right on driver load, following are our observations
> and questions.
>
> Jurgen's observation - 
>
> " The Atheros card "Qualcomm Atheros AR928X Wireless Network Adapter
> (PCI-Express) (rev 01)"  is plugged into the host system (datatron).
> When I attach it to the DomU - the module "ath9k" is automatically
> loaded, but it gives an exception "iowrite32+0x2b/0x30".
> No idea what the issue is (tried also with another Atheros Card
> (ath10k) - similar problem). When I try an Intel card, it works.
> (the card also works on the Dom0 - so the Linux driver and HW is OK)."
>
> Debugging - 
>
> After some investigation with kgdb and iommu trace on DomU it seems
> the iomap of PCI BAR for the device returns a a mapping f which first
> 0x1000 bytes are read only and that causes access violation when
> trying to write registers mapped to this area (all the regs with
> offset < 0x1000) - why this happens i still don't know. Register
> writes with offsets > 0x1000 are fine.
>
> Running same driver on Dom0 is totally fine 
>
> Bellow the sigsev backtrace and xen dmesg from DomU
>
> As can be seen there is ioremap_ of size 0x10000 starting
> at ffffc900402c0000 but as i said, i noticed that anything bellow
> ffffc900402c*1000 *is not writable (from gdb using set addr = val) and
> only readable while anything above this address is writeable. 
>
> Question -
>
> Please give any advise on this issue and especially how to approach
> debugging this both on Domu and Dom0 and where in xen code to look for
> possible issues.

First of all, is this a PV or HVM domU ?

Is this BAR the same BAR which has the MSI-X table in?  For safety, Xen
has to trap and emulate updates to the MSI/MSI-X configuration.  It is
possible that that logic has gone wrong.

~Andrew

[-- Attachment #1.2: Type: text/html, Size: 4076 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atheros WiFi - memory paging failure on driver load
  2016-07-15 10:04 ` Andrew Cooper
@ 2016-07-15 10:16   ` Jürgen Walter • Quattru
  2016-07-16  3:45   ` Andrey Grodzovsky
  1 sibling, 0 replies; 9+ messages in thread
From: Jürgen Walter • Quattru @ 2016-07-15 10:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Andrey Grodzovsky, Jan Beulich, xen-devel

> First of all, is this a PV or HVM domU ?
it is a PV

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atheros WiFi - memory paging failure on driver load
  2016-07-15 10:04 ` Andrew Cooper
  2016-07-15 10:16   ` Jürgen Walter • Quattru
@ 2016-07-16  3:45   ` Andrey Grodzovsky
  2016-07-18  3:29     ` Andrey Grodzovsky
  1 sibling, 1 reply; 9+ messages in thread
From: Andrey Grodzovsky @ 2016-07-16  3:45 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Jan Beulich, Jürgen Walter • Quattru, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 5732 bytes --]

On Fri, Jul 15, 2016 at 6:04 AM, Andrew Cooper <andrew.cooper3@citrix.com>
wrote:

> On 12/07/16 04:59, Andrey Grodzovsky wrote:
>
> Hello
>
> Some background -
>
> We are trying to run Qualcomm Atheros AR928X Wireless Network Adapter and
> have a crash right on driver load, following are our observations and
> questions.
>
> Jurgen's observation -
>
> " The Atheros card "Qualcomm Atheros AR928X Wireless Network Adapter
> (PCI-Express) (rev 01)"  is plugged into the host system (datatron).
> When I attach it to the DomU - the module "ath9k" is automatically loaded,
> but it gives an exception "iowrite32+0x2b/0x30".
> No idea what the issue is (tried also with another Atheros Card (ath10k) -
> similar problem). When I try an Intel card, it works.
> (the card also works on the Dom0 - so the Linux driver and HW is OK)."
>
> Debugging -
>
> After some investigation with kgdb and iommu trace on DomU it seems the
> iomap of PCI BAR for the device returns a a mapping f which first 0x1000
> bytes are read only and that causes access violation when trying to write
> registers mapped to this area (all the regs with offset < 0x1000) - why
> this happens i still don't know. Register writes with offsets > 0x1000 are
> fine.
>
> Running same driver on Dom0 is totally fine
>
> Bellow the sigsev backtrace and xen dmesg from DomU
>
> As can be seen there is ioremap_ of size 0x10000 starting
> at ffffc900402c0000 but as i said, i noticed that anything bellow
> ffffc900402c*1000 *is not writable (from gdb using set addr = val) and
> only readable while anything above this address is writeable.
>
> Question -
>
> Please give any advise on this issue and especially how to approach
> debugging this both on Domu and Dom0 and where in xen code to look for
> possible issues.
>
>
> First of all, is this a PV or HVM domU ?
>
> Is this BAR the same BAR which has the MSI-X table in?  For safety, Xen
> has to trap and emulate updates to the MSI/MSI-X configuration.  It is
> possible that that logic has gone wrong.
>
> ~Andrew
>

As much as I understand from looking at lspci -vvv for this device on the
guest (after attaching and SIGSEV) It is on the same (and and only) BAR but
MSI/MS-X is not enabled  ? (the "-" next to the property)
Please take a look at the dump.

00:00.0 Network controller: Qualcomm Atheros AR928X Wireless Network
Adapter (PCI-Express) (rev 01)
Subsystem: Qualcomm Atheros AR928X Wireless Network Adapter (PCI-Express)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 25
Region 0: Memory at f7b00000 (64-bit, non-prefetchable) [size=64K]
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
*Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-*
* Address: 00000000  Data: 0000*
Capabilities: [60] Express (v1) Legacy Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L1, Exit Latency L0s <512ns,
L1 <64us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt-
ABWMgmt-
*Capabilities: [90] MSI-X: Enable- Count=1 Masked-*
* Vector table: BAR=0 offset=00000000*
* PBA: BAR=0 offset=00000000*
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [140 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
Status: NegoPending- InProgress-
Capabilities: [160 v1] Device Serial Number 00-00-00-00-00-00-00-00
Kernel driver in use: ath9k
Kernel modules: ath9k
00: 8c 16 2a 00 07 01 10 00 01 00 80 02 08 00 00 00
10: 04 00 b0 f7 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 8c 16 99 30
30: 00 00 00 00 40 00 00 00 00 00 00 00 10 01 00 00
40: 01 50 c2 03 00 00 00 00 00 00 00 00 00 00 00 00
50: 05 60 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 10 90 11 00 c0 0c 90 05 00 20 01 00 11 38 03 00
70: 48 00 11 10 00 00 00 00 c0 03 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 11 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


>
>

[-- Attachment #1.2: Type: text/html, Size: 12335 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atheros WiFi - memory paging failure on driver load
  2016-07-16  3:45   ` Andrey Grodzovsky
@ 2016-07-18  3:29     ` Andrey Grodzovsky
  2016-07-18 18:22       ` Andrew Cooper
  0 siblings, 1 reply; 9+ messages in thread
From: Andrey Grodzovsky @ 2016-07-18  3:29 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Jan Beulich, Jürgen Walter • Quattru, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 6189 bytes --]

On Fri, Jul 15, 2016 at 11:45 PM, Andrey Grodzovsky <andrey2805@gmail.com>
wrote:

>
>
> On Fri, Jul 15, 2016 at 6:04 AM, Andrew Cooper <andrew.cooper3@citrix.com>
> wrote:
>
>> On 12/07/16 04:59, Andrey Grodzovsky wrote:
>>
>> Hello
>>
>> Some background -
>>
>> We are trying to run Qualcomm Atheros AR928X Wireless Network Adapter and
>> have a crash right on driver load, following are our observations and
>> questions.
>>
>> Jurgen's observation -
>>
>> " The Atheros card "Qualcomm Atheros AR928X Wireless Network Adapter
>> (PCI-Express) (rev 01)"  is plugged into the host system (datatron).
>> When I attach it to the DomU - the module "ath9k" is automatically
>> loaded, but it gives an exception "iowrite32+0x2b/0x30".
>> No idea what the issue is (tried also with another Atheros Card (ath10k)
>> - similar problem). When I try an Intel card, it works.
>> (the card also works on the Dom0 - so the Linux driver and HW is OK)."
>>
>> Debugging -
>>
>> After some investigation with kgdb and iommu trace on DomU it seems the
>> iomap of PCI BAR for the device returns a a mapping f which first 0x1000
>> bytes are read only and that causes access violation when trying to write
>> registers mapped to this area (all the regs with offset < 0x1000) - why
>> this happens i still don't know. Register writes with offsets > 0x1000 are
>> fine.
>>
>> Running same driver on Dom0 is totally fine
>>
>> Bellow the sigsev backtrace and xen dmesg from DomU
>>
>> As can be seen there is ioremap_ of size 0x10000 starting
>> at ffffc900402c0000 but as i said, i noticed that anything bellow
>> ffffc900402c*1000 *is not writable (from gdb using set addr = val) and
>> only readable while anything above this address is writeable.
>>
>> Question -
>>
>> Please give any advise on this issue and especially how to approach
>> debugging this both on Domu and Dom0 and where in xen code to look for
>> possible issues.
>>
>>
>> First of all, is this a PV or HVM domU ?
>>
>> Is this BAR the same BAR which has the MSI-X table in?  For safety, Xen
>> has to trap and emulate updates to the MSI/MSI-X configuration.  It is
>> possible that that logic has gone wrong.
>>
>> ~Andrew
>>
>
> As much as I understand from looking at lspci -vvv for this device on the
> guest (after attaching and SIGSEV) It is on the same (and and only) BAR but
> MSI/MS-X is not enabled  ? (the "-" next to the property)
> Please take a look at the dump.
>
> 00:00.0 Network controller: Qualcomm Atheros AR928X Wireless Network
> Adapter (PCI-Express) (rev 01)
> Subsystem: Qualcomm Atheros AR928X Wireless Network Adapter (PCI-Express)
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR+ FastB2B- DisINTx-
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
> <MAbort- >SERR- <PERR- INTx-
> Latency: 0, Cache Line Size: 32 bytes
> Interrupt: pin A routed to IRQ 25
> Region 0: Memory at f7b00000 (64-bit, non-prefetchable) [size=64K]
> Capabilities: [40] Power Management version 2
> Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> *Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-*
> * Address: 00000000  Data: 0000*
> Capabilities: [60] Express (v1) Legacy Endpoint, MSI 00
> DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
> ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
> DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> MaxPayload 128 bytes, MaxReadReq 512 bytes
> DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L1, Exit Latency L0s
> <512ns, L1 <64us
> ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
> LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk+
> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt-
> ABWMgmt-
> *Capabilities: [90] MSI-X: Enable- Count=1 Masked-*
> * Vector table: BAR=0 offset=00000000*
> * PBA: BAR=0 offset=00000000*
> Capabilities: [100 v1] Advanced Error Reporting
> UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
> ECRC- UnsupReq- ACSViol-
> UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
> ECRC- UnsupReq- ACSViol-
> UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
> ECRC- UnsupReq- ACSViol-
> CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> Capabilities: [140 v1] Virtual Channel
> Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
> Arb: Fixed- WRR32- WRR64- WRR128-
> Ctrl: ArbSelect=Fixed
> Status: InProgress-
> VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
> Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
> Status: NegoPending- InProgress-
> Capabilities: [160 v1] Device Serial Number 00-00-00-00-00-00-00-00
> Kernel driver in use: ath9k
> Kernel modules: ath9k
> 00: 8c 16 2a 00 07 01 10 00 01 00 80 02 08 00 00 00
> 10: 04 00 b0 f7 00 00 00 00 00 00 00 00 00 00 00 00
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 8c 16 99 30
> 30: 00 00 00 00 40 00 00 00 00 00 00 00 10 01 00 00
> 40: 01 50 c2 03 00 00 00 00 00 00 00 00 00 00 00 00
> 50: 05 60 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 60: 10 90 11 00 c0 0c 90 05 00 20 01 00 11 38 03 00
> 70: 48 00 11 10 00 00 00 00 c0 03 00 00 00 00 00 00
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 90: 11 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> a0: 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
>
>>
>>
I created a full xenalyze trace from the time i run the command to attach
the device to after the kernel oops.
I hope it can give  clues to the root cause.

[-- Attachment #1.2: Type: text/html, Size: 12660 bytes --]

[-- Attachment #2: xenalyze.cap.zip --]
[-- Type: application/zip, Size: 162751 bytes --]

[-- Attachment #3: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atheros WiFi - memory paging failure on driver load
  2016-07-18  3:29     ` Andrey Grodzovsky
@ 2016-07-18 18:22       ` Andrew Cooper
  2016-07-18 18:56         ` Andrey Grodzovsky
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2016-07-18 18:22 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Jan Beulich, Jürgen Walter • Quattru, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2071 bytes --]

On 18/07/16 04:29, Andrey Grodzovsky wrote:
>
>
> On Fri, Jul 15, 2016 at 11:45 PM, Andrey Grodzovsky
> <andrey2805@gmail.com <mailto:andrey2805@gmail.com>> wrote:
>
>
>
>     On Fri, Jul 15, 2016 at 6:04 AM, Andrew Cooper
>     <andrew.cooper3@citrix.com <mailto:andrew.cooper3@citrix.com>> wrote:
>
>         On 12/07/16 04:59, Andrey Grodzovsky wrote:
>>         Hello
>>
>>         Some background -
>>
>>         We are trying to run Qualcomm Atheros AR928X Wireless Network
>>         Adapter and have a crash right on driver load, following are
>>         our observations and questions.
>>
>>         Jurgen's observation - 
>>
>>         " The Atheros card "Qualcomm Atheros AR928X Wireless Network
>>         Adapter (PCI-Express) (rev 01)"  is plugged into the host
>>         system (datatron).
>>         When I attach it to the DomU - the module "ath9k" is
>>         automatically loaded, but it gives an exception
>>         "iowrite32+0x2b/0x30".
>>         No idea what the issue is (tried also with another Atheros
>>         Card (ath10k) - similar problem). When I try an Intel card,
>>         it works.
>>         (the card also works on the Dom0 - so the Linux driver and HW
>>         is OK)."
>>
>>         Debugging - 
>>
>>         After some investigation with kgdb and iommu trace on DomU it
>>         seems the iomap of PCI BAR for the device returns a a mapping
>>         f which first 0x1000 bytes are read only and that causes
>>         access violation when trying to write registers mapped to
>>         this area (all the regs with offset < 0x1000) - why this
>>         happens i still don't know. Register writes with offsets >
>>         0x1000 are fine.
>

Your card is not PCI spec compliant.

The Spec mandates that nothing may exist in any 4k aligned block
covering part of the MSI-X table, precisely so read-only tricks like
this can be done trap&intercept MSI-X updates.



>>
>>         Running same driver on Dom0 is totally fine
>

This is curious.  Dom0 and DomU should be treated identically in this
regard.

~Andrew

[-- Attachment #1.2: Type: text/html, Size: 8985 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atheros WiFi - memory paging failure on driver load
  2016-07-18 18:22       ` Andrew Cooper
@ 2016-07-18 18:56         ` Andrey Grodzovsky
  2016-07-18 19:16           ` Andrew Cooper
  2016-08-01 11:41           ` Jan Beulich
  0 siblings, 2 replies; 9+ messages in thread
From: Andrey Grodzovsky @ 2016-07-18 18:56 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Jan Beulich, Jürgen Walter • Quattru, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2133 bytes --]

Thank you for your comments !
As a question , will disabling MSI enable bit in the card  and switching
back to legacy interrupt method might resolve the issue ?

On Mon, Jul 18, 2016 at 2:22 PM, Andrew Cooper <andrew.cooper3@citrix.com>
wrote:

> On 18/07/16 04:29, Andrey Grodzovsky wrote:
>
>
>
> On Fri, Jul 15, 2016 at 11:45 PM, Andrey Grodzovsky <
> <andrey2805@gmail.com>andrey2805@gmail.com> wrote:
>
>>
>>
>> On Fri, Jul 15, 2016 at 6:04 AM, Andrew Cooper <
>> <andrew.cooper3@citrix.com>andrew.cooper3@citrix.com> wrote:
>>
>>> On 12/07/16 04:59, Andrey Grodzovsky wrote:
>>>
>>> Hello
>>>
>>> Some background -
>>>
>>> We are trying to run Qualcomm Atheros AR928X Wireless Network Adapter
>>> and have a crash right on driver load, following are our observations and
>>> questions.
>>>
>>> Jurgen's observation -
>>>
>>> " The Atheros card "Qualcomm Atheros AR928X Wireless Network Adapter
>>> (PCI-Express) (rev 01)"  is plugged into the host system (datatron).
>>> When I attach it to the DomU - the module "ath9k" is automatically
>>> loaded, but it gives an exception "iowrite32+0x2b/0x30".
>>> No idea what the issue is (tried also with another Atheros Card (ath10k)
>>> - similar problem). When I try an Intel card, it works.
>>> (the card also works on the Dom0 - so the Linux driver and HW is OK)."
>>>
>>> Debugging -
>>>
>>> After some investigation with kgdb and iommu trace on DomU it seems the
>>> iomap of PCI BAR for the device returns a a mapping f which first 0x1000
>>> bytes are read only and that causes access violation when trying to write
>>> registers mapped to this area (all the regs with offset < 0x1000) - why
>>> this happens i still don't know. Register writes with offsets > 0x1000 are
>>> fine.
>>>
>>>
> Your card is not PCI spec compliant.
>
> The Spec mandates that nothing may exist in any 4k aligned block covering
> part of the MSI-X table, precisely so read-only tricks like this can be
> done trap&intercept MSI-X updates.
>
>
>
>
>>> Running same driver on Dom0 is totally fine
>>>
>>>
> This is curious.  Dom0 and DomU should be treated identically in this
> regard.
>
> ~Andrew
>

[-- Attachment #1.2: Type: text/html, Size: 8997 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atheros WiFi - memory paging failure on driver load
  2016-07-18 18:56         ` Andrey Grodzovsky
@ 2016-07-18 19:16           ` Andrew Cooper
  2016-08-01 11:41           ` Jan Beulich
  1 sibling, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2016-07-18 19:16 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Jürgen Walter • Quattru, Jan Beulich, xen-devel

On 18/07/2016 19:56, Andrey Grodzovsky wrote:
> Thank you for your comments !
> As a question , will disabling MSI enable bit in the card  and
> switching back to legacy interrupt method might resolve the issue ?

I don't know.  That will be down to the driver itself.  However, the use
of legacy line interrupts is really a worst case scenaro, and should be
avoided at all costs.

To investigate further, you need to identify which register domU is
attempting to access which is within the first page, and whether dom0
accesses the same register.  It could be that there is some behaviour
difference in the driver between dom0 and domU, or that there is some
behavioural difference in Xen to do with dom0 and domU accesses to this
register.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Atheros WiFi - memory paging failure on driver load
  2016-07-18 18:56         ` Andrey Grodzovsky
  2016-07-18 19:16           ` Andrew Cooper
@ 2016-08-01 11:41           ` Jan Beulich
  1 sibling, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2016-08-01 11:41 UTC (permalink / raw)
  To: Andrey Grodzovsky; +Cc: Andrew Cooper, jw, xen-devel

>>> On 18.07.16 at 20:56, <andrey2805@gmail.com> wrote:
> As a question , will disabling MSI enable bit in the card  and switching
> back to legacy interrupt method might resolve the issue ?

No, at least not with the current implementation: MSI-X gets
"prepared" (PHYSDEVOP_prepare_msix) by the pciback driver,
i.e. the r/o marking of the repsective page(s) happens outside
of the control of the targeted DomU.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-08-01 11:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-12  3:59 Atheros WiFi - memory paging failure on driver load Andrey Grodzovsky
2016-07-15 10:04 ` Andrew Cooper
2016-07-15 10:16   ` Jürgen Walter • Quattru
2016-07-16  3:45   ` Andrey Grodzovsky
2016-07-18  3:29     ` Andrey Grodzovsky
2016-07-18 18:22       ` Andrew Cooper
2016-07-18 18:56         ` Andrey Grodzovsky
2016-07-18 19:16           ` Andrew Cooper
2016-08-01 11:41           ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).