All of lore.kernel.org
 help / color / mirror / Atom feed
* PVH dom0 creation fails - the system freezes
@ 2018-07-23 11:50 bercarug
  2018-07-24  9:54 ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: bercarug @ 2018-07-23 11:50 UTC (permalink / raw)
  To: xen-devel, dwmw2, abelgun


[-- Attachment #1.1: Type: text/plain, Size: 3813 bytes --]

Hello,

For the last few days, I have been trying to get a PVH dom0 running,
however I encountered the following problem: the system seems to
freeze after the hypervisor boots, the screen goes black. I have tried to
debug it via a serial console (using Minicom) and managed to get some
more Xen output, after the screen turns black.

I mention that I have tried to boot the PVH dom0 using different kernel
images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 4.12).

Below I attached my system / hypervisor configuration, as well as the
output captured through the serial console, corresponding to the latest
versions for Xen and the Linux Kernel (Xen staging and Kernel from the
xen/tip tree).


OS + Distro: Linux / Debian 9 Stretch
Kernel Version: 4.17-rc5, tagged with for-linus-4.18-rc5-tag from the
xen/tip tree.
Xen Version: 4.12, commit id e3f667bc5f51d0aa44357a64ca134cd952679c81
of the Xen tree.
Host system: attached cpuinfo.log
Serial console output: attached boot.log
My grub configuration file, containing the Xen command line arguments: 
attached
grub.log

I can provide additional info as requested.
Any ideas why this happens? Do you have any recommendations for additional
debugging?

Here are the last few lines of the boot log. The last (separated) ones 
were only
visible though the serial console, since at that point the screen was 
completely
black.

(XEN) *** Building a PVH Dom0 ***
(XEN) [VT-D]d0:Hostbridge: skip 0000:00:00.0 map
(XEN) [VT-D]d0:PCI: map 0000:00:14.0
(XEN) [VT-D]d0:PCI: map 0000:00:14.2
(XEN) [VT-D]d0:PCI: map 0000:00:16.0
(XEN) [VT-D]d0:PCI: map 0000:00:16.1
(XEN) [VT-D]d0:PCI: map 0000:00:17.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.2
(XEN) [VT-D]d0:PCI: map 0000:00:1f.4
(XEN) [VT-D]d0:PCIe: map 0000:01:00.0
(XEN) [VT-D]d0:PCIe: map 0000:02:00.0
(XEN) [VT-D]d0:PCIe: map 0000:03:00.0
(XEN) [VT-D]d0:PCIe: map 0000:04:00.0
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021b000
(XEN) WARNING: PVH is an experimental mode with limited functionality
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM on 1 nodes using 4 CPUs
(XEN) 
...................................................................................................................................done.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************************************
(XEN) WARNING: CONSOLE OUTPUT IS SYNCHRONOUS
(XEN) This option is intended to aid debugging of Xen by ensuring
(XEN) that all output is synchronously delivered on the serial line.
(XEN) However it can introduce SIGNIFICANT latencies and affect
(XEN) timekeeping. It is NOT recommended for production use!
(XEN) ***************************************************
(XEN) 3... 2... 1...

(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch 
input to Xen)
(XEN) Freed 468kB init memory
(XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
(XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
(XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 
8deb3000, iommu reg = ffff82c00021b000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
(XEN) root_entry[00] = 1021c60001
(XEN) context[a0] = 2_1021d6d001
(XEN) l4[000] = 9c00001021d6c107
(XEN) l3[002] = 9c00001021d3e107
(XEN) l2[06f] = 9c000010218c0107
(XEN) l1[0b3] = 8000000000000000
(XEN) l1[0b3] not present
(XEN) Dom0 callback via changed to Direct Vector 0xf3

Thanks,
Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.

[-- Attachment #1.2: Type: text/html, Size: 5455 bytes --]

[-- Attachment #2: boot.log --]
[-- Type: text/x-log, Size: 12019 bytes --]

(XEN) Xen version 4.12-unstable (root@) (gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516) debug=y  Mon Jul 16 09:20:26 EDT 2018
(XEN) Latest ChangeSet: Thu Jul 12 18:48:06 2018 +0200 git:e3f667bc5f
(XEN) Console output is synchronous.
(XEN) Bootloader: GRUB 2.02~beta3-5
(XEN) Command line: placeholder dom0=pvh dom0_mem=4096M loglvl=all sync_console console_to_ring=true console=com1,vga com1=115200,8n1 iommu=debug,verbose,workaround_bios_bug iommu_inclusive_mapping=true
(XEN) Xen image load base address: 0
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN) Disc information:
(XEN)  Found 1 MBR signatures
(XEN)  Found 2 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 0000000000098c00 (usable)
(XEN)  0000000000098c00 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 000000008c1c4000 (usable)
(XEN)  000000008c1c4000 - 000000008c1c5000 (ACPI NVS)
(XEN)  000000008c1c5000 - 000000008c20f000 (reserved)
(XEN)  000000008c20f000 - 000000008c281000 (usable)
(XEN)  000000008c281000 - 000000008dec1000 (reserved)
(XEN)  000000008dec1000 - 000000008df9a000 (ACPI NVS)
(XEN)  000000008df9a000 - 000000008dfff000 (ACPI data)
(XEN)  000000008dfff000 - 000000008e000000 (usable)
(XEN)  000000008e000000 - 0000000090000000 (reserved)
(XEN)  0000000094000000 - 000000009a000000 (reserved)
(XEN)  000000009df00000 - 00000000a0000000 (reserved)
(XEN)  00000000e0000000 - 00000000f0000000 (reserved)
(XEN)  00000000fd000000 - 00000000fe800000 (reserved)
(XEN)  00000000fec00000 - 00000000fec01000 (reserved)
(XEN)  00000000fed00000 - 00000000fed01000 (reserved)
(XEN)  00000000fed10000 - 00000000fed1a000 (reserved)
(XEN)  00000000fed84000 - 00000000fed85000 (reserved)
(XEN)  00000000fee00000 - 00000000fee01000 (reserved)
(XEN)  00000000ff400000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 0000001060000000 (usable)
(XEN) New Xen image base address: 0x8ba00000
(XEN) ACPI: RSDP 000F0510, 0024 (r2 INTEL )
(XEN) ACPI: XSDT 8DFB7188, 00EC (r1 INTEL  S1200SPO        0 INTL  1000013)
(XEN) ACPI: FACP 8DFF3000, 00F4 (r5 INTEL  S1200SPO        0 INTL 20091013)
(XEN) ACPI: DSDT 8DFC3000, 29241 (r2 INTEL  S1200SPO        0 INTL 20091013)
(XEN) ACPI: FACS 8DF6D000, 0040
(XEN) ACPI: HPET 8DFF2000, 0038 (r1 INTEL  S1200SPO        1 INTL 20091013)
(XEN) ACPI: APIC 8DFF1000, 00BC (r3 INTEL  S1200SPO        1 INTL 20091013)
(XEN) ACPI: MCFG 8DFF0000, 003C (r1 INTEL  S1200SPO        1 INTL 20091013)
(XEN) ACPI: SPMI 8DFEE000, 0042 (r5 INTEL  S1200SPO        0 INTL 20091013)
(XEN) ACPI: WDDT 8DFED000, 0040 (r1 INTEL  S1200SPO        0 INTL 20091013)
(XEN) ACPI: SSDT 8DFC0000, 2BAE (r2 INTEL  S1200SPO     1000 INTL 20091013)
(XEN) ACPI: SSDT 8DFBF000, 0BE3 (r2 INTEL  S1200SPO     1000 INTL 20091013)
(XEN) ACPI: SSDT 8DFBE000, 019A (r2 INTEL  S1200SPO     1000 INTL 20091013)
(XEN) ACPI: SSDT 8DFBD000, 04A3 (r2 INTEL  S1200SPO     1000 INTL 20091013)
(XEN) ACPI: TCPA 8DFFC000, 0064 (r2 INTEL  S1200SPO        2 INTL  1000013)
(XEN) ACPI: TPM2 8DFFA000, 0034 (r3 INTEL  S1200SPO        2 INTL  1000013)
(XEN) ACPI: SSDT 8DFF4000, 5328 (r2 INTEL  S1200SPO     3000 INTL 20141107)
(XEN) ACPI: SSDT 8DFBC000, 0E73 (r2 INTEL  S1200SPO     3000 INTL 20141107)
(XEN) ACPI: SSDT 8DFBA000, 0064 (r2 INTEL  S1200SPO        2 INTL 20141107)
(XEN) ACPI: DMAR 8DFB8000, 0070 (r1 INTEL  S1200SPO        1 INTL        1)
(XEN) ACPI: HEST 8DFFD000, 00A8 (r1 INTEL  S1200SPO        1 INTL        1)
(XEN) ACPI: ERST 8DFB5000, 0230 (r1 INTEL  S1200SPO        1 INTL        1)
(XEN) ACPI: SSDT 8DFFB000, 03A7 (r2 INTEL  S1200SPO     1000 INTL 20141107)
(XEN) ACPI: SSDT 8DFBB000, 0B79 (r2 INTEL  S1200SPO        2 INTL 20141107)
(XEN) ACPI: BERT 8DFB6000, 0030 (r1 INTEL  S1200SPO        1 INTL        1)
(XEN) ACPI: UEFI 8DF82000, 0042 (r1 INTEL  S1200SPO        2 INTL  1000013)
(XEN) ACPI: PRAD 8DFB9000, 0102 (r2 INTEL  S1200SPO        2 INTL 20141107)
(XEN) ACPI: EINJ 8DFB4000, 0130 (r1 INTEL  S1200SPO        1 INTL        1)
(XEN) ACPI: SPCR 8DFEF000, 0050 (r1 INTEL  S1200SPO        0 INTL 20091013)
(XEN) System RAM: 65217MB (66783036kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-0000001060000000
(XEN) Domain heap initialised
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 158 (0x9e), Stepping 9 (raw 000906e9)
(XEN) DMI 2.7 present.
(XEN) Using APIC driver default
(XEN) ACPI: PM-Timer IO Port: 0x1808 (32 bits)
(XEN) ACPI: SLEEP INFO: pm1x_cnt[1:1804,1:0], pm1x_evt[1:1800,1:0]
(XEN) ACPI: 32/64X FACS address mismatch in FADT - 8df6d000/0000000000000000, using 32
(XEN) ACPI:             wakeup_vec[8df6d00c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x01] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x03] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x05] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x07] enabled)
(XEN) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1])
(XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-119
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
(XEN) ACPI: HPET id: 0x8086a201 base: 0xfed00000
(XEN) [VT-D]Host address width 39
(XEN) [VT-D]found ACPI_DMAR_DRHD:
(XEN) [VT-D]  dmaru->address = fed90000
(XEN) [VT-D]drhd->address = fed90000 iommu->reg = ffff82c00021b000
(XEN) [VT-D]cap = d2008c40660462 ecap = f050da
(XEN) [VT-D] IOAPIC: 0000:f0:1f.0
(XEN) [VT-D] MSI HPET: 0000:00:1f.0
(XEN) [VT-D]  flags: INCLUDE_ALL
(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D]  RMRR address range 3e2e0000..3e2fffff not in reserved memory; need "iommu_inclusive_mapping=1"?
(XEN) [VT-D] endpoint: 0000:00:14.0
(XEN) [VT-D]dmar.c:638:   RMRR region: base_addr 3e2e0000 end_addr 3e2fffff
(XEN) Xen ERST support is initialized.
(XEN) HEST: Table parsing has been initialized
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 8 CPUs (0 hotplug CPUs)
(XEN) IRQ limits: 120 GSI, 1432 MSI/MSI-X
(XEN) Not enabling x2APIC (upon firmware request)
(XEN) xstate: size: 0x440 and states: 0x1f
(XEN) mce_intel.c:782: MCA Capability: firstbank 0, extended MCE MSR 0, BCAST, CMCI
(XEN) CPU0: Intel machine check reporting enabled
(XEN) Speculative mitigation facilities:
(XEN)   Hardware features:
(XEN)   Compiled-in support: INDIRECT_THUNK
(XEN)   Xen settings: BTI-Thunk RETPOLINE, SPEC_CTRL: No, Other:
(XEN)   Support for VMs: PV: RSB EAGER_FPU, HVM: RSB EAGER_FPU
(XEN)   XPTI (64-bit PV only): Dom0 enabled, DomU enabled
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Platform timer is 23.999MHz HPET
(XEN) Detected 3792.189 MHz processor.
(XEN) Initing memory sharing.
(XEN) alt table ffff82d080465858 -> ffff82d08046759a
(XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff
(XEN) PCI: MCFG area at e0000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-ff
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB.
(XEN) Intel VT-d Snoop Control enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Posted Interrupt not enabled.
(XEN) Intel VT-d Shared EPT tables enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) nr_sockets: 1
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) TSC deadline timer enabled
(XEN) Allocated console ring of 64 KiB.
(XEN) mwait-idle: MWAIT substates: 0x142120
(XEN) mwait-idle: v0.4.1 model 0x9e
(XEN) mwait-idle: lapic_timer_reliable_states 0xffffffff
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN)  - VMCS shadowing
(XEN)  - VM Functions
(XEN)  - Virtualisation Exceptions
(XEN)  - Page Modification Logging
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) Brought up 8 CPUs
(XEN) build-id: 217e72f0df3db05d72d6fa8b460b643bad6c230e
(XEN) Running stub recovery selftests...
(XEN) traps.c:1570: GPF (0000): ffff82d0bffff041 [ffff82d0bffff041] -> ffff82d08037b412
(XEN) traps.c:755: Trap 12: ffff82d0bffff040 [ffff82d0bffff040] -> ffff82d08037b412
(XEN) traps.c:1097: Trap 3: ffff82d0bffff041 [ffff82d0bffff041] -> ffff82d08037b412
(XEN) ACPI sleep modes: S3
(XEN) VPMU: disabled
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) Dom0 has maximum 888 PIRQs
(XEN) NX (Execute Disable) protection active
(XEN) *** Building a PVH Dom0 ***
(XEN) [VT-D]d0:Hostbridge: skip 0000:00:00.0 map
(XEN) [VT-D]d0:PCI: map 0000:00:14.0
(XEN) [VT-D]d0:PCI: map 0000:00:14.2
(XEN) [VT-D]d0:PCI: map 0000:00:16.0
(XEN) [VT-D]d0:PCI: map 0000:00:16.1
(XEN) [VT-D]d0:PCI: map 0000:00:17.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.2
(XEN) [VT-D]d0:PCI: map 0000:00:1f.4
(XEN) [VT-D]d0:PCIe: map 0000:01:00.0
(XEN) [VT-D]d0:PCIe: map 0000:02:00.0
(XEN) [VT-D]d0:PCIe: map 0000:03:00.0
(XEN) [VT-D]d0:PCIe: map 0000:04:00.0
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021b000
(XEN) WARNING: PVH is an experimental mode with limited functionality
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM on 1 nodes using 4 CPUs
(XEN) ...................................................................................................................................done.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************************************
(XEN) WARNING: CONSOLE OUTPUT IS SYNCHRONOUS
(XEN) This option is intended to aid debugging of Xen by ensuring
(XEN) that all output is synchronously delivered on the serial line.
(XEN) However it can introduce SIGNIFICANT latencies and affect
(XEN) timekeeping. It is NOT recommended for production use!
(XEN) ***************************************************
(XEN) 3... 2... 1... 
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen)
(XEN) Freed 468kB init memory
(XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
(XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
(XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 8deb3000, iommu reg = ffff82c00021b000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
(XEN)     root_entry[00] = 1021c60001
(XEN)     context[a0] = 2_1021d6d001
(XEN)     l4[000] = 9c00001021d6c107
(XEN)     l3[002] = 9c00001021d3e107
(XEN)     l2[06f] = 9c000010218c0107
(XEN)     l1[0b3] = 8000000000000000
(XEN)     l1[0b3] not present
(XEN) Dom0 callback via changed to Direct Vector 0xf3

[-- Attachment #3: cpuinfo.log --]
[-- Type: text/x-log, Size: 10009 bytes --]

cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz
stepping	: 9
microcode	: 0x5e
cpu MHz		: 800.031
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
bogomips	: 7584.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz
stepping	: 9
microcode	: 0x5e
cpu MHz		: 799.716
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 1
cpu cores	: 4
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
bogomips	: 7584.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz
stepping	: 9
microcode	: 0x5e
cpu MHz		: 800.285
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 2
cpu cores	: 4
apicid		: 4
initial apicid	: 4
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
bogomips	: 7584.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz
stepping	: 9
microcode	: 0x5e
cpu MHz		: 800.068
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 6
initial apicid	: 6
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
bogomips	: 7584.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 4
vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz
stepping	: 9
microcode	: 0x5e
cpu MHz		: 801.970
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 4
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
bogomips	: 7584.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 5
vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz
stepping	: 9
microcode	: 0x5e
cpu MHz		: 800.277
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 1
cpu cores	: 4
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
bogomips	: 7584.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 6
vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz
stepping	: 9
microcode	: 0x5e
cpu MHz		: 800.104
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 2
cpu cores	: 4
apicid		: 5
initial apicid	: 5
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
bogomips	: 7584.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz
stepping	: 9
microcode	: 0x5e
cpu MHz		: 800.171
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
bogomips	: 7584.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

[-- Attachment #4: grub.log --]
[-- Type: text/x-log, Size: 1473 bytes --]

cat /etc/default/grub 
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX="console=tty0,115200n8 console=tty1,115200n8 console=ttyS0,115200n8 console=ttyS1,115200n8"
GRUB_CMDLINE_XEN="dom0=pvh dom0_mem=4096M loglvl=all sync_console console_to_ring=true console=com1,vga com1=115200,8n1 iommu=debug,verbose,workaround_bios_bug iommu_inclusive_mapping=true"

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"

[-- Attachment #5: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-23 11:50 PVH dom0 creation fails - the system freezes bercarug
@ 2018-07-24  9:54 ` Jan Beulich
  2018-07-25 10:06   ` bercarug
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2018-07-24  9:54 UTC (permalink / raw)
  To: bercarug; +Cc: xen-devel, David Woodhouse, abelgun

>>> On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
> For the last few days, I have been trying to get a PVH dom0 running,
> however I encountered the following problem: the system seems to
> freeze after the hypervisor boots, the screen goes black. I have tried to
> debug it via a serial console (using Minicom) and managed to get some
> more Xen output, after the screen turns black.
> 
> I mention that I have tried to boot the PVH dom0 using different kernel
> images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 4.12).
> 
> Below I attached my system / hypervisor configuration, as well as the
> output captured through the serial console, corresponding to the latest
> versions for Xen and the Linux Kernel (Xen staging and Kernel from the
> xen/tip tree).
> [...]
> (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
> (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 8deb3000, iommu reg = ffff82c00021b000
> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
> (XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
> (XEN) root_entry[00] = 1021c60001
> (XEN) context[a0] = 2_1021d6d001
> (XEN) l4[000] = 9c00001021d6c107
> (XEN) l3[002] = 9c00001021d3e107
> (XEN) l2[06f] = 9c000010218c0107
> (XEN) l1[0b3] = 8000000000000000
> (XEN) l1[0b3] not present
> (XEN) Dom0 callback via changed to Direct Vector 0xf3

This might be a hint at a missing RMRR entry in the ACPI tables, as
we've seen to be the case for a number of systems (I dare to guess
that 0000:00:14.0 is a USB controller, perhaps one with a keyboard
and/or mouse connected). You may want to play with the respective
command line option ("rmrr="). Note that "iommu_inclusive_mapping"
as you're using it does not have any meaning for PVH (see
intel_iommu_hwdom_init()).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-24  9:54 ` Jan Beulich
@ 2018-07-25 10:06   ` bercarug
  2018-07-25 10:22     ` Wei Liu
                       ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: bercarug @ 2018-07-25 10:06 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, David Woodhouse, abelgun

[-- Attachment #1: Type: text/plain, Size: 3510 bytes --]

On 07/24/2018 12:54 PM, Jan Beulich wrote:
>>>> On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
>> For the last few days, I have been trying to get a PVH dom0 running,
>> however I encountered the following problem: the system seems to
>> freeze after the hypervisor boots, the screen goes black. I have tried to
>> debug it via a serial console (using Minicom) and managed to get some
>> more Xen output, after the screen turns black.
>>
>> I mention that I have tried to boot the PVH dom0 using different kernel
>> images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 4.12).
>>
>> Below I attached my system / hypervisor configuration, as well as the
>> output captured through the serial console, corresponding to the latest
>> versions for Xen and the Linux Kernel (Xen staging and Kernel from the
>> xen/tip tree).
>> [...]
>> (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
>> (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
>> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 8deb3000, iommu reg = ffff82c00021b000
>> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
>> (XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
>> (XEN) root_entry[00] = 1021c60001
>> (XEN) context[a0] = 2_1021d6d001
>> (XEN) l4[000] = 9c00001021d6c107
>> (XEN) l3[002] = 9c00001021d3e107
>> (XEN) l2[06f] = 9c000010218c0107
>> (XEN) l1[0b3] = 8000000000000000
>> (XEN) l1[0b3] not present
>> (XEN) Dom0 callback via changed to Direct Vector 0xf3
> This might be a hint at a missing RMRR entry in the ACPI tables, as
> we've seen to be the case for a number of systems (I dare to guess
> that 0000:00:14.0 is a USB controller, perhaps one with a keyboard
> and/or mouse connected). You may want to play with the respective
> command line option ("rmrr="). Note that "iommu_inclusive_mapping"
> as you're using it does not have any meaning for PVH (see
> intel_iommu_hwdom_init()).
>
> Jan
>
>
>
Hello,

Following Roger's advice, I rebuilt Xen (4.12) using the staging branch 
and I managed to get a PVH dom0 starting. However, some other problems 
appeared:

1) The USB devices are not usable anymore (keyboard and mouse), so the 
system is only accessible through the serial port.

2) I can run any usual command in dom0, but the ones involving xl 
(except for xl info) will make the system run out of memory very fast. 
Eventually, when there is no more free memory available, the OOM killer 
begins removing processes until the system auto reboots.

I attached a file containing the output of a lsusb, as well as the 
output of xl info and xl list -l.
After xl list -l, the “free -m” commands show the available memory 
decreasing.
Each command has a timestamp appended, so it can be seen how fast the 
available memory decreases.

I removed much of the process killing logs and kept the last one, since 
they were following the same pattern.

Dom0 still appears to be of type PV (output of xl list -l), however 
during boot, the following messages were displayed: “Building a PVH 
Dom0” and “Booting paravirtualized kernel on Xen PVH”.

I mention that I had to add “workaround_bios_bug” in GRUB_CMDLINE_XEN 
for iommu to get dom0 running.

What could be causing the available memory loss problem?

Thank you,
Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.

[-- Attachment #2: lsusb.txt --]
[-- Type: text/plain, Size: 17884 bytes --]

lsusb && tim\b\b\bdate +%c
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Wed 25 Jul 2018 04:59:25 AM EDT
root@debian:/home/test# lsusb && date +%c\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\a\a\a\a\axl \binfo
host                   : debian
release                : 4.17.0-rc5
version                : #4 SMP Tue Jul 24 06:12:21 EDT 2018
machine                : x86_64
nr_cpus                : 8
max_cpu_id             : 7
nr_nodes               : 1
cores_per_socket       : 4
threads_per_core       : 2
cpu_mhz                : 3792.227
hw_caps                : bfebfbff:77faf3ff:2c100800:00000121:0000000f:009c6fbf:00000000:00000100
virt_caps              : hvm hvm_directio
total_memory           : 65217
free_memory            : 56242
sharing_freed_memory   : 0
sharing_used_memory    : 0
outstanding_claims     : 0
free_cpus              : 0
xen_major              : 4
xen_minor              : 12
xen_extra              : -unstable
xen_version            : 4.12-unstable
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : Thu Jun 28 10:54:01 2018 +0300 git:61bdddb821
xen_commandline        : placeholder dom0=pvh dom0_mem=8192M loglvl=all sync_console console_to_ring=true console=com1,vga com1=115200,8n1 iommu=debug,verbose,workaround_bios_bug
cc_compiler            : gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
cc_compile_by          : root
cc_compile_domain      : 
cc_compile_date        : Tue Jul 24 04:03:02 EDT 2018
build_id               : e6d3e802a6420aae9e2e25dd5941c5d24adad026
xend_config_format     : 4
Wed 25 Jul 2018 04:59:38 AM EDT
root@debian:/home/test# xl info && date +%c\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\a\a\a\a\afree \b-m
              total        used        free      shared  buff/cache   available
Mem:           7977         179        7549           8         248        7560
Swap:         65120           0       65120
Wed 25 Jul 2018 04:59:44 AM EDT
root@debian:/home/test# free -m && date +%c\a\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\a\a\a\axl \blist \b-l
[
    {
        "domid": 0,
        "config": {
            "c_info": {
                "type": "pv",
                "name": "Domain-0"
            },
            "b_info": {
                "max_memkb": 17179869180,
                "target_memkb": 8387743,
                "sched_params": {
                    "sched": "credit",
                    "weight": 256,
                    "cap": 0
                },
                "type.pv": {

                },
                "arch_arm": {

                }
            }
        }
    }
]

Wed 25 Jul 2018 04:59:52 AM EDT
root@debian:/home/test# xl list -l && date +%c\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\bfree -m
              total        used        free      shared  buff/cache   available
Mem:           7129         180        6701           8         248        6711
Swap:         65120           0       65120
Wed 25 Jul 2018 04:59:53 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:           6441         180        6012           8         248        6023
Swap:         65120           0       65120
Wed 25 Jul 2018 04:59:54 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:           5789         180        5360           8         248        5371
Swap:         65120           0       65120
Wed 25 Jul 2018 04:59:55 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:           5007         181        4578           8         248        4589
Swap:         65120           0       65120
Wed 25 Jul 2018 04:59:56 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:           4317         180        3888           8         248        3899
Swap:         65120           0       65120
Wed 25 Jul 2018 04:59:57 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:           3603         181        3174           8         248        3184
Swap:         65120           0       65120
Wed 25 Jul 2018 04:59:57 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:           2863         181        2434           8         248        2444
Swap:         65120           0       65120
Wed 25 Jul 2018 04:59:58 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:           2169         181        1739           8         248        1750
Swap:         65120           0       65120
Wed 25 Jul 2018 04:59:59 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:           1495         182        1064           8         248        1075
Swap:         65120           0       65120
Wed 25 Jul 2018 05:00:00 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:            775         182         343           8         248         354
Swap:         65120           0       65120
Wed 25 Jul 2018 05:00:00 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:            293         151         117           1          24          46
Swap:         65120          57       65063
Wed 25 Jul 2018 05:00:01 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:            293         139         129           1          24           4
Swap:         65120          55       65065
Wed 25 Jul 2018 05:00:02 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:            293         139         126           1          26           2
Swap:         65120          55       65065
Wed 25 Jul 2018 05:00:03 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:            246         110         116           0          18          43
Swap:         65120          89       65031
Wed 25 Jul 2018 05:00:04 AM EDT
root@debian:/home/test# free -m && date +%c
              total        used        free      shared  buff/cache   available
Mem:            246         111         116           0          18          42
Swap:         65120          90       65030
Wed 25 Jul 2018 05:00:05 AM EDT
root@debian:/home/test# free -m && date +%c

[...]

[  255.133877] Out of memory: Kill process 971 (systemd-cgroups) score 0 or sacrifice child
[  255.142990] Killed process 971 (systemd-cgroups) total-vm:4804kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[  255.184192] systemd invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[  255.196535] systemd cpuset=/ mems_allowed=0
[  255.201282] CPU: 7 PID: 1 Comm: systemd Not tainted 4.17.0-rc5 #4
[  255.208161] Hardware name:  , BIOS  
[  255.212232] Call Trace:
[  255.215048]  dump_stack+0x5c/0x7b
[  255.218829]  dump_header+0x6b/0x28c
[  255.222801]  ? find_lock_task_mm+0x52/0x80
[  255.227457]  ? oom_unkillable_task+0x9b/0xc0
[  255.232304]  out_of_memory+0x328/0x480
[  255.236569]  __alloc_pages_slowpath+0xd25/0xe00
[  255.241707]  ? __do_page_cache_readahead+0x129/0x2e0
[  255.247329]  __alloc_pages_nodemask+0x212/0x250
[  255.252469]  filemap_fault+0x3a0/0x650
[  255.256737]  ? alloc_set_pte+0x39c/0x520
[  255.261195]  ? filemap_map_pages+0x182/0x330
[  255.266049]  ext4_filemap_fault+0x2c/0x40 [ext4]
[  255.271279]  __do_fault+0x1f/0xb3
[  255.275059]  __handle_mm_fault+0xbdf/0x1110
[  255.279809]  handle_mm_fault+0xfc/0x1f0
[  255.284171]  __do_page_fault+0x255/0x4f0
[  255.288632]  ? exit_to_usermode_loop+0xa3/0xc0
[  255.293672]  ? page_fault+0x8/0x30
[  255.297551]  page_fault+0x1e/0x30
[  255.301332] RIP: 0033:0x7f537ebdad50
[  255.305408] RSP: 002b:00007ffec5e9bdb8 EFLAGS: 00010202
[  255.311318] RAX: 0000000000000000 RBX: 00005601f8de19e0 RCX: 00007f537d85db00
[  255.319365] RDX: 00005601f8de19e0 RSI: 00007f537ecdd76c RDI: 00005601f8de19e0
[  255.327412] RBP: 00007f537ecdd76c R08: 00007f537d85dbb8 R09: 0000000000000060
[  255.335460] R10: 00007f537efb0940 R11: 0000000000000206 R12: 00007ffec5e9bde0
[  255.343508] R13: 00007ffec5e9bef0 R14: 0000000000000000 R15: 0000000000000009
[  255.351563] Mem-Info:
[  255.354178] active_anon:10 inactive_anon:4 isolated_anon:0
[  255.354178]  active_file:129 inactive_file:6 isolated_file:0
[  255.354178]  unevictable:0 dirty:0 writeback:0 unstable:0
[  255.354178]  slab_reclaimable:2925 slab_unreclaimable:4868
[  255.354178]  mapped:0 shmem:0 pagetables:65 bounce:0
[  255.354178]  free:26541 free_pcp:0 free_cma:0
[  255.389666] Node 0 active_anon:40kB inactive_anon:16kB active_file:516kB inactive_file:24kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[  255.418748] Node 0 DMA free:15880kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15964kB managed:15880kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  255.447834] lowmem_reserve[]: 0 1889 7707 7707 7707
[  255.453359] Node 0 DMA32 free:39428kB min:16532kB low:20664kB high:24796kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:2279640kB managed:44552kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  255.483415] lowmem_reserve[]: 0 0 5818 5818 5818
[  255.488651] Node 0 Normal free:50856kB min:50912kB low:63640kB high:76368kB active_anon:40kB inactive_anon:16kB active_file:708kB inactive_file:104kB unevictable:0kB writepending:0kB present:6092996kB managed:139648kB mlocked:0kB kernel_stack:2240kB pagetables:260kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  255.519966] lowmem_reserve[]: 0 0 0 0 0
[  255.524326] Node 0 DMA: 0*4kB 1*8kB (U) 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15880kB
[  255.538966] Node 0 DMA32: 9*4kB (UM) 6*8kB (UM) 3*16kB (UM) 4*32kB (M) 6*64kB (M) 7*128kB (UM) 4*256kB (M) 6*512kB (M) 3*1024kB (M) 1*2048kB (U) 7*4096kB (UM) = 39428kB
[  255.555839] Node 0 Normal: 551*4kB (UME) 356*8kB (UME) 185*16kB (UME) 90*32kB (ME) 50*64kB (UME) 36*128kB (UME) 12*256kB (UM) 6*512kB (M) 8*1024kB (UM) 9*2048kB (M) 0*4096kB = 51468kB
[  255.574160] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[  255.583952] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  255.593453] 206 total pagecache pages
[  255.597628] 6 pages in swap cache
[  255.601404] Swap cache stats: add 127798, delete 127855, find 93647/162354
[  255.609164] Free swap  = 66670076kB
[  255.613138] Total swap = 66683900kB
[  255.617115] 2097150 pages RAM
[  255.620500] 0 pages HighMem/MovableOnly
[  255.624867] 2047130 pages reserved
[  255.628746] 0 pages hwpoisoned
[  255.632230] Unreclaimable slab info:
[  255.636308] Name                      Used          Total
[  255.642418] scsi_sense_cache          41KB         56KB
[  255.648329] ip6_dst_cache              3KB         15KB
[  255.654245] RAWv6                     27KB         31KB
[  255.660161] sgpool-128                 8KB          8KB
[  255.666072] cfq_io_cq                  7KB         19KB
[  255.671988] cfq_queue                 11KB         23KB
[  255.677901] mqueue_inode_cache          0KB          3KB
[  255.683910] dnotify_struct             0KB          3KB
[  255.689828] secpath_cache              0KB          8KB
[  255.695738] RAW                       30KB         30KB
[  255.701654] hugetlbfs_inode_cache          1KB          7KB
[  255.707955] eventpoll_pwq              4KB         23KB
[  255.713872] eventpoll_epi              7KB         32KB
[  255.719785] request_queue              4KB         12KB
[  255.725698] blkdev_ioc                 6KB         15KB
[  255.731613] biovec-max                96KB         96KB
[  255.737526] biovec-128                 4KB          4KB
[  255.743442] biovec-64                293KB        328KB
[  255.749357] dmaengine-unmap-256          2KB          6KB
[  255.755466] dmaengine-unmap-128          3KB         22KB
[  255.761570] dmaengine-unmap-16          6KB          7KB
[  255.767581] dmaengine-unmap-2          0KB          3KB
[  255.773497] skbuff_fclone_cache         51KB         84KB
[  255.779604] skbuff_head_cache         74KB        148KB
[  255.785518] net_namespace              6KB          6KB
[  255.791431] shmem_inode_cache        609KB        665KB
[  255.797348] taskstats                  3KB          3KB
[  255.803259] proc_dir_entry           176KB        192KB
[  255.809176] pde_opener                 0KB          3KB
[  255.815088] seq_file                   2KB          8KB
[  255.821002] sigqueue                   7KB         11KB
[  255.826915] kernfs_node_cache       2792KB       2812KB
[  255.832832] mnt_cache                 29KB         48KB
[  255.838745] filp                      83KB        352KB
[  255.844661] names_cache               56KB         56KB
[  255.850572] vm_area_struct           108KB        566KB
[  255.856486] mm_struct                 76KB         96KB
[  255.862404] files_cache               29KB         45KB
[  255.868314] signal_cache             203KB        232KB
[  255.874227] sighand_cache            406KB        420KB
[  255.880146] task_struct              655KB        655KB
[  255.886057] cred_jar                  57KB        165KB
[  255.891972] anon_vma                  17KB        105KB
[  255.897885] pid                       49KB        276KB
[  255.903798] Acpi-Operand             590KB        606KB
[  255.909715] Acpi-Parse                 4KB         15KB
[  255.915630] Acpi-State                 5KB         19KB
[  255.921545] Acpi-Namespace           221KB        228KB
[  255.927455] numa_policy                0KB          3KB
[  255.933370] trace_event_file         114KB        126KB
[  255.939284] ftrace_event_field        148KB        159KB
[  255.945297] pool_workqueue           115KB        328KB
[  255.951211] task_group                12KB         27KB
[  255.957124] kmalloc-2097152         2048KB       2048KB
[  255.963038] kmalloc-262144           768KB        768KB
[  255.968953] kmalloc-131072           128KB        128KB
[  255.974864] kmalloc-32768            288KB        288KB
[  255.980778] kmalloc-16384            384KB        384KB
[  255.986697] kmalloc-8192             712KB        712KB
[  255.992607] kmalloc-4096             596KB        600KB
[  255.998521] kmalloc-2048            1546KB       1592KB
[  256.004436] kmalloc-1024            1158KB       1216KB
[  256.010352] kmalloc-512              522KB        604KB
[  256.016264] kmalloc-256              125KB        132KB
[  256.022179] kmalloc-192              236KB        267KB
[  256.028092] kmalloc-96               178KB        252KB
[  256.034009] kmalloc-64               298KB        360KB
[  256.039924] kmalloc-32               411KB        430KB
[  256.045839] kmalloc-128              104KB        132KB
[  256.051749] kmem_cache                33KB         40KB
[  256.057665] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[  256.067170] [  273]     0   273    11680        1   122880      363         -1000 systemd-udevd
[  256.076962] Kernel panic - not syncing: Out of memory and no killable processes...
[  256.076962] 
[  256.087232] CPU: 7 PID: 1 Comm: systemd Not tainted 4.17.0-rc5 #4
[  256.094113] Hardware name:  , BIOS  
[  256.098185] Call Trace:
[  256.100999]  dump_stack+0x5c/0x7b
[  256.104780]  panic+0xe4/0x252
[  256.108173]  ? dump_header+0x189/0x28c
[  256.112437]  out_of_memory+0x334/0x480
[  256.116705]  __alloc_pages_slowpath+0xd25/0xe00
[  256.121843]  ? __do_page_cache_readahead+0x129/0x2e0
[  256.127465]  __alloc_pages_nodemask+0x212/0x250
[  256.132603]  filemap_fault+0x3a0/0x650
[  256.136871]  ? alloc_set_pte+0x39c/0x520
[  256.141330]  ? filemap_map_pages+0x182/0x330
[  256.146184]  ext4_filemap_fault+0x2c/0x40 [ext4]
[  256.151412]  __do_fault+0x1f/0xb3
[  256.155197]  __handle_mm_fault+0xbdf/0x1110
[  256.159948]  handle_mm_fault+0xfc/0x1f0
[  256.164306]  __do_page_fault+0x255/0x4f0
[  256.168769]  ? exit_to_usermode_loop+0xa3/0xc0
[  256.173807]  ? page_fault+0x8/0x30
[  256.177687]  page_fault+0x1e/0x30
[  256.181467] RIP: 0033:0x7f537ebdad50
[  256.185538] RSP: 002b:00007ffec5e9bdb8 EFLAGS: 00010202
[  256.191455] RAX: 0000000000000000 RBX: 00005601f8de19e0 RCX: 00007f537d85db00
[  256.199500] RDX: 00005601f8de19e0 RSI: 00007f537ecdd76c RDI: 00005601f8de19e0
[  256.207546] RBP: 00007f537ecdd76c R08: 00007f537d85dbb8 R09: 0000000000000060
[  256.215593] R10: 00007f537efb0940 R11: 0000000000000206 R12: 00007ffec5e9bde0
[  256.223642] R13: 00007ffec5e9bef0 R14: 0000000000000000 R15: 0000000000000009
[  256.231752] Kernel Offset: disabled
(XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
(XEN) Resetting with ACPI MEMORY or I/O RESET_REG.

[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 10:06   ` bercarug
@ 2018-07-25 10:22     ` Wei Liu
  2018-07-25 10:43     ` Juergen Gross
  2018-07-25 13:35     ` Roger Pau Monné
  2 siblings, 0 replies; 47+ messages in thread
From: Wei Liu @ 2018-07-25 10:22 UTC (permalink / raw)
  To: bercarug; +Cc: xen-devel, David Woodhouse, Wei Liu, Jan Beulich, abelgun

On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@amazon.com wrote:
> On 07/24/2018 12:54 PM, Jan Beulich wrote:
> > > > > On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
> > > For the last few days, I have been trying to get a PVH dom0 running,
> > > however I encountered the following problem: the system seems to
> > > freeze after the hypervisor boots, the screen goes black. I have tried to
> > > debug it via a serial console (using Minicom) and managed to get some
> > > more Xen output, after the screen turns black.
> > > 
> > > I mention that I have tried to boot the PVH dom0 using different kernel
> > > images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 4.12).
> > > 
> > > Below I attached my system / hypervisor configuration, as well as the
> > > output captured through the serial console, corresponding to the latest
> > > versions for Xen and the Linux Kernel (Xen staging and Kernel from the
> > > xen/tip tree).
> > > [...]
> > > (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
> > > (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
> > > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 8deb3000, iommu reg = ffff82c00021b000
> > > (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
> > > (XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
> > > (XEN) root_entry[00] = 1021c60001
> > > (XEN) context[a0] = 2_1021d6d001
> > > (XEN) l4[000] = 9c00001021d6c107
> > > (XEN) l3[002] = 9c00001021d3e107
> > > (XEN) l2[06f] = 9c000010218c0107
> > > (XEN) l1[0b3] = 8000000000000000
> > > (XEN) l1[0b3] not present
> > > (XEN) Dom0 callback via changed to Direct Vector 0xf3
> > This might be a hint at a missing RMRR entry in the ACPI tables, as
> > we've seen to be the case for a number of systems (I dare to guess
> > that 0000:00:14.0 is a USB controller, perhaps one with a keyboard
> > and/or mouse connected). You may want to play with the respective
> > command line option ("rmrr="). Note that "iommu_inclusive_mapping"
> > as you're using it does not have any meaning for PVH (see
> > intel_iommu_hwdom_init()).
> > 
> > Jan
> > 
> > 
> > 
> Hello,
> 
> Following Roger's advice, I rebuilt Xen (4.12) using the staging branch and
> I managed to get a PVH dom0 starting. However, some other problems appeared:
> 
> 1) The USB devices are not usable anymore (keyboard and mouse), so the
> system is only accessible through the serial port.
> 
> 2) I can run any usual command in dom0, but the ones involving xl (except
> for xl info) will make the system run out of memory very fast. Eventually,
> when there is no more free memory available, the OOM killer begins removing
> processes until the system auto reboots.

Any chance you can run valgrind with xl?

I'm intrigued. by this PVH-only memory leak.  :-)

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 10:06   ` bercarug
  2018-07-25 10:22     ` Wei Liu
@ 2018-07-25 10:43     ` Juergen Gross
  2018-07-25 13:35     ` Roger Pau Monné
  2 siblings, 0 replies; 47+ messages in thread
From: Juergen Gross @ 2018-07-25 10:43 UTC (permalink / raw)
  To: bercarug, Jan Beulich; +Cc: xen-devel, David Woodhouse, abelgun

On 25/07/18 12:06, bercarug@amazon.com wrote:
> On 07/24/2018 12:54 PM, Jan Beulich wrote:
>>>>> On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
>>> For the last few days, I have been trying to get a PVH dom0 running,
>>> however I encountered the following problem: the system seems to
>>> freeze after the hypervisor boots, the screen goes black. I have
>>> tried to
>>> debug it via a serial console (using Minicom) and managed to get some
>>> more Xen output, after the screen turns black.
>>>
>>> I mention that I have tried to boot the PVH dom0 using different kernel
>>> images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11,
>>> 4.12).
>>>
>>> Below I attached my system / hypervisor configuration, as well as the
>>> output captured through the serial console, corresponding to the latest
>>> versions for Xen and the Linux Kernel (Xen staging and Kernel from the
>>> xen/tip tree).
>>> [...]
>>> (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
>>> (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
>>> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr
>>> 8deb3000, iommu reg = ffff82c00021b000
>>> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
>>> (XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
>>> (XEN) root_entry[00] = 1021c60001
>>> (XEN) context[a0] = 2_1021d6d001
>>> (XEN) l4[000] = 9c00001021d6c107
>>> (XEN) l3[002] = 9c00001021d3e107
>>> (XEN) l2[06f] = 9c000010218c0107
>>> (XEN) l1[0b3] = 8000000000000000
>>> (XEN) l1[0b3] not present
>>> (XEN) Dom0 callback via changed to Direct Vector 0xf3
>> This might be a hint at a missing RMRR entry in the ACPI tables, as
>> we've seen to be the case for a number of systems (I dare to guess
>> that 0000:00:14.0 is a USB controller, perhaps one with a keyboard
>> and/or mouse connected). You may want to play with the respective
>> command line option ("rmrr="). Note that "iommu_inclusive_mapping"
>> as you're using it does not have any meaning for PVH (see
>> intel_iommu_hwdom_init()).
>>
>> Jan
>>
>>
>>
> Hello,
> 
> Following Roger's advice, I rebuilt Xen (4.12) using the staging branch
> and I managed to get a PVH dom0 starting. However, some other problems
> appeared:
> 
> 1) The USB devices are not usable anymore (keyboard and mouse), so the
> system is only accessible through the serial port.
> 
> 2) I can run any usual command in dom0, but the ones involving xl
> (except for xl info) will make the system run out of memory very fast.
> Eventually, when there is no more free memory available, the OOM killer
> begins removing processes until the system auto reboots.
> 
> I attached a file containing the output of a lsusb, as well as the
> output of xl info and xl list -l.
> After xl list -l, the “free -m” commands show the available memory
> decreasing.
> Each command has a timestamp appended, so it can be seen how fast the
> available memory decreases.
> 
> I removed much of the process killing logs and kept the last one, since
> they were following the same pattern.
> 
> Dom0 still appears to be of type PV (output of xl list -l), however
> during boot, the following messages were displayed: “Building a PVH
> Dom0” and “Booting paravirtualized kernel on Xen PVH”.

This is a problem in xen-init-dom0, which will set "PV" for dom0
unconditionally.

> I mention that I had to add “workaround_bios_bug” in GRUB_CMDLINE_XEN
> for iommu to get dom0 running.
> 
> What could be causing the available memory loss problem?

Hmm, good question.

xl list is using domctl hypercalls, while xl info is using sysctl ones.
Can you test whether xl cpupool-list is causing memory loss, too? That's
another sysctl based command. As is xl dmesg.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 10:06   ` bercarug
  2018-07-25 10:22     ` Wei Liu
  2018-07-25 10:43     ` Juergen Gross
@ 2018-07-25 13:35     ` Roger Pau Monné
  2018-07-25 13:41       ` Juergen Gross
  2018-07-25 13:57       ` bercarug
  2 siblings, 2 replies; 47+ messages in thread
From: Roger Pau Monné @ 2018-07-25 13:35 UTC (permalink / raw)
  To: bercarug; +Cc: xen-devel, David Woodhouse, Jan Beulich, abelgun

On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@amazon.com wrote:
> On 07/24/2018 12:54 PM, Jan Beulich wrote:
> > > > > On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
> > > For the last few days, I have been trying to get a PVH dom0 running,
> > > however I encountered the following problem: the system seems to
> > > freeze after the hypervisor boots, the screen goes black. I have tried to
> > > debug it via a serial console (using Minicom) and managed to get some
> > > more Xen output, after the screen turns black.
> > > 
> > > I mention that I have tried to boot the PVH dom0 using different kernel
> > > images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 4.12).
> > > 
> > > Below I attached my system / hypervisor configuration, as well as the
> > > output captured through the serial console, corresponding to the latest
> > > versions for Xen and the Linux Kernel (Xen staging and Kernel from the
> > > xen/tip tree).
> > > [...]
> > > (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
> > > (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
> > > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 8deb3000, iommu reg = ffff82c00021b000

Can you figure out which PCI device is 00:14.0?

> > > (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
> > > (XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
> > > (XEN) root_entry[00] = 1021c60001
> > > (XEN) context[a0] = 2_1021d6d001
> > > (XEN) l4[000] = 9c00001021d6c107
> > > (XEN) l3[002] = 9c00001021d3e107
> > > (XEN) l2[06f] = 9c000010218c0107
> > > (XEN) l1[0b3] = 8000000000000000
> > > (XEN) l1[0b3] not present
> > > (XEN) Dom0 callback via changed to Direct Vector 0xf3
> > This might be a hint at a missing RMRR entry in the ACPI tables, as
> > we've seen to be the case for a number of systems (I dare to guess
> > that 0000:00:14.0 is a USB controller, perhaps one with a keyboard
> > and/or mouse connected). You may want to play with the respective
> > command line option ("rmrr="). Note that "iommu_inclusive_mapping"
> > as you're using it does not have any meaning for PVH (see
> > intel_iommu_hwdom_init()).
> > 
> > Jan
> > 
> > 
> > 
> Hello,
> 
> Following Roger's advice, I rebuilt Xen (4.12) using the staging branch and
> I managed to get a PVH dom0 starting. However, some other problems appeared:
> 
> 1) The USB devices are not usable anymore (keyboard and mouse), so the
> system is only accessible through the serial port.

Can you boot with iommu=debug and see if you get any extra IOMMU
information on the serial console?

> 2) I can run any usual command in dom0, but the ones involving xl (except
> for xl info) will make the system run out of memory very fast. Eventually,
> when there is no more free memory available, the OOM killer begins removing
> processes until the system auto reboots.
> 
> I attached a file containing the output of a lsusb, as well as the output of
> xl info and xl list -l.
> After xl list -l, the “free -m” commands show the available memory
> decreasing.
> Each command has a timestamp appended, so it can be seen how fast the
> available memory decreases.
> 
> I removed much of the process killing logs and kept the last one, since they
> were following the same pattern.
> 
> Dom0 still appears to be of type PV (output of xl list -l), however during
> boot, the following messages were displayed: “Building a PVH Dom0” and
> “Booting paravirtualized kernel on Xen PVH”.
> 
> I mention that I had to add “workaround_bios_bug” in GRUB_CMDLINE_XEN for
> iommu to get dom0 running.

It seems to me like your ACPI DMAR table contains errors, and I
wouldn't be surprised if those also cause the USB devices to
malfunction.

> 
> What could be causing the available memory loss problem?

That seems to be Linux aggressively ballooning out memory, you go from
7129M total memory to 246M. Are you creating a lot of domains?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 13:35     ` Roger Pau Monné
@ 2018-07-25 13:41       ` Juergen Gross
  2018-07-25 14:02         ` Wei Liu
  2018-07-25 13:57       ` bercarug
  1 sibling, 1 reply; 47+ messages in thread
From: Juergen Gross @ 2018-07-25 13:41 UTC (permalink / raw)
  To: Roger Pau Monné, bercarug
  Cc: xen-devel, David Woodhouse, Jan Beulich, abelgun

On 25/07/18 15:35, Roger Pau Monné wrote:
> On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@amazon.com wrote:
>> On 07/24/2018 12:54 PM, Jan Beulich wrote:
>>>>>> On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
>>>> For the last few days, I have been trying to get a PVH dom0 running,
>>>> however I encountered the following problem: the system seems to
>>>> freeze after the hypervisor boots, the screen goes black. I have tried to
>>>> debug it via a serial console (using Minicom) and managed to get some
>>>> more Xen output, after the screen turns black.
>>>>
>>>> I mention that I have tried to boot the PVH dom0 using different kernel
>>>> images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 4.12).
>>>>
>>>> Below I attached my system / hypervisor configuration, as well as the
>>>> output captured through the serial console, corresponding to the latest
>>>> versions for Xen and the Linux Kernel (Xen staging and Kernel from the
>>>> xen/tip tree).
>>>> [...]
>>>> (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
>>>> (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
>>>> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 8deb3000, iommu reg = ffff82c00021b000
> 
> Can you figure out which PCI device is 00:14.0?
> 
>>>> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
>>>> (XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
>>>> (XEN) root_entry[00] = 1021c60001
>>>> (XEN) context[a0] = 2_1021d6d001
>>>> (XEN) l4[000] = 9c00001021d6c107
>>>> (XEN) l3[002] = 9c00001021d3e107
>>>> (XEN) l2[06f] = 9c000010218c0107
>>>> (XEN) l1[0b3] = 8000000000000000
>>>> (XEN) l1[0b3] not present
>>>> (XEN) Dom0 callback via changed to Direct Vector 0xf3
>>> This might be a hint at a missing RMRR entry in the ACPI tables, as
>>> we've seen to be the case for a number of systems (I dare to guess
>>> that 0000:00:14.0 is a USB controller, perhaps one with a keyboard
>>> and/or mouse connected). You may want to play with the respective
>>> command line option ("rmrr="). Note that "iommu_inclusive_mapping"
>>> as you're using it does not have any meaning for PVH (see
>>> intel_iommu_hwdom_init()).
>>>
>>> Jan
>>>
>>>
>>>
>> Hello,
>>
>> Following Roger's advice, I rebuilt Xen (4.12) using the staging branch and
>> I managed to get a PVH dom0 starting. However, some other problems appeared:
>>
>> 1) The USB devices are not usable anymore (keyboard and mouse), so the
>> system is only accessible through the serial port.
> 
> Can you boot with iommu=debug and see if you get any extra IOMMU
> information on the serial console?
> 
>> 2) I can run any usual command in dom0, but the ones involving xl (except
>> for xl info) will make the system run out of memory very fast. Eventually,
>> when there is no more free memory available, the OOM killer begins removing
>> processes until the system auto reboots.
>>
>> I attached a file containing the output of a lsusb, as well as the output of
>> xl info and xl list -l.
>> After xl list -l, the “free -m” commands show the available memory
>> decreasing.
>> Each command has a timestamp appended, so it can be seen how fast the
>> available memory decreases.
>>
>> I removed much of the process killing logs and kept the last one, since they
>> were following the same pattern.
>>
>> Dom0 still appears to be of type PV (output of xl list -l), however during
>> boot, the following messages were displayed: “Building a PVH Dom0” and
>> “Booting paravirtualized kernel on Xen PVH”.
>>
>> I mention that I had to add “workaround_bios_bug” in GRUB_CMDLINE_XEN for
>> iommu to get dom0 running.
> 
> It seems to me like your ACPI DMAR table contains errors, and I
> wouldn't be surprised if those also cause the USB devices to
> malfunction.
> 
>>
>> What could be causing the available memory loss problem?
> 
> That seems to be Linux aggressively ballooning out memory, you go from
> 7129M total memory to 246M. Are you creating a lot of domains?

This might be related to the tools thinking dom0 is a PV domain.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 13:35     ` Roger Pau Monné
  2018-07-25 13:41       ` Juergen Gross
@ 2018-07-25 13:57       ` bercarug
  2018-07-25 14:12         ` Roger Pau Monné
  1 sibling, 1 reply; 47+ messages in thread
From: bercarug @ 2018-07-25 13:57 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, David Woodhouse, Jan Beulich, abelgun

On 07/25/2018 04:35 PM, Roger Pau Monné wrote:
> On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@amazon.com wrote:
>> On 07/24/2018 12:54 PM, Jan Beulich wrote:
>>>>>> On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
>>>> For the last few days, I have been trying to get a PVH dom0 running,
>>>> however I encountered the following problem: the system seems to
>>>> freeze after the hypervisor boots, the screen goes black. I have tried to
>>>> debug it via a serial console (using Minicom) and managed to get some
>>>> more Xen output, after the screen turns black.
>>>>
>>>> I mention that I have tried to boot the PVH dom0 using different kernel
>>>> images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 4.12).
>>>>
>>>> Below I attached my system / hypervisor configuration, as well as the
>>>> output captured through the serial console, corresponding to the latest
>>>> versions for Xen and the Linux Kernel (Xen staging and Kernel from the
>>>> xen/tip tree).
>>>> [...]
>>>> (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
>>>> (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
>>>> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 8deb3000, iommu reg = ffff82c00021b000
> Can you figure out which PCI device is 00:14.0?
This is the output of lspci -vvv for device 00:14.0:

00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI 
Controller (rev 31) (prog-if 30 [XHCI])
         Subsystem: Intel Corporation Sunrise Point-H USB 3.0 xHCI 
Controller
         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr+ Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium 
 >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
         Latency: 0
         Interrupt: pin A routed to IRQ 178
         Region 0: Memory at a2e00000 (64-bit, non-prefetchable) [size=64K]
         Capabilities: [70] Power Management version 2
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA 
PME(D0-,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
         Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
                 Address: 00000000fee0e000  Data: 4021
         Kernel driver in use: xhci_hcd
         Kernel modules: xhci_pci
>>>> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
>>>> (XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
>>>> (XEN) root_entry[00] = 1021c60001
>>>> (XEN) context[a0] = 2_1021d6d001
>>>> (XEN) l4[000] = 9c00001021d6c107
>>>> (XEN) l3[002] = 9c00001021d3e107
>>>> (XEN) l2[06f] = 9c000010218c0107
>>>> (XEN) l1[0b3] = 8000000000000000
>>>> (XEN) l1[0b3] not present
>>>> (XEN) Dom0 callback via changed to Direct Vector 0xf3
>>> This might be a hint at a missing RMRR entry in the ACPI tables, as
>>> we've seen to be the case for a number of systems (I dare to guess
>>> that 0000:00:14.0 is a USB controller, perhaps one with a keyboard
>>> and/or mouse connected). You may want to play with the respective
>>> command line option ("rmrr="). Note that "iommu_inclusive_mapping"
>>> as you're using it does not have any meaning for PVH (see
>>> intel_iommu_hwdom_init()).
>>>
>>> Jan
>>>
>>>
>>>
>> Hello,
>>
>> Following Roger's advice, I rebuilt Xen (4.12) using the staging branch and
>> I managed to get a PVH dom0 starting. However, some other problems appeared:
>>
>> 1) The USB devices are not usable anymore (keyboard and mouse), so the
>> system is only accessible through the serial port.
> Can you boot with iommu=debug and see if you get any extra IOMMU
> information on the serial console?
The debug flag was already set, so the log I attached on the first
message already contains the IOMMU info.
In Xen's command line I used iommu=debug,verbose,workaround_bios_bug.
>
>> 2) I can run any usual command in dom0, but the ones involving xl (except
>> for xl info) will make the system run out of memory very fast. Eventually,
>> when there is no more free memory available, the OOM killer begins removing
>> processes until the system auto reboots.
>>
>> I attached a file containing the output of a lsusb, as well as the output of
>> xl info and xl list -l.
>> After xl list -l, the “free -m” commands show the available memory
>> decreasing.
>> Each command has a timestamp appended, so it can be seen how fast the
>> available memory decreases.
>>
>> I removed much of the process killing logs and kept the last one, since they
>> were following the same pattern.
>>
>> Dom0 still appears to be of type PV (output of xl list -l), however during
>> boot, the following messages were displayed: “Building a PVH Dom0” and
>> “Booting paravirtualized kernel on Xen PVH”.
>>
>> I mention that I had to add “workaround_bios_bug” in GRUB_CMDLINE_XEN for
>> iommu to get dom0 running.
> It seems to me like your ACPI DMAR table contains errors, and I
> wouldn't be surprised if those also cause the USB devices to
> malfunction.
>
>> What could be causing the available memory loss problem?
> That seems to be Linux aggressively ballooning out memory, you go from
> 7129M total memory to 246M. Are you creating a lot of domains?
>
> Roger.
>
I did not create any guest before issuing "xl list -l". However, creating

a PVH domU will work - "xl create <cfg_file>" does not produce this 
behavior.


Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 13:41       ` Juergen Gross
@ 2018-07-25 14:02         ` Wei Liu
  2018-07-25 14:05           ` bercarug
  0 siblings, 1 reply; 47+ messages in thread
From: Wei Liu @ 2018-07-25 14:02 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Wei Liu, Jan Beulich, abelgun, xen-devel, Roger Pau Monné,
	David Woodhouse, bercarug

On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
> On 25/07/18 15:35, Roger Pau Monné wrote:
> > 
> >>
> >> What could be causing the available memory loss problem?
> > 
> > That seems to be Linux aggressively ballooning out memory, you go from
> > 7129M total memory to 246M. Are you creating a lot of domains?
> 
> This might be related to the tools thinking dom0 is a PV domain.

Good point.

In that case, xenstore-ls -fp would also be useful. The output should
show the balloon target for Dom0.

You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
if it makes any difference.

Wei.

> 
> 
> Juergen
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 14:02         ` Wei Liu
@ 2018-07-25 14:05           ` bercarug
  2018-07-25 14:10             ` Wei Liu
  2018-07-25 16:12             ` Roger Pau Monné
  0 siblings, 2 replies; 47+ messages in thread
From: bercarug @ 2018-07-25 14:05 UTC (permalink / raw)
  To: Wei Liu, Juergen Gross
  Cc: xen-devel, abelgun, David Woodhouse, Jan Beulich, Roger Pau Monné

On 07/25/2018 05:02 PM, Wei Liu wrote:
> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
>> On 25/07/18 15:35, Roger Pau Monné wrote:
>>>> What could be causing the available memory loss problem?
>>> That seems to be Linux aggressively ballooning out memory, you go from
>>> 7129M total memory to 246M. Are you creating a lot of domains?
>> This might be related to the tools thinking dom0 is a PV domain.
> Good point.
>
> In that case, xenstore-ls -fp would also be useful. The output should
> show the balloon target for Dom0.
>
> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
> if it makes any difference.
>
> Wei.
Also tried setting autoballooning off, but it had no effect.

Gabriel
>
>>
>> Juergen
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xenproject.org
>> https://lists.xenproject.org/mailman/listinfo/xen-devel





Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 14:05           ` bercarug
@ 2018-07-25 14:10             ` Wei Liu
  2018-07-25 16:12             ` Roger Pau Monné
  1 sibling, 0 replies; 47+ messages in thread
From: Wei Liu @ 2018-07-25 14:10 UTC (permalink / raw)
  To: bercarug
  Cc: Juergen Gross, Wei Liu, Jan Beulich, abelgun, xen-devel,
	David Woodhouse, Roger Pau Monné

On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
> On 07/25/2018 05:02 PM, Wei Liu wrote:
> > On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
> > > On 25/07/18 15:35, Roger Pau Monné wrote:
> > > > > What could be causing the available memory loss problem?
> > > > That seems to be Linux aggressively ballooning out memory, you go from
> > > > 7129M total memory to 246M. Are you creating a lot of domains?
> > > This might be related to the tools thinking dom0 is a PV domain.
> > Good point.
> > 
> > In that case, xenstore-ls -fp would also be useful. The output should
> > show the balloon target for Dom0.
> > 
> > You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
> > if it makes any difference.
> > 
> > Wei.
> Also tried setting autoballooning off, but it had no effect.

What does xenstore-ls -fp says? Is the target for Dom0 something
sensible? If yes, then the leak is not related to ballooning.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 13:57       ` bercarug
@ 2018-07-25 14:12         ` Roger Pau Monné
  2018-07-25 16:19           ` Paul Durrant
  0 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monné @ 2018-07-25 14:12 UTC (permalink / raw)
  To: bercarug; +Cc: xen-devel, David Woodhouse, Jan Beulich, abelgun

On Wed, Jul 25, 2018 at 04:57:23PM +0300, bercarug@amazon.com wrote:
> On 07/25/2018 04:35 PM, Roger Pau Monné wrote:
> > On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@amazon.com wrote:
> > > On 07/24/2018 12:54 PM, Jan Beulich wrote:
> > > > > > > On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
> > > > > For the last few days, I have been trying to get a PVH dom0 running,
> > > > > however I encountered the following problem: the system seems to
> > > > > freeze after the hypervisor boots, the screen goes black. I have tried to
> > > > > debug it via a serial console (using Minicom) and managed to get some
> > > > > more Xen output, after the screen turns black.
> > > > > 
> > > > > I mention that I have tried to boot the PVH dom0 using different kernel
> > > > > images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 4.12).
> > > > > 
> > > > > Below I attached my system / hypervisor configuration, as well as the
> > > > > output captured through the serial console, corresponding to the latest
> > > > > versions for Xen and the Linux Kernel (Xen staging and Kernel from the
> > > > > xen/tip tree).
> > > > > [...]
> > > > > (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
> > > > > (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
> > > > > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 8deb3000, iommu reg = ffff82c00021b000
> > Can you figure out which PCI device is 00:14.0?
> This is the output of lspci -vvv for device 00:14.0:
> 
> 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI
> Controller (rev 31) (prog-if 30 [XHCI])
>         Subsystem: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+
> Stepping- SERR+ FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
> <TAbort- <MAbort+ >SERR- <PERR- INTx-
>         Latency: 0
>         Interrupt: pin A routed to IRQ 178
>         Region 0: Memory at a2e00000 (64-bit, non-prefetchable) [size=64K]
>         Capabilities: [70] Power Management version 2
>                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA
> PME(D0-,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>         Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
>                 Address: 00000000fee0e000  Data: 4021
>         Kernel driver in use: xhci_hcd
>         Kernel modules: xhci_pci

I'm afraid your USB controller is missing RMRR entries in the DMAR
ACPI tables, thus causing the IOMMU faults and not working properly.

You could try to manually add some extra rmrr regions by appending:

rmrr=0x8deb3=0:0:14.0

To the Xen command line, and keep adding any address that pops up in
the iommu faults. This is of course quite cumbersome, but there's no
way to get the required memory addresses if the data in RMRR is
wrong/incomplete.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 14:05           ` bercarug
  2018-07-25 14:10             ` Wei Liu
@ 2018-07-25 16:12             ` Roger Pau Monné
  2018-07-25 16:29               ` Juergen Gross
  2018-07-26  8:15               ` bercarug
  1 sibling, 2 replies; 47+ messages in thread
From: Roger Pau Monné @ 2018-07-25 16:12 UTC (permalink / raw)
  To: bercarug
  Cc: Juergen Gross, Wei Liu, Jan Beulich, abelgun, xen-devel, David Woodhouse

On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
> On 07/25/2018 05:02 PM, Wei Liu wrote:
> > On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
> > > On 25/07/18 15:35, Roger Pau Monné wrote:
> > > > > What could be causing the available memory loss problem?
> > > > That seems to be Linux aggressively ballooning out memory, you go from
> > > > 7129M total memory to 246M. Are you creating a lot of domains?
> > > This might be related to the tools thinking dom0 is a PV domain.
> > Good point.
> > 
> > In that case, xenstore-ls -fp would also be useful. The output should
> > show the balloon target for Dom0.
> > 
> > You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
> > if it makes any difference.
> > 
> > Wei.
> Also tried setting autoballooning off, but it had no effect.

This is a Linux/libxl issue that I'm not sure what's the best way to
solve. Linux has the following 'workaround' in the balloon driver:

err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
		   &static_max);
if (err != 1)
	static_max = new_target;
else
	static_max >>= PAGE_SHIFT - 10;
target_diff = xen_pv_domain() ? 0
		: static_max - balloon_stats.target_pages;

I suppose this is used to cope with the memory reporting mismatch
usually seen on HVM guests. This however interacts quite badly on a
PVH Dom0 that has for example:

/local/domain/0/memory/target = "8391840"   (n0)
/local/domain/0/memory/static-max = "17179869180"   (n0)

One way to solve this is to set target and static-max to the same
value initially, so that target_diff on Linux is 0. Another option
would be to force target_diff = 0 for Dom0.

I'm attaching a patch for libxl that should solve this, could you
please give it a try and report back?

I'm still unsure however about the best way to fix this, need to think
about it.

Roger.
---8<---
diff --git a/tools/libxl/libxl_mem.c b/tools/libxl/libxl_mem.c
index e551e09fed..2c984993d8 100644
--- a/tools/libxl/libxl_mem.c
+++ b/tools/libxl/libxl_mem.c
@@ -151,7 +151,9 @@ retry_transaction:
         *target_memkb = info.current_memkb;
     }
     if (staticmax == NULL) {
-        libxl__xs_printf(gc, t, max_path, "%"PRIu64, info.max_memkb);
+        libxl__xs_printf(gc, t, max_path, "%"PRIu64,
+                         libxl__domain_type(gc, 0) == LIBXL_DOMAIN_TYPE_PV ?
+                         info.max_memkb : info.current_memkb);
         *max_memkb = info.max_memkb;
     }
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 14:12         ` Roger Pau Monné
@ 2018-07-25 16:19           ` Paul Durrant
  2018-07-26 16:46             ` Roger Pau Monné
  0 siblings, 1 reply; 47+ messages in thread
From: Paul Durrant @ 2018-07-25 16:19 UTC (permalink / raw)
  To: Roger Pau Monne, bercarug
  Cc: xen-devel, David Woodhouse, Jan Beulich, abelgun

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> Of Roger Pau Monné
> Sent: 25 July 2018 15:12
> To: bercarug@amazon.com
> Cc: xen-devel <xen-devel@lists.xenproject.org>; David Woodhouse
> <dwmw2@infradead.org>; Jan Beulich <JBeulich@suse.com>;
> abelgun@amazon.com
> Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
> 
> On Wed, Jul 25, 2018 at 04:57:23PM +0300, bercarug@amazon.com wrote:
> > On 07/25/2018 04:35 PM, Roger Pau Monné wrote:
> > > On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@amazon.com
> wrote:
> > > > On 07/24/2018 12:54 PM, Jan Beulich wrote:
> > > > > > > > On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
> > > > > > For the last few days, I have been trying to get a PVH dom0 running,
> > > > > > however I encountered the following problem: the system seems
> to
> > > > > > freeze after the hypervisor boots, the screen goes black. I have
> tried to
> > > > > > debug it via a serial console (using Minicom) and managed to get
> some
> > > > > > more Xen output, after the screen turns black.
> > > > > >
> > > > > > I mention that I have tried to boot the PVH dom0 using different
> kernel
> > > > > > images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11,
> 4.12).
> > > > > >
> > > > > > Below I attached my system / hypervisor configuration, as well as
> the
> > > > > > output captured through the serial console, corresponding to the
> latest
> > > > > > versions for Xen and the Linux Kernel (Xen staging and Kernel from
> the
> > > > > > xen/tip tree).
> > > > > > [...]
> > > > > > (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
> > > > > > (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending
> Fault
> > > > > > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault
> addr 8deb3000, iommu reg = ffff82c00021b000
> > > Can you figure out which PCI device is 00:14.0?
> > This is the output of lspci -vvv for device 00:14.0:
> >
> > 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI
> > Controller (rev 31) (prog-if 30 [XHCI])
> >         Subsystem: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller
> >         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr+
> > Stepping- SERR+ FastB2B- DisINTx+
> >         Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
> > <TAbort- <MAbort+ >SERR- <PERR- INTx-
> >         Latency: 0
> >         Interrupt: pin A routed to IRQ 178
> >         Region 0: Memory at a2e00000 (64-bit, non-prefetchable) [size=64K]
> >         Capabilities: [70] Power Management version 2
> >                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA
> > PME(D0-,D1-,D2-,D3hot+,D3cold+)
> >                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> >         Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
> >                 Address: 00000000fee0e000  Data: 4021
> >         Kernel driver in use: xhci_hcd
> >         Kernel modules: xhci_pci
> 
> I'm afraid your USB controller is missing RMRR entries in the DMAR
> ACPI tables, thus causing the IOMMU faults and not working properly.
> 
> You could try to manually add some extra rmrr regions by appending:
> 
> rmrr=0x8deb3=0:0:14.0
> 
> To the Xen command line, and keep adding any address that pops up in
> the iommu faults. This is of course quite cumbersome, but there's no
> way to get the required memory addresses if the data in RMRR is
> wrong/incomplete.
> 

You could just add all E820 reserved regions in there. That will almost certainly cover it.

  Paul

> Roger.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 16:12             ` Roger Pau Monné
@ 2018-07-25 16:29               ` Juergen Gross
  2018-07-25 18:56                 ` [Memory Accounting] was: " Andrew Cooper
  2018-07-26 11:08                 ` Roger Pau Monné
  2018-07-26  8:15               ` bercarug
  1 sibling, 2 replies; 47+ messages in thread
From: Juergen Gross @ 2018-07-25 16:29 UTC (permalink / raw)
  To: Roger Pau Monné, bercarug
  Cc: Wei Liu, Jan Beulich, abelgun, xen-devel, Boris Ostrovsky,
	David Woodhouse

On 25/07/18 18:12, Roger Pau Monné wrote:
> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
>> On 07/25/2018 05:02 PM, Wei Liu wrote:
>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
>>>>>> What could be causing the available memory loss problem?
>>>>> That seems to be Linux aggressively ballooning out memory, you go from
>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
>>>> This might be related to the tools thinking dom0 is a PV domain.
>>> Good point.
>>>
>>> In that case, xenstore-ls -fp would also be useful. The output should
>>> show the balloon target for Dom0.
>>>
>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
>>> if it makes any difference.
>>>
>>> Wei.
>> Also tried setting autoballooning off, but it had no effect.
> 
> This is a Linux/libxl issue that I'm not sure what's the best way to
> solve. Linux has the following 'workaround' in the balloon driver:
> 
> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
> 		   &static_max);
> if (err != 1)
> 	static_max = new_target;
> else
> 	static_max >>= PAGE_SHIFT - 10;
> target_diff = xen_pv_domain() ? 0
> 		: static_max - balloon_stats.target_pages;

Hmm, shouldn't PVH behave the same way as PV here? I don't think
there is memory missing for PVH, opposed to HVM's firmware memory.

Adding Boris for a second opinion.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-25 16:29               ` Juergen Gross
@ 2018-07-25 18:56                 ` Andrew Cooper
  2018-07-25 23:07                   ` Boris Ostrovsky
  2018-07-26 11:08                 ` Roger Pau Monné
  1 sibling, 1 reply; 47+ messages in thread
From: Andrew Cooper @ 2018-07-25 18:56 UTC (permalink / raw)
  To: Juergen Gross, Roger Pau Monné, bercarug
  Cc: Wei Liu, Jan Beulich, abelgun, xen-devel, Boris Ostrovsky,
	David Woodhouse

On 25/07/18 17:29, Juergen Gross wrote:
> On 25/07/18 18:12, Roger Pau Monné wrote:
>> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
>>> On 07/25/2018 05:02 PM, Wei Liu wrote:
>>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
>>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
>>>>>>> What could be causing the available memory loss problem?
>>>>>> That seems to be Linux aggressively ballooning out memory, you go from
>>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
>>>>> This might be related to the tools thinking dom0 is a PV domain.
>>>> Good point.
>>>>
>>>> In that case, xenstore-ls -fp would also be useful. The output should
>>>> show the balloon target for Dom0.
>>>>
>>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
>>>> if it makes any difference.
>>>>
>>>> Wei.
>>> Also tried setting autoballooning off, but it had no effect.
>> This is a Linux/libxl issue that I'm not sure what's the best way to
>> solve. Linux has the following 'workaround' in the balloon driver:
>>
>> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
>> 		   &static_max);
>> if (err != 1)
>> 	static_max = new_target;
>> else
>> 	static_max >>= PAGE_SHIFT - 10;
>> target_diff = xen_pv_domain() ? 0
>> 		: static_max - balloon_stats.target_pages;
> Hmm, shouldn't PVH behave the same way as PV here? I don't think
> there is memory missing for PVH, opposed to HVM's firmware memory.
>
> Adding Boris for a second opinion.

/sigh

<rant>

Ballooning, and guest memory accounting is a known, growing, clustermess
of swamps.  The ballooning protocol itself is sufficiently broken as to
be useless outside of contrived scenarios, owing to the lack of any
ability to nack the request and the guest not knowing or being able to
work out how much RAM it actually has.

The Xen/toolstack/qemu-{trad,upstream}/hvmloader guessathon contributes
to lots of corner cases where things explode spectacularly on migration,
such as having more than 4 network cards, or having vram != 64M, or
generally anything involving PCI Passthrough.

Can we take this hint that maybe its time to try fixing the problem
properly rather than applying even more duct tape?  I'd like to remind
people that there is a design which has been discussed at various
conferences in the past, not not overly objected to.

</rant>

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-25 18:56                 ` [Memory Accounting] was: " Andrew Cooper
@ 2018-07-25 23:07                   ` Boris Ostrovsky
  2018-07-26  9:41                     ` Juergen Gross
  2018-07-26  9:45                     ` George Dunlap
  0 siblings, 2 replies; 47+ messages in thread
From: Boris Ostrovsky @ 2018-07-25 23:07 UTC (permalink / raw)
  To: Andrew Cooper, Juergen Gross, Roger Pau Monné, bercarug
  Cc: xen-devel, Wei Liu, David Woodhouse, Jan Beulich, abelgun

On 07/25/2018 02:56 PM, Andrew Cooper wrote:
> On 25/07/18 17:29, Juergen Gross wrote:
>> On 25/07/18 18:12, Roger Pau Monné wrote:
>>> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
>>>> On 07/25/2018 05:02 PM, Wei Liu wrote:
>>>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
>>>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
>>>>>>>> What could be causing the available memory loss problem?
>>>>>>> That seems to be Linux aggressively ballooning out memory, you go from
>>>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
>>>>>> This might be related to the tools thinking dom0 is a PV domain.
>>>>> Good point.
>>>>>
>>>>> In that case, xenstore-ls -fp would also be useful. The output should
>>>>> show the balloon target for Dom0.
>>>>>
>>>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
>>>>> if it makes any difference.
>>>>>
>>>>> Wei.
>>>> Also tried setting autoballooning off, but it had no effect.
>>> This is a Linux/libxl issue that I'm not sure what's the best way to
>>> solve. Linux has the following 'workaround' in the balloon driver:
>>>
>>> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
>>> 		   &static_max);
>>> if (err != 1)
>>> 	static_max = new_target;
>>> else
>>> 	static_max >>= PAGE_SHIFT - 10;
>>> target_diff = xen_pv_domain() ? 0
>>> 		: static_max - balloon_stats.target_pages;
>> Hmm, shouldn't PVH behave the same way as PV here? I don't think
>> there is memory missing for PVH, opposed to HVM's firmware memory.
>>
>> Adding Boris for a second opinion.

(Notwithstanding Andrews' rant below ;-))

I am trying to remember --- what memory were we trying not to online for
HVM here?


-boris


> /sigh
>
> <rant>
>
> Ballooning, and guest memory accounting is a known, growing, clustermess
> of swamps.  The ballooning protocol itself is sufficiently broken as to
> be useless outside of contrived scenarios, owing to the lack of any
> ability to nack the request and the guest not knowing or being able to
> work out how much RAM it actually has.
>
> The Xen/toolstack/qemu-{trad,upstream}/hvmloader guessathon contributes
> to lots of corner cases where things explode spectacularly on migration,
> such as having more than 4 network cards, or having vram != 64M, or
> generally anything involving PCI Passthrough.
>
> Can we take this hint that maybe its time to try fixing the problem
> properly rather than applying even more duct tape?  I'd like to remind
> people that there is a design which has been discussed at various
> conferences in the past, not not overly objected to.
>
> </rant>
>
> ~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 16:12             ` Roger Pau Monné
  2018-07-25 16:29               ` Juergen Gross
@ 2018-07-26  8:15               ` bercarug
  2018-07-26  8:31                 ` Juergen Gross
  1 sibling, 1 reply; 47+ messages in thread
From: bercarug @ 2018-07-26  8:15 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Wei Liu, Jan Beulich, abelgun, xen-devel, David Woodhouse

On 07/25/2018 07:12 PM, Roger Pau Monné wrote:
> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
>> On 07/25/2018 05:02 PM, Wei Liu wrote:
>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
>>>>>> What could be causing the available memory loss problem?
>>>>> That seems to be Linux aggressively ballooning out memory, you go from
>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
>>>> This might be related to the tools thinking dom0 is a PV domain.
>>> Good point.
>>>
>>> In that case, xenstore-ls -fp would also be useful. The output should
>>> show the balloon target for Dom0.
>>>
>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
>>> if it makes any difference.
>>>
>>> Wei.
>> Also tried setting autoballooning off, but it had no effect.
> This is a Linux/libxl issue that I'm not sure what's the best way to
> solve. Linux has the following 'workaround' in the balloon driver:
>
> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
> 		   &static_max);
> if (err != 1)
> 	static_max = new_target;
> else
> 	static_max >>= PAGE_SHIFT - 10;
> target_diff = xen_pv_domain() ? 0
> 		: static_max - balloon_stats.target_pages;
>
> I suppose this is used to cope with the memory reporting mismatch
> usually seen on HVM guests. This however interacts quite badly on a
> PVH Dom0 that has for example:
>
> /local/domain/0/memory/target = "8391840"   (n0)
> /local/domain/0/memory/static-max = "17179869180"   (n0)
>
> One way to solve this is to set target and static-max to the same
> value initially, so that target_diff on Linux is 0. Another option
> would be to force target_diff = 0 for Dom0.
>
> I'm attaching a patch for libxl that should solve this, could you
> please give it a try and report back?
>
> I'm still unsure however about the best way to fix this, need to think
> about it.
>
> Roger.
> ---8<---
> diff --git a/tools/libxl/libxl_mem.c b/tools/libxl/libxl_mem.c
> index e551e09fed..2c984993d8 100644
> --- a/tools/libxl/libxl_mem.c
> +++ b/tools/libxl/libxl_mem.c
> @@ -151,7 +151,9 @@ retry_transaction:
>           *target_memkb = info.current_memkb;
>       }
>       if (staticmax == NULL) {
> -        libxl__xs_printf(gc, t, max_path, "%"PRIu64, info.max_memkb);
> +        libxl__xs_printf(gc, t, max_path, "%"PRIu64,
> +                         libxl__domain_type(gc, 0) == LIBXL_DOMAIN_TYPE_PV ?
> +                         info.max_memkb : info.current_memkb);
>           *max_memkb = info.max_memkb;
>       }
>   
>
>
I have tried Roger's patch and it fixed the memory decrease problem. "xl 
list -l"

no longer causes any issue.

The output of "xenstore-ls -fp" shows that both target and static-max 
are now

set to the same value.


Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-26  8:15               ` bercarug
@ 2018-07-26  8:31                 ` Juergen Gross
  2018-07-26 11:05                   ` Roger Pau Monné
  0 siblings, 1 reply; 47+ messages in thread
From: Juergen Gross @ 2018-07-26  8:31 UTC (permalink / raw)
  To: bercarug, Roger Pau Monné
  Cc: xen-devel, Wei Liu, David Woodhouse, Jan Beulich, abelgun

On 26/07/18 10:15, bercarug@amazon.com wrote:
> On 07/25/2018 07:12 PM, Roger Pau Monné wrote:
>> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
>>> On 07/25/2018 05:02 PM, Wei Liu wrote:
>>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
>>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
>>>>>>> What could be causing the available memory loss problem?
>>>>>> That seems to be Linux aggressively ballooning out memory, you go
>>>>>> from
>>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
>>>>> This might be related to the tools thinking dom0 is a PV domain.
>>>> Good point.
>>>>
>>>> In that case, xenstore-ls -fp would also be useful. The output should
>>>> show the balloon target for Dom0.
>>>>
>>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to
>>>> see
>>>> if it makes any difference.
>>>>
>>>> Wei.
>>> Also tried setting autoballooning off, but it had no effect.
>> This is a Linux/libxl issue that I'm not sure what's the best way to
>> solve. Linux has the following 'workaround' in the balloon driver:
>>
>> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
>>            &static_max);
>> if (err != 1)
>>     static_max = new_target;
>> else
>>     static_max >>= PAGE_SHIFT - 10;
>> target_diff = xen_pv_domain() ? 0
>>         : static_max - balloon_stats.target_pages;
>>
>> I suppose this is used to cope with the memory reporting mismatch
>> usually seen on HVM guests. This however interacts quite badly on a
>> PVH Dom0 that has for example:
>>
>> /local/domain/0/memory/target = "8391840"   (n0)
>> /local/domain/0/memory/static-max = "17179869180"   (n0)
>>
>> One way to solve this is to set target and static-max to the same
>> value initially, so that target_diff on Linux is 0. Another option
>> would be to force target_diff = 0 for Dom0.
>>
>> I'm attaching a patch for libxl that should solve this, could you
>> please give it a try and report back?
>>
>> I'm still unsure however about the best way to fix this, need to think
>> about it.
>>
>> Roger.
>> ---8<---
>> diff --git a/tools/libxl/libxl_mem.c b/tools/libxl/libxl_mem.c
>> index e551e09fed..2c984993d8 100644
>> --- a/tools/libxl/libxl_mem.c
>> +++ b/tools/libxl/libxl_mem.c
>> @@ -151,7 +151,9 @@ retry_transaction:
>>           *target_memkb = info.current_memkb;
>>       }
>>       if (staticmax == NULL) {
>> -        libxl__xs_printf(gc, t, max_path, "%"PRIu64, info.max_memkb);
>> +        libxl__xs_printf(gc, t, max_path, "%"PRIu64,
>> +                         libxl__domain_type(gc, 0) ==
>> LIBXL_DOMAIN_TYPE_PV ?
>> +                         info.max_memkb : info.current_memkb);
>>           *max_memkb = info.max_memkb;
>>       }
>>  
>>
> I have tried Roger's patch and it fixed the memory decrease problem. "xl
> list -l"
> 
> no longer causes any issue.
> 
> The output of "xenstore-ls -fp" shows that both target and static-max
> are now
> 
> set to the same value.

Right.

Meaning that it will be impossible to add memory to PVH dom0 e.g. after
memory hotplug.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-25 23:07                   ` Boris Ostrovsky
@ 2018-07-26  9:41                     ` Juergen Gross
  2018-07-26  9:45                     ` George Dunlap
  1 sibling, 0 replies; 47+ messages in thread
From: Juergen Gross @ 2018-07-26  9:41 UTC (permalink / raw)
  To: Boris Ostrovsky, Andrew Cooper, Roger Pau Monné, bercarug
  Cc: xen-devel, Wei Liu, David Woodhouse, Jan Beulich, abelgun

On 26/07/18 01:07, Boris Ostrovsky wrote:
> On 07/25/2018 02:56 PM, Andrew Cooper wrote:
>> On 25/07/18 17:29, Juergen Gross wrote:
>>> On 25/07/18 18:12, Roger Pau Monné wrote:
>>>> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
>>>>> On 07/25/2018 05:02 PM, Wei Liu wrote:
>>>>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
>>>>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
>>>>>>>>> What could be causing the available memory loss problem?
>>>>>>>> That seems to be Linux aggressively ballooning out memory, you go from
>>>>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
>>>>>>> This might be related to the tools thinking dom0 is a PV domain.
>>>>>> Good point.
>>>>>>
>>>>>> In that case, xenstore-ls -fp would also be useful. The output should
>>>>>> show the balloon target for Dom0.
>>>>>>
>>>>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
>>>>>> if it makes any difference.
>>>>>>
>>>>>> Wei.
>>>>> Also tried setting autoballooning off, but it had no effect.
>>>> This is a Linux/libxl issue that I'm not sure what's the best way to
>>>> solve. Linux has the following 'workaround' in the balloon driver:
>>>>
>>>> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
>>>> 		   &static_max);
>>>> if (err != 1)
>>>> 	static_max = new_target;
>>>> else
>>>> 	static_max >>= PAGE_SHIFT - 10;
>>>> target_diff = xen_pv_domain() ? 0
>>>> 		: static_max - balloon_stats.target_pages;
>>> Hmm, shouldn't PVH behave the same way as PV here? I don't think
>>> there is memory missing for PVH, opposed to HVM's firmware memory.
>>>
>>> Adding Boris for a second opinion.
> 
> (Notwithstanding Andrews' rant below ;-))
> 
> I am trying to remember --- what memory were we trying not to online for
> HVM here?

The problem was to not online memory automatically which was added later
via hotplug (especially NVDIMMs). This happened as the reported memory
sizes via E820 map and hypervisor differed due to HVM's firmware memory.
So the initial target size was larger than the initial memory size
reported via E820 map. This led to automatic ballooning up when a NVDIMM
was added to the HVM guest.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-25 23:07                   ` Boris Ostrovsky
  2018-07-26  9:41                     ` Juergen Gross
@ 2018-07-26  9:45                     ` George Dunlap
  2018-07-26 11:11                       ` Roger Pau Monné
  1 sibling, 1 reply; 47+ messages in thread
From: George Dunlap @ 2018-07-26  9:45 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Juergen Gross, Wei Liu, Andrew Cooper, Jan Beulich, abelgun,
	xen-devel, Roger Pau Monné,
	David Woodhouse, bercarug

On Thu, Jul 26, 2018 at 12:07 AM, Boris Ostrovsky
<boris.ostrovsky@oracle.com> wrote:
> On 07/25/2018 02:56 PM, Andrew Cooper wrote:
>> On 25/07/18 17:29, Juergen Gross wrote:
>>> On 25/07/18 18:12, Roger Pau Monné wrote:
>>>> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
>>>>> On 07/25/2018 05:02 PM, Wei Liu wrote:
>>>>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
>>>>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
>>>>>>>>> What could be causing the available memory loss problem?
>>>>>>>> That seems to be Linux aggressively ballooning out memory, you go from
>>>>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
>>>>>>> This might be related to the tools thinking dom0 is a PV domain.
>>>>>> Good point.
>>>>>>
>>>>>> In that case, xenstore-ls -fp would also be useful. The output should
>>>>>> show the balloon target for Dom0.
>>>>>>
>>>>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
>>>>>> if it makes any difference.
>>>>>>
>>>>>> Wei.
>>>>> Also tried setting autoballooning off, but it had no effect.
>>>> This is a Linux/libxl issue that I'm not sure what's the best way to
>>>> solve. Linux has the following 'workaround' in the balloon driver:
>>>>
>>>> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
>>>>                &static_max);
>>>> if (err != 1)
>>>>     static_max = new_target;
>>>> else
>>>>     static_max >>= PAGE_SHIFT - 10;
>>>> target_diff = xen_pv_domain() ? 0
>>>>             : static_max - balloon_stats.target_pages;
>>> Hmm, shouldn't PVH behave the same way as PV here? I don't think
>>> there is memory missing for PVH, opposed to HVM's firmware memory.
>>>
>>> Adding Boris for a second opinion.
>
> (Notwithstanding Andrews' rant below ;-))
>
> I am trying to remember --- what memory were we trying not to online for
> HVM here?

My general memory of the situation is this:

* Balloon drivers are told to reach a "target" value for max_pages.
* max_pages includes all memory assigned to the guest, including video
ram, "special" pages, ipxe ROMs, bios ROMs from passed-through
devices, and so on.
* Unfortunately, the balloon driver doesn't know what their max_pages
value is and can't read it.
* So what the balloon drivers do at the moment (as I understand it) is
look at the memory *reported as RAM*, and do a calculation:
  visible_ram - target_max_pages = pages_in_balloon

You can probably see why this won't work -- the result is that the
guest balloons down to (target_max_pages + non_ram_pages).  This is
kind of messy for normal guests, but when you have a
populate-on-demand guest, that leaves non_ram_pages amount of PoD ram
in the guest.  The hypervisor then spends a huge amount of work
swapping the PoD pages around under the guest's feet, until it can't
find any more zeroed guest pages to use, and it crashes the guest.

The kludge we have right now is to make up a number for HVM guests
which is slightly larger than non_ram_pages, and tell the guest to aim
for *that* instead.

I think what we need is for the *toolstack* to calculate the size of
the balloon rather than the guest, and tell the balloon driver how big
to make its balloon, rather than the balloon driver trying to figure
that out on its own.

We also need to get a handle on making the allocation and tracking of
all the random "non-RAM" pages allocated to a guest; but that's a
slightly different region of the swamp.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-26  8:31                 ` Juergen Gross
@ 2018-07-26 11:05                   ` Roger Pau Monné
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2018-07-26 11:05 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Wei Liu, Jan Beulich, abelgun, xen-devel, David Woodhouse, bercarug

On Thu, Jul 26, 2018 at 10:31:21AM +0200, Juergen Gross wrote:
> On 26/07/18 10:15, bercarug@amazon.com wrote:
> > On 07/25/2018 07:12 PM, Roger Pau Monné wrote:
> >> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
> >>> On 07/25/2018 05:02 PM, Wei Liu wrote:
> >>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
> >>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
> >>>>>>> What could be causing the available memory loss problem?
> >>>>>> That seems to be Linux aggressively ballooning out memory, you go
> >>>>>> from
> >>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
> >>>>> This might be related to the tools thinking dom0 is a PV domain.
> >>>> Good point.
> >>>>
> >>>> In that case, xenstore-ls -fp would also be useful. The output should
> >>>> show the balloon target for Dom0.
> >>>>
> >>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to
> >>>> see
> >>>> if it makes any difference.
> >>>>
> >>>> Wei.
> >>> Also tried setting autoballooning off, but it had no effect.
> >> This is a Linux/libxl issue that I'm not sure what's the best way to
> >> solve. Linux has the following 'workaround' in the balloon driver:
> >>
> >> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
> >>            &static_max);
> >> if (err != 1)
> >>     static_max = new_target;
> >> else
> >>     static_max >>= PAGE_SHIFT - 10;
> >> target_diff = xen_pv_domain() ? 0
> >>         : static_max - balloon_stats.target_pages;
> >>
> >> I suppose this is used to cope with the memory reporting mismatch
> >> usually seen on HVM guests. This however interacts quite badly on a
> >> PVH Dom0 that has for example:
> >>
> >> /local/domain/0/memory/target = "8391840"   (n0)
> >> /local/domain/0/memory/static-max = "17179869180"   (n0)
> >>
> >> One way to solve this is to set target and static-max to the same
> >> value initially, so that target_diff on Linux is 0. Another option
> >> would be to force target_diff = 0 for Dom0.
> >>
> >> I'm attaching a patch for libxl that should solve this, could you
> >> please give it a try and report back?
> >>
> >> I'm still unsure however about the best way to fix this, need to think
> >> about it.
> >>
> >> Roger.
> >> ---8<---
> >> diff --git a/tools/libxl/libxl_mem.c b/tools/libxl/libxl_mem.c
> >> index e551e09fed..2c984993d8 100644
> >> --- a/tools/libxl/libxl_mem.c
> >> +++ b/tools/libxl/libxl_mem.c
> >> @@ -151,7 +151,9 @@ retry_transaction:
> >>           *target_memkb = info.current_memkb;
> >>       }
> >>       if (staticmax == NULL) {
> >> -        libxl__xs_printf(gc, t, max_path, "%"PRIu64, info.max_memkb);
> >> +        libxl__xs_printf(gc, t, max_path, "%"PRIu64,
> >> +                         libxl__domain_type(gc, 0) ==
> >> LIBXL_DOMAIN_TYPE_PV ?
> >> +                         info.max_memkb : info.current_memkb);
> >>           *max_memkb = info.max_memkb;
> >>       }
> >>  
> >>
> > I have tried Roger's patch and it fixed the memory decrease problem. "xl
> > list -l"
> > 
> > no longer causes any issue.
> > 
> > The output of "xenstore-ls -fp" shows that both target and static-max
> > are now
> > 
> > set to the same value.
> 
> Right.
> 
> Meaning that it will be impossible to add memory to PVH dom0 e.g. after
> memory hotplug.

Likely. HVM guests ATM can only boot ballooned down (when target !=
max) by using PoD IIRC.

Right now if the user doesn't specify a 'max' value in the command
line for a PV(H) Dom0 it's set to LONG_MAX.

Maybe a better option would be to set max == current if no max is
specified on the command line?

This however doesn't fully solve the problem, since setting target !=
static-max for a Linux PVH guest will cause the balloon driver to go
nuts.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 16:29               ` Juergen Gross
  2018-07-25 18:56                 ` [Memory Accounting] was: " Andrew Cooper
@ 2018-07-26 11:08                 ` Roger Pau Monné
  1 sibling, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2018-07-26 11:08 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Wei Liu, Jan Beulich, abelgun, xen-devel, Boris Ostrovsky,
	David Woodhouse, bercarug

On Wed, Jul 25, 2018 at 06:29:25PM +0200, Juergen Gross wrote:
> On 25/07/18 18:12, Roger Pau Monné wrote:
> > On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
> >> On 07/25/2018 05:02 PM, Wei Liu wrote:
> >>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
> >>>> On 25/07/18 15:35, Roger Pau Monné wrote:
> >>>>>> What could be causing the available memory loss problem?
> >>>>> That seems to be Linux aggressively ballooning out memory, you go from
> >>>>> 7129M total memory to 246M. Are you creating a lot of domains?
> >>>> This might be related to the tools thinking dom0 is a PV domain.
> >>> Good point.
> >>>
> >>> In that case, xenstore-ls -fp would also be useful. The output should
> >>> show the balloon target for Dom0.
> >>>
> >>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
> >>> if it makes any difference.
> >>>
> >>> Wei.
> >> Also tried setting autoballooning off, but it had no effect.
> > 
> > This is a Linux/libxl issue that I'm not sure what's the best way to
> > solve. Linux has the following 'workaround' in the balloon driver:
> > 
> > err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
> > 		   &static_max);
> > if (err != 1)
> > 	static_max = new_target;
> > else
> > 	static_max >>= PAGE_SHIFT - 10;
> > target_diff = xen_pv_domain() ? 0
> > 		: static_max - balloon_stats.target_pages;
> 
> Hmm, shouldn't PVH behave the same way as PV here? I don't think
> there is memory missing for PVH, opposed to HVM's firmware memory.

There's memory missing for PVH, eg: the console page, the TSS for real
mode (if required), the identity page tables for running in
protected mode (if required) and the memory used to store the ACPI
tables.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-26  9:45                     ` George Dunlap
@ 2018-07-26 11:11                       ` Roger Pau Monné
  2018-07-26 11:22                         ` Juergen Gross
  2018-07-26 11:23                         ` George Dunlap
  0 siblings, 2 replies; 47+ messages in thread
From: Roger Pau Monné @ 2018-07-26 11:11 UTC (permalink / raw)
  To: George Dunlap
  Cc: Juergen Gross, Wei Liu, Andrew Cooper, Jan Beulich, abelgun,
	xen-devel, Boris Ostrovsky, David Woodhouse, bercarug

On Thu, Jul 26, 2018 at 10:45:08AM +0100, George Dunlap wrote:
> On Thu, Jul 26, 2018 at 12:07 AM, Boris Ostrovsky
> <boris.ostrovsky@oracle.com> wrote:
> > On 07/25/2018 02:56 PM, Andrew Cooper wrote:
> >> On 25/07/18 17:29, Juergen Gross wrote:
> >>> On 25/07/18 18:12, Roger Pau Monné wrote:
> >>>> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
> >>>>> On 07/25/2018 05:02 PM, Wei Liu wrote:
> >>>>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
> >>>>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
> >>>>>>>>> What could be causing the available memory loss problem?
> >>>>>>>> That seems to be Linux aggressively ballooning out memory, you go from
> >>>>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
> >>>>>>> This might be related to the tools thinking dom0 is a PV domain.
> >>>>>> Good point.
> >>>>>>
> >>>>>> In that case, xenstore-ls -fp would also be useful. The output should
> >>>>>> show the balloon target for Dom0.
> >>>>>>
> >>>>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
> >>>>>> if it makes any difference.
> >>>>>>
> >>>>>> Wei.
> >>>>> Also tried setting autoballooning off, but it had no effect.
> >>>> This is a Linux/libxl issue that I'm not sure what's the best way to
> >>>> solve. Linux has the following 'workaround' in the balloon driver:
> >>>>
> >>>> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
> >>>>                &static_max);
> >>>> if (err != 1)
> >>>>     static_max = new_target;
> >>>> else
> >>>>     static_max >>= PAGE_SHIFT - 10;
> >>>> target_diff = xen_pv_domain() ? 0
> >>>>             : static_max - balloon_stats.target_pages;
> >>> Hmm, shouldn't PVH behave the same way as PV here? I don't think
> >>> there is memory missing for PVH, opposed to HVM's firmware memory.
> >>>
> >>> Adding Boris for a second opinion.
> >
> > (Notwithstanding Andrews' rant below ;-))
> >
> > I am trying to remember --- what memory were we trying not to online for
> > HVM here?
> 
> My general memory of the situation is this:
> 
> * Balloon drivers are told to reach a "target" value for max_pages.
> * max_pages includes all memory assigned to the guest, including video
> ram, "special" pages, ipxe ROMs, bios ROMs from passed-through
> devices, and so on.
> * Unfortunately, the balloon driver doesn't know what their max_pages
> value is and can't read it.
> * So what the balloon drivers do at the moment (as I understand it) is
> look at the memory *reported as RAM*, and do a calculation:
>   visible_ram - target_max_pages = pages_in_balloon
> 
> You can probably see why this won't work -- the result is that the
> guest balloons down to (target_max_pages + non_ram_pages).  This is
> kind of messy for normal guests, but when you have a
> populate-on-demand guest, that leaves non_ram_pages amount of PoD ram
> in the guest.  The hypervisor then spends a huge amount of work
> swapping the PoD pages around under the guest's feet, until it can't
> find any more zeroed guest pages to use, and it crashes the guest.
> 
> The kludge we have right now is to make up a number for HVM guests
> which is slightly larger than non_ram_pages, and tell the guest to aim
> for *that* instead.
> 
> I think what we need is for the *toolstack* to calculate the size of
> the balloon rather than the guest, and tell the balloon driver how big
> to make its balloon, rather than the balloon driver trying to figure
> that out on its own.

Maybe the best option would be for the toolstack to fetch the e820
memory map and set the target based on the size of the RAM regions in
there for PVH Dom0? That would certainly match the expectations of the
guest.

Note that for DomUs if hvmloader (or any other component) inside of
the guest changes the memory map it would also have to adjust the
value in the xenstore 'target' node.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-26 11:11                       ` Roger Pau Monné
@ 2018-07-26 11:22                         ` Juergen Gross
  2018-07-26 11:27                           ` George Dunlap
  2018-07-26 13:50                           ` Roger Pau Monné
  2018-07-26 11:23                         ` George Dunlap
  1 sibling, 2 replies; 47+ messages in thread
From: Juergen Gross @ 2018-07-26 11:22 UTC (permalink / raw)
  To: Roger Pau Monné, George Dunlap
  Cc: Wei Liu, Andrew Cooper, Jan Beulich, abelgun, xen-devel,
	Boris Ostrovsky, David Woodhouse, bercarug

On 26/07/18 13:11, Roger Pau Monné wrote:
> On Thu, Jul 26, 2018 at 10:45:08AM +0100, George Dunlap wrote:
>> On Thu, Jul 26, 2018 at 12:07 AM, Boris Ostrovsky
>> <boris.ostrovsky@oracle.com> wrote:
>>> On 07/25/2018 02:56 PM, Andrew Cooper wrote:
>>>> On 25/07/18 17:29, Juergen Gross wrote:
>>>>> On 25/07/18 18:12, Roger Pau Monné wrote:
>>>>>> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
>>>>>>> On 07/25/2018 05:02 PM, Wei Liu wrote:
>>>>>>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
>>>>>>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
>>>>>>>>>>> What could be causing the available memory loss problem?
>>>>>>>>>> That seems to be Linux aggressively ballooning out memory, you go from
>>>>>>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
>>>>>>>>> This might be related to the tools thinking dom0 is a PV domain.
>>>>>>>> Good point.
>>>>>>>>
>>>>>>>> In that case, xenstore-ls -fp would also be useful. The output should
>>>>>>>> show the balloon target for Dom0.
>>>>>>>>
>>>>>>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
>>>>>>>> if it makes any difference.
>>>>>>>>
>>>>>>>> Wei.
>>>>>>> Also tried setting autoballooning off, but it had no effect.
>>>>>> This is a Linux/libxl issue that I'm not sure what's the best way to
>>>>>> solve. Linux has the following 'workaround' in the balloon driver:
>>>>>>
>>>>>> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
>>>>>>                &static_max);
>>>>>> if (err != 1)
>>>>>>     static_max = new_target;
>>>>>> else
>>>>>>     static_max >>= PAGE_SHIFT - 10;
>>>>>> target_diff = xen_pv_domain() ? 0
>>>>>>             : static_max - balloon_stats.target_pages;
>>>>> Hmm, shouldn't PVH behave the same way as PV here? I don't think
>>>>> there is memory missing for PVH, opposed to HVM's firmware memory.
>>>>>
>>>>> Adding Boris for a second opinion.
>>>
>>> (Notwithstanding Andrews' rant below ;-))
>>>
>>> I am trying to remember --- what memory were we trying not to online for
>>> HVM here?
>>
>> My general memory of the situation is this:
>>
>> * Balloon drivers are told to reach a "target" value for max_pages.
>> * max_pages includes all memory assigned to the guest, including video
>> ram, "special" pages, ipxe ROMs, bios ROMs from passed-through
>> devices, and so on.
>> * Unfortunately, the balloon driver doesn't know what their max_pages
>> value is and can't read it.
>> * So what the balloon drivers do at the moment (as I understand it) is
>> look at the memory *reported as RAM*, and do a calculation:
>>   visible_ram - target_max_pages = pages_in_balloon
>>
>> You can probably see why this won't work -- the result is that the
>> guest balloons down to (target_max_pages + non_ram_pages).  This is
>> kind of messy for normal guests, but when you have a
>> populate-on-demand guest, that leaves non_ram_pages amount of PoD ram
>> in the guest.  The hypervisor then spends a huge amount of work
>> swapping the PoD pages around under the guest's feet, until it can't
>> find any more zeroed guest pages to use, and it crashes the guest.
>>
>> The kludge we have right now is to make up a number for HVM guests
>> which is slightly larger than non_ram_pages, and tell the guest to aim
>> for *that* instead.
>>
>> I think what we need is for the *toolstack* to calculate the size of
>> the balloon rather than the guest, and tell the balloon driver how big
>> to make its balloon, rather than the balloon driver trying to figure
>> that out on its own.
> 
> Maybe the best option would be for the toolstack to fetch the e820
> memory map and set the target based on the size of the RAM regions in
> there for PVH Dom0? That would certainly match the expectations of the
> guest.
> 
> Note that for DomUs if hvmloader (or any other component) inside of
> the guest changes the memory map it would also have to adjust the
> value in the xenstore 'target' node.

How would it do that later when the guest is already running?

I believe the right way would be to design a proper ballooning interface
suitable for all kinds of guests from scratch. This should include how
to deal with hotplug of memory or booting with mem < mem_max. Whether
PoD should be included should be discussed, too.

After defining that interface we can look after a proper way to select
the correct interface (old or new) in the gust and how to communicate
that selection to the host.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-26 11:11                       ` Roger Pau Monné
  2018-07-26 11:22                         ` Juergen Gross
@ 2018-07-26 11:23                         ` George Dunlap
  1 sibling, 0 replies; 47+ messages in thread
From: George Dunlap @ 2018-07-26 11:23 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Wei Liu, Andrew Cooper, Jan Beulich, abelgun,
	xen-devel, Boris Ostrovsky, David Woodhouse, bercarug

On Thu, Jul 26, 2018 at 12:11 PM, Roger Pau Monné <roger.pau@citrix.com> wrote:
> On Thu, Jul 26, 2018 at 10:45:08AM +0100, George Dunlap wrote:
>> On Thu, Jul 26, 2018 at 12:07 AM, Boris Ostrovsky
>> <boris.ostrovsky@oracle.com> wrote:
>> > On 07/25/2018 02:56 PM, Andrew Cooper wrote:
>> >> On 25/07/18 17:29, Juergen Gross wrote:
>> >>> On 25/07/18 18:12, Roger Pau Monné wrote:
>> >>>> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
>> >>>>> On 07/25/2018 05:02 PM, Wei Liu wrote:
>> >>>>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
>> >>>>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
>> >>>>>>>>> What could be causing the available memory loss problem?
>> >>>>>>>> That seems to be Linux aggressively ballooning out memory, you go from
>> >>>>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
>> >>>>>>> This might be related to the tools thinking dom0 is a PV domain.
>> >>>>>> Good point.
>> >>>>>>
>> >>>>>> In that case, xenstore-ls -fp would also be useful. The output should
>> >>>>>> show the balloon target for Dom0.
>> >>>>>>
>> >>>>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
>> >>>>>> if it makes any difference.
>> >>>>>>
>> >>>>>> Wei.
>> >>>>> Also tried setting autoballooning off, but it had no effect.
>> >>>> This is a Linux/libxl issue that I'm not sure what's the best way to
>> >>>> solve. Linux has the following 'workaround' in the balloon driver:
>> >>>>
>> >>>> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
>> >>>>                &static_max);
>> >>>> if (err != 1)
>> >>>>     static_max = new_target;
>> >>>> else
>> >>>>     static_max >>= PAGE_SHIFT - 10;
>> >>>> target_diff = xen_pv_domain() ? 0
>> >>>>             : static_max - balloon_stats.target_pages;
>> >>> Hmm, shouldn't PVH behave the same way as PV here? I don't think
>> >>> there is memory missing for PVH, opposed to HVM's firmware memory.
>> >>>
>> >>> Adding Boris for a second opinion.
>> >
>> > (Notwithstanding Andrews' rant below ;-))
>> >
>> > I am trying to remember --- what memory were we trying not to online for
>> > HVM here?
>>
>> My general memory of the situation is this:
>>
>> * Balloon drivers are told to reach a "target" value for max_pages.
>> * max_pages includes all memory assigned to the guest, including video
>> ram, "special" pages, ipxe ROMs, bios ROMs from passed-through
>> devices, and so on.
>> * Unfortunately, the balloon driver doesn't know what their max_pages
>> value is and can't read it.
>> * So what the balloon drivers do at the moment (as I understand it) is
>> look at the memory *reported as RAM*, and do a calculation:
>>   visible_ram - target_max_pages = pages_in_balloon
>>
>> You can probably see why this won't work -- the result is that the
>> guest balloons down to (target_max_pages + non_ram_pages).  This is
>> kind of messy for normal guests, but when you have a
>> populate-on-demand guest, that leaves non_ram_pages amount of PoD ram
>> in the guest.  The hypervisor then spends a huge amount of work
>> swapping the PoD pages around under the guest's feet, until it can't
>> find any more zeroed guest pages to use, and it crashes the guest.
>>
>> The kludge we have right now is to make up a number for HVM guests
>> which is slightly larger than non_ram_pages, and tell the guest to aim
>> for *that* instead.
>>
>> I think what we need is for the *toolstack* to calculate the size of
>> the balloon rather than the guest, and tell the balloon driver how big
>> to make its balloon, rather than the balloon driver trying to figure
>> that out on its own.
>
> Maybe the best option would be for the toolstack to fetch the e820
> memory map and set the target based on the size of the RAM regions in
> there for PVH Dom0? That would certainly match the expectations of the
> guest.

Right; so:
* Expecting the guest do to calculate its own balloon size was always
an architectural mistake.
* What we're tripping over now is that the hack we used to paper over
the architectural mistake for HVM doesn't apply for PVH.

We can either:
1. Extend the hack to paper it over for PVH as well
2. Fix it properly by addressing the underlying architectural mistake.

Given how long it's been that nobody's had bandwidth to do #2, I think
at the moment I think we don't really have any option except to do #1.
But we shouldn't forget that we need to do #2 at some point.

> Note that for DomUs if hvmloader (or any other component) inside of
> the guest changes the memory map it would also have to adjust the
> value in the xenstore 'target' node.

Yes, this sort of thing is part of the reason the swamp hasn't been
drained yet, just fences put up in a few places to keep the alligators
in.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-26 11:22                         ` Juergen Gross
@ 2018-07-26 11:27                           ` George Dunlap
  2018-07-26 12:19                             ` Juergen Gross
  2018-07-26 13:50                           ` Roger Pau Monné
  1 sibling, 1 reply; 47+ messages in thread
From: George Dunlap @ 2018-07-26 11:27 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Wei Liu, Andrew Cooper, Jan Beulich, abelgun, xen-devel,
	Boris Ostrovsky, Roger Pau Monné,
	David Woodhouse, bercarug

On Thu, Jul 26, 2018 at 12:22 PM, Juergen Gross <jgross@suse.com> wrote:
> I believe the right way would be to design a proper ballooning interface
> suitable for all kinds of guests from scratch. This should include how
> to deal with hotplug of memory or booting with mem < mem_max. Whether
> PoD should be included should be discussed, too.

Juergen, what do you think of this interface:  The toolstack looks at
all the available information about the guest and determines what size
the guest's balloon needs to be.  It then writes
memory/target_balloon_size.  The guest balloon driver simply attempts
to allocate / free memory to make the balloon that size.

We'll have to leave memory/target handling as is until all old
versions of the balloon driver disappear.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-26 11:27                           ` George Dunlap
@ 2018-07-26 12:19                             ` Juergen Gross
  2018-07-26 14:44                               ` George Dunlap
  0 siblings, 1 reply; 47+ messages in thread
From: Juergen Gross @ 2018-07-26 12:19 UTC (permalink / raw)
  To: George Dunlap
  Cc: Wei Liu, Andrew Cooper, Jan Beulich, abelgun, xen-devel,
	Boris Ostrovsky, Roger Pau Monné,
	David Woodhouse, bercarug

On 26/07/18 13:27, George Dunlap wrote:
> On Thu, Jul 26, 2018 at 12:22 PM, Juergen Gross <jgross@suse.com> wrote:
>> I believe the right way would be to design a proper ballooning interface
>> suitable for all kinds of guests from scratch. This should include how
>> to deal with hotplug of memory or booting with mem < mem_max. Whether
>> PoD should be included should be discussed, too.
> 
> Juergen, what do you think of this interface:  The toolstack looks at
> all the available information about the guest and determines what size
> the guest's balloon needs to be.  It then writes
> memory/target_balloon_size.  The guest balloon driver simply attempts
> to allocate / free memory to make the balloon that size.

This should be per numa node. So memory/node<n>/target-balloon-size
(I don't like the underscores in the node name).

When adding node specific memory hotplug this scheme could be used
for that purpose, too.

So any memory hotplugged will immediately be used by the guest.

That's just stating a fact, no criticism.

> We'll have to leave memory/target handling as is until all old
> versions of the balloon driver disappear.

Sure.

I'd like the guest to write xenstore node
control/feature-per-node-memory to indicate it will use the new
node(s) instead of the memory/target node.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-26 11:22                         ` Juergen Gross
  2018-07-26 11:27                           ` George Dunlap
@ 2018-07-26 13:50                           ` Roger Pau Monné
  2018-07-26 13:58                             ` Juergen Gross
  1 sibling, 1 reply; 47+ messages in thread
From: Roger Pau Monné @ 2018-07-26 13:50 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Wei Liu, Andrew Cooper, George Dunlap, Jan Beulich, abelgun,
	xen-devel, Boris Ostrovsky, David Woodhouse, bercarug

On Thu, Jul 26, 2018 at 01:22:33PM +0200, Juergen Gross wrote:
> On 26/07/18 13:11, Roger Pau Monné wrote:
> > On Thu, Jul 26, 2018 at 10:45:08AM +0100, George Dunlap wrote:
> >> On Thu, Jul 26, 2018 at 12:07 AM, Boris Ostrovsky
> >> <boris.ostrovsky@oracle.com> wrote:
> >>> On 07/25/2018 02:56 PM, Andrew Cooper wrote:
> >>>> On 25/07/18 17:29, Juergen Gross wrote:
> >>>>> On 25/07/18 18:12, Roger Pau Monné wrote:
> >>>>>> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
> >>>>>>> On 07/25/2018 05:02 PM, Wei Liu wrote:
> >>>>>>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
> >>>>>>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
> >>>>>>>>>>> What could be causing the available memory loss problem?
> >>>>>>>>>> That seems to be Linux aggressively ballooning out memory, you go from
> >>>>>>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
> >>>>>>>>> This might be related to the tools thinking dom0 is a PV domain.
> >>>>>>>> Good point.
> >>>>>>>>
> >>>>>>>> In that case, xenstore-ls -fp would also be useful. The output should
> >>>>>>>> show the balloon target for Dom0.
> >>>>>>>>
> >>>>>>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
> >>>>>>>> if it makes any difference.
> >>>>>>>>
> >>>>>>>> Wei.
> >>>>>>> Also tried setting autoballooning off, but it had no effect.
> >>>>>> This is a Linux/libxl issue that I'm not sure what's the best way to
> >>>>>> solve. Linux has the following 'workaround' in the balloon driver:
> >>>>>>
> >>>>>> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
> >>>>>>                &static_max);
> >>>>>> if (err != 1)
> >>>>>>     static_max = new_target;
> >>>>>> else
> >>>>>>     static_max >>= PAGE_SHIFT - 10;
> >>>>>> target_diff = xen_pv_domain() ? 0
> >>>>>>             : static_max - balloon_stats.target_pages;
> >>>>> Hmm, shouldn't PVH behave the same way as PV here? I don't think
> >>>>> there is memory missing for PVH, opposed to HVM's firmware memory.
> >>>>>
> >>>>> Adding Boris for a second opinion.
> >>>
> >>> (Notwithstanding Andrews' rant below ;-))
> >>>
> >>> I am trying to remember --- what memory were we trying not to online for
> >>> HVM here?
> >>
> >> My general memory of the situation is this:
> >>
> >> * Balloon drivers are told to reach a "target" value for max_pages.
> >> * max_pages includes all memory assigned to the guest, including video
> >> ram, "special" pages, ipxe ROMs, bios ROMs from passed-through
> >> devices, and so on.
> >> * Unfortunately, the balloon driver doesn't know what their max_pages
> >> value is and can't read it.
> >> * So what the balloon drivers do at the moment (as I understand it) is
> >> look at the memory *reported as RAM*, and do a calculation:
> >>   visible_ram - target_max_pages = pages_in_balloon
> >>
> >> You can probably see why this won't work -- the result is that the
> >> guest balloons down to (target_max_pages + non_ram_pages).  This is
> >> kind of messy for normal guests, but when you have a
> >> populate-on-demand guest, that leaves non_ram_pages amount of PoD ram
> >> in the guest.  The hypervisor then spends a huge amount of work
> >> swapping the PoD pages around under the guest's feet, until it can't
> >> find any more zeroed guest pages to use, and it crashes the guest.
> >>
> >> The kludge we have right now is to make up a number for HVM guests
> >> which is slightly larger than non_ram_pages, and tell the guest to aim
> >> for *that* instead.
> >>
> >> I think what we need is for the *toolstack* to calculate the size of
> >> the balloon rather than the guest, and tell the balloon driver how big
> >> to make its balloon, rather than the balloon driver trying to figure
> >> that out on its own.
> > 
> > Maybe the best option would be for the toolstack to fetch the e820
> > memory map and set the target based on the size of the RAM regions in
> > there for PVH Dom0? That would certainly match the expectations of the
> > guest.
> > 
> > Note that for DomUs if hvmloader (or any other component) inside of
> > the guest changes the memory map it would also have to adjust the
> > value in the xenstore 'target' node.
> 
> How would it do that later when the guest is already running?

hvmloader should modify the 'target' xenstore node if it changes the
memory map.

So the value provided by the toolstack would match the amount of RAM
in the memory map up to the point where the guest is started, from
there on anything inside the guest changing the memory map should also
change the xenstore value.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-26 13:50                           ` Roger Pau Monné
@ 2018-07-26 13:58                             ` Juergen Gross
  2018-07-26 14:35                               ` Roger Pau Monné
  0 siblings, 1 reply; 47+ messages in thread
From: Juergen Gross @ 2018-07-26 13:58 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Wei Liu, Andrew Cooper, George Dunlap, Jan Beulich, abelgun,
	xen-devel, Boris Ostrovsky, David Woodhouse, bercarug

On 26/07/18 15:50, Roger Pau Monné wrote:
> On Thu, Jul 26, 2018 at 01:22:33PM +0200, Juergen Gross wrote:
>> On 26/07/18 13:11, Roger Pau Monné wrote:
>>> On Thu, Jul 26, 2018 at 10:45:08AM +0100, George Dunlap wrote:
>>>> On Thu, Jul 26, 2018 at 12:07 AM, Boris Ostrovsky
>>>> <boris.ostrovsky@oracle.com> wrote:
>>>>> On 07/25/2018 02:56 PM, Andrew Cooper wrote:
>>>>>> On 25/07/18 17:29, Juergen Gross wrote:
>>>>>>> On 25/07/18 18:12, Roger Pau Monné wrote:
>>>>>>>> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
>>>>>>>>> On 07/25/2018 05:02 PM, Wei Liu wrote:
>>>>>>>>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
>>>>>>>>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
>>>>>>>>>>>>> What could be causing the available memory loss problem?
>>>>>>>>>>>> That seems to be Linux aggressively ballooning out memory, you go from
>>>>>>>>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
>>>>>>>>>>> This might be related to the tools thinking dom0 is a PV domain.
>>>>>>>>>> Good point.
>>>>>>>>>>
>>>>>>>>>> In that case, xenstore-ls -fp would also be useful. The output should
>>>>>>>>>> show the balloon target for Dom0.
>>>>>>>>>>
>>>>>>>>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
>>>>>>>>>> if it makes any difference.
>>>>>>>>>>
>>>>>>>>>> Wei.
>>>>>>>>> Also tried setting autoballooning off, but it had no effect.
>>>>>>>> This is a Linux/libxl issue that I'm not sure what's the best way to
>>>>>>>> solve. Linux has the following 'workaround' in the balloon driver:
>>>>>>>>
>>>>>>>> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
>>>>>>>>                &static_max);
>>>>>>>> if (err != 1)
>>>>>>>>     static_max = new_target;
>>>>>>>> else
>>>>>>>>     static_max >>= PAGE_SHIFT - 10;
>>>>>>>> target_diff = xen_pv_domain() ? 0
>>>>>>>>             : static_max - balloon_stats.target_pages;
>>>>>>> Hmm, shouldn't PVH behave the same way as PV here? I don't think
>>>>>>> there is memory missing for PVH, opposed to HVM's firmware memory.
>>>>>>>
>>>>>>> Adding Boris for a second opinion.
>>>>>
>>>>> (Notwithstanding Andrews' rant below ;-))
>>>>>
>>>>> I am trying to remember --- what memory were we trying not to online for
>>>>> HVM here?
>>>>
>>>> My general memory of the situation is this:
>>>>
>>>> * Balloon drivers are told to reach a "target" value for max_pages.
>>>> * max_pages includes all memory assigned to the guest, including video
>>>> ram, "special" pages, ipxe ROMs, bios ROMs from passed-through
>>>> devices, and so on.
>>>> * Unfortunately, the balloon driver doesn't know what their max_pages
>>>> value is and can't read it.
>>>> * So what the balloon drivers do at the moment (as I understand it) is
>>>> look at the memory *reported as RAM*, and do a calculation:
>>>>   visible_ram - target_max_pages = pages_in_balloon
>>>>
>>>> You can probably see why this won't work -- the result is that the
>>>> guest balloons down to (target_max_pages + non_ram_pages).  This is
>>>> kind of messy for normal guests, but when you have a
>>>> populate-on-demand guest, that leaves non_ram_pages amount of PoD ram
>>>> in the guest.  The hypervisor then spends a huge amount of work
>>>> swapping the PoD pages around under the guest's feet, until it can't
>>>> find any more zeroed guest pages to use, and it crashes the guest.
>>>>
>>>> The kludge we have right now is to make up a number for HVM guests
>>>> which is slightly larger than non_ram_pages, and tell the guest to aim
>>>> for *that* instead.
>>>>
>>>> I think what we need is for the *toolstack* to calculate the size of
>>>> the balloon rather than the guest, and tell the balloon driver how big
>>>> to make its balloon, rather than the balloon driver trying to figure
>>>> that out on its own.
>>>
>>> Maybe the best option would be for the toolstack to fetch the e820
>>> memory map and set the target based on the size of the RAM regions in
>>> there for PVH Dom0? That would certainly match the expectations of the
>>> guest.
>>>
>>> Note that for DomUs if hvmloader (or any other component) inside of
>>> the guest changes the memory map it would also have to adjust the
>>> value in the xenstore 'target' node.
>>
>> How would it do that later when the guest is already running?
> 
> hvmloader should modify the 'target' xenstore node if it changes the
> memory map.
> 
> So the value provided by the toolstack would match the amount of RAM
> in the memory map up to the point where the guest is started, from
> there on anything inside the guest changing the memory map should also
> change the xenstore value.

So what should libxl write into target when the user specifies a new
value via "xl mem-set" then? It doesn't know whether the guest is still
trying to reach the old target, so it can't trust the current memory
allocated and target value in Xenstore to match.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-26 13:58                             ` Juergen Gross
@ 2018-07-26 14:35                               ` Roger Pau Monné
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2018-07-26 14:35 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Wei Liu, Andrew Cooper, George Dunlap, Jan Beulich, abelgun,
	xen-devel, Boris Ostrovsky, David Woodhouse, bercarug

On Thu, Jul 26, 2018 at 03:58:43PM +0200, Juergen Gross wrote:
> On 26/07/18 15:50, Roger Pau Monné wrote:
> > On Thu, Jul 26, 2018 at 01:22:33PM +0200, Juergen Gross wrote:
> >> On 26/07/18 13:11, Roger Pau Monné wrote:
> >>> On Thu, Jul 26, 2018 at 10:45:08AM +0100, George Dunlap wrote:
> >>>> On Thu, Jul 26, 2018 at 12:07 AM, Boris Ostrovsky
> >>>> <boris.ostrovsky@oracle.com> wrote:
> >>>>> On 07/25/2018 02:56 PM, Andrew Cooper wrote:
> >>>>>> On 25/07/18 17:29, Juergen Gross wrote:
> >>>>>>> On 25/07/18 18:12, Roger Pau Monné wrote:
> >>>>>>>> On Wed, Jul 25, 2018 at 05:05:35PM +0300, bercarug@amazon.com wrote:
> >>>>>>>>> On 07/25/2018 05:02 PM, Wei Liu wrote:
> >>>>>>>>>> On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:
> >>>>>>>>>>> On 25/07/18 15:35, Roger Pau Monné wrote:
> >>>>>>>>>>>>> What could be causing the available memory loss problem?
> >>>>>>>>>>>> That seems to be Linux aggressively ballooning out memory, you go from
> >>>>>>>>>>>> 7129M total memory to 246M. Are you creating a lot of domains?
> >>>>>>>>>>> This might be related to the tools thinking dom0 is a PV domain.
> >>>>>>>>>> Good point.
> >>>>>>>>>>
> >>>>>>>>>> In that case, xenstore-ls -fp would also be useful. The output should
> >>>>>>>>>> show the balloon target for Dom0.
> >>>>>>>>>>
> >>>>>>>>>> You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
> >>>>>>>>>> if it makes any difference.
> >>>>>>>>>>
> >>>>>>>>>> Wei.
> >>>>>>>>> Also tried setting autoballooning off, but it had no effect.
> >>>>>>>> This is a Linux/libxl issue that I'm not sure what's the best way to
> >>>>>>>> solve. Linux has the following 'workaround' in the balloon driver:
> >>>>>>>>
> >>>>>>>> err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
> >>>>>>>>                &static_max);
> >>>>>>>> if (err != 1)
> >>>>>>>>     static_max = new_target;
> >>>>>>>> else
> >>>>>>>>     static_max >>= PAGE_SHIFT - 10;
> >>>>>>>> target_diff = xen_pv_domain() ? 0
> >>>>>>>>             : static_max - balloon_stats.target_pages;
> >>>>>>> Hmm, shouldn't PVH behave the same way as PV here? I don't think
> >>>>>>> there is memory missing for PVH, opposed to HVM's firmware memory.
> >>>>>>>
> >>>>>>> Adding Boris for a second opinion.
> >>>>>
> >>>>> (Notwithstanding Andrews' rant below ;-))
> >>>>>
> >>>>> I am trying to remember --- what memory were we trying not to online for
> >>>>> HVM here?
> >>>>
> >>>> My general memory of the situation is this:
> >>>>
> >>>> * Balloon drivers are told to reach a "target" value for max_pages.
> >>>> * max_pages includes all memory assigned to the guest, including video
> >>>> ram, "special" pages, ipxe ROMs, bios ROMs from passed-through
> >>>> devices, and so on.
> >>>> * Unfortunately, the balloon driver doesn't know what their max_pages
> >>>> value is and can't read it.
> >>>> * So what the balloon drivers do at the moment (as I understand it) is
> >>>> look at the memory *reported as RAM*, and do a calculation:
> >>>>   visible_ram - target_max_pages = pages_in_balloon
> >>>>
> >>>> You can probably see why this won't work -- the result is that the
> >>>> guest balloons down to (target_max_pages + non_ram_pages).  This is
> >>>> kind of messy for normal guests, but when you have a
> >>>> populate-on-demand guest, that leaves non_ram_pages amount of PoD ram
> >>>> in the guest.  The hypervisor then spends a huge amount of work
> >>>> swapping the PoD pages around under the guest's feet, until it can't
> >>>> find any more zeroed guest pages to use, and it crashes the guest.
> >>>>
> >>>> The kludge we have right now is to make up a number for HVM guests
> >>>> which is slightly larger than non_ram_pages, and tell the guest to aim
> >>>> for *that* instead.
> >>>>
> >>>> I think what we need is for the *toolstack* to calculate the size of
> >>>> the balloon rather than the guest, and tell the balloon driver how big
> >>>> to make its balloon, rather than the balloon driver trying to figure
> >>>> that out on its own.
> >>>
> >>> Maybe the best option would be for the toolstack to fetch the e820
> >>> memory map and set the target based on the size of the RAM regions in
> >>> there for PVH Dom0? That would certainly match the expectations of the
> >>> guest.
> >>>
> >>> Note that for DomUs if hvmloader (or any other component) inside of
> >>> the guest changes the memory map it would also have to adjust the
> >>> value in the xenstore 'target' node.
> >>
> >> How would it do that later when the guest is already running?
> > 
> > hvmloader should modify the 'target' xenstore node if it changes the
> > memory map.
> > 
> > So the value provided by the toolstack would match the amount of RAM
> > in the memory map up to the point where the guest is started, from
> > there on anything inside the guest changing the memory map should also
> > change the xenstore value.
> 
> So what should libxl write into target when the user specifies a new
> value via "xl mem-set" then?

I thought the problem was at guest boot, where the OS sees a target
value different than the amount of RAM in the memory map, but I see
that the same problem would happen when doing a mem-set because the
target set in xenstore would match d->tot_pages, but the guest won't
be able to reach it due to memory being used by firmware. Sorry for
the noise.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Memory Accounting] was: Re: PVH dom0 creation fails - the system freezes
  2018-07-26 12:19                             ` Juergen Gross
@ 2018-07-26 14:44                               ` George Dunlap
  0 siblings, 0 replies; 47+ messages in thread
From: George Dunlap @ 2018-07-26 14:44 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Wei Liu, Andrew Cooper, Jan Beulich, abelgun, xen-devel,
	Boris Ostrovsky, Roger Pau Monné,
	David Woodhouse, bercarug

On Thu, Jul 26, 2018 at 1:19 PM, Juergen Gross <jgross@suse.com> wrote:
> On 26/07/18 13:27, George Dunlap wrote:
>> On Thu, Jul 26, 2018 at 12:22 PM, Juergen Gross <jgross@suse.com> wrote:
>>> I believe the right way would be to design a proper ballooning interface
>>> suitable for all kinds of guests from scratch. This should include how
>>> to deal with hotplug of memory or booting with mem < mem_max. Whether
>>> PoD should be included should be discussed, too.
>>
>> Juergen, what do you think of this interface:  The toolstack looks at
>> all the available information about the guest and determines what size
>> the guest's balloon needs to be.  It then writes
>> memory/target_balloon_size.  The guest balloon driver simply attempts
>> to allocate / free memory to make the balloon that size.
>
> This should be per numa node. So memory/node<n>/target-balloon-size
> (I don't like the underscores in the node name).

This should be vnode then, to emphasize that it's a node from the
*guest's* perspective, not the host's perspective.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-25 16:19           ` Paul Durrant
@ 2018-07-26 16:46             ` Roger Pau Monné
  2018-07-27  8:48               ` Bercaru, Gabriel
  0 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monné @ 2018-07-26 16:46 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, abelgun, David Woodhouse, Jan Beulich, bercarug

On Wed, Jul 25, 2018 at 05:19:03PM +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> > Of Roger Pau Monné
> > Sent: 25 July 2018 15:12
> > To: bercarug@amazon.com
> > Cc: xen-devel <xen-devel@lists.xenproject.org>; David Woodhouse
> > <dwmw2@infradead.org>; Jan Beulich <JBeulich@suse.com>;
> > abelgun@amazon.com
> > Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
> > 
> > On Wed, Jul 25, 2018 at 04:57:23PM +0300, bercarug@amazon.com wrote:
> > > On 07/25/2018 04:35 PM, Roger Pau Monné wrote:
> > > > On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@amazon.com
> > wrote:
> > > > > On 07/24/2018 12:54 PM, Jan Beulich wrote:
> > > > > > > > > On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
> > > > > > > For the last few days, I have been trying to get a PVH dom0 running,
> > > > > > > however I encountered the following problem: the system seems
> > to
> > > > > > > freeze after the hypervisor boots, the screen goes black. I have
> > tried to
> > > > > > > debug it via a serial console (using Minicom) and managed to get
> > some
> > > > > > > more Xen output, after the screen turns black.
> > > > > > >
> > > > > > > I mention that I have tried to boot the PVH dom0 using different
> > kernel
> > > > > > > images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11,
> > 4.12).
> > > > > > >
> > > > > > > Below I attached my system / hypervisor configuration, as well as
> > the
> > > > > > > output captured through the serial console, corresponding to the
> > latest
> > > > > > > versions for Xen and the Linux Kernel (Xen staging and Kernel from
> > the
> > > > > > > xen/tip tree).
> > > > > > > [...]
> > > > > > > (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
> > > > > > > (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending
> > Fault
> > > > > > > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault
> > addr 8deb3000, iommu reg = ffff82c00021b000
> > > > Can you figure out which PCI device is 00:14.0?
> > > This is the output of lspci -vvv for device 00:14.0:
> > >
> > > 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI
> > > Controller (rev 31) (prog-if 30 [XHCI])
> > >         Subsystem: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller
> > >         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > ParErr+
> > > Stepping- SERR+ FastB2B- DisINTx+
> > >         Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
> > > <TAbort- <MAbort+ >SERR- <PERR- INTx-
> > >         Latency: 0
> > >         Interrupt: pin A routed to IRQ 178
> > >         Region 0: Memory at a2e00000 (64-bit, non-prefetchable) [size=64K]
> > >         Capabilities: [70] Power Management version 2
> > >                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA
> > > PME(D0-,D1-,D2-,D3hot+,D3cold+)
> > >                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > >         Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
> > >                 Address: 00000000fee0e000  Data: 4021
> > >         Kernel driver in use: xhci_hcd
> > >         Kernel modules: xhci_pci
> > 
> > I'm afraid your USB controller is missing RMRR entries in the DMAR
> > ACPI tables, thus causing the IOMMU faults and not working properly.
> > 
> > You could try to manually add some extra rmrr regions by appending:
> > 
> > rmrr=0x8deb3=0:0:14.0
> > 
> > To the Xen command line, and keep adding any address that pops up in
> > the iommu faults. This is of course quite cumbersome, but there's no
> > way to get the required memory addresses if the data in RMRR is
> > wrong/incomplete.
> > 
> 
> You could just add all E820 reserved regions in there. That will almost certainly cover it.

I have a prototype patch for this that attempts to identity map all
reserved regions below 4GB to the p2m. It's still a WIP, but if you
could give it a try that would help me figure out whether this fixes
your issues and is indeed something that would be good to have.

I don't really like the patch as-is because it doesn't check whether
the reserved regions added to the p2m overlap with the LAPIC page or
the PCIe MCFG regions for example, I will continue to work on a safer
version.

If you can give this a shot, please remove any rmrr options from the
command line and use iommu=debug in order to catch any issues.

Thanks, Roger.
---8<---
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 2c44fabf99..76a1fd6681 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -21,6 +21,8 @@
 #include <xen/keyhandler.h>
 #include <xsm/xsm.h>
 
+#include <asm/setup.h>
+
 static int parse_iommu_param(const char *s);
 static void iommu_dump_p2m_table(unsigned char key);
 
@@ -47,6 +49,8 @@ integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
  *   no-igfx                    Disable VT-d for IGD devices (insecure)
  *   no-amd-iommu-perdev-intremap Don't use per-device interrupt remapping
  *                              tables (insecure)
+ *   inclusive                  Include any memory ranges below 4GB not used
+ *                              by Xen or unusable to the iommu page tables.
  */
 custom_param("iommu", parse_iommu_param);
 bool_t __initdata iommu_enable = 1;
@@ -60,6 +64,7 @@ bool_t __read_mostly iommu_passthrough;
 bool_t __read_mostly iommu_snoop = 1;
 bool_t __read_mostly iommu_qinval = 1;
 bool_t __read_mostly iommu_intremap = 1;
+bool __read_mostly iommu_inclusive = true;
 
 /*
  * In the current implementation of VT-d posted interrupts, in some extreme
@@ -126,6 +131,8 @@ static int __init parse_iommu_param(const char *s)
             iommu_dom0_strict = val;
         else if ( !strncmp(s, "sharept", ss - s) )
             iommu_hap_pt_share = val;
+        else if ( !strncmp(s, "inclusive", ss - s) )
+            iommu_inclusive = val;
         else
             rc = -EINVAL;
 
@@ -165,6 +172,85 @@ static void __hwdom_init check_hwdom_reqs(struct domain *d)
     iommu_dom0_strict = 1;
 }
 
+static void __hwdom_init setup_inclusive_mappings(struct domain *d)
+{
+    unsigned long i, j, tmp, top, max_pfn;
+
+    BUG_ON(!is_hardware_domain(d));
+
+    max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
+    top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
+
+    for ( i = 0; i < top; i++ )
+    {
+        unsigned long pfn = pdx_to_pfn(i);
+        bool map;
+        int rc = 0;
+
+        /*
+         * Set up 1:1 mapping for dom0. Default to include only
+         * conventional RAM areas and let RMRRs include needed reserved
+         * regions. When set, the inclusive mapping additionally maps in
+         * every pfn up to 4GB except those that fall in unusable ranges.
+         */
+        if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) )
+            continue;
+
+        if ( is_pv_domain(d) && iommu_inclusive && pfn <= max_pfn )
+            map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE);
+        else if ( is_hvm_domain(d) && iommu_inclusive )
+            map = page_is_ram_type(pfn, RAM_TYPE_RESERVED);
+        else
+            map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL);
+
+        if ( !map )
+            continue;
+
+        /* Exclude Xen bits */
+        if ( xen_in_range(pfn) )
+            continue;
+
+        /*
+         * If dom0-strict mode is enabled or guest type is HVM/PVH then exclude
+         * conventional RAM and let the common code map dom0's pages.
+         */
+        if ( (iommu_dom0_strict || is_hvm_domain(d)) &&
+             page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
+            continue;
+
+        /* For HVM avoid memory below 1MB because that's already mapped. */
+        if ( is_hvm_domain(d) && pfn < PFN_DOWN(MB(1)) )
+            continue;
+
+        tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
+        for ( j = 0; j < tmp; j++ )
+        {
+            int ret;
+
+            if ( iommu_use_hap_pt(d) )
+            {
+                ASSERT(is_hvm_domain(d));
+                ret = set_identity_p2m_entry(d, pfn * tmp + j, p2m_access_rw,
+                                             0);
+            }
+            else
+                ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
+                                     IOMMUF_readable|IOMMUF_writable);
+
+            if ( !rc )
+               rc = ret;
+        }
+
+        if ( rc )
+            printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
+                   d->domain_id, rc);
+
+        if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
+            process_pending_softirqs();
+    }
+
+}
+
 void __hwdom_init iommu_hwdom_init(struct domain *d)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -207,7 +293,10 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                    d->domain_id, rc);
     }
 
-    return hd->platform_ops->hwdom_init(d);
+    hd->platform_ops->hwdom_init(d);
+
+    if ( !iommu_passthrough )
+        setup_inclusive_mappings(d);
 }
 
 void iommu_teardown(struct domain *d)
diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index fb7edfaef9..91cadc602e 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -99,6 +99,4 @@ void pci_vtd_quirk(const struct pci_dev *);
 bool_t platform_supports_intremap(void);
 bool_t platform_supports_x2apic(void);
 
-void vtd_set_hwdom_mapping(struct domain *d);
-
 #endif // _VTD_EXTERN_H_
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 1710256823..569ec4aec2 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1304,12 +1304,6 @@ static void __hwdom_init intel_iommu_hwdom_init(struct domain *d)
 {
     struct acpi_drhd_unit *drhd;
 
-    if ( !iommu_passthrough && is_pv_domain(d) )
-    {
-        /* Set up 1:1 page table for hardware domain. */
-        vtd_set_hwdom_mapping(d);
-    }
-
     setup_hwdom_pci_devices(d, setup_hwdom_device);
     setup_hwdom_rmrr(d);
 
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index cc2bfea162..9971915349 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -32,11 +32,9 @@
 #include "../extern.h"
 
 /*
- * iommu_inclusive_mapping: when set, all memory below 4GB is included in dom0
- * 1:1 iommu mappings except xen and unusable regions.
+ * iommu_inclusive_mapping: superseded by iommu=inclusive.
  */
-static bool_t __hwdom_initdata iommu_inclusive_mapping = 1;
-boolean_param("iommu_inclusive_mapping", iommu_inclusive_mapping);
+boolean_param("iommu_inclusive_mapping", iommu_inclusive);
 
 void *map_vtd_domain_page(u64 maddr)
 {
@@ -107,67 +105,3 @@ void hvm_dpci_isairq_eoi(struct domain *d, unsigned int isairq)
     }
     spin_unlock(&d->event_lock);
 }
-
-void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
-{
-    unsigned long i, j, tmp, top, max_pfn;
-
-    BUG_ON(!is_hardware_domain(d));
-
-    max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
-    top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
-
-    for ( i = 0; i < top; i++ )
-    {
-        unsigned long pfn = pdx_to_pfn(i);
-        bool map;
-        int rc = 0;
-
-        /*
-         * Set up 1:1 mapping for dom0. Default to include only
-         * conventional RAM areas and let RMRRs include needed reserved
-         * regions. When set, the inclusive mapping additionally maps in
-         * every pfn up to 4GB except those that fall in unusable ranges.
-         */
-        if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) )
-            continue;
-
-        if ( iommu_inclusive_mapping && pfn <= max_pfn )
-            map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE);
-        else
-            map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL);
-
-        if ( !map )
-            continue;
-
-        /* Exclude Xen bits */
-        if ( xen_in_range(pfn) )
-            continue;
-
-        /*
-         * If dom0-strict mode is enabled then exclude conventional RAM
-         * and let the common code map dom0's pages.
-         */
-        if ( iommu_dom0_strict &&
-             page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
-            continue;
-
-        tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
-        for ( j = 0; j < tmp; j++ )
-        {
-            int ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
-                                     IOMMUF_readable|IOMMUF_writable);
-
-            if ( !rc )
-               rc = ret;
-        }
-
-        if ( rc )
-            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping failed: %d\n",
-                   d->domain_id, rc);
-
-        if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
-            process_pending_softirqs();
-    }
-}
-
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 6b42e3b876..15d6584837 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -35,6 +35,7 @@ extern bool_t iommu_snoop, iommu_qinval, iommu_intremap, iommu_intpost;
 extern bool_t iommu_hap_pt_share;
 extern bool_t iommu_debug;
 extern bool_t amd_iommu_perdev_intremap;
+extern bool iommu_inclusive;
 
 extern unsigned int iommu_dev_iotlb_timeout;
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-26 16:46             ` Roger Pau Monné
@ 2018-07-27  8:48               ` Bercaru, Gabriel
  2018-07-27  9:11                 ` Roger Pau Monné
  0 siblings, 1 reply; 47+ messages in thread
From: Bercaru, Gabriel @ 2018-07-27  8:48 UTC (permalink / raw)
  To: Roger Pau Monné, Paul Durrant
  Cc: xen-devel, David Woodhouse, Jan Beulich, Belgun, Adrian

[-- Attachment #1: Type: text/plain, Size: 14539 bytes --]

I tried the patch and it fixes the unusable USB devices problem.
However, I captured the boot messages and the "IOMMU mapping failed" printk
seems to have been executed on each iteration of the loop.

I attached a small section of the boot log. As I said, the warning log was displayed
many more times, I removed a lot of them to keep the attached file short.

Gabriel
________________________________________
From: Roger Pau Monné <roger.pau@citrix.com>
Sent: Thursday, July 26, 2018 7:46 PM
To: Paul Durrant
Cc: Bercaru, Gabriel; xen-devel; David Woodhouse; Jan Beulich; Belgun, Adrian
Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes

On Wed, Jul 25, 2018 at 05:19:03PM +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> > Of Roger Pau Monné
> > Sent: 25 July 2018 15:12
> > To: bercarug@amazon.com
> > Cc: xen-devel <xen-devel@lists.xenproject.org>; David Woodhouse
> > <dwmw2@infradead.org>; Jan Beulich <JBeulich@suse.com>;
> > abelgun@amazon.com
> > Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
> >
> > On Wed, Jul 25, 2018 at 04:57:23PM +0300, bercarug@amazon.com wrote:
> > > On 07/25/2018 04:35 PM, Roger Pau Monné wrote:
> > > > On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@amazon.com
> > wrote:
> > > > > On 07/24/2018 12:54 PM, Jan Beulich wrote:
> > > > > > > > > On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
> > > > > > > For the last few days, I have been trying to get a PVH dom0 running,
> > > > > > > however I encountered the following problem: the system seems
> > to
> > > > > > > freeze after the hypervisor boots, the screen goes black. I have
> > tried to
> > > > > > > debug it via a serial console (using Minicom) and managed to get
> > some
> > > > > > > more Xen output, after the screen turns black.
> > > > > > >
> > > > > > > I mention that I have tried to boot the PVH dom0 using different
> > kernel
> > > > > > > images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11,
> > 4.12).
> > > > > > >
> > > > > > > Below I attached my system / hypervisor configuration, as well as
> > the
> > > > > > > output captured through the serial console, corresponding to the
> > latest
> > > > > > > versions for Xen and the Linux Kernel (Xen staging and Kernel from
> > the
> > > > > > > xen/tip tree).
> > > > > > > [...]
> > > > > > > (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
> > > > > > > (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending
> > Fault
> > > > > > > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault
> > addr 8deb3000, iommu reg = ffff82c00021b000
> > > > Can you figure out which PCI device is 00:14.0?
> > > This is the output of lspci -vvv for device 00:14.0:
> > >
> > > 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI
> > > Controller (rev 31) (prog-if 30 [XHCI])
> > >         Subsystem: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller
> > >         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > ParErr+
> > > Stepping- SERR+ FastB2B- DisINTx+
> > >         Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
> > > <TAbort- <MAbort+ >SERR- <PERR- INTx-
> > >         Latency: 0
> > >         Interrupt: pin A routed to IRQ 178
> > >         Region 0: Memory at a2e00000 (64-bit, non-prefetchable) [size=64K]
> > >         Capabilities: [70] Power Management version 2
> > >                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA
> > > PME(D0-,D1-,D2-,D3hot+,D3cold+)
> > >                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > >         Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
> > >                 Address: 00000000fee0e000  Data: 4021
> > >         Kernel driver in use: xhci_hcd
> > >         Kernel modules: xhci_pci
> >
> > I'm afraid your USB controller is missing RMRR entries in the DMAR
> > ACPI tables, thus causing the IOMMU faults and not working properly.
> >
> > You could try to manually add some extra rmrr regions by appending:
> >
> > rmrr=0x8deb3=0:0:14.0
> >
> > To the Xen command line, and keep adding any address that pops up in
> > the iommu faults. This is of course quite cumbersome, but there's no
> > way to get the required memory addresses if the data in RMRR is
> > wrong/incomplete.
> >
>
> You could just add all E820 reserved regions in there. That will almost certainly cover it.

I have a prototype patch for this that attempts to identity map all
reserved regions below 4GB to the p2m. It's still a WIP, but if you
could give it a try that would help me figure out whether this fixes
your issues and is indeed something that would be good to have.

I don't really like the patch as-is because it doesn't check whether
the reserved regions added to the p2m overlap with the LAPIC page or
the PCIe MCFG regions for example, I will continue to work on a safer
version.

If you can give this a shot, please remove any rmrr options from the
command line and use iommu=debug in order to catch any issues.

Thanks, Roger.
---8<---
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 2c44fabf99..76a1fd6681 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -21,6 +21,8 @@
 #include <xen/keyhandler.h>
 #include <xsm/xsm.h>

+#include <asm/setup.h>
+
 static int parse_iommu_param(const char *s);
 static void iommu_dump_p2m_table(unsigned char key);

@@ -47,6 +49,8 @@ integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
  *   no-igfx                    Disable VT-d for IGD devices (insecure)
  *   no-amd-iommu-perdev-intremap Don't use per-device interrupt remapping
  *                              tables (insecure)
+ *   inclusive                  Include any memory ranges below 4GB not used
+ *                              by Xen or unusable to the iommu page tables.
  */
 custom_param("iommu", parse_iommu_param);
 bool_t __initdata iommu_enable = 1;
@@ -60,6 +64,7 @@ bool_t __read_mostly iommu_passthrough;
 bool_t __read_mostly iommu_snoop = 1;
 bool_t __read_mostly iommu_qinval = 1;
 bool_t __read_mostly iommu_intremap = 1;
+bool __read_mostly iommu_inclusive = true;

 /*
  * In the current implementation of VT-d posted interrupts, in some extreme
@@ -126,6 +131,8 @@ static int __init parse_iommu_param(const char *s)
             iommu_dom0_strict = val;
         else if ( !strncmp(s, "sharept", ss - s) )
             iommu_hap_pt_share = val;
+        else if ( !strncmp(s, "inclusive", ss - s) )
+            iommu_inclusive = val;
         else
             rc = -EINVAL;

@@ -165,6 +172,85 @@ static void __hwdom_init check_hwdom_reqs(struct domain *d)
     iommu_dom0_strict = 1;
 }

+static void __hwdom_init setup_inclusive_mappings(struct domain *d)
+{
+    unsigned long i, j, tmp, top, max_pfn;
+
+    BUG_ON(!is_hardware_domain(d));
+
+    max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
+    top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
+
+    for ( i = 0; i < top; i++ )
+    {
+        unsigned long pfn = pdx_to_pfn(i);
+        bool map;
+        int rc = 0;
+
+        /*
+         * Set up 1:1 mapping for dom0. Default to include only
+         * conventional RAM areas and let RMRRs include needed reserved
+         * regions. When set, the inclusive mapping additionally maps in
+         * every pfn up to 4GB except those that fall in unusable ranges.
+         */
+        if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) )
+            continue;
+
+        if ( is_pv_domain(d) && iommu_inclusive && pfn <= max_pfn )
+            map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE);
+        else if ( is_hvm_domain(d) && iommu_inclusive )
+            map = page_is_ram_type(pfn, RAM_TYPE_RESERVED);
+        else
+            map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL);
+
+        if ( !map )
+            continue;
+
+        /* Exclude Xen bits */
+        if ( xen_in_range(pfn) )
+            continue;
+
+        /*
+         * If dom0-strict mode is enabled or guest type is HVM/PVH then exclude
+         * conventional RAM and let the common code map dom0's pages.
+         */
+        if ( (iommu_dom0_strict || is_hvm_domain(d)) &&
+             page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
+            continue;
+
+        /* For HVM avoid memory below 1MB because that's already mapped. */
+        if ( is_hvm_domain(d) && pfn < PFN_DOWN(MB(1)) )
+            continue;
+
+        tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
+        for ( j = 0; j < tmp; j++ )
+        {
+            int ret;
+
+            if ( iommu_use_hap_pt(d) )
+            {
+                ASSERT(is_hvm_domain(d));
+                ret = set_identity_p2m_entry(d, pfn * tmp + j, p2m_access_rw,
+                                             0);
+            }
+            else
+                ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
+                                     IOMMUF_readable|IOMMUF_writable);
+
+            if ( !rc )
+               rc = ret;
+        }
+
+        if ( rc )
+            printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
+                   d->domain_id, rc);
+
+        if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
+            process_pending_softirqs();
+    }
+
+}
+
 void __hwdom_init iommu_hwdom_init(struct domain *d)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -207,7 +293,10 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                    d->domain_id, rc);
     }

-    return hd->platform_ops->hwdom_init(d);
+    hd->platform_ops->hwdom_init(d);
+
+    if ( !iommu_passthrough )
+        setup_inclusive_mappings(d);
 }

 void iommu_teardown(struct domain *d)
diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index fb7edfaef9..91cadc602e 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -99,6 +99,4 @@ void pci_vtd_quirk(const struct pci_dev *);
 bool_t platform_supports_intremap(void);
 bool_t platform_supports_x2apic(void);

-void vtd_set_hwdom_mapping(struct domain *d);
-
 #endif // _VTD_EXTERN_H_
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 1710256823..569ec4aec2 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1304,12 +1304,6 @@ static void __hwdom_init intel_iommu_hwdom_init(struct domain *d)
 {
     struct acpi_drhd_unit *drhd;

-    if ( !iommu_passthrough && is_pv_domain(d) )
-    {
-        /* Set up 1:1 page table for hardware domain. */
-        vtd_set_hwdom_mapping(d);
-    }
-
     setup_hwdom_pci_devices(d, setup_hwdom_device);
     setup_hwdom_rmrr(d);

diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index cc2bfea162..9971915349 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -32,11 +32,9 @@
 #include "../extern.h"

 /*
- * iommu_inclusive_mapping: when set, all memory below 4GB is included in dom0
- * 1:1 iommu mappings except xen and unusable regions.
+ * iommu_inclusive_mapping: superseded by iommu=inclusive.
  */
-static bool_t __hwdom_initdata iommu_inclusive_mapping = 1;
-boolean_param("iommu_inclusive_mapping", iommu_inclusive_mapping);
+boolean_param("iommu_inclusive_mapping", iommu_inclusive);

 void *map_vtd_domain_page(u64 maddr)
 {
@@ -107,67 +105,3 @@ void hvm_dpci_isairq_eoi(struct domain *d, unsigned int isairq)
     }
     spin_unlock(&d->event_lock);
 }
-
-void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
-{
-    unsigned long i, j, tmp, top, max_pfn;
-
-    BUG_ON(!is_hardware_domain(d));
-
-    max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
-    top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
-
-    for ( i = 0; i < top; i++ )
-    {
-        unsigned long pfn = pdx_to_pfn(i);
-        bool map;
-        int rc = 0;
-
-        /*
-         * Set up 1:1 mapping for dom0. Default to include only
-         * conventional RAM areas and let RMRRs include needed reserved
-         * regions. When set, the inclusive mapping additionally maps in
-         * every pfn up to 4GB except those that fall in unusable ranges.
-         */
-        if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) )
-            continue;
-
-        if ( iommu_inclusive_mapping && pfn <= max_pfn )
-            map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE);
-        else
-            map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL);
-
-        if ( !map )
-            continue;
-
-        /* Exclude Xen bits */
-        if ( xen_in_range(pfn) )
-            continue;
-
-        /*
-         * If dom0-strict mode is enabled then exclude conventional RAM
-         * and let the common code map dom0's pages.
-         */
-        if ( iommu_dom0_strict &&
-             page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
-            continue;
-
-        tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
-        for ( j = 0; j < tmp; j++ )
-        {
-            int ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
-                                     IOMMUF_readable|IOMMUF_writable);
-
-            if ( !rc )
-               rc = ret;
-        }
-
-        if ( rc )
-            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping failed: %d\n",
-                   d->domain_id, rc);
-
-        if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
-            process_pending_softirqs();
-    }
-}
-
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 6b42e3b876..15d6584837 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -35,6 +35,7 @@ extern bool_t iommu_snoop, iommu_qinval, iommu_intremap, iommu_intpost;
 extern bool_t iommu_hap_pt_share;
 extern bool_t iommu_debug;
 extern bool_t amd_iommu_perdev_intremap;
+extern bool iommu_inclusive;

 extern unsigned int iommu_dev_iotlb_timeout;






Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.

[-- Attachment #2: iommu.txt --]
[-- Type: text/plain, Size: 1732 bytes --]

(XEN) *** Building a PVH Dom0 ***
(XEN) [VT-D]d0:Hostbridge: skip 0000:00:00.0 map
(XEN) [VT-D]d0:PCI: map 0000:00:14.0
(XEN) [VT-D]d0:PCI: map 0000:00:14.2
(XEN) [VT-D]d0:PCI: map 0000:00:16.0
(XEN) [VT-D]d0:PCI: map 0000:00:16.1
(XEN) [VT-D]d0:PCI: map 0000:00:17.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.2
(XEN) [VT-D]d0:PCI: map 0000:00:1f.4
(XEN) [VT-D]d0:PCIe: map 0000:01:00.0
(XEN) [VT-D]d0:PCIe: map 0000:02:00.0
(XEN) [VT-D]d0:PCIe: map 0000:03:00.0
(XEN) [VT-D]d0:PCIe: map 0000:04:00.0
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021b000
(XEN) hap.c:289: d0 failed to allocate from HAP pool
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN) Cannot setup identity map d0:fee00, gfn already mapped to 1021c63.
(XEN)  d0: IOMMU mapping failed: -16
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN) WARNING: PVH is an experimental mode with limited functionality

[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-27  8:48               ` Bercaru, Gabriel
@ 2018-07-27  9:11                 ` Roger Pau Monné
  2018-08-02 11:36                   ` Bercaru, Gabriel
  0 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monné @ 2018-07-27  9:11 UTC (permalink / raw)
  To: Bercaru, Gabriel
  Cc: xen-devel, Paul Durrant, David Woodhouse, Jan Beulich, Belgun, Adrian

On Fri, Jul 27, 2018 at 08:48:32AM +0000, Bercaru, Gabriel wrote:
> I tried the patch and it fixes the unusable USB devices problem.
> However, I captured the boot messages and the "IOMMU mapping failed" printk
> seems to have been executed on each iteration of the loop.
> 
> I attached a small section of the boot log. As I said, the warning log was displayed
> many more times, I removed a lot of them to keep the attached file short.

The patch is still a WIP, but it's good to know it solves your USB
issues :).

I think you are likely missing patch:

http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=173c7803592065d27bf2e60d50e08e197a0efa83

Can you try to update to latest staging or apply the patch and try
again?

I think if you have this patch applied the number of errors reported
by the IOMMU initialization should go down.

Thank, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-07-27  9:11                 ` Roger Pau Monné
@ 2018-08-02 11:36                   ` Bercaru, Gabriel
  2018-08-02 13:55                     ` Roger Pau Monné
  0 siblings, 1 reply; 47+ messages in thread
From: Bercaru, Gabriel @ 2018-08-02 11:36 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Paul Durrant, David Woodhouse, Jan Beulich, Belgun, Adrian

[-- Attachment #1: Type: text/plain, Size: 4568 bytes --]

I applied the match mentioned, but the system fails to boot. Instead, it
drops to a BusyBox shell. It seems to be a file system issue.

Following is a sequence of the boot log regarding the file system issue.

Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... Begin: Waiting for suspend/resume device ... [    4.892146] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x6d5282a38d0, max_idle_ns: 881591194934 ns
[    4.903311] clocksource: Switched to clocksource tsc
Begin: Running /scripts/local-block ... done.
...
Begin: Running /scripts/local-block ... done.
done.
Gave up waiting for suspend/resume device
done.
Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... done.
done.
Gave up waiting for root file system device.  Common problems:
 - Boot args (cat /proc/cmdline)
   - Check rootdelay= (did the system wait long enough?)
 - Missing modules (cat /proc/modules; ls /dev)
ALERT!  UUID=93bcf82b-b9b9-4362-936a-d0c100a920da does not exist.  Dropping to a shell!


BusyBox v1.22.1 (Debian 1:1.22.0-19+b3) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) ls /dev
char                snapshot            tty29               tty53
console             stderr              tty3                tty54
core                stdin               tty30               tty55
cpu_dma_latency     stdout              tty31               tty56
fd                  tty                 tty32               tty57
full                tty0                tty33               tty58
hpet                tty1                tty34               tty59
hvc0                tty10               tty35               tty6
hvc1                tty11               tty36               tty60
hvc2                tty12               tty37               tty61
hvc3                tty13               tty38               tty62
hvc4                tty14               tty39               tty63
hvc5                tty15               tty4                tty7
hvc6                tty16               tty40               tty8
hvc7                tty17               tty41               tty9
input               tty18               tty42               ttyS0
kmsg                tty19               tty43               ttyS1
mem                 tty2                tty44               ttyS2
memory_bandwidth    tty20               tty45               ttyS3
network_latency     tty21               tty46               urandom
network_throughput  tty22               tty47               vcs
null                tty23               tty48               vcs1
port                tty24               tty49               vcsa
psaux               tty25               tty5                vcsa1
ptmx                tty26               tty50               vga_arbiter
pts                 tty27               tty51               xen
random              tty28               tty52               zero
(initramfs)


I attached the full boot log.

Gabriel

________________________________________
From: Roger Pau Monné <roger.pau@citrix.com>
Sent: Friday, July 27, 2018 12:11 PM
To: Bercaru, Gabriel
Cc: Paul Durrant; xen-devel; David Woodhouse; Jan Beulich; Belgun, Adrian
Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes

On Fri, Jul 27, 2018 at 08:48:32AM +0000, Bercaru, Gabriel wrote:
> I tried the patch and it fixes the unusable USB devices problem.
> However, I captured the boot messages and the "IOMMU mapping failed" printk
> seems to have been executed on each iteration of the loop.
>
> I attached a small section of the boot log. As I said, the warning log was displayed
> many more times, I removed a lot of them to keep the attached file short.

The patch is still a WIP, but it's good to know it solves your USB
issues :).

I think you are likely missing patch:

http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=173c7803592065d27bf2e60d50e08e197a0efa83

Can you try to update to latest staging or apply the patch and try
again?

I think if you have this patch applied the number of errors reported
by the IOMMU initialization should go down.

Thank, Roger.



Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.

[-- Attachment #2: fserror.cap --]
[-- Type: application/vnd.tcpdump.pcap, Size: 49404 bytes --]

[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-08-02 11:36                   ` Bercaru, Gabriel
@ 2018-08-02 13:55                     ` Roger Pau Monné
  2018-08-08  7:46                       ` bercarug
  0 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monné @ 2018-08-02 13:55 UTC (permalink / raw)
  To: Bercaru, Gabriel
  Cc: xen-devel, Paul Durrant, David Woodhouse, Jan Beulich, Belgun, Adrian

Please try to avoid top posting.

On Thu, Aug 02, 2018 at 11:36:26AM +0000, Bercaru, Gabriel wrote:
> I applied the match mentioned, but the system fails to boot. Instead, it
> drops to a BusyBox shell. It seems to be a file system issue.

So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
causes a regression for you?

As I understand it, before applying 173c780359206 you where capable of
booting the PVH Dom0, albeit with non-working USB?

And after applying 173c780359206 you are no longer able to boot?

> Following is a sequence of the boot log regarding the file system issue.

At least part of the issue seems to be that the IO-APIC information
provided to Dom0 is wrong (from the attached log):

[    0.000000] IOAPIC[0]: apic_id 2, version 152, address 0xfec00000, GSI 0-0
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
[    0.000000] Failed to find ioapic for gsi : 2
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
[    0.000000] Failed to find ioapic for gsi : 9
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 1
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 3
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 4
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 5
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 6
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 7
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 8
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 10
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 11
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 12
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 13
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 14
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 15

Can you try to boot with just the staging branch (current commit is
008a8fb249b9d433) and see how that goes?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-08-02 13:55                     ` Roger Pau Monné
@ 2018-08-08  7:46                       ` bercarug
  2018-08-08  8:08                         ` Roger Pau Monné
  0 siblings, 1 reply; 47+ messages in thread
From: bercarug @ 2018-08-08  7:46 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Paul Durrant, David Woodhouse, Jan Beulich, Belgun, Adrian

[-- Attachment #1: Type: text/plain, Size: 3650 bytes --]

On 08/02/2018 04:55 PM, Roger Pau Monné wrote:
> Please try to avoid top posting.
>
> On Thu, Aug 02, 2018 at 11:36:26AM +0000, Bercaru, Gabriel wrote:
>> I applied the match mentioned, but the system fails to boot. Instead, it
>> drops to a BusyBox shell. It seems to be a file system issue.
> So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
> causes a regression for you?
>
> As I understand it, before applying 173c780359206 you where capable of
> booting the PVH Dom0, albeit with non-working USB?
>
> And after applying 173c780359206 you are no longer able to boot?
Right, after applying 173c780359206 the system fails to boot (it drops 
to the BusyBox shell).
>> Following is a sequence of the boot log regarding the file system issue.
> At least part of the issue seems to be that the IO-APIC information
> provided to Dom0 is wrong (from the attached log):
>
> [    0.000000] IOAPIC[0]: apic_id 2, version 152, address 0xfec00000, GSI 0-0
> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
> [    0.000000] Failed to find ioapic for gsi : 2
> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
> [    0.000000] Failed to find ioapic for gsi : 9
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 1
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 3
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 4
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 5
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 6
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 7
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 8
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 10
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 11
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 12
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 13
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 14
> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 15
>
> Can you try to boot with just the staging branch (current commit is
> 008a8fb249b9d433) and see how that goes?
>
> Thanks, Roger.
>
I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and 
the system boots,

however the USB problem persists. I was able to log in using the serial 
port and after executing

"xl list -l" the memory decrease problem appeared.


I attached the boot log. Following are some lines extracted from the 
log, regarding the USB

devices problem:

[    5.864084] usb 1-1: device descriptor read/64, error -71

[    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
[    7.571347] usb 1-1: Device not responding to setup address.

[    8.008031] usb 1-1: device not accepting address 4, error -71

[    8.609623] usb 1-1: device not accepting address 5, error -71


At the beginning of the log, there is a message regarding 
iommu_inclusive_mapping:


(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D]  RMRR address range 3e2e0000..3e2fffff not in reserved 
memory; need "iommu_inclusive_mapping=1"?
(XEN) [VT-D] endpoint: 0000:00:14.0


I mention that I tried to boot again using this command line option, but 
this message and the

USB messages persist.


Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.

[-- Attachment #2: staging.cap --]
[-- Type: application/vnd.tcpdump.pcap, Size: 68757 bytes --]

[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-08-08  7:46                       ` bercarug
@ 2018-08-08  8:08                         ` Roger Pau Monné
  2018-08-08  8:39                           ` bercarug
  2018-08-08  8:43                           ` Paul Durrant
  0 siblings, 2 replies; 47+ messages in thread
From: Roger Pau Monné @ 2018-08-08  8:08 UTC (permalink / raw)
  To: bercarug
  Cc: xen-devel, Paul Durrant, David Woodhouse, Jan Beulich, Belgun, Adrian

On Wed, Aug 08, 2018 at 10:46:40AM +0300, bercarug@amazon.com wrote:
> On 08/02/2018 04:55 PM, Roger Pau Monné wrote:
> > Please try to avoid top posting.
> > 
> > On Thu, Aug 02, 2018 at 11:36:26AM +0000, Bercaru, Gabriel wrote:
> > > I applied the match mentioned, but the system fails to boot. Instead, it
> > > drops to a BusyBox shell. It seems to be a file system issue.
> > So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
> > causes a regression for you?
> > 
> > As I understand it, before applying 173c780359206 you where capable of
> > booting the PVH Dom0, albeit with non-working USB?
> > 
> > And after applying 173c780359206 you are no longer able to boot?
> Right, after applying 173c780359206 the system fails to boot (it drops to
> the BusyBox shell).
> > > Following is a sequence of the boot log regarding the file system issue.
> > At least part of the issue seems to be that the IO-APIC information
> > provided to Dom0 is wrong (from the attached log):
> > 
> > [    0.000000] IOAPIC[0]: apic_id 2, version 152, address 0xfec00000, GSI 0-0
> > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
> > [    0.000000] Failed to find ioapic for gsi : 2
> > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
> > [    0.000000] Failed to find ioapic for gsi : 9
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 1
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 3
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 4
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 5
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 6
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 7
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 8
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 10
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 11
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 12
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 13
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 14
> > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 15
> > 
> > Can you try to boot with just the staging branch (current commit is
> > 008a8fb249b9d433) and see how that goes?
> > 
> > Thanks, Roger.
> > 
> I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and the
> system boots,

OK, so your issues where not caused by 173c780359206 then?

008a8fb249b9d433 already contains 173c780359206 because it was
committed earlier. In any case it's good to know you are able to boot
(albeit with issues) using the current staging branch.

> however the USB problem persists. I was able to log in using the serial port
> and after executing

Yes, the fixes for this issue have not been committed yet, see:

https://lists.xenproject.org/archives/html/xen-devel/2018-08/msg00528.html

If you want you can give this branch a try, it should hopefully solve
your USB issues.

> "xl list -l" the memory decrease problem appeared.

Yup, I will look into this now in order to find some kind of
workaround.

> I attached the boot log. Following are some lines extracted from the log,
> regarding the USB
> 
> devices problem:
> 
> [    5.864084] usb 1-1: device descriptor read/64, error -71
> 
> [    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
> [    7.571347] usb 1-1: Device not responding to setup address.
> 
> [    8.008031] usb 1-1: device not accepting address 4, error -71
> 
> [    8.609623] usb 1-1: device not accepting address 5, error -71
> 
> 
> At the beginning of the log, there is a message regarding
> iommu_inclusive_mapping:
> 
> 
> (XEN) [VT-D]found ACPI_DMAR_RMRR:
> (XEN) [VT-D]  RMRR address range 3e2e0000..3e2fffff not in reserved memory;
> need "iommu_inclusive_mapping=1"?
> (XEN) [VT-D] endpoint: 0000:00:14.0
> 
> 
> I mention that I tried to boot again using this command line option, but
> this message and the
> 
> USB messages persist.

Yes, iommu_inclusive_mapping doesn't work for PVH, that's what my
patch series is trying to address. The error is caused by
missing/wrong RMRR regions in the ACPI tables.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-08-08  8:08                         ` Roger Pau Monné
@ 2018-08-08  8:39                           ` bercarug
  2018-08-08  8:43                           ` Paul Durrant
  1 sibling, 0 replies; 47+ messages in thread
From: bercarug @ 2018-08-08  8:39 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Paul Durrant, David Woodhouse, Jan Beulich, Belgun, Adrian

On 08/08/2018 11:08 AM, Roger Pau Monné wrote:
> On Wed, Aug 08, 2018 at 10:46:40AM +0300, bercarug@amazon.com wrote:
>> On 08/02/2018 04:55 PM, Roger Pau Monné wrote:
>>> Please try to avoid top posting.
>>>
>>> On Thu, Aug 02, 2018 at 11:36:26AM +0000, Bercaru, Gabriel wrote:
>>>> I applied the match mentioned, but the system fails to boot. Instead, it
>>>> drops to a BusyBox shell. It seems to be a file system issue.
>>> So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
>>> causes a regression for you?
>>>
>>> As I understand it, before applying 173c780359206 you where capable of
>>> booting the PVH Dom0, albeit with non-working USB?
>>>
>>> And after applying 173c780359206 you are no longer able to boot?
>> Right, after applying 173c780359206 the system fails to boot (it drops to
>> the BusyBox shell).
>>>> Following is a sequence of the boot log regarding the file system issue.
>>> At least part of the issue seems to be that the IO-APIC information
>>> provided to Dom0 is wrong (from the attached log):
>>>
>>> [    0.000000] IOAPIC[0]: apic_id 2, version 152, address 0xfec00000, GSI 0-0
>>> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
>>> [    0.000000] Failed to find ioapic for gsi : 2
>>> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
>>> [    0.000000] Failed to find ioapic for gsi : 9
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 1
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 3
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 4
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 5
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 6
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 7
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 8
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 10
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 11
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 12
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 13
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 14
>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 15
>>>
>>> Can you try to boot with just the staging branch (current commit is
>>> 008a8fb249b9d433) and see how that goes?
>>>
>>> Thanks, Roger.
>>>
>> I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and the
>> system boots,
> OK, so your issues where not caused by 173c780359206 then?
>
> 008a8fb249b9d433 already contains 173c780359206 because it was
> committed earlier. In any case it's good to know you are able to boot
> (albeit with issues) using the current staging branch.
>
>> however the USB problem persists. I was able to log in using the serial port
>> and after executing
> Yes, the fixes for this issue have not been committed yet, see:
>
> https://lists.xenproject.org/archives/html/xen-devel/2018-08/msg00528.html
>
> If you want you can give this branch a try, it should hopefully solve
> your USB issues.
>
>> "xl list -l" the memory decrease problem appeared.
> Yup, I will look into this now in order to find some kind of
> workaround.
>
>> I attached the boot log. Following are some lines extracted from the log,
>> regarding the USB
>>
>> devices problem:
>>
>> [    5.864084] usb 1-1: device descriptor read/64, error -71
>>
>> [    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
>> [    7.571347] usb 1-1: Device not responding to setup address.
>>
>> [    8.008031] usb 1-1: device not accepting address 4, error -71
>>
>> [    8.609623] usb 1-1: device not accepting address 5, error -71
>>
>>
>> At the beginning of the log, there is a message regarding
>> iommu_inclusive_mapping:
>>
>>
>> (XEN) [VT-D]found ACPI_DMAR_RMRR:
>> (XEN) [VT-D]  RMRR address range 3e2e0000..3e2fffff not in reserved memory;
>> need "iommu_inclusive_mapping=1"?
>> (XEN) [VT-D] endpoint: 0000:00:14.0
>>
>>
>> I mention that I tried to boot again using this command line option, but
>> this message and the
>>
>> USB messages persist.
> Yes, iommu_inclusive_mapping doesn't work for PVH, that's what my
> patch series is trying to address. The error is caused by
> missing/wrong RMRR regions in the ACPI tables.
>
> Thanks, Roger.

I tried compiling from the branch mentioned. I changed the command line by

adding "dom0-iommu=reserved", but I get the same error messages about USB

during boot.


Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-08-08  8:08                         ` Roger Pau Monné
  2018-08-08  8:39                           ` bercarug
@ 2018-08-08  8:43                           ` Paul Durrant
  2018-08-08  8:51                             ` Roger Pau Monné
  1 sibling, 1 reply; 47+ messages in thread
From: Paul Durrant @ 2018-08-08  8:43 UTC (permalink / raw)
  To: Roger Pau Monne, bercarug
  Cc: xen-devel, David Woodhouse, Jan Beulich, Belgun, Adrian

> -----Original Message-----
> From: Roger Pau Monne
> Sent: 08 August 2018 09:08
> To: bercarug@amazon.com
> Cc: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel <xen-
> devel@lists.xenproject.org>; David Woodhouse <dwmw2@infradead.org>;
> Jan Beulich <JBeulich@suse.com>; Belgun, Adrian <abelgun@amazon.com>
> Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
> 
> On Wed, Aug 08, 2018 at 10:46:40AM +0300, bercarug@amazon.com wrote:
> > On 08/02/2018 04:55 PM, Roger Pau Monné wrote:
> > > Please try to avoid top posting.
> > >
> > > On Thu, Aug 02, 2018 at 11:36:26AM +0000, Bercaru, Gabriel wrote:
> > > > I applied the match mentioned, but the system fails to boot. Instead, it
> > > > drops to a BusyBox shell. It seems to be a file system issue.
> > > So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
> > > causes a regression for you?
> > >
> > > As I understand it, before applying 173c780359206 you where capable of
> > > booting the PVH Dom0, albeit with non-working USB?
> > >
> > > And after applying 173c780359206 you are no longer able to boot?
> > Right, after applying 173c780359206 the system fails to boot (it drops to
> > the BusyBox shell).
> > > > Following is a sequence of the boot log regarding the file system issue.
> > > At least part of the issue seems to be that the IO-APIC information
> > > provided to Dom0 is wrong (from the attached log):
> > >
> > > [    0.000000] IOAPIC[0]: apic_id 2, version 152, address 0xfec00000, GSI 0-
> 0
> > > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
> > > [    0.000000] Failed to find ioapic for gsi : 2
> > > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
> > > [    0.000000] Failed to find ioapic for gsi : 9
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 1
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 3
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 4
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 5
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 6
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 7
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 8
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 10
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 11
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 12
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 13
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 14
> > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 15
> > >
> > > Can you try to boot with just the staging branch (current commit is
> > > 008a8fb249b9d433) and see how that goes?
> > >
> > > Thanks, Roger.
> > >
> > I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and
> the
> > system boots,
> 
> OK, so your issues where not caused by 173c780359206 then?
> 
> 008a8fb249b9d433 already contains 173c780359206 because it was
> committed earlier. In any case it's good to know you are able to boot
> (albeit with issues) using the current staging branch.
> 
> > however the USB problem persists. I was able to log in using the serial port
> > and after executing
> 
> Yes, the fixes for this issue have not been committed yet, see:
> 
> https://lists.xenproject.org/archives/html/xen-devel/2018-
> 08/msg00528.html
> 
> If you want you can give this branch a try, it should hopefully solve
> your USB issues.
> 
> > "xl list -l" the memory decrease problem appeared.
> 
> Yup, I will look into this now in order to find some kind of
> workaround.
> 
> > I attached the boot log. Following are some lines extracted from the log,
> > regarding the USB
> >
> > devices problem:
> >
> > [    5.864084] usb 1-1: device descriptor read/64, error -71
> >
> > [    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
> > [    7.571347] usb 1-1: Device not responding to setup address.
> >
> > [    8.008031] usb 1-1: device not accepting address 4, error -71
> >
> > [    8.609623] usb 1-1: device not accepting address 5, error -71
> >
> >
> > At the beginning of the log, there is a message regarding
> > iommu_inclusive_mapping:
> >
> >
> > (XEN) [VT-D]found ACPI_DMAR_RMRR:
> > (XEN) [VT-D]  RMRR address range 3e2e0000..3e2fffff not in reserved
> memory;
> > need "iommu_inclusive_mapping=1"?
> > (XEN) [VT-D] endpoint: 0000:00:14.0
> >
> >
> > I mention that I tried to boot again using this command line option, but
> > this message and the
> >
> > USB messages persist.
> 
> Yes, iommu_inclusive_mapping doesn't work for PVH, that's what my
> patch series is trying to address. The error is caused by
> missing/wrong RMRR regions in the ACPI tables.
> 

Looks like this warning is suggesting that there is an RMRR that falls outside of an E820 reserved region. For PV I guess that 'inclusive' will fix this, but for PVH 'reserved' isn't going to fix it. I hope that the range at least falls in a hole, so maybe we also need a dom0_iommu=holes option too? Ugly, but maybe necessary.

  Paul

> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-08-08  8:43                           ` Paul Durrant
@ 2018-08-08  8:51                             ` Roger Pau Monné
  2018-08-08  8:54                               ` bercarug
       [not found]                               ` <5B6AAD430200009A03E1638C@prv1-mh.provo.novell.com>
  0 siblings, 2 replies; 47+ messages in thread
From: Roger Pau Monné @ 2018-08-08  8:51 UTC (permalink / raw)
  To: Paul Durrant
  Cc: xen-devel, Belgun, Adrian, David Woodhouse, Jan Beulich, bercarug

On Wed, Aug 08, 2018 at 09:43:39AM +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Roger Pau Monne
> > Sent: 08 August 2018 09:08
> > To: bercarug@amazon.com
> > Cc: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel <xen-
> > devel@lists.xenproject.org>; David Woodhouse <dwmw2@infradead.org>;
> > Jan Beulich <JBeulich@suse.com>; Belgun, Adrian <abelgun@amazon.com>
> > Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
> > 
> > On Wed, Aug 08, 2018 at 10:46:40AM +0300, bercarug@amazon.com wrote:
> > > On 08/02/2018 04:55 PM, Roger Pau Monné wrote:
> > > > Please try to avoid top posting.
> > > >
> > > > On Thu, Aug 02, 2018 at 11:36:26AM +0000, Bercaru, Gabriel wrote:
> > > > > I applied the match mentioned, but the system fails to boot. Instead, it
> > > > > drops to a BusyBox shell. It seems to be a file system issue.
> > > > So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
> > > > causes a regression for you?
> > > >
> > > > As I understand it, before applying 173c780359206 you where capable of
> > > > booting the PVH Dom0, albeit with non-working USB?
> > > >
> > > > And after applying 173c780359206 you are no longer able to boot?
> > > Right, after applying 173c780359206 the system fails to boot (it drops to
> > > the BusyBox shell).
> > > > > Following is a sequence of the boot log regarding the file system issue.
> > > > At least part of the issue seems to be that the IO-APIC information
> > > > provided to Dom0 is wrong (from the attached log):
> > > >
> > > > [    0.000000] IOAPIC[0]: apic_id 2, version 152, address 0xfec00000, GSI 0-
> > 0
> > > > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
> > > > [    0.000000] Failed to find ioapic for gsi : 2
> > > > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
> > > > [    0.000000] Failed to find ioapic for gsi : 9
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 1
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 3
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 4
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 5
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 6
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 7
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 8
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 10
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 11
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 12
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 13
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 14
> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 15
> > > >
> > > > Can you try to boot with just the staging branch (current commit is
> > > > 008a8fb249b9d433) and see how that goes?
> > > >
> > > > Thanks, Roger.
> > > >
> > > I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and
> > the
> > > system boots,
> > 
> > OK, so your issues where not caused by 173c780359206 then?
> > 
> > 008a8fb249b9d433 already contains 173c780359206 because it was
> > committed earlier. In any case it's good to know you are able to boot
> > (albeit with issues) using the current staging branch.
> > 
> > > however the USB problem persists. I was able to log in using the serial port
> > > and after executing
> > 
> > Yes, the fixes for this issue have not been committed yet, see:
> > 
> > https://lists.xenproject.org/archives/html/xen-devel/2018-
> > 08/msg00528.html
> > 
> > If you want you can give this branch a try, it should hopefully solve
> > your USB issues.
> > 
> > > "xl list -l" the memory decrease problem appeared.
> > 
> > Yup, I will look into this now in order to find some kind of
> > workaround.
> > 
> > > I attached the boot log. Following are some lines extracted from the log,
> > > regarding the USB
> > >
> > > devices problem:
> > >
> > > [    5.864084] usb 1-1: device descriptor read/64, error -71
> > >
> > > [    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
> > > [    7.571347] usb 1-1: Device not responding to setup address.
> > >
> > > [    8.008031] usb 1-1: device not accepting address 4, error -71
> > >
> > > [    8.609623] usb 1-1: device not accepting address 5, error -71
> > >
> > >
> > > At the beginning of the log, there is a message regarding
> > > iommu_inclusive_mapping:
> > >
> > >
> > > (XEN) [VT-D]found ACPI_DMAR_RMRR:
> > > (XEN) [VT-D]  RMRR address range 3e2e0000..3e2fffff not in reserved
> > memory;
> > > need "iommu_inclusive_mapping=1"?
> > > (XEN) [VT-D] endpoint: 0000:00:14.0
> > >
> > >
> > > I mention that I tried to boot again using this command line option, but
> > > this message and the
> > >
> > > USB messages persist.

Does USB work despite of the warning message?

> > Yes, iommu_inclusive_mapping doesn't work for PVH, that's what my
> > patch series is trying to address. The error is caused by
> > missing/wrong RMRR regions in the ACPI tables.
> > 
> 
> Looks like this warning is suggesting that there is an RMRR that falls outside of an E820 reserved region. For PV I guess that 'inclusive' will fix this, but for PVH 'reserved' isn't going to fix it. I hope that the range at least falls in a hole, so maybe we also need a dom0_iommu=holes option too? Ugly, but maybe necessary.

I wanted to avoid adding such option because I think it's going to
interact quite badly with BAR mappings.

Maybe it would make sense to add RMRR regions as long as they don't
reside in a RAM region and just print the warning message?

As a side note, it would be good that you report this to the vendor,
the firmware they are providing is broken and should be fixed.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-08-08  8:51                             ` Roger Pau Monné
@ 2018-08-08  8:54                               ` bercarug
  2018-08-08  9:44                                 ` Roger Pau Monné
       [not found]                               ` <5B6AAD430200009A03E1638C@prv1-mh.provo.novell.com>
  1 sibling, 1 reply; 47+ messages in thread
From: bercarug @ 2018-08-08  8:54 UTC (permalink / raw)
  To: Roger Pau Monné, Paul Durrant
  Cc: xen-devel, David Woodhouse, Jan Beulich, Belgun, Adrian

On 08/08/2018 11:51 AM, Roger Pau Monné wrote:
> On Wed, Aug 08, 2018 at 09:43:39AM +0100, Paul Durrant wrote:
>>> -----Original Message-----
>>> From: Roger Pau Monne
>>> Sent: 08 August 2018 09:08
>>> To: bercarug@amazon.com
>>> Cc: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel <xen-
>>> devel@lists.xenproject.org>; David Woodhouse <dwmw2@infradead.org>;
>>> Jan Beulich <JBeulich@suse.com>; Belgun, Adrian <abelgun@amazon.com>
>>> Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
>>>
>>> On Wed, Aug 08, 2018 at 10:46:40AM +0300, bercarug@amazon.com wrote:
>>>> On 08/02/2018 04:55 PM, Roger Pau Monné wrote:
>>>>> Please try to avoid top posting.
>>>>>
>>>>> On Thu, Aug 02, 2018 at 11:36:26AM +0000, Bercaru, Gabriel wrote:
>>>>>> I applied the match mentioned, but the system fails to boot. Instead, it
>>>>>> drops to a BusyBox shell. It seems to be a file system issue.
>>>>> So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
>>>>> causes a regression for you?
>>>>>
>>>>> As I understand it, before applying 173c780359206 you where capable of
>>>>> booting the PVH Dom0, albeit with non-working USB?
>>>>>
>>>>> And after applying 173c780359206 you are no longer able to boot?
>>>> Right, after applying 173c780359206 the system fails to boot (it drops to
>>>> the BusyBox shell).
>>>>>> Following is a sequence of the boot log regarding the file system issue.
>>>>> At least part of the issue seems to be that the IO-APIC information
>>>>> provided to Dom0 is wrong (from the attached log):
>>>>>
>>>>> [    0.000000] IOAPIC[0]: apic_id 2, version 152, address 0xfec00000, GSI 0-
>>> 0
>>>>> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
>>>>> [    0.000000] Failed to find ioapic for gsi : 2
>>>>> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
>>>>> [    0.000000] Failed to find ioapic for gsi : 9
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 1
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 3
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 4
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 5
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 6
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 7
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 8
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 10
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 11
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 12
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 13
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 14
>>>>> [    0.000000] ERROR: Unable to locate IOAPIC for GSI 15
>>>>>
>>>>> Can you try to boot with just the staging branch (current commit is
>>>>> 008a8fb249b9d433) and see how that goes?
>>>>>
>>>>> Thanks, Roger.
>>>>>
>>>> I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and
>>> the
>>>> system boots,
>>> OK, so your issues where not caused by 173c780359206 then?
>>>
>>> 008a8fb249b9d433 already contains 173c780359206 because it was
>>> committed earlier. In any case it's good to know you are able to boot
>>> (albeit with issues) using the current staging branch.
>>>
>>>> however the USB problem persists. I was able to log in using the serial port
>>>> and after executing
>>> Yes, the fixes for this issue have not been committed yet, see:
>>>
>>> https://lists.xenproject.org/archives/html/xen-devel/2018-
>>> 08/msg00528.html
>>>
>>> If you want you can give this branch a try, it should hopefully solve
>>> your USB issues.
>>>
>>>> "xl list -l" the memory decrease problem appeared.
>>> Yup, I will look into this now in order to find some kind of
>>> workaround.
>>>
>>>> I attached the boot log. Following are some lines extracted from the log,
>>>> regarding the USB
>>>>
>>>> devices problem:
>>>>
>>>> [    5.864084] usb 1-1: device descriptor read/64, error -71
>>>>
>>>> [    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
>>>> [    7.571347] usb 1-1: Device not responding to setup address.
>>>>
>>>> [    8.008031] usb 1-1: device not accepting address 4, error -71
>>>>
>>>> [    8.609623] usb 1-1: device not accepting address 5, error -71
>>>>
>>>>
>>>> At the beginning of the log, there is a message regarding
>>>> iommu_inclusive_mapping:
>>>>
>>>>
>>>> (XEN) [VT-D]found ACPI_DMAR_RMRR:
>>>> (XEN) [VT-D]  RMRR address range 3e2e0000..3e2fffff not in reserved
>>> memory;
>>>> need "iommu_inclusive_mapping=1"?
>>>> (XEN) [VT-D] endpoint: 0000:00:14.0
>>>>
>>>>
>>>> I mention that I tried to boot again using this command line option, but
>>>> this message and the
>>>>
>>>> USB messages persist.
> Does USB work despite of the warning message?
No, it does not.
>
>>> Yes, iommu_inclusive_mapping doesn't work for PVH, that's what my
>>> patch series is trying to address. The error is caused by
>>> missing/wrong RMRR regions in the ACPI tables.
>>>
>> Looks like this warning is suggesting that there is an RMRR that falls outside of an E820 reserved region. For PV I guess that 'inclusive' will fix this, but for PVH 'reserved' isn't going to fix it. I hope that the range at least falls in a hole, so maybe we also need a dom0_iommu=holes option too? Ugly, but maybe necessary.
> I wanted to avoid adding such option because I think it's going to
> interact quite badly with BAR mappings.
>
> Maybe it would make sense to add RMRR regions as long as they don't
> reside in a RAM region and just print the warning message?
>
> As a side note, it would be good that you report this to the vendor,
> the firmware they are providing is broken and should be fixed.
>
> Thanks, Roger.





Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-08-08  8:54                               ` bercarug
@ 2018-08-08  9:44                                 ` Roger Pau Monné
  2018-08-08 10:11                                   ` Roger Pau Monné
  0 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monné @ 2018-08-08  9:44 UTC (permalink / raw)
  To: bercarug
  Cc: xen-devel, Paul Durrant, David Woodhouse, Jan Beulich, Belgun, Adrian

On Wed, Aug 08, 2018 at 11:54:40AM +0300, bercarug@amazon.com wrote:
> On 08/08/2018 11:51 AM, Roger Pau Monné wrote:
> > On Wed, Aug 08, 2018 at 09:43:39AM +0100, Paul Durrant wrote:
> > > > -----Original Message-----
> > > > From: Roger Pau Monne
> > > > Sent: 08 August 2018 09:08
> > > > To: bercarug@amazon.com
> > > > Cc: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel <xen-
> > > > devel@lists.xenproject.org>; David Woodhouse <dwmw2@infradead.org>;
> > > > Jan Beulich <JBeulich@suse.com>; Belgun, Adrian <abelgun@amazon.com>
> > > > Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
> > > > 
> > > > On Wed, Aug 08, 2018 at 10:46:40AM +0300, bercarug@amazon.com wrote:
> > > > > On 08/02/2018 04:55 PM, Roger Pau Monné wrote:
> > > > > > Please try to avoid top posting.
> > > > > > 
> > > > > > On Thu, Aug 02, 2018 at 11:36:26AM +0000, Bercaru, Gabriel wrote:
> > > > > > > I applied the match mentioned, but the system fails to boot. Instead, it
> > > > > > > drops to a BusyBox shell. It seems to be a file system issue.
> > > > > > So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
> > > > > > causes a regression for you?
> > > > > > 
> > > > > > As I understand it, before applying 173c780359206 you where capable of
> > > > > > booting the PVH Dom0, albeit with non-working USB?
> > > > > > 
> > > > > > And after applying 173c780359206 you are no longer able to boot?
> > > > > Right, after applying 173c780359206 the system fails to boot (it drops to
> > > > > the BusyBox shell).
> > > > > > > Following is a sequence of the boot log regarding the file system issue.
> > > > > > At least part of the issue seems to be that the IO-APIC information
> > > > > > provided to Dom0 is wrong (from the attached log):
> > > > > > 
> > > > > > [    0.000000] IOAPIC[0]: apic_id 2, version 152, address 0xfec00000, GSI 0-
> > > > 0
> > > > > > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
> > > > > > [    0.000000] Failed to find ioapic for gsi : 2
> > > > > > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
> > > > > > [    0.000000] Failed to find ioapic for gsi : 9
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 1
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 3
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 4
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 5
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 6
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 7
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 8
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 10
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 11
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 12
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 13
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 14
> > > > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 15
> > > > > > 
> > > > > > Can you try to boot with just the staging branch (current commit is
> > > > > > 008a8fb249b9d433) and see how that goes?
> > > > > > 
> > > > > > Thanks, Roger.
> > > > > > 
> > > > > I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and
> > > > the
> > > > > system boots,
> > > > OK, so your issues where not caused by 173c780359206 then?
> > > > 
> > > > 008a8fb249b9d433 already contains 173c780359206 because it was
> > > > committed earlier. In any case it's good to know you are able to boot
> > > > (albeit with issues) using the current staging branch.
> > > > 
> > > > > however the USB problem persists. I was able to log in using the serial port
> > > > > and after executing
> > > > Yes, the fixes for this issue have not been committed yet, see:
> > > > 
> > > > https://lists.xenproject.org/archives/html/xen-devel/2018-
> > > > 08/msg00528.html
> > > > 
> > > > If you want you can give this branch a try, it should hopefully solve
> > > > your USB issues.
> > > > 
> > > > > "xl list -l" the memory decrease problem appeared.
> > > > Yup, I will look into this now in order to find some kind of
> > > > workaround.
> > > > 
> > > > > I attached the boot log. Following are some lines extracted from the log,
> > > > > regarding the USB
> > > > > 
> > > > > devices problem:
> > > > > 
> > > > > [    5.864084] usb 1-1: device descriptor read/64, error -71
> > > > > 
> > > > > [    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
> > > > > [    7.571347] usb 1-1: Device not responding to setup address.
> > > > > 
> > > > > [    8.008031] usb 1-1: device not accepting address 4, error -71
> > > > > 
> > > > > [    8.609623] usb 1-1: device not accepting address 5, error -71
> > > > > 
> > > > > 
> > > > > At the beginning of the log, there is a message regarding
> > > > > iommu_inclusive_mapping:
> > > > > 
> > > > > 
> > > > > (XEN) [VT-D]found ACPI_DMAR_RMRR:
> > > > > (XEN) [VT-D]  RMRR address range 3e2e0000..3e2fffff not in reserved
> > > > memory;
> > > > > need "iommu_inclusive_mapping=1"?
> > > > > (XEN) [VT-D] endpoint: 0000:00:14.0
> > > > > 
> > > > > 
> > > > > I mention that I tried to boot again using this command line option, but
> > > > > this message and the
> > > > > 
> > > > > USB messages persist.
> > Does USB work despite of the warning message?
> No, it does not.

I just realized that I've dropped a chunk from my series while
rebasing, could you please try again with the following diff applied
on top of my series?

diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 6aec43ed1a..6271d8b671 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -209,7 +209,13 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
         if ( !hwdom_iommu_map(d, pfn, max_pfn) )
             continue;
 
-        rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable);
+        if ( iommu_use_hap_pt(d) )
+        {
+            ASSERT(is_hvm_domain(d));
+            rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
+        }
+        else
+            rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable);
         if ( rc )
             printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
                    d->domain_id, rc);

> > > > Yes, iommu_inclusive_mapping doesn't work for PVH, that's what my
> > > > patch series is trying to address. The error is caused by
> > > > missing/wrong RMRR regions in the ACPI tables.
> > > > 
> > > Looks like this warning is suggesting that there is an RMRR that falls outside of an E820 reserved region. For PV I guess that 'inclusive' will fix this, but for PVH 'reserved' isn't going to fix it. I hope that the range at least falls in a hole, so maybe we also need a dom0_iommu=holes option too? Ugly, but maybe necessary.
> > I wanted to avoid adding such option because I think it's going to
> > interact quite badly with BAR mappings.
> > 
> > Maybe it would make sense to add RMRR regions as long as they don't
> > reside in a RAM region and just print the warning message?

I see that the RMRR region is mapped regardless of the error message,
so it means that the firmware hos both wrong RMRR regions and missing
entries...

I can try to make something like 'inclusive' work for PVH if the
current series doesn't solve your issues.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
       [not found]                                 ` <5B6AAF130200003B04D2E796@prv1-mh.provo.novell.com>
@ 2018-08-08 10:00                                   ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2018-08-08 10:00 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: xen-devel, bercarug, David Woodhouse, Paul Durrant, abelgun

>>> On 08.08.18 at 10:51, <roger.pau@citrix.com> wrote:
> On Wed, Aug 08, 2018 at 09:43:39AM +0100, Paul Durrant wrote:
>> > -----Original Message-----
>> > From: Roger Pau Monne
>> > Sent: 08 August 2018 09:08
>> > To: bercarug@amazon.com 
>> > Cc: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel <xen-
>> > devel@lists.xenproject.org>; David Woodhouse <dwmw2@infradead.org>;
>> > Jan Beulich <JBeulich@suse.com>; Belgun, Adrian <abelgun@amazon.com>
>> > Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
>> > 
>> > On Wed, Aug 08, 2018 at 10:46:40AM +0300, bercarug@amazon.com wrote:
>> > > On 08/02/2018 04:55 PM, Roger Pau Monné wrote:
>> > > > Please try to avoid top posting.
>> > > >
>> > > > On Thu, Aug 02, 2018 at 11:36:26AM +0000, Bercaru, Gabriel wrote:
>> > > > > I applied the match mentioned, but the system fails to boot. Instead, it
>> > > > > drops to a BusyBox shell. It seems to be a file system issue.
>> > > > So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
>> > > > causes a regression for you?
>> > > >
>> > > > As I understand it, before applying 173c780359206 you where capable of
>> > > > booting the PVH Dom0, albeit with non-working USB?
>> > > >
>> > > > And after applying 173c780359206 you are no longer able to boot?
>> > > Right, after applying 173c780359206 the system fails to boot (it drops to
>> > > the BusyBox shell).
>> > > > > Following is a sequence of the boot log regarding the file system issue.
>> > > > At least part of the issue seems to be that the IO-APIC information
>> > > > provided to Dom0 is wrong (from the attached log):
>> > > >
>> > > > [    0.000000] IOAPIC[0]: apic_id 2, version 152, address 0xfec00000, GSI 
> 0-
>> > 0
>> > > > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
>> > > > [    0.000000] Failed to find ioapic for gsi : 2
>> > > > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high 
> level)
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
>> > > > [    0.000000] Failed to find ioapic for gsi : 9
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 1
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 3
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 4
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 5
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 6
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 7
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 8
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 10
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 11
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 12
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 13
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 14
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 15
>> > > >
>> > > > Can you try to boot with just the staging branch (current commit is
>> > > > 008a8fb249b9d433) and see how that goes?
>> > > >
>> > > > Thanks, Roger.
>> > > >
>> > > I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and
>> > the
>> > > system boots,
>> > 
>> > OK, so your issues where not caused by 173c780359206 then?
>> > 
>> > 008a8fb249b9d433 already contains 173c780359206 because it was
>> > committed earlier. In any case it's good to know you are able to boot
>> > (albeit with issues) using the current staging branch.
>> > 
>> > > however the USB problem persists. I was able to log in using the serial 
> port
>> > > and after executing
>> > 
>> > Yes, the fixes for this issue have not been committed yet, see:
>> > 
>> > https://lists.xenproject.org/archives/html/xen-devel/2018- 
>> > 08/msg00528.html
>> > 
>> > If you want you can give this branch a try, it should hopefully solve
>> > your USB issues.
>> > 
>> > > "xl list -l" the memory decrease problem appeared.
>> > 
>> > Yup, I will look into this now in order to find some kind of
>> > workaround.
>> > 
>> > > I attached the boot log. Following are some lines extracted from the log,
>> > > regarding the USB
>> > >
>> > > devices problem:
>> > >
>> > > [    5.864084] usb 1-1: device descriptor read/64, error -71
>> > >
>> > > [    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
>> > > [    7.571347] usb 1-1: Device not responding to setup address.
>> > >
>> > > [    8.008031] usb 1-1: device not accepting address 4, error -71
>> > >
>> > > [    8.609623] usb 1-1: device not accepting address 5, error -71
>> > >
>> > >
>> > > At the beginning of the log, there is a message regarding
>> > > iommu_inclusive_mapping:
>> > >
>> > >
>> > > (XEN) [VT-D]found ACPI_DMAR_RMRR:
>> > > (XEN) [VT-D]  RMRR address range 3e2e0000..3e2fffff not in reserved
>> > memory;
>> > > need "iommu_inclusive_mapping=1"?
>> > > (XEN) [VT-D] endpoint: 0000:00:14.0
>> > >
>> > >
>> > > I mention that I tried to boot again using this command line option, but
>> > > this message and the
>> > >
>> > > USB messages persist.
> 
> Does USB work despite of the warning message?
> 
>> > Yes, iommu_inclusive_mapping doesn't work for PVH, that's what my
>> > patch series is trying to address. The error is caused by
>> > missing/wrong RMRR regions in the ACPI tables.
>> > 
>> 
>> Looks like this warning is suggesting that there is an RMRR that falls 
> outside of an E820 reserved region. For PV I guess that 'inclusive' will fix 
> this, but for PVH 'reserved' isn't going to fix it. I hope that the range at 
> least falls in a hole, so maybe we also need a dom0_iommu=holes option too? 
> Ugly, but maybe necessary.
> 
> I wanted to avoid adding such option because I think it's going to
> interact quite badly with BAR mappings.
> 
> Maybe it would make sense to add RMRR regions as long as they don't
> reside in a RAM region and just print the warning message?

But the BAR problem would then still exist, unless we hand Dom0 a
fixed up E820 table.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-08-08  9:44                                 ` Roger Pau Monné
@ 2018-08-08 10:11                                   ` Roger Pau Monné
  2018-08-08 10:13                                     ` bercarug
  0 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monné @ 2018-08-08 10:11 UTC (permalink / raw)
  To: bercarug
  Cc: xen-devel, Paul Durrant, David Woodhouse, Jan Beulich, Belgun, Adrian

On Wed, Aug 08, 2018 at 11:44:28AM +0200, Roger Pau Monné wrote:
> I just realized that I've dropped a chunk from my series while
> rebasing, could you please try again with the following diff applied
> on top of my series?
> 
> diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
> index 6aec43ed1a..6271d8b671 100644
> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -209,7 +209,13 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
>          if ( !hwdom_iommu_map(d, pfn, max_pfn) )
>              continue;
>  
> -        rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable);
> +        if ( iommu_use_hap_pt(d) )
> +        {
> +            ASSERT(is_hvm_domain(d));
> +            rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
> +        }
> +        else
> +            rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable);
>          if ( rc )
>              printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
>                     d->domain_id, rc);

I've pushed a new version that has this chunk, so it might be easier
for you to just fetch and test:

git://xenbits.xen.org/people/royger/xen.git iommu_inclusive_v4

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: PVH dom0 creation fails - the system freezes
  2018-08-08 10:11                                   ` Roger Pau Monné
@ 2018-08-08 10:13                                     ` bercarug
  0 siblings, 0 replies; 47+ messages in thread
From: bercarug @ 2018-08-08 10:13 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Paul Durrant, David Woodhouse, Jan Beulich, Belgun, Adrian

On 08/08/2018 01:11 PM, Roger Pau Monné wrote:
> On Wed, Aug 08, 2018 at 11:44:28AM +0200, Roger Pau Monné wrote:
>> I just realized that I've dropped a chunk from my series while
>> rebasing, could you please try again with the following diff applied
>> on top of my series?
>>
>> diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
>> index 6aec43ed1a..6271d8b671 100644
>> --- a/xen/drivers/passthrough/x86/iommu.c
>> +++ b/xen/drivers/passthrough/x86/iommu.c
>> @@ -209,7 +209,13 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
>>           if ( !hwdom_iommu_map(d, pfn, max_pfn) )
>>               continue;
>>   
>> -        rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable);
>> +        if ( iommu_use_hap_pt(d) )
>> +        {
>> +            ASSERT(is_hvm_domain(d));
>> +            rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
>> +        }
>> +        else
>> +            rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable);
>>           if ( rc )
>>               printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
>>                      d->domain_id, rc);
> I've pushed a new version that has this chunk, so it might be easier
> for you to just fetch and test:
>
> git://xenbits.xen.org/people/royger/xen.git iommu_inclusive_v4
>
> Thanks, Roger.
>
I already recompiled Xen using this patch, and the USB devices are 
functional again.


Thanks,

Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2018-08-08 10:13 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-23 11:50 PVH dom0 creation fails - the system freezes bercarug
2018-07-24  9:54 ` Jan Beulich
2018-07-25 10:06   ` bercarug
2018-07-25 10:22     ` Wei Liu
2018-07-25 10:43     ` Juergen Gross
2018-07-25 13:35     ` Roger Pau Monné
2018-07-25 13:41       ` Juergen Gross
2018-07-25 14:02         ` Wei Liu
2018-07-25 14:05           ` bercarug
2018-07-25 14:10             ` Wei Liu
2018-07-25 16:12             ` Roger Pau Monné
2018-07-25 16:29               ` Juergen Gross
2018-07-25 18:56                 ` [Memory Accounting] was: " Andrew Cooper
2018-07-25 23:07                   ` Boris Ostrovsky
2018-07-26  9:41                     ` Juergen Gross
2018-07-26  9:45                     ` George Dunlap
2018-07-26 11:11                       ` Roger Pau Monné
2018-07-26 11:22                         ` Juergen Gross
2018-07-26 11:27                           ` George Dunlap
2018-07-26 12:19                             ` Juergen Gross
2018-07-26 14:44                               ` George Dunlap
2018-07-26 13:50                           ` Roger Pau Monné
2018-07-26 13:58                             ` Juergen Gross
2018-07-26 14:35                               ` Roger Pau Monné
2018-07-26 11:23                         ` George Dunlap
2018-07-26 11:08                 ` Roger Pau Monné
2018-07-26  8:15               ` bercarug
2018-07-26  8:31                 ` Juergen Gross
2018-07-26 11:05                   ` Roger Pau Monné
2018-07-25 13:57       ` bercarug
2018-07-25 14:12         ` Roger Pau Monné
2018-07-25 16:19           ` Paul Durrant
2018-07-26 16:46             ` Roger Pau Monné
2018-07-27  8:48               ` Bercaru, Gabriel
2018-07-27  9:11                 ` Roger Pau Monné
2018-08-02 11:36                   ` Bercaru, Gabriel
2018-08-02 13:55                     ` Roger Pau Monné
2018-08-08  7:46                       ` bercarug
2018-08-08  8:08                         ` Roger Pau Monné
2018-08-08  8:39                           ` bercarug
2018-08-08  8:43                           ` Paul Durrant
2018-08-08  8:51                             ` Roger Pau Monné
2018-08-08  8:54                               ` bercarug
2018-08-08  9:44                                 ` Roger Pau Monné
2018-08-08 10:11                                   ` Roger Pau Monné
2018-08-08 10:13                                     ` bercarug
     [not found]                               ` <5B6AAD430200009A03E1638C@prv1-mh.provo.novell.com>
     [not found]                                 ` <5B6AAF130200003B04D2E796@prv1-mh.provo.novell.com>
2018-08-08 10:00                                   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.