All of lore.kernel.org
 help / color / mirror / Atom feed
* Keystone Issue
@ 2020-06-01 12:38 CodeWiz2280
  2020-06-01 13:29 ` Julien Grall
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-01 12:38 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 823 bytes --]

To Whom it May Concern:

Hello, I am using a Texas Instruments K2E Keystone Eval board with Linux
4.19.59.  It has a 32-bit ARM Cortex A15 processor. There is keystone
specific code in the kernel in arch/arm/mm/pv-fixup-asm.s that executes
during early_paging_init for LPAE support.  This causes the kernel to
switch its running 32-bit address space to a 36-bit address space and the
hypervisor traps repeatedly and stops it from booting.  I suspect this is
because Xen only allowed for the original 32-bit memory range specified by
the dom0 device tree. The 36-bit LPAE address is a fixed offset from the
32-bit address and is not physically different memory.  Can you suggest any
way to get through this problem? I am using the master branch of xen from
earlier this year.  Any help is greatly appreciated.

Thanks,
Dave

[-- Attachment #2: Type: text/html, Size: 933 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-01 12:38 Keystone Issue CodeWiz2280
@ 2020-06-01 13:29 ` Julien Grall
  2020-06-01 15:21   ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: Julien Grall @ 2020-06-01 13:29 UTC (permalink / raw)
  To: CodeWiz2280, xen-devel

Hello,

I have a few questions in order to understand a bit more your problem.

On 01/06/2020 13:38, CodeWiz2280 wrote:
> Hello, I am using a Texas Instruments K2E Keystone Eval board with Linux 
> 4.19.59.  It has a 32-bit ARM Cortex A15 processor. There is keystone 
> specific code in the kernel in arch/arm/mm/pv-fixup-asm.s that executes 
> during early_paging_init for LPAE support.  This causes the kernel to 
> switch its running 32-bit address space to a 36-bit address space and 
> the hypervisor traps repeatedly and stops it from booting.

Without any log it is going to be difficult to help. Could you post the 
hypervisor log when debug is enabled?

>  I suspect 
> this is because Xen only allowed for the original 32-bit memory range 
> specified by the dom0 device tree.

How much RAM did you give to your Dom0?

> The 36-bit LPAE address is a fixed 
> offset from the 32-bit address and is not physically different memory.

I am not sure to understand this. Are you suggesting that the kernel is 
trying to relocate itself in a different part of the physical memory?

Can you provide more details on the fixed offset?

>  
> Can you suggest any way to get through this problem? I am using the 
> master branch of xen from earlier this year.  

Can you provide the exact baseline your are using? Did make any changes 
on top?

> Any help is greatly 
> appreciated.
Best regards,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-01 13:29 ` Julien Grall
@ 2020-06-01 15:21   ` CodeWiz2280
  2020-06-01 17:38     ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-01 15:21 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 2585 bytes --]

Hi Julien,

Thank you for your response.  I will try and post a log for you.  I have
been switching back and forth between configurations and need to take a new
one.

The board has 4GB of memory. Uboot places the kernel/initramfs/dtb in the
0x8000_0000 region but then the kernel switches its code/data over to a
0x8_0000_0000 range via the pv-fixup-asm.S assembly code called from
early_paging_init in arch/arm/mm/mmu.c.  That code is exclusive to the
keystone in the 4.19 kernel when "CONFIG_ARM_PV_FIXUP" and "ARM_LPAE" are
enabled in the kernel .  The upper 2GB of memory is above 0xFFFF_FFFF so
LPAE is required.

/proc/iomem looks like this without running xen after the switch and the
kernel boots:

80000000 - 9fffffff : System RAM (boot alias)
c8000000 - ffffffff : System RAM (boot alias)
800000000 - 1fffffff : System RAM
    800008000-800dfffff : Kernel Code
    801000000-80108ab3f : Kernel data
848000000-8ffffffff : System RAM

I was able to duplicate this issue with a build of your latest "master"
repository from this morning.

On Mon, Jun 1, 2020 at 9:29 AM Julien Grall <julien@xen.org> wrote:

> Hello,
>
> I have a few questions in order to understand a bit more your problem.
>
> On 01/06/2020 13:38, CodeWiz2280 wrote:
> > Hello, I am using a Texas Instruments K2E Keystone Eval board with Linux
> > 4.19.59.  It has a 32-bit ARM Cortex A15 processor. There is keystone
> > specific code in the kernel in arch/arm/mm/pv-fixup-asm.s that executes
> > during early_paging_init for LPAE support.  This causes the kernel to
> > switch its running 32-bit address space to a 36-bit address space and
> > the hypervisor traps repeatedly and stops it from booting.
>
> Without any log it is going to be difficult to help. Could you post the
> hypervisor log when debug is enabled?
>
> >  I suspect
> > this is because Xen only allowed for the original 32-bit memory range
> > specified by the dom0 device tree.
>
> How much RAM did you give to your Dom0?
>
> > The 36-bit LPAE address is a fixed
> > offset from the 32-bit address and is not physically different memory.
>
> I am not sure to understand this. Are you suggesting that the kernel is
> trying to relocate itself in a different part of the physical memory?
>
> Can you provide more details on the fixed offset?
>
> >
> > Can you suggest any way to get through this problem? I am using the
> > master branch of xen from earlier this year.
>
> Can you provide the exact baseline your are using? Did make any changes
> on top?
>
> > Any help is greatly
> > appreciated.
> Best regards,
>
> --
> Julien Grall
>

[-- Attachment #2: Type: text/html, Size: 3329 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-01 15:21   ` CodeWiz2280
@ 2020-06-01 17:38     ` CodeWiz2280
  2020-06-03 11:32       ` Julien Grall
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-01 17:38 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 10048 bytes --]

Hi Julien,

As requested please see log below from the eval board booting dom0, some
notes are as follows:

1. The offset that gets applied to the 32-bit address to translate it
to 36-bits is 0x7_8000_0000
2. Uboot has been setup to not change the address of the memory in the
device tree prior to launching xen, otherwise it would automatically offset
it and replace it with a 36-bit address and xen would immediately panic at
the 36-bit address for a 32-bit processor.
3. The RAM starting address placed in the device tree is 0x8000_0000, which
gets carved up by xen and replaced with 0xA000_0000 prior to booting
dom0..  I had to put in test code to have the kernel offset the 0xA000_0000
32-bit starting address to the 36-bit address needed before the kernel will
attempt to switch.  If it stays 32-bit then it will not switch over the
address space.  Note that without xen in play uboot would normally replace
the address in the device tree with the 36-bit one.
4. The dom0 kernel will boot from xen if the early_paging_init switch step
is skipped, and the low mem stays in 32-bit....but there is a problem with
the peripherals so this is not an acceptable solution.

It seems that either the kernel would need some API to tell xen that there
is going to be a change in the memory its using prior to call
early_paging_init(), or Xen would need to add the additional 36-bit
addresses during the memory bank allocation step....but recognize that they
are not actually different physical memory but just aliased to a different
address.

Thanks,
Dave

 Xen 4.14-unstable
(XEN) Xen version 4.14-unstable (arm-linux-gnueabihf-gcc (Linaro GCC
4.9-2015.05) 4.9.3 20150413 (prerelease)) debug=y  Mon Jun  1 10:22:11 EDT
2020
(XEN) Latest ChangeSet:
(XEN) build-id: 30ae91a06c71a885cfba2788965144999a864614
(XEN) Console output is synchronous.
(XEN) Processor: 412fc0f4: "ARM Limited", variant: 0x2, part 0xc0f, rev 0x4
(XEN) 32-bit Execution:
(XEN)   Processor Features: 00001131:00011011
(XEN)     Instruction Sets: AArch32 A32 Thumb Thumb-2 ThumbEE Jazelle
(XEN)     Extensions: GenericTimer Security
(XEN)   Debug Features: 02010555
(XEN)   Auxiliary Features: 00000000
(XEN)   Memory Model Features: 10201105 20000000 01240000 02102211
(XEN)  ISA Features: 02101110 13112111 21232041 11112131 10011142 00000000
(XEN) Using SMC Calling Convention v1.0
(XEN) Using PSCI v0.1
(XEN) SMP: Allowing 4 CPUs
(XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27 Freq: 208333 KHz
(XEN) GICv2 initialization:
(XEN)         gic_dist_addr=0000000002561000
(XEN)         gic_cpu_addr=0000000002562000
(XEN)         gic_hyp_addr=0000000002564000
(XEN)         gic_vcpu_addr=0000000002566000
(XEN)         gic_maintenance_irq=25
(XEN) Using the new VGIC implementation.
(XEN) GICv2: 512 lines, 4 cpus, secure (IID 0200143b).
(XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Initializing Credit2 scheduler
(XEN)  load_precision_shift: 18
(XEN)  load_window_shift: 30
(XEN)  underload_balance_tolerance: 0
(XEN)  overload_balance_tolerance: -3
(XEN)  runqueues arrangement: socket
(XEN)  cap enforcement granularity: 10ms
(XEN) load tracking window length 1073741824 ns
(XEN) Allocated console ring of 32 KiB.
(XEN) VFP implementer 0x41 architecture 4 part 0x30 variant 0xf rev 0x0
(XEN) CPU0: Guest atomics will try 2 times before pausing the domain
(XEN) Bringing up CPU1
(XEN) CPU1: Guest atomics will try 1 times before pausing the domain
(XEN) CPU 1 booted.
(XEN) Bringing up CPU2
(XEN) CPU2: Guest atomics will try 1 times before pausing the domain
(XEN) CPU 2 booted.
(XEN) Bringing up CPU3
(XEN) CPU3: Guest atomics will try 1 times before pausing the domain
(XEN) CPU 3 booted.
(XEN) Brought up 4 CPUs
(XEN) I/O virtualisation disabled
(XEN) P2M: 40-bit IPA
(XEN) P2M: 3 levels with order-1 root, VTCR 0x80003558
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Adding cpu 0 to runqueue 0
(XEN)  First cpu on runqueue, activating
(XEN) Adding cpu 1 to runqueue 0
(XEN) Adding cpu 2 to runqueue 0
(XEN) Adding cpu 3 to runqueue 0
(XEN) alternatives: Patching with alt table 002c2530 -> 002c2578
(XEN) *** LOADING DOMAIN 0 ***
(XEN) Loading d0 kernel from boot module @ 0000000083000000
(XEN) Loading ramdisk from boot module @ 0000000088000000
(XEN) Allocating 1:1 mappings totalling 1024MB for dom0:
(XEN) BANK[0] 0x000000a0000000-0x000000e0000000 (1024MB)
(XEN) Grant table range: 0x00000082000000-0x00000082040000
(XEN) Allocating PPI 16 for event channel interrupt
(XEN) Loading zImage from 0000000083000000 to
00000000a7a00000-00000000a7f36100
(XEN) Loading d0 initrd from 0000000088000000 to
0x00000000a8200000-0x00000000abe00000
(XEN) Loading d0 DTB to 0x00000000a8000000-0x00000000a8007872
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM in background
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************************************
(XEN) WARNING: CONSOLE OUTPUT IS SYNCHRONOUS
(XEN) This option is intended to aid debugging of Xen by ensuring
(XEN) that all output is synchronously delivered on the serial line.
(XEN) However it can introduce SIGNIFICANT latencies and affect
(XEN) timekeeping. It is NOT recommended for production use!
(XEN) ***************************************************
(XEN) WARNING: SILO mode is not enabled.
(XEN) It has implications on the security of the system,
(XEN) unless the communications have been forbidden between
(XEN) untrusted domains.
(XEN) ***************************************************
(XEN) 3... 2... 1...
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) Freed 328kB init memory.
(XEN) DOM0: [    0.000000] Booting Linux on physical CPU 0x0
(XEN) DOM0: [    0.000000] Linux version 4.19.59-g5f8c1c6121 (gcc version
8.3.0 (GNU Toolchain for the A-profile A
(XEN) DOM0: rchitecture 8.3-2019.03 (arm-rel-8.36))) #52 SMP Mon Jun 1
12:13:51 EDT 2020
(XEN) DOM0: [    0.000000] CPU: ARMv7 Processor [412fc0f4] revision 4
(ARMv7), cr=30c5387d
(XEN) DOM0: [    0.000000] CPU: div instructions available: patching
division code
(XEN) DOM0: [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT
instruction cache
(XEN) DOM0: [    0.000000] OF: fdt: Machine model: Texas Instruments
Keystone 2 Edison EVM
(XEN) DOM0: [    0.000000] bootconsole [earlycon0] enabled
(XEN) DOM0: [    0.000000] debug: ignoring loglevel setting.
(XEN) DOM0: [    0.000000] Memory policy: Data cache writealloc
(XEN) DOM0: [    0.000000] test : mem_start from dtb = 0xa0000000
(XEN) DOM0: [    0.000000] test : force the mem_start = 0x800000000
(XEN) DOM0: [    0.000000] test : note KEYSTONE_LOW_PHYS_START = 80000000,
KEYSTONE_LOW_PHYS_END = ffffffff
(XEN) DOM0: [    0.000000] test : note KEYSTONE_HIGH_PHYS_START =
800000000, KEYSTONE_HIGH_PHYS_END = bffffffff
(XEN) DOM0: [    0.000000] test : Switch physical address space to
0x820000000 (0xa0000000 + 0x780000000)
(XEN) DOM0: [    0.000000] test : inside of early_paging_init()
(XEN) traps.c:1980:d0v0 HSR=0x80000086 pc=0xa020010c gva=0xa020010c
gpa=0x0000082000310c
(XEN) traps.c:1980:d0v0 HSR=0x80000086 pc=0xffff000c gva=0xffff000c
gpa=0x0000082000700c
(XEN) traps.c:1980:d0v0 HSR=0x80000086 pc=0xffff000c gva=0xffff000c
gpa=0x0000082000700c
... last line loops indefinitely



On Mon, Jun 1, 2020 at 11:21 AM CodeWiz2280 <codewiz2280@gmail.com> wrote:

> Hi Julien,
>
> Thank you for your response.  I will try and post a log for you.  I have
> been switching back and forth between configurations and need to take a new
> one.
>
> The board has 4GB of memory. Uboot places the kernel/initramfs/dtb in the
> 0x8000_0000 region but then the kernel switches its code/data over to a
> 0x8_0000_0000 range via the pv-fixup-asm.S assembly code called from
> early_paging_init in arch/arm/mm/mmu.c.  That code is exclusive to the
> keystone in the 4.19 kernel when "CONFIG_ARM_PV_FIXUP" and "ARM_LPAE" are
> enabled in the kernel .  The upper 2GB of memory is above 0xFFFF_FFFF so
> LPAE is required.
>
> /proc/iomem looks like this without running xen after the switch and the
> kernel boots:
>
> 80000000 - 9fffffff : System RAM (boot alias)
> c8000000 - ffffffff : System RAM (boot alias)
> 800000000 - 1fffffff : System RAM
>     800008000-800dfffff : Kernel Code
>     801000000-80108ab3f : Kernel data
> 848000000-8ffffffff : System RAM
>
> I was able to duplicate this issue with a build of your latest "master"
> repository from this morning.
>
> On Mon, Jun 1, 2020 at 9:29 AM Julien Grall <julien@xen.org> wrote:
>
>> Hello,
>>
>> I have a few questions in order to understand a bit more your problem.
>>
>> On 01/06/2020 13:38, CodeWiz2280 wrote:
>> > Hello, I am using a Texas Instruments K2E Keystone Eval board with
>> Linux
>> > 4.19.59.  It has a 32-bit ARM Cortex A15 processor. There is keystone
>> > specific code in the kernel in arch/arm/mm/pv-fixup-asm.s that executes
>> > during early_paging_init for LPAE support.  This causes the kernel to
>> > switch its running 32-bit address space to a 36-bit address space and
>> > the hypervisor traps repeatedly and stops it from booting.
>>
>> Without any log it is going to be difficult to help. Could you post the
>> hypervisor log when debug is enabled?
>>
>> >  I suspect
>> > this is because Xen only allowed for the original 32-bit memory range
>> > specified by the dom0 device tree.
>>
>> How much RAM did you give to your Dom0?
>>
>> > The 36-bit LPAE address is a fixed
>> > offset from the 32-bit address and is not physically different memory.
>>
>> I am not sure to understand this. Are you suggesting that the kernel is
>> trying to relocate itself in a different part of the physical memory?
>>
>> Can you provide more details on the fixed offset?
>>
>> >
>> > Can you suggest any way to get through this problem? I am using the
>> > master branch of xen from earlier this year.
>>
>> Can you provide the exact baseline your are using? Did make any changes
>> on top?
>>
>> > Any help is greatly
>> > appreciated.
>> Best regards,
>>
>> --
>> Julien Grall
>>
>

[-- Attachment #2: Type: text/html, Size: 11752 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-01 17:38     ` CodeWiz2280
@ 2020-06-03 11:32       ` Julien Grall
  2020-06-03 17:13         ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: Julien Grall @ 2020-06-03 11:32 UTC (permalink / raw)
  To: CodeWiz2280; +Cc: xen-devel, Stefano Stabellini, Bertrand Marquis

(+Bertrand and Stefano)

On 01/06/2020 18:38, CodeWiz2280 wrote:
> Hi Julien,

Hi Dave,

> 
> As requested please see log below from the eval board booting dom0, some 
> notes are as follows:

Thanks for the logs and the notes. They are useful to understand your issue.

> 1. The offset that gets applied to the 32-bit address to translate it 
> to 36-bits is 0x7_8000_0000

Is this offset present in the Device-Tree?

> 2. Uboot has been setup to not change the address of the memory in the 
> device tree prior to launching xen, otherwise it would 
> automatically offset it and replace it with a 36-bit address and xen 
> would immediately panic at the 36-bit address for a 32-bit processor.

What is the list of the memory banks Xen will see?

Xen is able to support 36-bit address, can you point to the panic() you 
are hitting?

> 3. The RAM starting address placed in the device tree is 0x8000_0000, 
> which gets carved up by xen and replaced with 0xA000_0000 prior to 
> booting dom0..  I had to put in test code to have the kernel offset the 
> 0xA000_0000 32-bit starting address to the 36-bit address needed before 
> the kernel will attempt to switch.  If it stays 32-bit then it will not 
> switch over the address space.  Note that without xen in play uboot 
> would normally replace the address in the device tree with the 36-bit one.

IIUC, in the case of Linux boot directly, the Device-Tree will not 
describe the low memory range. Is that correct?

> 4. The dom0 kernel will boot from xen if the early_paging_init switch 
> step is skipped, and the low mem stays in 32-bit....but there is a 
> problem with the peripherals so this is not an acceptable solution.

Can you details a bit more the problem with the peripherals?

> 
> It seems that either the kernel would need some API to tell xen that 
> there is going to be a change in the memory its using prior to call 
> early_paging_init(), 

 From my understanding, the problem is very specific to the KeyStone. So 
I would rather avoid to introduce an hypercall specific to your 
platform. But...

> or Xen would need to add the additional 36-bit 
> addresses during the memory bank allocation step....but recognize that 
> they are not actually different physical memory but just aliased to a 
> different address.

... I think it is possible to fix it entirely in Xen without any 
modification in the device-tree.

It is seems better that Xen treats the low memory region as "not usable" 
and only use the high memory region internally. When allocating a Dom0 
memory banks, it would need to ensure that there is a corresponding 
alias in low memory.

Xen will also need to do two mappings in the Dom0 stage-2 page-tables. 
The extra one is for the alias.

This approach will prevent to use hypercall buffer from low memory and 
therefore require your guest to support LPAE. Is it going to be an issue 
for you?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-03 11:32       ` Julien Grall
@ 2020-06-03 17:13         ` CodeWiz2280
  2020-06-03 18:09           ` Julien Grall
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-03 17:13 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini, Bertrand Marquis

[-- Attachment #1: Type: text/plain, Size: 4061 bytes --]

Hi Julien,

The offset is already applied to the memory nodes in the device tree,
meaning a direct Linux boot from uboot would have only the 36-bit addresses
in the device tree (0x8_0000_0000 and 0x8_8000_0000).  Linux would start
executing from a 32-bit address space however and then switch over to the
aliased 36-bit addresses in the device tree as discussed below by
early_paging_init().

I had to add the 32-bit memory node 0x8000_0000 in uboot in place of the
0x8_0000_0000 node otherwise Xen would detect the 32-bit processor and
panic on "Unable to detect the first memory bank" in domain_build.c.  If I
leave only the 36-bit addresses in the device tree and skip past the panic
check in domain_build.c, then I could not get the dom0 kernel to boot at
all.  I believe I would only see "Serial input to DOM0" and nothing else at
that point.

Yes, leaving LPAE support on for the kernel is preferred.

Thank you for your help in this matter.

Respectfully,
Dave

On Wed, Jun 3, 2020 at 7:32 AM Julien Grall <julien@xen.org> wrote:

> (+Bertrand and Stefano)
>
> On 01/06/2020 18:38, CodeWiz2280 wrote:
> > Hi Julien,
>
> Hi Dave,
>
> >
> > As requested please see log below from the eval board booting dom0, some
> > notes are as follows:
>
> Thanks for the logs and the notes. They are useful to understand your
> issue.
>
> > 1. The offset that gets applied to the 32-bit address to translate it
> > to 36-bits is 0x7_8000_0000
>
> Is this offset present in the Device-Tree?
>
> > 2. Uboot has been setup to not change the address of the memory in the
> > device tree prior to launching xen, otherwise it would
> > automatically offset it and replace it with a 36-bit address and xen
> > would immediately panic at the 36-bit address for a 32-bit processor.
>
> What is the list of the memory banks Xen will see?
>
> Xen is able to support 36-bit address, can you point to the panic() you
> are hitting?
>
> > 3. The RAM starting address placed in the device tree is 0x8000_0000,
> > which gets carved up by xen and replaced with 0xA000_0000 prior to
> > booting dom0..  I had to put in test code to have the kernel offset the
> > 0xA000_0000 32-bit starting address to the 36-bit address needed before
> > the kernel will attempt to switch.  If it stays 32-bit then it will not
> > switch over the address space.  Note that without xen in play uboot
> > would normally replace the address in the device tree with the 36-bit
> one.
>
> IIUC, in the case of Linux boot directly, the Device-Tree will not
> describe the low memory range. Is that correct?
>
> > 4. The dom0 kernel will boot from xen if the early_paging_init switch
> > step is skipped, and the low mem stays in 32-bit....but there is a
> > problem with the peripherals so this is not an acceptable solution.
>
> Can you details a bit more the problem with the peripherals?
>
> >
> > It seems that either the kernel would need some API to tell xen that
> > there is going to be a change in the memory its using prior to call
> > early_paging_init(),
>
>  From my understanding, the problem is very specific to the KeyStone. So
> I would rather avoid to introduce an hypercall specific to your
> platform. But...
>
> > or Xen would need to add the additional 36-bit
> > addresses during the memory bank allocation step....but recognize that
> > they are not actually different physical memory but just aliased to a
> > different address.
>
> ... I think it is possible to fix it entirely in Xen without any
> modification in the device-tree.
>
> It is seems better that Xen treats the low memory region as "not usable"
> and only use the high memory region internally. When allocating a Dom0
> memory banks, it would need to ensure that there is a corresponding
> alias in low memory.
>
> Xen will also need to do two mappings in the Dom0 stage-2 page-tables.
> The extra one is for the alias.
>
> This approach will prevent to use hypercall buffer from low memory and
> therefore require your guest to support LPAE. Is it going to be an issue
> for you?
>
> Cheers,
>
> --
> Julien Grall
>

[-- Attachment #2: Type: text/html, Size: 4883 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-03 17:13         ` CodeWiz2280
@ 2020-06-03 18:09           ` Julien Grall
  2020-06-03 18:37             ` CodeWiz2280
  2020-06-04  8:02             ` Bertrand Marquis
  0 siblings, 2 replies; 55+ messages in thread
From: Julien Grall @ 2020-06-03 18:09 UTC (permalink / raw)
  To: CodeWiz2280; +Cc: xen-devel, Stefano Stabellini, Bertrand Marquis



On 03/06/2020 18:13, CodeWiz2280 wrote:
> Hi Julien,

Hello,

In general, we avoid top post on xen-devel, instead we reply inline. I 
believe gmail should allow you to do it :).

> 
> The offset is already applied to the memory nodes in the device tree, 
> meaning a direct Linux boot from uboot would have only the 36-bit 
> addresses in the device tree (0x8_0000_0000 and 0x8_8000_0000).  Linux 
> would start executing from a 32-bit address space however and then 
> switch over to the aliased 36-bit addresses in the device tree as 
> discussed below by early_paging_init().
> 
> I had to add the 32-bit memory node 0x8000_0000 in uboot in place of the 
> 0x8_0000_0000 node otherwise Xen would detect the 32-bit processor and 
> panic on "Unable to detect the first memory bank" in domain_build.c. 

So for 32-bit Xen requires to have the first bank below 4GB. This is 
because you can't boot from a physical address above 32-bit.

Obviously, this check wouldn't work on your platform because all your 
memory will be above 4GB.

> If 
> I leave only the 36-bit addresses in the device tree and skip past the 
> panic check in domain_build.c, then I could not get the dom0 kernel to 
> boot at all.  I believe I would only see "Serial input to DOM0" and 
> nothing else at that point.

Which would make sense per above.

> 
> Yes, leaving LPAE support on for the kernel is preferred.

Ok, so the solution I suggested below should work. Unfortunately, I 
don't have time to work on it. Although, I would be more than happy to 
answers questions and reviewing the patches.

Would you be willing to have a try to implement it?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-03 18:09           ` Julien Grall
@ 2020-06-03 18:37             ` CodeWiz2280
  2020-06-04  8:02             ` Bertrand Marquis
  1 sibling, 0 replies; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-03 18:37 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini, Bertrand Marquis

On Wed, Jun 3, 2020 at 2:09 PM Julien Grall <julien@xen.org> wrote:
>
>
>
> On 03/06/2020 18:13, CodeWiz2280 wrote:
> > Hi Julien,
>
> Hello,
>
> In general, we avoid top post on xen-devel, instead we reply inline. I
> believe gmail should allow you to do it :).
>
I'm sorry about that.  Hopefully this looks right now.
> >
> > The offset is already applied to the memory nodes in the device tree,
> > meaning a direct Linux boot from uboot would have only the 36-bit
> > addresses in the device tree (0x8_0000_0000 and 0x8_8000_0000).  Linux
> > would start executing from a 32-bit address space however and then
> > switch over to the aliased 36-bit addresses in the device tree as
> > discussed below by early_paging_init().
> >
> > I had to add the 32-bit memory node 0x8000_0000 in uboot in place of the
> > 0x8_0000_0000 node otherwise Xen would detect the 32-bit processor and
> > panic on "Unable to detect the first memory bank" in domain_build.c.
>
> So for 32-bit Xen requires to have the first bank below 4GB. This is
> because you can't boot from a physical address above 32-bit.
>
> Obviously, this check wouldn't work on your platform because all your
> memory will be above 4GB.
>
> > If
> > I leave only the 36-bit addresses in the device tree and skip past the
> > panic check in domain_build.c, then I could not get the dom0 kernel to
> > boot at all.  I believe I would only see "Serial input to DOM0" and
> > nothing else at that point.
>
> Which would make sense per above.
>
> >
> > Yes, leaving LPAE support on for the kernel is preferred.
>
> Ok, so the solution I suggested below should work. Unfortunately, I
> don't have time to work on it. Although, I would be more than happy to
> answers questions and reviewing the patches.
>
> Would you be willing to have a try to implement it?
>
Unfortunately, I am not familiar enough with the Xen codebase to
attempt to make the changes.  Thank you for your support and insight.

> Cheers,
>
> --
> Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-03 18:09           ` Julien Grall
  2020-06-03 18:37             ` CodeWiz2280
@ 2020-06-04  8:02             ` Bertrand Marquis
  2020-06-04  8:59               ` Julien Grall
  1 sibling, 1 reply; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-04  8:02 UTC (permalink / raw)
  To: CodeWiz2280; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall

Hi,

> On 3 Jun 2020, at 19:09, Julien Grall <julien@xen.org> wrote:
> 
> 
> 
> On 03/06/2020 18:13, CodeWiz2280 wrote:
>> Hi Julien,
> 
> Hello,
> 
> In general, we avoid top post on xen-devel, instead we reply inline. I believe gmail should allow you to do it :).
> 
>> The offset is already applied to the memory nodes in the device tree, meaning a direct Linux boot from uboot would have only the 36-bit addresses in the device tree (0x8_0000_0000 and 0x8_8000_0000).  Linux would start executing from a 32-bit address space however and then switch over to the aliased 36-bit addresses in the device tree as discussed below by early_paging_init().
>> I had to add the 32-bit memory node 0x8000_0000 in uboot in place of the 0x8_0000_0000 node otherwise Xen would detect the 32-bit processor and panic on "Unable to detect the first memory bank" in domain_build.c. 
> 
> So for 32-bit Xen requires to have the first bank below 4GB. This is because you can't boot from a physical address above 32-bit.
> 
> Obviously, this check wouldn't work on your platform because all your memory will be above 4GB.

I think that the Keystone board has low memory accessible at 2 different address (one low and one high).

I would here suggest to have a dtb with 2 regions (one under 4GB and one over) and remove from the region over 4G the area already addressed by the region under 4GB.

Does that make sense ?

Cheers
Bertrand

> 
>> If I leave only the 36-bit addresses in the device tree and skip past the panic check in domain_build.c, then I could not get the dom0 kernel to boot at all.  I believe I would only see "Serial input to DOM0" and nothing else at that point.
> 
> Which would make sense per above.
> 
>> Yes, leaving LPAE support on for the kernel is preferred.
> 
> Ok, so the solution I suggested below should work. Unfortunately, I don't have time to work on it. Although, I would be more than happy to answers questions and reviewing the patches.
> 
> Would you be willing to have a try to implement it?
> 
> Cheers,
> 
> -- 
> Julien Grall



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-04  8:02             ` Bertrand Marquis
@ 2020-06-04  8:59               ` Julien Grall
  2020-06-04  9:08                 ` Bertrand Marquis
  0 siblings, 1 reply; 55+ messages in thread
From: Julien Grall @ 2020-06-04  8:59 UTC (permalink / raw)
  To: Bertrand Marquis, CodeWiz2280; +Cc: xen-devel, nd, Stefano Stabellini



On 04/06/2020 09:02, Bertrand Marquis wrote:
> Hi,

Hi Bertrand,

> 
>> On 3 Jun 2020, at 19:09, Julien Grall <julien@xen.org> wrote:
>>
>>
>>
>> On 03/06/2020 18:13, CodeWiz2280 wrote:
>>> Hi Julien,
>>
>> Hello,
>>
>> In general, we avoid top post on xen-devel, instead we reply inline. I believe gmail should allow you to do it :).
>>
>>> The offset is already applied to the memory nodes in the device tree, meaning a direct Linux boot from uboot would have only the 36-bit addresses in the device tree (0x8_0000_0000 and 0x8_8000_0000).  Linux would start executing from a 32-bit address space however and then switch over to the aliased 36-bit addresses in the device tree as discussed below by early_paging_init().
>>> I had to add the 32-bit memory node 0x8000_0000 in uboot in place of the 0x8_0000_0000 node otherwise Xen would detect the 32-bit processor and panic on "Unable to detect the first memory bank" in domain_build.c.
>>
>> So for 32-bit Xen requires to have the first bank below 4GB. This is because you can't boot from a physical address above 32-bit.
>>
>> Obviously, this check wouldn't work on your platform because all your memory will be above 4GB.
> 
> I think that the Keystone board has low memory accessible at 2 different address (one low and one high).
> 
> I would here suggest to have a dtb with 2 regions (one under 4GB and one over) and remove from the region over 4G the area already addressed by the region under 4GB.

I thought about this. However, in an earlier reply, David wrote:

"4. The dom0 kernel will boot from xen if the early_paging_init switch 
step is skipped, and the low mem stays in 32-bit....but there is a
problem with the peripherals so this is not an acceptable solution."

It is not clear to me what sort of issues will arise with the 
peripherals. But I have assumed that it wouldn't be possible for Dom0 to 
keep using the memory below 4GB.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-04  8:59               ` Julien Grall
@ 2020-06-04  9:08                 ` Bertrand Marquis
  2020-06-04 10:15                   ` Julien Grall
  0 siblings, 1 reply; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-04  9:08 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, nd, Stefano Stabellini, CodeWiz2280



> On 4 Jun 2020, at 09:59, Julien Grall <julien@xen.org> wrote:
> 
> 
> 
> On 04/06/2020 09:02, Bertrand Marquis wrote:
>> Hi,
> 
> Hi Bertrand,
> 
>>> On 3 Jun 2020, at 19:09, Julien Grall <julien@xen.org> wrote:
>>> 
>>> 
>>> 
>>> On 03/06/2020 18:13, CodeWiz2280 wrote:
>>>> Hi Julien,
>>> 
>>> Hello,
>>> 
>>> In general, we avoid top post on xen-devel, instead we reply inline. I believe gmail should allow you to do it :).
>>> 
>>>> The offset is already applied to the memory nodes in the device tree, meaning a direct Linux boot from uboot would have only the 36-bit addresses in the device tree (0x8_0000_0000 and 0x8_8000_0000).  Linux would start executing from a 32-bit address space however and then switch over to the aliased 36-bit addresses in the device tree as discussed below by early_paging_init().
>>>> I had to add the 32-bit memory node 0x8000_0000 in uboot in place of the 0x8_0000_0000 node otherwise Xen would detect the 32-bit processor and panic on "Unable to detect the first memory bank" in domain_build.c.
>>> 
>>> So for 32-bit Xen requires to have the first bank below 4GB. This is because you can't boot from a physical address above 32-bit.
>>> 
>>> Obviously, this check wouldn't work on your platform because all your memory will be above 4GB.
>> I think that the Keystone board has low memory accessible at 2 different address (one low and one high).
>> I would here suggest to have a dtb with 2 regions (one under 4GB and one over) and remove from the region over 4G the area already addressed by the region under 4GB.
> 
> I thought about this. However, in an earlier reply, David wrote:
> 
> "4. The dom0 kernel will boot from xen if the early_paging_init switch step is skipped, and the low mem stays in 32-bit....but there is a
> problem with the peripherals so this is not an acceptable solution."
> 
> It is not clear to me what sort of issues will arise with the peripherals. But I have assumed that it wouldn't be possible for Dom0 to keep using the memory below 4GB.

I would have thought that linux would have need some memory, even small in the 32bit space in order to boot.
I could understand that some memory in the low address space needs to be reserved by Linux as DMA area for peripherals not supporting 36-bit addresses, but the whole low memory sounds like a big restriction.

Would it be possible to have a bit more information on the “problem with peripherals” here ?

Cheers
Bertrand


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-04  9:08                 ` Bertrand Marquis
@ 2020-06-04 10:15                   ` Julien Grall
  2020-06-04 12:07                     ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: Julien Grall @ 2020-06-04 10:15 UTC (permalink / raw)
  To: Bertrand Marquis; +Cc: xen-devel, nd, Stefano Stabellini, CodeWiz2280

Hi,

On 04/06/2020 10:08, Bertrand Marquis wrote:
> I would have thought that linux would have need some memory, even small in the 32bit space in order to boot.

Yes it needs some, but then they are switching to use the high memory 
alias after the MMU has been switch on.

 From my understanding, the only difference is the page-tables will 
point to the high memory alias address rather than the low memory one.
Linux will still be located at the same place but now accessed from the 
high memory alias rather than the low one.

Note that AFAICT the secondary CPUs will still be brought-up using the 
low memory alias.

> I could understand that some memory in the low address space needs to be reserved by Linux as DMA area for peripherals not supporting 36-bit addresses, but the whole low memory sounds like a big restriction.
Many platforms have devices only supporting 32-bit DMA, but none of them 
require such aliasing. So this doesn't look to be the issue here.

TBH, this code is only used by Keystone and switching address space is 
expensive (you have to turn off the MMU, updates page-tables, flush the 
cache...). I find hard to believe a developper would have come up with 
this complexity if it were possible to always use the low memory address 
range. It is even harder to believe Linux community would have accepted it.

> 
> Would it be possible to have a bit more information on the “problem with peripherals” here ?

I am curious as well, so I looked in more depth :). Going through the 
Linux history, one of the commit message [1] suggests they are switching 
to a coherent address space. The datasheet [2] (page 75) also confirm 
that the low region is not IO coherent.

So I think you would not be able to do DMA without flush the cache which 
can be pretty expensive. For a PoC, it might be possible to force Linux 
flushing the area before and after each DMA request. This should be 
possible by marking the devices as not coherent.

Although, I am not entirely sure if there is any fallout.

@Dave, do you think it is possible for you to have a try? I can provide 
the patch for Linux to disable DMA coherency if possible.

For a proper solution, I think we need to implement something similar to 
what I wrote earlier.

Cheers,

[1] 5eb3da7246a5b2dfac9f38a7be62b1a0295584c7
[2] https://www.ti.com/lit/ds/symlink/tci6638k2k.pdf?ts=1591183242813


-- 
Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-04 10:15                   ` Julien Grall
@ 2020-06-04 12:07                     ` CodeWiz2280
  2020-06-04 18:24                       ` Julien Grall
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-04 12:07 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, nd, Stefano Stabellini, Bertrand Marquis

On Thu, Jun 4, 2020 at 6:16 AM Julien Grall <julien@xen.org> wrote:
>
> Hi,
>
> On 04/06/2020 10:08, Bertrand Marquis wrote:
> > I would have thought that linux would have need some memory, even small in the 32bit space in order to boot.
>
> Yes it needs some, but then they are switching to use the high memory
> alias after the MMU has been switch on.
>
>  From my understanding, the only difference is the page-tables will
> point to the high memory alias address rather than the low memory one.
> Linux will still be located at the same place but now accessed from the
> high memory alias rather than the low one.
>
> Note that AFAICT the secondary CPUs will still be brought-up using the
> low memory alias.
>
> > I could understand that some memory in the low address space needs to be reserved by Linux as DMA area for peripherals not supporting 36-bit addresses, but the whole low memory sounds like a big restriction.
> Many platforms have devices only supporting 32-bit DMA, but none of them
> require such aliasing. So this doesn't look to be the issue here.
>
> TBH, this code is only used by Keystone and switching address space is
> expensive (you have to turn off the MMU, updates page-tables, flush the
> cache...). I find hard to believe a developper would have come up with
> this complexity if it were possible to always use the low memory address
> range. It is even harder to believe Linux community would have accepted it.
>
> >
> > Would it be possible to have a bit more information on the “problem with peripherals” here ?
>
> I am curious as well, so I looked in more depth :). Going through the
> Linux history, one of the commit message [1] suggests they are switching
> to a coherent address space. The datasheet [2] (page 75) also confirm
> that the low region is not IO coherent.
>
> So I think you would not be able to do DMA without flush the cache which
> can be pretty expensive. For a PoC, it might be possible to force Linux
> flushing the area before and after each DMA request. This should be
> possible by marking the devices as not coherent.
>
> Although, I am not entirely sure if there is any fallout.
>
> @Dave, do you think it is possible for you to have a try? I can provide
> the patch for Linux to disable DMA coherency if possible.
I attempted to do that, where I removed the "dma-coherent" flags from
the device tree.  There are likely other issues, but the most glaring
problem that I ran into is that the ethernet does not work.  Eth0
shows up in ifconfig but there is no activity on it after a small
handful of message exchanges, whereas booting without Xen it seems to
work fine even if left in 32-bit mode (with the dma-coherent
disabled).  I don't know what implications behind the scenes there are
trying to stay in the lower 0x8000_0000 alias range either though.  I
would rather run it as intended by switching to the upper
0x8_0000_0000 alias region.

>
> For a proper solution, I think we need to implement something similar to
> what I wrote earlier.
>
> Cheers,
>
> [1] 5eb3da7246a5b2dfac9f38a7be62b1a0295584c7
> [2] https://www.ti.com/lit/ds/symlink/tci6638k2k.pdf?ts=1591183242813
>
>
> --
> Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-04 12:07                     ` CodeWiz2280
@ 2020-06-04 18:24                       ` Julien Grall
  2020-06-05  2:29                         ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: Julien Grall @ 2020-06-04 18:24 UTC (permalink / raw)
  To: CodeWiz2280; +Cc: xen-devel, nd, Stefano Stabellini, Bertrand Marquis

Hi,

On 04/06/2020 13:07, CodeWiz2280 wrote:
> On Thu, Jun 4, 2020 at 6:16 AM Julien Grall <julien@xen.org> wrote:
>>
>> Hi,
>>
>> On 04/06/2020 10:08, Bertrand Marquis wrote:
>>> I would have thought that linux would have need some memory, even small in the 32bit space in order to boot.
>>
>> Yes it needs some, but then they are switching to use the high memory
>> alias after the MMU has been switch on.
>>
>>   From my understanding, the only difference is the page-tables will
>> point to the high memory alias address rather than the low memory one.
>> Linux will still be located at the same place but now accessed from the
>> high memory alias rather than the low one.
>>
>> Note that AFAICT the secondary CPUs will still be brought-up using the
>> low memory alias.
>>
>>> I could understand that some memory in the low address space needs to be reserved by Linux as DMA area for peripherals not supporting 36-bit addresses, but the whole low memory sounds like a big restriction.
>> Many platforms have devices only supporting 32-bit DMA, but none of them
>> require such aliasing. So this doesn't look to be the issue here.
>>
>> TBH, this code is only used by Keystone and switching address space is
>> expensive (you have to turn off the MMU, updates page-tables, flush the
>> cache...). I find hard to believe a developper would have come up with
>> this complexity if it were possible to always use the low memory address
>> range. It is even harder to believe Linux community would have accepted it.
>>
>>>
>>> Would it be possible to have a bit more information on the “problem with peripherals” here ?
>>
>> I am curious as well, so I looked in more depth :). Going through the
>> Linux history, one of the commit message [1] suggests they are switching
>> to a coherent address space. The datasheet [2] (page 75) also confirm
>> that the low region is not IO coherent.
>>
>> So I think you would not be able to do DMA without flush the cache which
>> can be pretty expensive. For a PoC, it might be possible to force Linux
>> flushing the area before and after each DMA request. This should be
>> possible by marking the devices as not coherent.
>>
>> Although, I am not entirely sure if there is any fallout.
>>
>> @Dave, do you think it is possible for you to have a try? I can provide
>> the patch for Linux to disable DMA coherency if possible.
> I attempted to do that, where I removed the "dma-coherent" flags from
> the device tree.  There are likely other issues, but the most glaring
> problem that I ran into is that the ethernet does not work.  Eth0
> shows up in ifconfig but there is no activity on it after a small
> handful of message exchanges, whereas booting without Xen it seems to
> work fine even if left in 32-bit mode (with the dma-coherent
> disabled).  I don't know what implications behind the scenes there are
> trying to stay in the lower 0x8000_0000 alias range either though. 

Thank you for the answer. As wrote, Linux is working fine in 32-bit mode 
when dma-coherent is left in 32-bit mode. So this suggest a different 
issue on the platform.

Given that you receive an handful of packet and then nothing, this would 
lead to maybe an interrupt problem. Can you check whether the number of 
interrupts increments the same way on baremetal and on Xen?

Dumping /proc/interrupts should be sufficient.

> I
> would rather run it as intended by switching to the upper
> 0x8_0000_0000 alias region.

I agree this would be ideal :).

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-04 18:24                       ` Julien Grall
@ 2020-06-05  2:29                         ` CodeWiz2280
  2020-06-05  7:36                           ` Bertrand Marquis
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-05  2:29 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, nd, Stefano Stabellini, Bertrand Marquis

On Thu, Jun 4, 2020 at 2:24 PM Julien Grall <julien@xen.org> wrote:
>
> Hi,
>
> On 04/06/2020 13:07, CodeWiz2280 wrote:
> > On Thu, Jun 4, 2020 at 6:16 AM Julien Grall <julien@xen.org> wrote:
> >>
> >> Hi,
> >>
> >> On 04/06/2020 10:08, Bertrand Marquis wrote:
> >>> I would have thought that linux would have need some memory, even small in the 32bit space in order to boot.
> >>
> >> Yes it needs some, but then they are switching to use the high memory
> >> alias after the MMU has been switch on.
> >>
> >>   From my understanding, the only difference is the page-tables will
> >> point to the high memory alias address rather than the low memory one.
> >> Linux will still be located at the same place but now accessed from the
> >> high memory alias rather than the low one.
> >>
> >> Note that AFAICT the secondary CPUs will still be brought-up using the
> >> low memory alias.
> >>
> >>> I could understand that some memory in the low address space needs to be reserved by Linux as DMA area for peripherals not supporting 36-bit addresses, but the whole low memory sounds like a big restriction.
> >> Many platforms have devices only supporting 32-bit DMA, but none of them
> >> require such aliasing. So this doesn't look to be the issue here.
> >>
> >> TBH, this code is only used by Keystone and switching address space is
> >> expensive (you have to turn off the MMU, updates page-tables, flush the
> >> cache...). I find hard to believe a developper would have come up with
> >> this complexity if it were possible to always use the low memory address
> >> range. It is even harder to believe Linux community would have accepted it.
> >>
> >>>
> >>> Would it be possible to have a bit more information on the “problem with peripherals” here ?
> >>
> >> I am curious as well, so I looked in more depth :). Going through the
> >> Linux history, one of the commit message [1] suggests they are switching
> >> to a coherent address space. The datasheet [2] (page 75) also confirm
> >> that the low region is not IO coherent.
> >>
> >> So I think you would not be able to do DMA without flush the cache which
> >> can be pretty expensive. For a PoC, it might be possible to force Linux
> >> flushing the area before and after each DMA request. This should be
> >> possible by marking the devices as not coherent.
> >>
> >> Although, I am not entirely sure if there is any fallout.
> >>
> >> @Dave, do you think it is possible for you to have a try? I can provide
> >> the patch for Linux to disable DMA coherency if possible.
> > I attempted to do that, where I removed the "dma-coherent" flags from
> > the device tree.  There are likely other issues, but the most glaring
> > problem that I ran into is that the ethernet does not work.  Eth0
> > shows up in ifconfig but there is no activity on it after a small
> > handful of message exchanges, whereas booting without Xen it seems to
> > work fine even if left in 32-bit mode (with the dma-coherent
> > disabled).  I don't know what implications behind the scenes there are
> > trying to stay in the lower 0x8000_0000 alias range either though.
>
> Thank you for the answer. As wrote, Linux is working fine in 32-bit mode
> when dma-coherent is left in 32-bit mode. So this suggest a different
> issue on the platform.
>
> Given that you receive an handful of packet and then nothing, this would
> lead to maybe an interrupt problem. Can you check whether the number of
> interrupts increments the same way on baremetal and on Xen?
>
> Dumping /proc/interrupts should be sufficient.
>
I am able to ping the board from itself, do you think it could still
be an interrupt issue?  It just cannot seem to ping out to a different
host (or ping from
my pc).  Unfortunately, the interrupts for the netcp Ethernet driver
on this board don't show up in the cat /proc/interrupts output under
the non-Xen kernel or
Xen loaded kernel from what I can tell.  I'm not sure how I would confirm that.

> > I
> > would rather run it as intended by switching to the upper
> > 0x8_0000_0000 alias region.
>
> I agree this would be ideal :).
>
> Cheers,
>
> --
> Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-05  2:29                         ` CodeWiz2280
@ 2020-06-05  7:36                           ` Bertrand Marquis
  2020-06-05 12:25                             ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-05  7:36 UTC (permalink / raw)
  To: CodeWiz2280; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall

Hi,

> On 5 Jun 2020, at 03:29, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> 
> On Thu, Jun 4, 2020 at 2:24 PM Julien Grall <julien@xen.org> wrote:
>> 
>> Hi,
>> 
>> On 04/06/2020 13:07, CodeWiz2280 wrote:
>>> On Thu, Jun 4, 2020 at 6:16 AM Julien Grall <julien@xen.org> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> On 04/06/2020 10:08, Bertrand Marquis wrote:
>>>>> I would have thought that linux would have need some memory, even small in the 32bit space in order to boot.
>>>> 
>>>> Yes it needs some, but then they are switching to use the high memory
>>>> alias after the MMU has been switch on.
>>>> 
>>>>  From my understanding, the only difference is the page-tables will
>>>> point to the high memory alias address rather than the low memory one.
>>>> Linux will still be located at the same place but now accessed from the
>>>> high memory alias rather than the low one.
>>>> 
>>>> Note that AFAICT the secondary CPUs will still be brought-up using the
>>>> low memory alias.
>>>> 
>>>>> I could understand that some memory in the low address space needs to be reserved by Linux as DMA area for peripherals not supporting 36-bit addresses, but the whole low memory sounds like a big restriction.
>>>> Many platforms have devices only supporting 32-bit DMA, but none of them
>>>> require such aliasing. So this doesn't look to be the issue here.
>>>> 
>>>> TBH, this code is only used by Keystone and switching address space is
>>>> expensive (you have to turn off the MMU, updates page-tables, flush the
>>>> cache...). I find hard to believe a developper would have come up with
>>>> this complexity if it were possible to always use the low memory address
>>>> range. It is even harder to believe Linux community would have accepted it.
>>>> 
>>>>> 
>>>>> Would it be possible to have a bit more information on the “problem with peripherals” here ?
>>>> 
>>>> I am curious as well, so I looked in more depth :). Going through the
>>>> Linux history, one of the commit message [1] suggests they are switching
>>>> to a coherent address space. The datasheet [2] (page 75) also confirm
>>>> that the low region is not IO coherent.
>>>> 
>>>> So I think you would not be able to do DMA without flush the cache which
>>>> can be pretty expensive. For a PoC, it might be possible to force Linux
>>>> flushing the area before and after each DMA request. This should be
>>>> possible by marking the devices as not coherent.
>>>> 
>>>> Although, I am not entirely sure if there is any fallout.
>>>> 
>>>> @Dave, do you think it is possible for you to have a try? I can provide
>>>> the patch for Linux to disable DMA coherency if possible.
>>> I attempted to do that, where I removed the "dma-coherent" flags from
>>> the device tree.  There are likely other issues, but the most glaring
>>> problem that I ran into is that the ethernet does not work.  Eth0
>>> shows up in ifconfig but there is no activity on it after a small
>>> handful of message exchanges, whereas booting without Xen it seems to
>>> work fine even if left in 32-bit mode (with the dma-coherent
>>> disabled).  I don't know what implications behind the scenes there are
>>> trying to stay in the lower 0x8000_0000 alias range either though.
>> 
>> Thank you for the answer. As wrote, Linux is working fine in 32-bit mode
>> when dma-coherent is left in 32-bit mode. So this suggest a different
>> issue on the platform.
>> 
>> Given that you receive an handful of packet and then nothing, this would
>> lead to maybe an interrupt problem. Can you check whether the number of
>> interrupts increments the same way on baremetal and on Xen?
>> 
>> Dumping /proc/interrupts should be sufficient.
>> 
> I am able to ping the board from itself, do you think it could still
> be an interrupt issue?  It just cannot seem to ping out to a different
> host (or ping from
> my pc).  Unfortunately, the interrupts for the netcp Ethernet driver
> on this board don't show up in the cat /proc/interrupts output under
> the non-Xen kernel or
> Xen loaded kernel from what I can tell.  I'm not sure how I would confirm that.

Could you check the content of /proc/interrupts ?

I did raise an issue several years ago on the keystone 2 related to interrupts and virtualization (no with Xen but the context should still be right):
https://e2e.ti.com/support/processors/f/791/t/462126?Keystone-2-no-interrupts-received-out-of-80-and-92-

There might be something to check in regards to level vs front interrupts for forwarded interrupts.

Regards
Bertrand



> 
>>> I
>>> would rather run it as intended by switching to the upper
>>> 0x8_0000_0000 alias region.
>> 
>> I agree this would be ideal :).
>> 
>> Cheers,
>> 
>> --
>> Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-05  7:36                           ` Bertrand Marquis
@ 2020-06-05 12:25                             ` CodeWiz2280
  2020-06-05 12:30                               ` Julien Grall
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-05 12:25 UTC (permalink / raw)
  To: Bertrand Marquis; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall

On Fri, Jun 5, 2020 at 3:37 AM Bertrand Marquis
<Bertrand.Marquis@arm.com> wrote:
>
> Hi,
>
> > On 5 Jun 2020, at 03:29, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> >
> > On Thu, Jun 4, 2020 at 2:24 PM Julien Grall <julien@xen.org> wrote:
> >>
> >> Hi,
> >>
> >> On 04/06/2020 13:07, CodeWiz2280 wrote:
> >>> On Thu, Jun 4, 2020 at 6:16 AM Julien Grall <julien@xen.org> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> On 04/06/2020 10:08, Bertrand Marquis wrote:
> >>>>> I would have thought that linux would have need some memory, even small in the 32bit space in order to boot.
> >>>>
> >>>> Yes it needs some, but then they are switching to use the high memory
> >>>> alias after the MMU has been switch on.
> >>>>
> >>>>  From my understanding, the only difference is the page-tables will
> >>>> point to the high memory alias address rather than the low memory one.
> >>>> Linux will still be located at the same place but now accessed from the
> >>>> high memory alias rather than the low one.
> >>>>
> >>>> Note that AFAICT the secondary CPUs will still be brought-up using the
> >>>> low memory alias.
> >>>>
> >>>>> I could understand that some memory in the low address space needs to be reserved by Linux as DMA area for peripherals not supporting 36-bit addresses, but the whole low memory sounds like a big restriction.
> >>>> Many platforms have devices only supporting 32-bit DMA, but none of them
> >>>> require such aliasing. So this doesn't look to be the issue here.
> >>>>
> >>>> TBH, this code is only used by Keystone and switching address space is
> >>>> expensive (you have to turn off the MMU, updates page-tables, flush the
> >>>> cache...). I find hard to believe a developper would have come up with
> >>>> this complexity if it were possible to always use the low memory address
> >>>> range. It is even harder to believe Linux community would have accepted it.
> >>>>
> >>>>>
> >>>>> Would it be possible to have a bit more information on the “problem with peripherals” here ?
> >>>>
> >>>> I am curious as well, so I looked in more depth :). Going through the
> >>>> Linux history, one of the commit message [1] suggests they are switching
> >>>> to a coherent address space. The datasheet [2] (page 75) also confirm
> >>>> that the low region is not IO coherent.
> >>>>
> >>>> So I think you would not be able to do DMA without flush the cache which
> >>>> can be pretty expensive. For a PoC, it might be possible to force Linux
> >>>> flushing the area before and after each DMA request. This should be
> >>>> possible by marking the devices as not coherent.
> >>>>
> >>>> Although, I am not entirely sure if there is any fallout.
> >>>>
> >>>> @Dave, do you think it is possible for you to have a try? I can provide
> >>>> the patch for Linux to disable DMA coherency if possible.
> >>> I attempted to do that, where I removed the "dma-coherent" flags from
> >>> the device tree.  There are likely other issues, but the most glaring
> >>> problem that I ran into is that the ethernet does not work.  Eth0
> >>> shows up in ifconfig but there is no activity on it after a small
> >>> handful of message exchanges, whereas booting without Xen it seems to
> >>> work fine even if left in 32-bit mode (with the dma-coherent
> >>> disabled).  I don't know what implications behind the scenes there are
> >>> trying to stay in the lower 0x8000_0000 alias range either though.
> >>
> >> Thank you for the answer. As wrote, Linux is working fine in 32-bit mode
> >> when dma-coherent is left in 32-bit mode. So this suggest a different
> >> issue on the platform.
> >>
> >> Given that you receive an handful of packet and then nothing, this would
> >> lead to maybe an interrupt problem. Can you check whether the number of
> >> interrupts increments the same way on baremetal and on Xen?
> >>
> >> Dumping /proc/interrupts should be sufficient.
> >>
> > I am able to ping the board from itself, do you think it could still
> > be an interrupt issue?  It just cannot seem to ping out to a different
> > host (or ping from
> > my pc).  Unfortunately, the interrupts for the netcp Ethernet driver
> > on this board don't show up in the cat /proc/interrupts output under
> > the non-Xen kernel or
> > Xen loaded kernel from what I can tell.  I'm not sure how I would confirm that.
>
> Could you check the content of /proc/interrupts ?
>
> I did raise an issue several years ago on the keystone 2 related to interrupts and virtualization (no with Xen but the context should still be right):
> https://e2e.ti.com/support/processors/f/791/t/462126?Keystone-2-no-interrupts-received-out-of-80-and-92-
>
> There might be something to check in regards to level vs front interrupts for forwarded interrupts.
>
The Keystone uses the netcp driver, which has interrupts from 40-79
listed in the device tree (arch/arm/boot/keystone-k2e-netcp.dtsi).
I'm using the same device tree between my non-xen standalone kernel
and my dom0 kernel booted by xen.  In the standalone (non-xen) kernel
the ethernet works fine, but I don't see any of its interrupts in the
output of /proc/iomem.  I'm not seeing them in /proc/iomem when
running dom0 under Xen either.  When booting with Xen I get this
behavior where the ifconfig output shows 1 RX message and 1 TX
message, and then nothing else.

> Regards
> Bertrand
>
>
>
> >
> >>> I
> >>> would rather run it as intended by switching to the upper
> >>> 0x8_0000_0000 alias region.
> >>
> >> I agree this would be ideal :).
> >>
> >> Cheers,
> >>
> >> --
> >> Julien Grall
>


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-05 12:25                             ` CodeWiz2280
@ 2020-06-05 12:30                               ` Julien Grall
  2020-06-05 12:42                                 ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: Julien Grall @ 2020-06-05 12:30 UTC (permalink / raw)
  To: CodeWiz2280, Bertrand Marquis; +Cc: xen-devel, nd, Stefano Stabellini

Hi,

On 05/06/2020 13:25, CodeWiz2280 wrote:
> The Keystone uses the netcp driver, which has interrupts from 40-79
> listed in the device tree (arch/arm/boot/keystone-k2e-netcp.dtsi).
> I'm using the same device tree between my non-xen standalone kernel
> and my dom0 kernel booted by xen.  In the standalone (non-xen) kernel
> the ethernet works fine, but I don't see any of its interrupts in the
> output of /proc/iomem.  I'm not seeing them in /proc/iomem when
> running dom0 under Xen either.  When booting with Xen I get this
> behavior where the ifconfig output shows 1 RX message and 1 TX
> message, and then nothing else.

I am not sure whether this is a typo in the e-mail. /proc/iomem is 
listing the list of the MMIO regions. You want to use /proc/interrupts.

Can you confirm which path you are dumping?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-05 12:30                               ` Julien Grall
@ 2020-06-05 12:42                                 ` CodeWiz2280
  2020-06-05 12:47                                   ` Bertrand Marquis
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-05 12:42 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, nd, Stefano Stabellini, Bertrand Marquis

On Fri, Jun 5, 2020 at 8:30 AM Julien Grall <julien@xen.org> wrote:
>
> Hi,
>
> On 05/06/2020 13:25, CodeWiz2280 wrote:
> > The Keystone uses the netcp driver, which has interrupts from 40-79
> > listed in the device tree (arch/arm/boot/keystone-k2e-netcp.dtsi).
> > I'm using the same device tree between my non-xen standalone kernel
> > and my dom0 kernel booted by xen.  In the standalone (non-xen) kernel
> > the ethernet works fine, but I don't see any of its interrupts in the
> > output of /proc/iomem.  I'm not seeing them in /proc/iomem when
> > running dom0 under Xen either.  When booting with Xen I get this
> > behavior where the ifconfig output shows 1 RX message and 1 TX
> > message, and then nothing else.
>
> I am not sure whether this is a typo in the e-mail. /proc/iomem is
> listing the list of the MMIO regions. You want to use /proc/interrupts.
>
> Can you confirm which path you are dumping?
Yes, that was a typo.  Sorry about that.  I meant that I am dumping
/proc/interrupts and do not
see them under the non-xen kernel or xen booted dom0.
>
> Cheers,
>
> --
> Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-05 12:42                                 ` CodeWiz2280
@ 2020-06-05 12:47                                   ` Bertrand Marquis
  2020-06-05 15:05                                     ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-05 12:47 UTC (permalink / raw)
  To: CodeWiz2280; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall



> On 5 Jun 2020, at 13:42, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> 
> On Fri, Jun 5, 2020 at 8:30 AM Julien Grall <julien@xen.org> wrote:
>> 
>> Hi,
>> 
>> On 05/06/2020 13:25, CodeWiz2280 wrote:
>>> The Keystone uses the netcp driver, which has interrupts from 40-79
>>> listed in the device tree (arch/arm/boot/keystone-k2e-netcp.dtsi).
>>> I'm using the same device tree between my non-xen standalone kernel
>>> and my dom0 kernel booted by xen.  In the standalone (non-xen) kernel
>>> the ethernet works fine, but I don't see any of its interrupts in the
>>> output of /proc/iomem.  I'm not seeing them in /proc/iomem when
>>> running dom0 under Xen either.  When booting with Xen I get this
>>> behavior where the ifconfig output shows 1 RX message and 1 TX
>>> message, and then nothing else.
>> 
>> I am not sure whether this is a typo in the e-mail. /proc/iomem is
>> listing the list of the MMIO regions. You want to use /proc/interrupts.
>> 
>> Can you confirm which path you are dumping?
> Yes, that was a typo.  Sorry about that.  I meant that I am dumping
> /proc/interrupts and do not
> see them under the non-xen kernel or xen booted dom0.

Could you post both /proc/interrupts content ?

Cheers
Bertrand



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-05 12:47                                   ` Bertrand Marquis
@ 2020-06-05 15:05                                     ` CodeWiz2280
  2020-06-05 19:12                                       ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-05 15:05 UTC (permalink / raw)
  To: Bertrand Marquis; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall

On Fri, Jun 5, 2020 at 8:47 AM Bertrand Marquis
<Bertrand.Marquis@arm.com> wrote:
>
>
>
> > On 5 Jun 2020, at 13:42, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> >
> > On Fri, Jun 5, 2020 at 8:30 AM Julien Grall <julien@xen.org> wrote:
> >>
> >> Hi,
> >>
> >> On 05/06/2020 13:25, CodeWiz2280 wrote:
> >>> The Keystone uses the netcp driver, which has interrupts from 40-79
> >>> listed in the device tree (arch/arm/boot/keystone-k2e-netcp.dtsi).
> >>> I'm using the same device tree between my non-xen standalone kernel
> >>> and my dom0 kernel booted by xen.  In the standalone (non-xen) kernel
> >>> the ethernet works fine, but I don't see any of its interrupts in the
> >>> output of /proc/iomem.  I'm not seeing them in /proc/iomem when
> >>> running dom0 under Xen either.  When booting with Xen I get this
> >>> behavior where the ifconfig output shows 1 RX message and 1 TX
> >>> message, and then nothing else.
> >>
> >> I am not sure whether this is a typo in the e-mail. /proc/iomem is
> >> listing the list of the MMIO regions. You want to use /proc/interrupts.
> >>
> >> Can you confirm which path you are dumping?
> > Yes, that was a typo.  Sorry about that.  I meant that I am dumping
> > /proc/interrupts and do not
> > see them under the non-xen kernel or xen booted dom0.
>
> Could you post both /proc/interrupts content ?

Standalone non-xen kernel (Ethernet works)
# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
 17:          0          0          0          0     GICv2  29 Level
  arch_timer
 18:       9856       1202        457        650     GICv2  30 Level
  arch_timer
 21:          0          0          0          0     GICv2 142 Edge
  timer-keystone
 22:          0          0          0          0     GICv2  52 Edge      arm-pmu
 23:          0          0          0          0     GICv2  53 Edge      arm-pmu
 24:          0          0          0          0     GICv2  54 Edge      arm-pmu
 25:          0          0          0          0     GICv2  55 Edge      arm-pmu
 26:          0          0          0          0     GICv2  36 Edge
  26202a0.keystone_irq
 27:       1435          0          0          0     GICv2 309 Edge      ttyS0
 29:          0          0          0          0     GICv2 315 Edge
  2530000.i2c
 30:          1          0          0          0     GICv2 318 Edge
  2530400.i2c
 31:          0          0          0          0     GICv2 321 Edge
  2530800.i2c
 32:         69          0          0          0     GICv2 324 Edge
  21000400.spi
 33:          0          0          0          0     GICv2 328 Edge
  21000600.spi
 34:          0          0          0          0     GICv2 332 Edge
  21000800.spi
 70:          0          0          0          0     GICv2 417 Edge
  ks-pcie-error-irq
 79:          0          0          0          0   PCI-MSI   0 Edge
  PCIe PME, aerdrv
 88:         57          0          0          0     GICv2  80 Level
  hwqueue-528
 89:         57          0          0          0     GICv2  81 Level
  hwqueue-529
 90:         47          0          0          0     GICv2  82 Level
  hwqueue-530
 91:         41          0          0          0     GICv2  83 Level
  hwqueue-531
IPI0:          0          0          0          0  CPU wakeup interrupts
IPI1:          0          0          0          0  Timer broadcast interrupts
IPI2:        730        988       1058        937  Rescheduling interrupts
IPI3:          2          3          4          6  Function call interrupts
IPI4:          0          0          0          0  CPU stop interrupts
IPI5:          0          0          0          0  IRQ work interrupts
IPI6:          0          0          0          0  completion interrupts

Xen dom0 (Ethernet stops)
# cat /proc/interrupts
           CPU0
 18:      10380     GIC-0  27 Level     arch_timer
 19:          0     GIC-0 142 Edge      timer-keystone
 20:         88     GIC-0  16 Level     events
 21:          0   xen-dyn     Edge    -event     xenbus
 22:          0     GIC-0  36 Edge      26202a0.keystone_irq
 23:          1     GIC-0 312 Edge      ttyS0
 25:          1     GIC-0 318 Edge
 27:          1     GIC-0 324 Edge      21000400.spi
 28:          0     GIC-0 328 Edge      21000600.spi
 29:          0     GIC-0 332 Edge      21000800.spi
 65:          0     GIC-0 417 Edge      ks-pcie-error-irq
 74:          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
 83:          1     GIC-0  80 Level     hwqueue-528
 84:          1     GIC-0  81 Level     hwqueue-529
 85:          1     GIC-0  82 Level     hwqueue-530
 86:          1     GIC-0  83 Level     hwqueue-531
115:         87   xen-dyn     Edge    -virq      hvc_console
IPI0:          0  CPU wakeup interrupts
IPI1:          0  Timer broadcast interrupts
IPI2:          0  Rescheduling interrupts
IPI3:          0  Function call interrupts
IPI4:          0  CPU stop interrupts
IPI5:          0  IRQ work interrupts
IPI6:          0  completion interrupts
Err:          0

>
> Cheers
> Bertrand
>


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-05 15:05                                     ` CodeWiz2280
@ 2020-06-05 19:12                                       ` CodeWiz2280
  2020-06-08  8:40                                         ` Bertrand Marquis
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-05 19:12 UTC (permalink / raw)
  To: Bertrand Marquis; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall

On Fri, Jun 5, 2020 at 11:05 AM CodeWiz2280 <codewiz2280@gmail.com> wrote:
>
> On Fri, Jun 5, 2020 at 8:47 AM Bertrand Marquis
> <Bertrand.Marquis@arm.com> wrote:
> >
> >
> >
> > > On 5 Jun 2020, at 13:42, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> > >
> > > On Fri, Jun 5, 2020 at 8:30 AM Julien Grall <julien@xen.org> wrote:
> > >>
> > >> Hi,
> > >>
> > >> On 05/06/2020 13:25, CodeWiz2280 wrote:
> > >>> The Keystone uses the netcp driver, which has interrupts from 40-79
> > >>> listed in the device tree (arch/arm/boot/keystone-k2e-netcp.dtsi).
> > >>> I'm using the same device tree between my non-xen standalone kernel
> > >>> and my dom0 kernel booted by xen.  In the standalone (non-xen) kernel
> > >>> the ethernet works fine, but I don't see any of its interrupts in the
> > >>> output of /proc/iomem.  I'm not seeing them in /proc/iomem when
> > >>> running dom0 under Xen either.  When booting with Xen I get this
> > >>> behavior where the ifconfig output shows 1 RX message and 1 TX
> > >>> message, and then nothing else.
> > >>
> > >> I am not sure whether this is a typo in the e-mail. /proc/iomem is
> > >> listing the list of the MMIO regions. You want to use /proc/interrupts.
> > >>
> > >> Can you confirm which path you are dumping?
> > > Yes, that was a typo.  Sorry about that.  I meant that I am dumping
> > > /proc/interrupts and do not
> > > see them under the non-xen kernel or xen booted dom0.
> >
> > Could you post both /proc/interrupts content ?
>
> Standalone non-xen kernel (Ethernet works)
> # cat /proc/interrupts
>            CPU0       CPU1       CPU2       CPU3
>  17:          0          0          0          0     GICv2  29 Level
>   arch_timer
>  18:       9856       1202        457        650     GICv2  30 Level
>   arch_timer
>  21:          0          0          0          0     GICv2 142 Edge
>   timer-keystone
>  22:          0          0          0          0     GICv2  52 Edge      arm-pmu
>  23:          0          0          0          0     GICv2  53 Edge      arm-pmu
>  24:          0          0          0          0     GICv2  54 Edge      arm-pmu
>  25:          0          0          0          0     GICv2  55 Edge      arm-pmu
>  26:          0          0          0          0     GICv2  36 Edge
>   26202a0.keystone_irq
>  27:       1435          0          0          0     GICv2 309 Edge      ttyS0
>  29:          0          0          0          0     GICv2 315 Edge
>   2530000.i2c
>  30:          1          0          0          0     GICv2 318 Edge
>   2530400.i2c
>  31:          0          0          0          0     GICv2 321 Edge
>   2530800.i2c
>  32:         69          0          0          0     GICv2 324 Edge
>   21000400.spi
>  33:          0          0          0          0     GICv2 328 Edge
>   21000600.spi
>  34:          0          0          0          0     GICv2 332 Edge
>   21000800.spi
>  70:          0          0          0          0     GICv2 417 Edge
>   ks-pcie-error-irq
>  79:          0          0          0          0   PCI-MSI   0 Edge
>   PCIe PME, aerdrv
>  88:         57          0          0          0     GICv2  80 Level
>   hwqueue-528
>  89:         57          0          0          0     GICv2  81 Level
>   hwqueue-529
>  90:         47          0          0          0     GICv2  82 Level
>   hwqueue-530
>  91:         41          0          0          0     GICv2  83 Level
>   hwqueue-531
> IPI0:          0          0          0          0  CPU wakeup interrupts
> IPI1:          0          0          0          0  Timer broadcast interrupts
> IPI2:        730        988       1058        937  Rescheduling interrupts
> IPI3:          2          3          4          6  Function call interrupts
> IPI4:          0          0          0          0  CPU stop interrupts
> IPI5:          0          0          0          0  IRQ work interrupts
> IPI6:          0          0          0          0  completion interrupts
>
> Xen dom0 (Ethernet stops)
> # cat /proc/interrupts
>            CPU0
>  18:      10380     GIC-0  27 Level     arch_timer
>  19:          0     GIC-0 142 Edge      timer-keystone
>  20:         88     GIC-0  16 Level     events
>  21:          0   xen-dyn     Edge    -event     xenbus
>  22:          0     GIC-0  36 Edge      26202a0.keystone_irq
>  23:          1     GIC-0 312 Edge      ttyS0
>  25:          1     GIC-0 318 Edge
>  27:          1     GIC-0 324 Edge      21000400.spi
>  28:          0     GIC-0 328 Edge      21000600.spi
>  29:          0     GIC-0 332 Edge      21000800.spi
>  65:          0     GIC-0 417 Edge      ks-pcie-error-irq
>  74:          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
>  83:          1     GIC-0  80 Level     hwqueue-528
>  84:          1     GIC-0  81 Level     hwqueue-529
>  85:          1     GIC-0  82 Level     hwqueue-530
>  86:          1     GIC-0  83 Level     hwqueue-531
> 115:         87   xen-dyn     Edge    -virq      hvc_console
> IPI0:          0  CPU wakeup interrupts
> IPI1:          0  Timer broadcast interrupts
> IPI2:          0  Rescheduling interrupts
> IPI3:          0  Function call interrupts
> IPI4:          0  CPU stop interrupts
> IPI5:          0  IRQ work interrupts
> IPI6:          0  completion interrupts
> Err:          0
After getting a chance to look at this a little more, I believe the
TX/RX interrupts for the ethernets map like this:

eth0 Rx  - hwqueue-528
eth1 Rx - hwqueue-529
eth0 Tx  - hwqueue-530
eth1 Tx - hwqueue-531
>
The interrupt counts in the standlone working kernel seem to roughly
correspond to the counts of Tx/Rx messages in ifconfig.  Going on
that, its clear that only 1 interrupt has been received for Tx and 1
for Rx in the Xen Dom0 equivalent.  Any thoughts on this?
> >
> > Cheers
> > Bertrand
> >


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-05 19:12                                       ` CodeWiz2280
@ 2020-06-08  8:40                                         ` Bertrand Marquis
  2020-06-08 12:33                                           ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-08  8:40 UTC (permalink / raw)
  To: CodeWiz2280; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall



> On 5 Jun 2020, at 20:12, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> 
> On Fri, Jun 5, 2020 at 11:05 AM CodeWiz2280 <codewiz2280@gmail.com> wrote:
>> 
>> On Fri, Jun 5, 2020 at 8:47 AM Bertrand Marquis
>> <Bertrand.Marquis@arm.com> wrote:
>>> 
>>> 
>>> 
>>>> On 5 Jun 2020, at 13:42, CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>>> 
>>>> On Fri, Jun 5, 2020 at 8:30 AM Julien Grall <julien@xen.org> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> On 05/06/2020 13:25, CodeWiz2280 wrote:
>>>>>> The Keystone uses the netcp driver, which has interrupts from 40-79
>>>>>> listed in the device tree (arch/arm/boot/keystone-k2e-netcp.dtsi).
>>>>>> I'm using the same device tree between my non-xen standalone kernel
>>>>>> and my dom0 kernel booted by xen.  In the standalone (non-xen) kernel
>>>>>> the ethernet works fine, but I don't see any of its interrupts in the
>>>>>> output of /proc/iomem.  I'm not seeing them in /proc/iomem when
>>>>>> running dom0 under Xen either.  When booting with Xen I get this
>>>>>> behavior where the ifconfig output shows 1 RX message and 1 TX
>>>>>> message, and then nothing else.
>>>>> 
>>>>> I am not sure whether this is a typo in the e-mail. /proc/iomem is
>>>>> listing the list of the MMIO regions. You want to use /proc/interrupts.
>>>>> 
>>>>> Can you confirm which path you are dumping?
>>>> Yes, that was a typo.  Sorry about that.  I meant that I am dumping
>>>> /proc/interrupts and do not
>>>> see them under the non-xen kernel or xen booted dom0.
>>> 
>>> Could you post both /proc/interrupts content ?
>> 
>> Standalone non-xen kernel (Ethernet works)
>> # cat /proc/interrupts
>>           CPU0       CPU1       CPU2       CPU3
>> 17:          0          0          0          0     GICv2  29 Level
>>  arch_timer
>> 18:       9856       1202        457        650     GICv2  30 Level
>>  arch_timer
>> 21:          0          0          0          0     GICv2 142 Edge
>>  timer-keystone
>> 22:          0          0          0          0     GICv2  52 Edge      arm-pmu
>> 23:          0          0          0          0     GICv2  53 Edge      arm-pmu
>> 24:          0          0          0          0     GICv2  54 Edge      arm-pmu
>> 25:          0          0          0          0     GICv2  55 Edge      arm-pmu
>> 26:          0          0          0          0     GICv2  36 Edge
>>  26202a0.keystone_irq
>> 27:       1435          0          0          0     GICv2 309 Edge      ttyS0
>> 29:          0          0          0          0     GICv2 315 Edge
>>  2530000.i2c
>> 30:          1          0          0          0     GICv2 318 Edge
>>  2530400.i2c
>> 31:          0          0          0          0     GICv2 321 Edge
>>  2530800.i2c
>> 32:         69          0          0          0     GICv2 324 Edge
>>  21000400.spi
>> 33:          0          0          0          0     GICv2 328 Edge
>>  21000600.spi
>> 34:          0          0          0          0     GICv2 332 Edge
>>  21000800.spi
>> 70:          0          0          0          0     GICv2 417 Edge
>>  ks-pcie-error-irq
>> 79:          0          0          0          0   PCI-MSI   0 Edge
>>  PCIe PME, aerdrv
>> 88:         57          0          0          0     GICv2  80 Level
>>  hwqueue-528
>> 89:         57          0          0          0     GICv2  81 Level
>>  hwqueue-529
>> 90:         47          0          0          0     GICv2  82 Level
>>  hwqueue-530
>> 91:         41          0          0          0     GICv2  83 Level
>>  hwqueue-531
>> IPI0:          0          0          0          0  CPU wakeup interrupts
>> IPI1:          0          0          0          0  Timer broadcast interrupts
>> IPI2:        730        988       1058        937  Rescheduling interrupts
>> IPI3:          2          3          4          6  Function call interrupts
>> IPI4:          0          0          0          0  CPU stop interrupts
>> IPI5:          0          0          0          0  IRQ work interrupts
>> IPI6:          0          0          0          0  completion interrupts
>> 
>> Xen dom0 (Ethernet stops)
>> # cat /proc/interrupts
>>           CPU0
>> 18:      10380     GIC-0  27 Level     arch_timer
>> 19:          0     GIC-0 142 Edge      timer-keystone
>> 20:         88     GIC-0  16 Level     events
>> 21:          0   xen-dyn     Edge    -event     xenbus
>> 22:          0     GIC-0  36 Edge      26202a0.keystone_irq
>> 23:          1     GIC-0 312 Edge      ttyS0
>> 25:          1     GIC-0 318 Edge
>> 27:          1     GIC-0 324 Edge      21000400.spi
>> 28:          0     GIC-0 328 Edge      21000600.spi
>> 29:          0     GIC-0 332 Edge      21000800.spi
>> 65:          0     GIC-0 417 Edge      ks-pcie-error-irq
>> 74:          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
>> 83:          1     GIC-0  80 Level     hwqueue-528
>> 84:          1     GIC-0  81 Level     hwqueue-529
>> 85:          1     GIC-0  82 Level     hwqueue-530
>> 86:          1     GIC-0  83 Level     hwqueue-531
>> 115:         87   xen-dyn     Edge    -virq      hvc_console
>> IPI0:          0  CPU wakeup interrupts
>> IPI1:          0  Timer broadcast interrupts
>> IPI2:          0  Rescheduling interrupts
>> IPI3:          0  Function call interrupts
>> IPI4:          0  CPU stop interrupts
>> IPI5:          0  IRQ work interrupts
>> IPI6:          0  completion interrupts
>> Err:          0
> After getting a chance to look at this a little more, I believe the
> TX/RX interrupts for the ethernets map like this:
> 
> eth0 Rx  - hwqueue-528
> eth1 Rx - hwqueue-529
> eth0 Tx  - hwqueue-530
> eth1 Tx - hwqueue-531
>> 
> The interrupt counts in the standlone working kernel seem to roughly
> correspond to the counts of Tx/Rx messages in ifconfig.  Going on
> that, its clear that only 1 interrupt has been received for Tx and 1
> for Rx in the Xen Dom0 equivalent.  Any thoughts on this?

This definitely look like an interrupt acknowledgement issue.
This could be caused by 2 things I remember of:
- front vs level interrupts
- a problem with forwarded interrupt acknowledgement. 
I think there was something related to that where the vcpu ack was not properly
handled on a keystone and I had to change the way the interrupt was acked for
forwarded hardware interrupts.

I will try to get more info on that one as I have no access to the code anymore.

Regards
Bertrand







^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-08  8:40                                         ` Bertrand Marquis
@ 2020-06-08 12:33                                           ` CodeWiz2280
  2020-06-08 16:13                                             ` Stefano Stabellini
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-08 12:33 UTC (permalink / raw)
  To: Bertrand Marquis; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall

It actually shows only 1 interrupt for any of the devices in that list
(e.g. spi, ttyS0, ethernet) so you're probably right on the money with
it being an interrupt acknowledge issue.  Any help you can provide is
greatly appreciated.

On Mon, Jun 8, 2020 at 4:40 AM Bertrand Marquis
<Bertrand.Marquis@arm.com> wrote:
>
>
>
> > On 5 Jun 2020, at 20:12, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> >
> > On Fri, Jun 5, 2020 at 11:05 AM CodeWiz2280 <codewiz2280@gmail.com> wrote:
> >>
> >> On Fri, Jun 5, 2020 at 8:47 AM Bertrand Marquis
> >> <Bertrand.Marquis@arm.com> wrote:
> >>>
> >>>
> >>>
> >>>> On 5 Jun 2020, at 13:42, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> >>>>
> >>>> On Fri, Jun 5, 2020 at 8:30 AM Julien Grall <julien@xen.org> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> On 05/06/2020 13:25, CodeWiz2280 wrote:
> >>>>>> The Keystone uses the netcp driver, which has interrupts from 40-79
> >>>>>> listed in the device tree (arch/arm/boot/keystone-k2e-netcp.dtsi).
> >>>>>> I'm using the same device tree between my non-xen standalone kernel
> >>>>>> and my dom0 kernel booted by xen.  In the standalone (non-xen) kernel
> >>>>>> the ethernet works fine, but I don't see any of its interrupts in the
> >>>>>> output of /proc/iomem.  I'm not seeing them in /proc/iomem when
> >>>>>> running dom0 under Xen either.  When booting with Xen I get this
> >>>>>> behavior where the ifconfig output shows 1 RX message and 1 TX
> >>>>>> message, and then nothing else.
> >>>>>
> >>>>> I am not sure whether this is a typo in the e-mail. /proc/iomem is
> >>>>> listing the list of the MMIO regions. You want to use /proc/interrupts.
> >>>>>
> >>>>> Can you confirm which path you are dumping?
> >>>> Yes, that was a typo.  Sorry about that.  I meant that I am dumping
> >>>> /proc/interrupts and do not
> >>>> see them under the non-xen kernel or xen booted dom0.
> >>>
> >>> Could you post both /proc/interrupts content ?
> >>
> >> Standalone non-xen kernel (Ethernet works)
> >> # cat /proc/interrupts
> >>           CPU0       CPU1       CPU2       CPU3
> >> 17:          0          0          0          0     GICv2  29 Level
> >>  arch_timer
> >> 18:       9856       1202        457        650     GICv2  30 Level
> >>  arch_timer
> >> 21:          0          0          0          0     GICv2 142 Edge
> >>  timer-keystone
> >> 22:          0          0          0          0     GICv2  52 Edge      arm-pmu
> >> 23:          0          0          0          0     GICv2  53 Edge      arm-pmu
> >> 24:          0          0          0          0     GICv2  54 Edge      arm-pmu
> >> 25:          0          0          0          0     GICv2  55 Edge      arm-pmu
> >> 26:          0          0          0          0     GICv2  36 Edge
> >>  26202a0.keystone_irq
> >> 27:       1435          0          0          0     GICv2 309 Edge      ttyS0
> >> 29:          0          0          0          0     GICv2 315 Edge
> >>  2530000.i2c
> >> 30:          1          0          0          0     GICv2 318 Edge
> >>  2530400.i2c
> >> 31:          0          0          0          0     GICv2 321 Edge
> >>  2530800.i2c
> >> 32:         69          0          0          0     GICv2 324 Edge
> >>  21000400.spi
> >> 33:          0          0          0          0     GICv2 328 Edge
> >>  21000600.spi
> >> 34:          0          0          0          0     GICv2 332 Edge
> >>  21000800.spi
> >> 70:          0          0          0          0     GICv2 417 Edge
> >>  ks-pcie-error-irq
> >> 79:          0          0          0          0   PCI-MSI   0 Edge
> >>  PCIe PME, aerdrv
> >> 88:         57          0          0          0     GICv2  80 Level
> >>  hwqueue-528
> >> 89:         57          0          0          0     GICv2  81 Level
> >>  hwqueue-529
> >> 90:         47          0          0          0     GICv2  82 Level
> >>  hwqueue-530
> >> 91:         41          0          0          0     GICv2  83 Level
> >>  hwqueue-531
> >> IPI0:          0          0          0          0  CPU wakeup interrupts
> >> IPI1:          0          0          0          0  Timer broadcast interrupts
> >> IPI2:        730        988       1058        937  Rescheduling interrupts
> >> IPI3:          2          3          4          6  Function call interrupts
> >> IPI4:          0          0          0          0  CPU stop interrupts
> >> IPI5:          0          0          0          0  IRQ work interrupts
> >> IPI6:          0          0          0          0  completion interrupts
> >>
> >> Xen dom0 (Ethernet stops)
> >> # cat /proc/interrupts
> >>           CPU0
> >> 18:      10380     GIC-0  27 Level     arch_timer
> >> 19:          0     GIC-0 142 Edge      timer-keystone
> >> 20:         88     GIC-0  16 Level     events
> >> 21:          0   xen-dyn     Edge    -event     xenbus
> >> 22:          0     GIC-0  36 Edge      26202a0.keystone_irq
> >> 23:          1     GIC-0 312 Edge      ttyS0
> >> 25:          1     GIC-0 318 Edge
> >> 27:          1     GIC-0 324 Edge      21000400.spi
> >> 28:          0     GIC-0 328 Edge      21000600.spi
> >> 29:          0     GIC-0 332 Edge      21000800.spi
> >> 65:          0     GIC-0 417 Edge      ks-pcie-error-irq
> >> 74:          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
> >> 83:          1     GIC-0  80 Level     hwqueue-528
> >> 84:          1     GIC-0  81 Level     hwqueue-529
> >> 85:          1     GIC-0  82 Level     hwqueue-530
> >> 86:          1     GIC-0  83 Level     hwqueue-531
> >> 115:         87   xen-dyn     Edge    -virq      hvc_console
> >> IPI0:          0  CPU wakeup interrupts
> >> IPI1:          0  Timer broadcast interrupts
> >> IPI2:          0  Rescheduling interrupts
> >> IPI3:          0  Function call interrupts
> >> IPI4:          0  CPU stop interrupts
> >> IPI5:          0  IRQ work interrupts
> >> IPI6:          0  completion interrupts
> >> Err:          0
> > After getting a chance to look at this a little more, I believe the
> > TX/RX interrupts for the ethernets map like this:
> >
> > eth0 Rx  - hwqueue-528
> > eth1 Rx - hwqueue-529
> > eth0 Tx  - hwqueue-530
> > eth1 Tx - hwqueue-531
> >>
> > The interrupt counts in the standlone working kernel seem to roughly
> > correspond to the counts of Tx/Rx messages in ifconfig.  Going on
> > that, its clear that only 1 interrupt has been received for Tx and 1
> > for Rx in the Xen Dom0 equivalent.  Any thoughts on this?
>
> This definitely look like an interrupt acknowledgement issue.
> This could be caused by 2 things I remember of:
> - front vs level interrupts
> - a problem with forwarded interrupt acknowledgement.
> I think there was something related to that where the vcpu ack was not properly
> handled on a keystone and I had to change the way the interrupt was acked for
> forwarded hardware interrupts.
>
> I will try to get more info on that one as I have no access to the code anymore.
>
> Regards
> Bertrand
>
>
>
>
>


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-08 12:33                                           ` CodeWiz2280
@ 2020-06-08 16:13                                             ` Stefano Stabellini
  2020-06-09 14:33                                               ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: Stefano Stabellini @ 2020-06-08 16:13 UTC (permalink / raw)
  To: CodeWiz2280
  Cc: xen-devel, nd, Stefano Stabellini, Julien Grall, Bertrand Marquis



On Mon, 8 Jun 2020, CodeWiz2280 wrote:
> It actually shows only 1 interrupt for any of the devices in that list
> (e.g. spi, ttyS0, ethernet) so you're probably right on the money with
> it being an interrupt acknowledge issue.  Any help you can provide is
> greatly appreciated.
> 
> On Mon, Jun 8, 2020 at 4:40 AM Bertrand Marquis
> <Bertrand.Marquis@arm.com> wrote:
> >
> >
> >
> > > On 5 Jun 2020, at 20:12, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> > >
> > > On Fri, Jun 5, 2020 at 11:05 AM CodeWiz2280 <codewiz2280@gmail.com> wrote:
> > >>
> > >> On Fri, Jun 5, 2020 at 8:47 AM Bertrand Marquis
> > >> <Bertrand.Marquis@arm.com> wrote:
> > >>>
> > >>>
> > >>>
> > >>>> On 5 Jun 2020, at 13:42, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> > >>>>
> > >>>> On Fri, Jun 5, 2020 at 8:30 AM Julien Grall <julien@xen.org> wrote:
> > >>>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> On 05/06/2020 13:25, CodeWiz2280 wrote:
> > >>>>>> The Keystone uses the netcp driver, which has interrupts from 40-79
> > >>>>>> listed in the device tree (arch/arm/boot/keystone-k2e-netcp.dtsi).
> > >>>>>> I'm using the same device tree between my non-xen standalone kernel
> > >>>>>> and my dom0 kernel booted by xen.  In the standalone (non-xen) kernel
> > >>>>>> the ethernet works fine, but I don't see any of its interrupts in the
> > >>>>>> output of /proc/iomem.  I'm not seeing them in /proc/iomem when
> > >>>>>> running dom0 under Xen either.  When booting with Xen I get this
> > >>>>>> behavior where the ifconfig output shows 1 RX message and 1 TX
> > >>>>>> message, and then nothing else.
> > >>>>>
> > >>>>> I am not sure whether this is a typo in the e-mail. /proc/iomem is
> > >>>>> listing the list of the MMIO regions. You want to use /proc/interrupts.
> > >>>>>
> > >>>>> Can you confirm which path you are dumping?
> > >>>> Yes, that was a typo.  Sorry about that.  I meant that I am dumping
> > >>>> /proc/interrupts and do not
> > >>>> see them under the non-xen kernel or xen booted dom0.
> > >>>
> > >>> Could you post both /proc/interrupts content ?
> > >>
> > >> Standalone non-xen kernel (Ethernet works)
> > >> # cat /proc/interrupts
> > >>           CPU0       CPU1       CPU2       CPU3
> > >> 17:          0          0          0          0     GICv2  29 Level
> > >>  arch_timer
> > >> 18:       9856       1202        457        650     GICv2  30 Level
> > >>  arch_timer
> > >> 21:          0          0          0          0     GICv2 142 Edge
> > >>  timer-keystone
> > >> 22:          0          0          0          0     GICv2  52 Edge      arm-pmu
> > >> 23:          0          0          0          0     GICv2  53 Edge      arm-pmu
> > >> 24:          0          0          0          0     GICv2  54 Edge      arm-pmu
> > >> 25:          0          0          0          0     GICv2  55 Edge      arm-pmu
> > >> 26:          0          0          0          0     GICv2  36 Edge
> > >>  26202a0.keystone_irq
> > >> 27:       1435          0          0          0     GICv2 309 Edge      ttyS0
> > >> 29:          0          0          0          0     GICv2 315 Edge
> > >>  2530000.i2c
> > >> 30:          1          0          0          0     GICv2 318 Edge
> > >>  2530400.i2c
> > >> 31:          0          0          0          0     GICv2 321 Edge
> > >>  2530800.i2c
> > >> 32:         69          0          0          0     GICv2 324 Edge
> > >>  21000400.spi
> > >> 33:          0          0          0          0     GICv2 328 Edge
> > >>  21000600.spi
> > >> 34:          0          0          0          0     GICv2 332 Edge
> > >>  21000800.spi
> > >> 70:          0          0          0          0     GICv2 417 Edge
> > >>  ks-pcie-error-irq
> > >> 79:          0          0          0          0   PCI-MSI   0 Edge
> > >>  PCIe PME, aerdrv
> > >> 88:         57          0          0          0     GICv2  80 Level
> > >>  hwqueue-528
> > >> 89:         57          0          0          0     GICv2  81 Level
> > >>  hwqueue-529
> > >> 90:         47          0          0          0     GICv2  82 Level
> > >>  hwqueue-530
> > >> 91:         41          0          0          0     GICv2  83 Level
> > >>  hwqueue-531
> > >> IPI0:          0          0          0          0  CPU wakeup interrupts
> > >> IPI1:          0          0          0          0  Timer broadcast interrupts
> > >> IPI2:        730        988       1058        937  Rescheduling interrupts
> > >> IPI3:          2          3          4          6  Function call interrupts
> > >> IPI4:          0          0          0          0  CPU stop interrupts
> > >> IPI5:          0          0          0          0  IRQ work interrupts
> > >> IPI6:          0          0          0          0  completion interrupts
> > >>
> > >> Xen dom0 (Ethernet stops)
> > >> # cat /proc/interrupts
> > >>           CPU0
> > >> 18:      10380     GIC-0  27 Level     arch_timer
> > >> 19:          0     GIC-0 142 Edge      timer-keystone
> > >> 20:         88     GIC-0  16 Level     events
> > >> 21:          0   xen-dyn     Edge    -event     xenbus
> > >> 22:          0     GIC-0  36 Edge      26202a0.keystone_irq
> > >> 23:          1     GIC-0 312 Edge      ttyS0
> > >> 25:          1     GIC-0 318 Edge
> > >> 27:          1     GIC-0 324 Edge      21000400.spi
> > >> 28:          0     GIC-0 328 Edge      21000600.spi
> > >> 29:          0     GIC-0 332 Edge      21000800.spi
> > >> 65:          0     GIC-0 417 Edge      ks-pcie-error-irq
> > >> 74:          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
> > >> 83:          1     GIC-0  80 Level     hwqueue-528
> > >> 84:          1     GIC-0  81 Level     hwqueue-529
> > >> 85:          1     GIC-0  82 Level     hwqueue-530
> > >> 86:          1     GIC-0  83 Level     hwqueue-531
> > >> 115:         87   xen-dyn     Edge    -virq      hvc_console
> > >> IPI0:          0  CPU wakeup interrupts
> > >> IPI1:          0  Timer broadcast interrupts
> > >> IPI2:          0  Rescheduling interrupts
> > >> IPI3:          0  Function call interrupts
> > >> IPI4:          0  CPU stop interrupts
> > >> IPI5:          0  IRQ work interrupts
> > >> IPI6:          0  completion interrupts
> > >> Err:          0
> > > After getting a chance to look at this a little more, I believe the
> > > TX/RX interrupts for the ethernets map like this:
> > >
> > > eth0 Rx  - hwqueue-528
> > > eth1 Rx - hwqueue-529
> > > eth0 Tx  - hwqueue-530
> > > eth1 Tx - hwqueue-531
> > >>
> > > The interrupt counts in the standlone working kernel seem to roughly
> > > correspond to the counts of Tx/Rx messages in ifconfig.  Going on
> > > that, its clear that only 1 interrupt has been received for Tx and 1
> > > for Rx in the Xen Dom0 equivalent.  Any thoughts on this?
> >
> > This definitely look like an interrupt acknowledgement issue.
> > This could be caused by 2 things I remember of:
> > - front vs level interrupts
> > - a problem with forwarded interrupt acknowledgement.
> > I think there was something related to that where the vcpu ack was not properly
> > handled on a keystone and I had to change the way the interrupt was acked for
> > forwarded hardware interrupts.

Is there maybe some sort of secondary interrupt controller (secondary in
addition to the GIC) or interrupt "concentrator" on KeyStone?

Or is it just a small deviation from normal GIC behavior?


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-08 16:13                                             ` Stefano Stabellini
@ 2020-06-09 14:33                                               ` CodeWiz2280
  2020-06-09 15:28                                                 ` Bertrand Marquis
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-09 14:33 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, nd, Julien Grall, Bertrand Marquis

There does appear to be a secondary (CIC) controller that can forward
events to the GIC-400 and EDMA controllers for the keystone 2 family.
Admittedly, i'm not sure how it is being used with regards to the
peripherals.  I only see mention of the GIC-400 parent for the devices
in the device tree.  Maybe Bertrand has a better idea on whether any
peripherals go through the CIC first?  I see that gic_interrupt ()
fires once in Xen, which calls doIRQ to push out the virtual interrupt
to the dom0 kernel.  The dom0 kernel then handles the interrupt and
returns, but gic_interrupt() never fires again in Xen.

On Mon, Jun 8, 2020 at 12:13 PM Stefano Stabellini
<sstabellini@kernel.org> wrote:
>
>
>
> On Mon, 8 Jun 2020, CodeWiz2280 wrote:
> > It actually shows only 1 interrupt for any of the devices in that list
> > (e.g. spi, ttyS0, ethernet) so you're probably right on the money with
> > it being an interrupt acknowledge issue.  Any help you can provide is
> > greatly appreciated.
> >
> > On Mon, Jun 8, 2020 at 4:40 AM Bertrand Marquis
> > <Bertrand.Marquis@arm.com> wrote:
> > >
> > >
> > >
> > > > On 5 Jun 2020, at 20:12, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> > > >
> > > > On Fri, Jun 5, 2020 at 11:05 AM CodeWiz2280 <codewiz2280@gmail.com> wrote:
> > > >>
> > > >> On Fri, Jun 5, 2020 at 8:47 AM Bertrand Marquis
> > > >> <Bertrand.Marquis@arm.com> wrote:
> > > >>>
> > > >>>
> > > >>>
> > > >>>> On 5 Jun 2020, at 13:42, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> > > >>>>
> > > >>>> On Fri, Jun 5, 2020 at 8:30 AM Julien Grall <julien@xen.org> wrote:
> > > >>>>>
> > > >>>>> Hi,
> > > >>>>>
> > > >>>>> On 05/06/2020 13:25, CodeWiz2280 wrote:
> > > >>>>>> The Keystone uses the netcp driver, which has interrupts from 40-79
> > > >>>>>> listed in the device tree (arch/arm/boot/keystone-k2e-netcp.dtsi).
> > > >>>>>> I'm using the same device tree between my non-xen standalone kernel
> > > >>>>>> and my dom0 kernel booted by xen.  In the standalone (non-xen) kernel
> > > >>>>>> the ethernet works fine, but I don't see any of its interrupts in the
> > > >>>>>> output of /proc/iomem.  I'm not seeing them in /proc/iomem when
> > > >>>>>> running dom0 under Xen either.  When booting with Xen I get this
> > > >>>>>> behavior where the ifconfig output shows 1 RX message and 1 TX
> > > >>>>>> message, and then nothing else.
> > > >>>>>
> > > >>>>> I am not sure whether this is a typo in the e-mail. /proc/iomem is
> > > >>>>> listing the list of the MMIO regions. You want to use /proc/interrupts.
> > > >>>>>
> > > >>>>> Can you confirm which path you are dumping?
> > > >>>> Yes, that was a typo.  Sorry about that.  I meant that I am dumping
> > > >>>> /proc/interrupts and do not
> > > >>>> see them under the non-xen kernel or xen booted dom0.
> > > >>>
> > > >>> Could you post both /proc/interrupts content ?
> > > >>
> > > >> Standalone non-xen kernel (Ethernet works)
> > > >> # cat /proc/interrupts
> > > >>           CPU0       CPU1       CPU2       CPU3
> > > >> 17:          0          0          0          0     GICv2  29 Level
> > > >>  arch_timer
> > > >> 18:       9856       1202        457        650     GICv2  30 Level
> > > >>  arch_timer
> > > >> 21:          0          0          0          0     GICv2 142 Edge
> > > >>  timer-keystone
> > > >> 22:          0          0          0          0     GICv2  52 Edge      arm-pmu
> > > >> 23:          0          0          0          0     GICv2  53 Edge      arm-pmu
> > > >> 24:          0          0          0          0     GICv2  54 Edge      arm-pmu
> > > >> 25:          0          0          0          0     GICv2  55 Edge      arm-pmu
> > > >> 26:          0          0          0          0     GICv2  36 Edge
> > > >>  26202a0.keystone_irq
> > > >> 27:       1435          0          0          0     GICv2 309 Edge      ttyS0
> > > >> 29:          0          0          0          0     GICv2 315 Edge
> > > >>  2530000.i2c
> > > >> 30:          1          0          0          0     GICv2 318 Edge
> > > >>  2530400.i2c
> > > >> 31:          0          0          0          0     GICv2 321 Edge
> > > >>  2530800.i2c
> > > >> 32:         69          0          0          0     GICv2 324 Edge
> > > >>  21000400.spi
> > > >> 33:          0          0          0          0     GICv2 328 Edge
> > > >>  21000600.spi
> > > >> 34:          0          0          0          0     GICv2 332 Edge
> > > >>  21000800.spi
> > > >> 70:          0          0          0          0     GICv2 417 Edge
> > > >>  ks-pcie-error-irq
> > > >> 79:          0          0          0          0   PCI-MSI   0 Edge
> > > >>  PCIe PME, aerdrv
> > > >> 88:         57          0          0          0     GICv2  80 Level
> > > >>  hwqueue-528
> > > >> 89:         57          0          0          0     GICv2  81 Level
> > > >>  hwqueue-529
> > > >> 90:         47          0          0          0     GICv2  82 Level
> > > >>  hwqueue-530
> > > >> 91:         41          0          0          0     GICv2  83 Level
> > > >>  hwqueue-531
> > > >> IPI0:          0          0          0          0  CPU wakeup interrupts
> > > >> IPI1:          0          0          0          0  Timer broadcast interrupts
> > > >> IPI2:        730        988       1058        937  Rescheduling interrupts
> > > >> IPI3:          2          3          4          6  Function call interrupts
> > > >> IPI4:          0          0          0          0  CPU stop interrupts
> > > >> IPI5:          0          0          0          0  IRQ work interrupts
> > > >> IPI6:          0          0          0          0  completion interrupts
> > > >>
> > > >> Xen dom0 (Ethernet stops)
> > > >> # cat /proc/interrupts
> > > >>           CPU0
> > > >> 18:      10380     GIC-0  27 Level     arch_timer
> > > >> 19:          0     GIC-0 142 Edge      timer-keystone
> > > >> 20:         88     GIC-0  16 Level     events
> > > >> 21:          0   xen-dyn     Edge    -event     xenbus
> > > >> 22:          0     GIC-0  36 Edge      26202a0.keystone_irq
> > > >> 23:          1     GIC-0 312 Edge      ttyS0
> > > >> 25:          1     GIC-0 318 Edge
> > > >> 27:          1     GIC-0 324 Edge      21000400.spi
> > > >> 28:          0     GIC-0 328 Edge      21000600.spi
> > > >> 29:          0     GIC-0 332 Edge      21000800.spi
> > > >> 65:          0     GIC-0 417 Edge      ks-pcie-error-irq
> > > >> 74:          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
> > > >> 83:          1     GIC-0  80 Level     hwqueue-528
> > > >> 84:          1     GIC-0  81 Level     hwqueue-529
> > > >> 85:          1     GIC-0  82 Level     hwqueue-530
> > > >> 86:          1     GIC-0  83 Level     hwqueue-531
> > > >> 115:         87   xen-dyn     Edge    -virq      hvc_console
> > > >> IPI0:          0  CPU wakeup interrupts
> > > >> IPI1:          0  Timer broadcast interrupts
> > > >> IPI2:          0  Rescheduling interrupts
> > > >> IPI3:          0  Function call interrupts
> > > >> IPI4:          0  CPU stop interrupts
> > > >> IPI5:          0  IRQ work interrupts
> > > >> IPI6:          0  completion interrupts
> > > >> Err:          0
> > > > After getting a chance to look at this a little more, I believe the
> > > > TX/RX interrupts for the ethernets map like this:
> > > >
> > > > eth0 Rx  - hwqueue-528
> > > > eth1 Rx - hwqueue-529
> > > > eth0 Tx  - hwqueue-530
> > > > eth1 Tx - hwqueue-531
> > > >>
> > > > The interrupt counts in the standlone working kernel seem to roughly
> > > > correspond to the counts of Tx/Rx messages in ifconfig.  Going on
> > > > that, its clear that only 1 interrupt has been received for Tx and 1
> > > > for Rx in the Xen Dom0 equivalent.  Any thoughts on this?
> > >
> > > This definitely look like an interrupt acknowledgement issue.
> > > This could be caused by 2 things I remember of:
> > > - front vs level interrupts
> > > - a problem with forwarded interrupt acknowledgement.
> > > I think there was something related to that where the vcpu ack was not properly
> > > handled on a keystone and I had to change the way the interrupt was acked for
> > > forwarded hardware interrupts.
>
> Is there maybe some sort of secondary interrupt controller (secondary in
> addition to the GIC) or interrupt "concentrator" on KeyStone?
>
> Or is it just a small deviation from normal GIC behavior?


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-09 14:33                                               ` CodeWiz2280
@ 2020-06-09 15:28                                                 ` Bertrand Marquis
  2020-06-09 15:47                                                   ` Julien Grall
  0 siblings, 1 reply; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-09 15:28 UTC (permalink / raw)
  To: CodeWiz2280; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall

Hi,

> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> 
> There does appear to be a secondary (CIC) controller that can forward
> events to the GIC-400 and EDMA controllers for the keystone 2 family.
> Admittedly, i'm not sure how it is being used with regards to the
> peripherals.  I only see mention of the GIC-400 parent for the devices
> in the device tree.  Maybe Bertrand has a better idea on whether any
> peripherals go through the CIC first?  I see that gic_interrupt ()
> fires once in Xen, which calls doIRQ to push out the virtual interrupt
> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
> returns, but gic_interrupt() never fires again in Xen.

I do not remember of any CIC but the behaviour definitely look like an interrupt acknowledge problem.

Could you try the following:
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc *desc)
     /* Lower the priority of the IRQ */
     gicv2_eoi_irq(desc);
     /* Deactivation happens in maintenance interrupt / via GICV */
+
+    /* Test for Keystone2 */
+    gicv2_dir_irq(desc);
 }

I think the problem I had was related to the vgic not deactivating properly the interrupt.
This might make the interrupt fire indefinitely !!

Regards
Bertrand

> 
> On Mon, Jun 8, 2020 at 12:13 PM Stefano Stabellini
> <sstabellini@kernel.org> wrote:
>> 
>> 
>> 
>> On Mon, 8 Jun 2020, CodeWiz2280 wrote:
>>> It actually shows only 1 interrupt for any of the devices in that list
>>> (e.g. spi, ttyS0, ethernet) so you're probably right on the money with
>>> it being an interrupt acknowledge issue.  Any help you can provide is
>>> greatly appreciated.
>>> 
>>> On Mon, Jun 8, 2020 at 4:40 AM Bertrand Marquis
>>> <Bertrand.Marquis@arm.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On 5 Jun 2020, at 20:12, CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>>>> 
>>>>> On Fri, Jun 5, 2020 at 11:05 AM CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>>>>> 
>>>>>> On Fri, Jun 5, 2020 at 8:47 AM Bertrand Marquis
>>>>>> <Bertrand.Marquis@arm.com> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On 5 Jun 2020, at 13:42, CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> On Fri, Jun 5, 2020 at 8:30 AM Julien Grall <julien@xen.org> wrote:
>>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> On 05/06/2020 13:25, CodeWiz2280 wrote:
>>>>>>>>>> The Keystone uses the netcp driver, which has interrupts from 40-79
>>>>>>>>>> listed in the device tree (arch/arm/boot/keystone-k2e-netcp.dtsi).
>>>>>>>>>> I'm using the same device tree between my non-xen standalone kernel
>>>>>>>>>> and my dom0 kernel booted by xen.  In the standalone (non-xen) kernel
>>>>>>>>>> the ethernet works fine, but I don't see any of its interrupts in the
>>>>>>>>>> output of /proc/iomem.  I'm not seeing them in /proc/iomem when
>>>>>>>>>> running dom0 under Xen either.  When booting with Xen I get this
>>>>>>>>>> behavior where the ifconfig output shows 1 RX message and 1 TX
>>>>>>>>>> message, and then nothing else.
>>>>>>>>> 
>>>>>>>>> I am not sure whether this is a typo in the e-mail. /proc/iomem is
>>>>>>>>> listing the list of the MMIO regions. You want to use /proc/interrupts.
>>>>>>>>> 
>>>>>>>>> Can you confirm which path you are dumping?
>>>>>>>> Yes, that was a typo.  Sorry about that.  I meant that I am dumping
>>>>>>>> /proc/interrupts and do not
>>>>>>>> see them under the non-xen kernel or xen booted dom0.
>>>>>>> 
>>>>>>> Could you post both /proc/interrupts content ?
>>>>>> 
>>>>>> Standalone non-xen kernel (Ethernet works)
>>>>>> # cat /proc/interrupts
>>>>>>          CPU0       CPU1       CPU2       CPU3
>>>>>> 17:          0          0          0          0     GICv2  29 Level
>>>>>> arch_timer
>>>>>> 18:       9856       1202        457        650     GICv2  30 Level
>>>>>> arch_timer
>>>>>> 21:          0          0          0          0     GICv2 142 Edge
>>>>>> timer-keystone
>>>>>> 22:          0          0          0          0     GICv2  52 Edge      arm-pmu
>>>>>> 23:          0          0          0          0     GICv2  53 Edge      arm-pmu
>>>>>> 24:          0          0          0          0     GICv2  54 Edge      arm-pmu
>>>>>> 25:          0          0          0          0     GICv2  55 Edge      arm-pmu
>>>>>> 26:          0          0          0          0     GICv2  36 Edge
>>>>>> 26202a0.keystone_irq
>>>>>> 27:       1435          0          0          0     GICv2 309 Edge      ttyS0
>>>>>> 29:          0          0          0          0     GICv2 315 Edge
>>>>>> 2530000.i2c
>>>>>> 30:          1          0          0          0     GICv2 318 Edge
>>>>>> 2530400.i2c
>>>>>> 31:          0          0          0          0     GICv2 321 Edge
>>>>>> 2530800.i2c
>>>>>> 32:         69          0          0          0     GICv2 324 Edge
>>>>>> 21000400.spi
>>>>>> 33:          0          0          0          0     GICv2 328 Edge
>>>>>> 21000600.spi
>>>>>> 34:          0          0          0          0     GICv2 332 Edge
>>>>>> 21000800.spi
>>>>>> 70:          0          0          0          0     GICv2 417 Edge
>>>>>> ks-pcie-error-irq
>>>>>> 79:          0          0          0          0   PCI-MSI   0 Edge
>>>>>> PCIe PME, aerdrv
>>>>>> 88:         57          0          0          0     GICv2  80 Level
>>>>>> hwqueue-528
>>>>>> 89:         57          0          0          0     GICv2  81 Level
>>>>>> hwqueue-529
>>>>>> 90:         47          0          0          0     GICv2  82 Level
>>>>>> hwqueue-530
>>>>>> 91:         41          0          0          0     GICv2  83 Level
>>>>>> hwqueue-531
>>>>>> IPI0:          0          0          0          0  CPU wakeup interrupts
>>>>>> IPI1:          0          0          0          0  Timer broadcast interrupts
>>>>>> IPI2:        730        988       1058        937  Rescheduling interrupts
>>>>>> IPI3:          2          3          4          6  Function call interrupts
>>>>>> IPI4:          0          0          0          0  CPU stop interrupts
>>>>>> IPI5:          0          0          0          0  IRQ work interrupts
>>>>>> IPI6:          0          0          0          0  completion interrupts
>>>>>> 
>>>>>> Xen dom0 (Ethernet stops)
>>>>>> # cat /proc/interrupts
>>>>>>          CPU0
>>>>>> 18:      10380     GIC-0  27 Level     arch_timer
>>>>>> 19:          0     GIC-0 142 Edge      timer-keystone
>>>>>> 20:         88     GIC-0  16 Level     events
>>>>>> 21:          0   xen-dyn     Edge    -event     xenbus
>>>>>> 22:          0     GIC-0  36 Edge      26202a0.keystone_irq
>>>>>> 23:          1     GIC-0 312 Edge      ttyS0
>>>>>> 25:          1     GIC-0 318 Edge
>>>>>> 27:          1     GIC-0 324 Edge      21000400.spi
>>>>>> 28:          0     GIC-0 328 Edge      21000600.spi
>>>>>> 29:          0     GIC-0 332 Edge      21000800.spi
>>>>>> 65:          0     GIC-0 417 Edge      ks-pcie-error-irq
>>>>>> 74:          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
>>>>>> 83:          1     GIC-0  80 Level     hwqueue-528
>>>>>> 84:          1     GIC-0  81 Level     hwqueue-529
>>>>>> 85:          1     GIC-0  82 Level     hwqueue-530
>>>>>> 86:          1     GIC-0  83 Level     hwqueue-531
>>>>>> 115:         87   xen-dyn     Edge    -virq      hvc_console
>>>>>> IPI0:          0  CPU wakeup interrupts
>>>>>> IPI1:          0  Timer broadcast interrupts
>>>>>> IPI2:          0  Rescheduling interrupts
>>>>>> IPI3:          0  Function call interrupts
>>>>>> IPI4:          0  CPU stop interrupts
>>>>>> IPI5:          0  IRQ work interrupts
>>>>>> IPI6:          0  completion interrupts
>>>>>> Err:          0
>>>>> After getting a chance to look at this a little more, I believe the
>>>>> TX/RX interrupts for the ethernets map like this:
>>>>> 
>>>>> eth0 Rx  - hwqueue-528
>>>>> eth1 Rx - hwqueue-529
>>>>> eth0 Tx  - hwqueue-530
>>>>> eth1 Tx - hwqueue-531
>>>>>> 
>>>>> The interrupt counts in the standlone working kernel seem to roughly
>>>>> correspond to the counts of Tx/Rx messages in ifconfig.  Going on
>>>>> that, its clear that only 1 interrupt has been received for Tx and 1
>>>>> for Rx in the Xen Dom0 equivalent.  Any thoughts on this?
>>>> 
>>>> This definitely look like an interrupt acknowledgement issue.
>>>> This could be caused by 2 things I remember of:
>>>> - front vs level interrupts
>>>> - a problem with forwarded interrupt acknowledgement.
>>>> I think there was something related to that where the vcpu ack was not properly
>>>> handled on a keystone and I had to change the way the interrupt was acked for
>>>> forwarded hardware interrupts.
>> 
>> Is there maybe some sort of secondary interrupt controller (secondary in
>> addition to the GIC) or interrupt "concentrator" on KeyStone?
>> 
>> Or is it just a small deviation from normal GIC behavior?



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-09 15:28                                                 ` Bertrand Marquis
@ 2020-06-09 15:47                                                   ` Julien Grall
  2020-06-09 15:58                                                     ` CodeWiz2280
  2020-06-09 17:03                                                     ` Bertrand Marquis
  0 siblings, 2 replies; 55+ messages in thread
From: Julien Grall @ 2020-06-09 15:47 UTC (permalink / raw)
  To: Bertrand Marquis, CodeWiz2280; +Cc: xen-devel, nd, Stefano Stabellini



On 09/06/2020 16:28, Bertrand Marquis wrote:
> Hi,
> 
>> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>
>> There does appear to be a secondary (CIC) controller that can forward
>> events to the GIC-400 and EDMA controllers for the keystone 2 family.
>> Admittedly, i'm not sure how it is being used with regards to the
>> peripherals.  I only see mention of the GIC-400 parent for the devices
>> in the device tree.  Maybe Bertrand has a better idea on whether any
>> peripherals go through the CIC first?  I see that gic_interrupt ()
>> fires once in Xen, which calls doIRQ to push out the virtual interrupt
>> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
>> returns, but gic_interrupt() never fires again in Xen.
> 
> I do not remember of any CIC but the behaviour definitely look like an interrupt acknowledge problem.
> 
> Could you try the following:
> --- a/xen/arch/arm/gic-v2.c
> +++ b/xen/arch/arm/gic-v2.c
> @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc *desc)
>       /* Lower the priority of the IRQ */
>       gicv2_eoi_irq(desc);
>       /* Deactivation happens in maintenance interrupt / via GICV */
> +
> +    /* Test for Keystone2 */
> +    gicv2_dir_irq(desc);
>   }
> 
> I think the problem I had was related to the vgic not deactivating properly the interrupt.

Are you suggesting the guest EOI is not properly forwarded to the 
hardware when LR.HW is set? If so, this could possibly be workaround in 
Xen by raising a maintenance interrupt every time a guest EOI an interrupt.

> This might make the interrupt fire indefinitely !!

Most likely with level interrupt ;).

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-09 15:47                                                   ` Julien Grall
@ 2020-06-09 15:58                                                     ` CodeWiz2280
  2020-06-09 17:05                                                       ` Bertrand Marquis
  2020-06-09 17:03                                                     ` Bertrand Marquis
  1 sibling, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-09 15:58 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, nd, Stefano Stabellini, Bertrand Marquis

On Tue, Jun 9, 2020 at 11:47 AM Julien Grall <julien@xen.org> wrote:
>
>
>
> On 09/06/2020 16:28, Bertrand Marquis wrote:
> > Hi,
> >
> >> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> >>
> >> There does appear to be a secondary (CIC) controller that can forward
> >> events to the GIC-400 and EDMA controllers for the keystone 2 family.
> >> Admittedly, i'm not sure how it is being used with regards to the
> >> peripherals.  I only see mention of the GIC-400 parent for the devices
> >> in the device tree.  Maybe Bertrand has a better idea on whether any
> >> peripherals go through the CIC first?  I see that gic_interrupt ()
> >> fires once in Xen, which calls doIRQ to push out the virtual interrupt
> >> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
> >> returns, but gic_interrupt() never fires again in Xen.
> >
> > I do not remember of any CIC but the behaviour definitely look like an interrupt acknowledge problem.
> >
> > Could you try the following:
> > --- a/xen/arch/arm/gic-v2.c
> > +++ b/xen/arch/arm/gic-v2.c
> > @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc *desc)
> >       /* Lower the priority of the IRQ */
> >       gicv2_eoi_irq(desc);
> >       /* Deactivation happens in maintenance interrupt / via GICV */
> > +
> > +    /* Test for Keystone2 */
> > +    gicv2_dir_irq(desc);
> >   }
> >
> > I think the problem I had was related to the vgic not deactivating properly the interrupt.
>
This seemed to help with the edge triggered interrupts, e.g. UART

> Are you suggesting the guest EOI is not properly forwarded to the
> hardware when LR.HW is set? If so, this could possibly be workaround in
> Xen by raising a maintenance interrupt every time a guest EOI an interrupt.
>
> > This might make the interrupt fire indefinitely !!
>
> Most likely with level interrupt ;).
>
This is what's happening with the Ethernet driver which is level
triggered.  I had to temporarily disable it
to check the patch with the UART driver, otherwise the system would
hang processing the interrupt
repeatedly.
> Cheers,
>
> --
> Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-09 15:47                                                   ` Julien Grall
  2020-06-09 15:58                                                     ` CodeWiz2280
@ 2020-06-09 17:03                                                     ` Bertrand Marquis
  2020-06-09 17:32                                                       ` Julien Grall
  1 sibling, 1 reply; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-09 17:03 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, nd, Stefano Stabellini, CodeWiz2280

Hi

> On 9 Jun 2020, at 16:47, Julien Grall <julien@xen.org> wrote:
> 
> 
> 
> On 09/06/2020 16:28, Bertrand Marquis wrote:
>> Hi,
>>> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>> 
>>> There does appear to be a secondary (CIC) controller that can forward
>>> events to the GIC-400 and EDMA controllers for the keystone 2 family.
>>> Admittedly, i'm not sure how it is being used with regards to the
>>> peripherals.  I only see mention of the GIC-400 parent for the devices
>>> in the device tree.  Maybe Bertrand has a better idea on whether any
>>> peripherals go through the CIC first?  I see that gic_interrupt ()
>>> fires once in Xen, which calls doIRQ to push out the virtual interrupt
>>> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
>>> returns, but gic_interrupt() never fires again in Xen.
>> I do not remember of any CIC but the behaviour definitely look like an interrupt acknowledge problem.
>> Could you try the following:
>> --- a/xen/arch/arm/gic-v2.c
>> +++ b/xen/arch/arm/gic-v2.c
>> @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc *desc)
>>      /* Lower the priority of the IRQ */
>>      gicv2_eoi_irq(desc);
>>      /* Deactivation happens in maintenance interrupt / via GICV */
>> +
>> +    /* Test for Keystone2 */
>> +    gicv2_dir_irq(desc);
>>  }
>> I think the problem I had was related to the vgic not deactivating properly the interrupt.
> 
> Are you suggesting the guest EOI is not properly forwarded to the hardware when LR.HW is set? If so, this could possibly be workaround in Xen by raising a maintenance interrupt every time a guest EOI an interrupt.

Agree the maintenance interrupt would definitely be the right solution.
This was an easy ack to check if that was the source of the problem.

> 
>> This might make the interrupt fire indefinitely !!
> 
> Most likely with level interrupt ;).

Yes but this is just to confirm ;-)

> 
> Cheers,
> 
> -- 
> Julien Grall



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-09 15:58                                                     ` CodeWiz2280
@ 2020-06-09 17:05                                                       ` Bertrand Marquis
  0 siblings, 0 replies; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-09 17:05 UTC (permalink / raw)
  To: CodeWiz2280; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall



> On 9 Jun 2020, at 16:58, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> 
> On Tue, Jun 9, 2020 at 11:47 AM Julien Grall <julien@xen.org> wrote:
>> 
>> 
>> 
>> On 09/06/2020 16:28, Bertrand Marquis wrote:
>>> Hi,
>>> 
>>>> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>>> 
>>>> There does appear to be a secondary (CIC) controller that can forward
>>>> events to the GIC-400 and EDMA controllers for the keystone 2 family.
>>>> Admittedly, i'm not sure how it is being used with regards to the
>>>> peripherals.  I only see mention of the GIC-400 parent for the devices
>>>> in the device tree.  Maybe Bertrand has a better idea on whether any
>>>> peripherals go through the CIC first?  I see that gic_interrupt ()
>>>> fires once in Xen, which calls doIRQ to push out the virtual interrupt
>>>> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
>>>> returns, but gic_interrupt() never fires again in Xen.
>>> 
>>> I do not remember of any CIC but the behaviour definitely look like an interrupt acknowledge problem.
>>> 
>>> Could you try the following:
>>> --- a/xen/arch/arm/gic-v2.c
>>> +++ b/xen/arch/arm/gic-v2.c
>>> @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc *desc)
>>>      /* Lower the priority of the IRQ */
>>>      gicv2_eoi_irq(desc);
>>>      /* Deactivation happens in maintenance interrupt / via GICV */
>>> +
>>> +    /* Test for Keystone2 */
>>> +    gicv2_dir_irq(desc);
>>>  }
>>> 
>>> I think the problem I had was related to the vgic not deactivating properly the interrupt.
>> 
> This seemed to help with the edge triggered interrupts, e.g. UART

So the missing ack is definitely the issue.

> 
>> Are you suggesting the guest EOI is not properly forwarded to the
>> hardware when LR.HW is set? If so, this could possibly be workaround in
>> Xen by raising a maintenance interrupt every time a guest EOI an interrupt.
>> 
>>> This might make the interrupt fire indefinitely !!
>> 
>> Most likely with level interrupt ;).
>> 
> This is what's happening with the Ethernet driver which is level
> triggered.  I had to temporarily disable it
> to check the patch with the UART driver, otherwise the system would
> hang processing the interrupt
> repeatedly.

This is quite logic yes.
The way forward, as mentioned by Julien, will be to use a maintenance interrupt when the interrupt has been handled by the guest so that Xen can do the deactivation of the corresponding interrupt.
This will add some overhead but there is probably no other solution.

Cheers
Bertrand



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-09 17:03                                                     ` Bertrand Marquis
@ 2020-06-09 17:32                                                       ` Julien Grall
  2020-06-09 17:45                                                         ` Marc Zyngier
  0 siblings, 1 reply; 55+ messages in thread
From: Julien Grall @ 2020-06-09 17:32 UTC (permalink / raw)
  To: Bertrand Marquis
  Cc: xen-devel, nd, Stefano Stabellini, CodeWiz2280, Marc Zyngier

(+ Marc)

On 09/06/2020 18:03, Bertrand Marquis wrote:
> Hi
> 
>> On 9 Jun 2020, at 16:47, Julien Grall <julien@xen.org> wrote:
>>
>>
>>
>> On 09/06/2020 16:28, Bertrand Marquis wrote:
>>> Hi,
>>>> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>>>
>>>> There does appear to be a secondary (CIC) controller that can forward
>>>> events to the GIC-400 and EDMA controllers for the keystone 2 family.
>>>> Admittedly, i'm not sure how it is being used with regards to the
>>>> peripherals.  I only see mention of the GIC-400 parent for the devices
>>>> in the device tree.  Maybe Bertrand has a better idea on whether any
>>>> peripherals go through the CIC first?  I see that gic_interrupt ()
>>>> fires once in Xen, which calls doIRQ to push out the virtual interrupt
>>>> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
>>>> returns, but gic_interrupt() never fires again in Xen.
>>> I do not remember of any CIC but the behaviour definitely look like an interrupt acknowledge problem.
>>> Could you try the following:
>>> --- a/xen/arch/arm/gic-v2.c
>>> +++ b/xen/arch/arm/gic-v2.c
>>> @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc *desc)
>>>       /* Lower the priority of the IRQ */
>>>       gicv2_eoi_irq(desc);
>>>       /* Deactivation happens in maintenance interrupt / via GICV */
>>> +
>>> +    /* Test for Keystone2 */
>>> +    gicv2_dir_irq(desc);
>>>   }
>>> I think the problem I had was related to the vgic not deactivating properly the interrupt.
>>
>> Are you suggesting the guest EOI is not properly forwarded to the hardware when LR.HW is set? If so, this could possibly be workaround in Xen by raising a maintenance interrupt every time a guest EOI an interrupt.
> 
> Agree the maintenance interrupt would definitely be the right solution
I would like to make sure we aren't missing anything in Xen first. From 
what you said, you have encountered this issue in the past with a 
different hypervisor. So it doesn't look like to be Xen related.

Was there any official statement from TI? If not, can we try to get some 
input from them first?
	
@Marc, I know you dropped 32-bit support in KVM recently :). Although, I 
was wondering if you heard about any potential issue with guest EOI not 
forwarded to the host. This is on TI Keystone (Cortex A-15).

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-09 17:32                                                       ` Julien Grall
@ 2020-06-09 17:45                                                         ` Marc Zyngier
  2020-06-09 20:07                                                           ` CodeWiz2280
                                                                             ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: Marc Zyngier @ 2020-06-09 17:45 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, nd, Stefano Stabellini, CodeWiz2280, Bertrand Marquis

Hi Julien,

On 2020-06-09 18:32, Julien Grall wrote:
> (+ Marc)
> 
> On 09/06/2020 18:03, Bertrand Marquis wrote:
>> Hi
>> 
>>> On 9 Jun 2020, at 16:47, Julien Grall <julien@xen.org> wrote:
>>> 
>>> 
>>> 
>>> On 09/06/2020 16:28, Bertrand Marquis wrote:
>>>> Hi,
>>>>> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>>>> 
>>>>> There does appear to be a secondary (CIC) controller that can 
>>>>> forward
>>>>> events to the GIC-400 and EDMA controllers for the keystone 2 
>>>>> family.
>>>>> Admittedly, i'm not sure how it is being used with regards to the
>>>>> peripherals.  I only see mention of the GIC-400 parent for the 
>>>>> devices
>>>>> in the device tree.  Maybe Bertrand has a better idea on whether 
>>>>> any
>>>>> peripherals go through the CIC first?  I see that gic_interrupt ()
>>>>> fires once in Xen, which calls doIRQ to push out the virtual 
>>>>> interrupt
>>>>> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
>>>>> returns, but gic_interrupt() never fires again in Xen.
>>>> I do not remember of any CIC but the behaviour definitely look like 
>>>> an interrupt acknowledge problem.
>>>> Could you try the following:
>>>> --- a/xen/arch/arm/gic-v2.c
>>>> +++ b/xen/arch/arm/gic-v2.c
>>>> @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc 
>>>> *desc)
>>>>       /* Lower the priority of the IRQ */
>>>>       gicv2_eoi_irq(desc);
>>>>       /* Deactivation happens in maintenance interrupt / via GICV */
>>>> +
>>>> +    /* Test for Keystone2 */
>>>> +    gicv2_dir_irq(desc);
>>>>   }
>>>> I think the problem I had was related to the vgic not deactivating 
>>>> properly the interrupt.
>>> 
>>> Are you suggesting the guest EOI is not properly forwarded to the 
>>> hardware when LR.HW is set? If so, this could possibly be workaround 
>>> in Xen by raising a maintenance interrupt every time a guest EOI an 
>>> interrupt.
>> 
>> Agree the maintenance interrupt would definitely be the right solution
> I would like to make sure we aren't missing anything in Xen first.
> From what you said, you have encountered this issue in the past with a
> different hypervisor. So it doesn't look like to be Xen related.
> 
> Was there any official statement from TI? If not, can we try to get
> some input from them first?
> 
> @Marc, I know you dropped 32-bit support in KVM recently :). Although,

Yes! Victory is mine! Freedom from the shackles of 32bit, at last! :D

> I was wondering if you heard about any potential issue with guest EOI
> not forwarded to the host. This is on TI Keystone (Cortex A-15).

Not that I know of. A-15 definitely works (TC2, Tegra-K1, Calxeda Midway 
all run just fine with guest EOI), and GIC-400 is a pretty solid piece 
of kit (it is just sloooooow...).

Thinking of it, you would see something like that if the GIC was seeing 
the writes coming from the guest as secure instead of NS (cue the early 
firmware on XGene that exposed the wrong side of GIC-400).

Is there some kind of funky bridge between the CPU and the GIC?

         M.
-- 
Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-09 17:45                                                         ` Marc Zyngier
@ 2020-06-09 20:07                                                           ` CodeWiz2280
  2020-06-10  8:13                                                             ` Bertrand Marquis
  2020-06-10  8:06                                                           ` Bertrand Marquis
  2020-06-10 21:46                                                           ` Julien Grall
  2 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-09 20:07 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: xen-devel, nd, Stefano Stabellini, Julien Grall, Bertrand Marquis

On Tue, Jun 9, 2020 at 1:45 PM Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Julien,
>
> On 2020-06-09 18:32, Julien Grall wrote:
> > (+ Marc)
> >
> > On 09/06/2020 18:03, Bertrand Marquis wrote:
> >> Hi
> >>
> >>> On 9 Jun 2020, at 16:47, Julien Grall <julien@xen.org> wrote:
> >>>
> >>>
> >>>
> >>> On 09/06/2020 16:28, Bertrand Marquis wrote:
> >>>> Hi,
> >>>>> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> >>>>>
> >>>>> There does appear to be a secondary (CIC) controller that can
> >>>>> forward
> >>>>> events to the GIC-400 and EDMA controllers for the keystone 2
> >>>>> family.
> >>>>> Admittedly, i'm not sure how it is being used with regards to the
> >>>>> peripherals.  I only see mention of the GIC-400 parent for the
> >>>>> devices
> >>>>> in the device tree.  Maybe Bertrand has a better idea on whether
> >>>>> any
> >>>>> peripherals go through the CIC first?  I see that gic_interrupt ()
> >>>>> fires once in Xen, which calls doIRQ to push out the virtual
> >>>>> interrupt
> >>>>> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
> >>>>> returns, but gic_interrupt() never fires again in Xen.
> >>>> I do not remember of any CIC but the behaviour definitely look like
> >>>> an interrupt acknowledge problem.
> >>>> Could you try the following:
> >>>> --- a/xen/arch/arm/gic-v2.c
> >>>> +++ b/xen/arch/arm/gic-v2.c
> >>>> @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc
> >>>> *desc)
> >>>>       /* Lower the priority of the IRQ */
> >>>>       gicv2_eoi_irq(desc);
> >>>>       /* Deactivation happens in maintenance interrupt / via GICV */
> >>>> +
> >>>> +    /* Test for Keystone2 */
> >>>> +    gicv2_dir_irq(desc);
> >>>>   }
> >>>> I think the problem I had was related to the vgic not deactivating
> >>>> properly the interrupt.
> >>>
> >>> Are you suggesting the guest EOI is not properly forwarded to the
> >>> hardware when LR.HW is set? If so, this could possibly be workaround
> >>> in Xen by raising a maintenance interrupt every time a guest EOI an
> >>> interrupt.
> >>
> >> Agree the maintenance interrupt would definitely be the right solution
> > I would like to make sure we aren't missing anything in Xen first.
> > From what you said, you have encountered this issue in the past with a
> > different hypervisor. So it doesn't look like to be Xen related.
> >
> > Was there any official statement from TI? If not, can we try to get
> > some input from them first?
Thank you all for your support so far, its really appreciated.  Is
there a quick patch that I can try with this maintenance interrupt to
get the level interrupts working as well? I can pose the question to
TI but would like to close the loop and make sure there are no other
issues that pop out first.
> >
> > @Marc, I know you dropped 32-bit support in KVM recently :). Although,
>
> Yes! Victory is mine! Freedom from the shackles of 32bit, at last! :D
>
> > I was wondering if you heard about any potential issue with guest EOI
> > not forwarded to the host. This is on TI Keystone (Cortex A-15).
>
> Not that I know of. A-15 definitely works (TC2, Tegra-K1, Calxeda Midway
> all run just fine with guest EOI), and GIC-400 is a pretty solid piece
> of kit (it is just sloooooow...).
>
> Thinking of it, you would see something like that if the GIC was seeing
> the writes coming from the guest as secure instead of NS (cue the early
> firmware on XGene that exposed the wrong side of GIC-400).
>
> Is there some kind of funky bridge between the CPU and the GIC?
>
>          M.
> --
> Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-09 17:45                                                         ` Marc Zyngier
  2020-06-09 20:07                                                           ` CodeWiz2280
@ 2020-06-10  8:06                                                           ` Bertrand Marquis
  2020-06-10  8:20                                                             ` Marc Zyngier
  2020-06-10 21:46                                                           ` Julien Grall
  2 siblings, 1 reply; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-10  8:06 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall, CodeWiz2280

Hi,

> On 9 Jun 2020, at 18:45, Marc Zyngier <maz@kernel.org> wrote:
> 
> Hi Julien,
> 
> On 2020-06-09 18:32, Julien Grall wrote:
>> (+ Marc)
>> On 09/06/2020 18:03, Bertrand Marquis wrote:
>>> Hi
>>>> On 9 Jun 2020, at 16:47, Julien Grall <julien@xen.org> wrote:
>>>> On 09/06/2020 16:28, Bertrand Marquis wrote:
>>>>> Hi,
>>>>>> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>>>>> There does appear to be a secondary (CIC) controller that can forward
>>>>>> events to the GIC-400 and EDMA controllers for the keystone 2 family.
>>>>>> Admittedly, i'm not sure how it is being used with regards to the
>>>>>> peripherals.  I only see mention of the GIC-400 parent for the devices
>>>>>> in the device tree.  Maybe Bertrand has a better idea on whether any
>>>>>> peripherals go through the CIC first?  I see that gic_interrupt ()
>>>>>> fires once in Xen, which calls doIRQ to push out the virtual interrupt
>>>>>> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
>>>>>> returns, but gic_interrupt() never fires again in Xen.
>>>>> I do not remember of any CIC but the behaviour definitely look like an interrupt acknowledge problem.
>>>>> Could you try the following:
>>>>> --- a/xen/arch/arm/gic-v2.c
>>>>> +++ b/xen/arch/arm/gic-v2.c
>>>>> @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc *desc)
>>>>>      /* Lower the priority of the IRQ */
>>>>>      gicv2_eoi_irq(desc);
>>>>>      /* Deactivation happens in maintenance interrupt / via GICV */
>>>>> +
>>>>> +    /* Test for Keystone2 */
>>>>> +    gicv2_dir_irq(desc);
>>>>>  }
>>>>> I think the problem I had was related to the vgic not deactivating properly the interrupt.
>>>> Are you suggesting the guest EOI is not properly forwarded to the hardware when LR.HW is set? If so, this could possibly be workaround in Xen by raising a maintenance interrupt every time a guest EOI an interrupt.
>>> Agree the maintenance interrupt would definitely be the right solution
>> I would like to make sure we aren't missing anything in Xen first.
>> From what you said, you have encountered this issue in the past with a
>> different hypervisor. So it doesn't look like to be Xen related.
>> Was there any official statement from TI? If not, can we try to get
>> some input from them first?
>> @Marc, I know you dropped 32-bit support in KVM recently :). Although,
> 
> Yes! Victory is mine! Freedom from the shackles of 32bit, at last! :D
> 
>> I was wondering if you heard about any potential issue with guest EOI
>> not forwarded to the host. This is on TI Keystone (Cortex A-15).
> 
> Not that I know of. A-15 definitely works (TC2, Tegra-K1, Calxeda Midway all run just fine with guest EOI), and GIC-400 is a pretty solid piece of kit (it is just sloooooow...).
> 
> Thinking of it, you would see something like that if the GIC was seeing the writes coming from the guest as secure instead of NS (cue the early firmware on XGene that exposed the wrong side of GIC-400).
> 
> Is there some kind of funky bridge between the CPU and the GIC?

Yes the behaviour I had was coherent with the GIC seeing the processor in secure mode and not in non secure hence making the VGIC ack non functional.
So the only way to solve this is actually to do the interrupt deactivate inside Xen (using a maintenance interrupt).

I remember that I also had to do something specific for the configuration of edge/level and priorities to have an almost proper behaviour.
Sadly I have no access to the code anymore, so I would need to guess back what that was..

Bertrand

> 
>        M.
> -- 
> Jazz is not dead. It just smells funny...



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-09 20:07                                                           ` CodeWiz2280
@ 2020-06-10  8:13                                                             ` Bertrand Marquis
  0 siblings, 0 replies; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-10  8:13 UTC (permalink / raw)
  To: CodeWiz2280; +Cc: Marc Zyngier, nd, Stefano Stabellini, Julien Grall, xen-devel

Hi,

> On 9 Jun 2020, at 21:07, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> 
> On Tue, Jun 9, 2020 at 1:45 PM Marc Zyngier <maz@kernel.org> wrote:
>> 
>> Hi Julien,
>> 
>> On 2020-06-09 18:32, Julien Grall wrote:
>>> (+ Marc)
>>> 
>>> On 09/06/2020 18:03, Bertrand Marquis wrote:
>>>> Hi
>>>> 
>>>>> On 9 Jun 2020, at 16:47, Julien Grall <julien@xen.org> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On 09/06/2020 16:28, Bertrand Marquis wrote:
>>>>>> Hi,
>>>>>>> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>>>>>> 
>>>>>>> There does appear to be a secondary (CIC) controller that can
>>>>>>> forward
>>>>>>> events to the GIC-400 and EDMA controllers for the keystone 2
>>>>>>> family.
>>>>>>> Admittedly, i'm not sure how it is being used with regards to the
>>>>>>> peripherals.  I only see mention of the GIC-400 parent for the
>>>>>>> devices
>>>>>>> in the device tree.  Maybe Bertrand has a better idea on whether
>>>>>>> any
>>>>>>> peripherals go through the CIC first?  I see that gic_interrupt ()
>>>>>>> fires once in Xen, which calls doIRQ to push out the virtual
>>>>>>> interrupt
>>>>>>> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
>>>>>>> returns, but gic_interrupt() never fires again in Xen.
>>>>>> I do not remember of any CIC but the behaviour definitely look like
>>>>>> an interrupt acknowledge problem.
>>>>>> Could you try the following:
>>>>>> --- a/xen/arch/arm/gic-v2.c
>>>>>> +++ b/xen/arch/arm/gic-v2.c
>>>>>> @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc
>>>>>> *desc)
>>>>>>      /* Lower the priority of the IRQ */
>>>>>>      gicv2_eoi_irq(desc);
>>>>>>      /* Deactivation happens in maintenance interrupt / via GICV */
>>>>>> +
>>>>>> +    /* Test for Keystone2 */
>>>>>> +    gicv2_dir_irq(desc);
>>>>>>  }
>>>>>> I think the problem I had was related to the vgic not deactivating
>>>>>> properly the interrupt.
>>>>> 
>>>>> Are you suggesting the guest EOI is not properly forwarded to the
>>>>> hardware when LR.HW is set? If so, this could possibly be workaround
>>>>> in Xen by raising a maintenance interrupt every time a guest EOI an
>>>>> interrupt.
>>>> 
>>>> Agree the maintenance interrupt would definitely be the right solution
>>> I would like to make sure we aren't missing anything in Xen first.
>>> From what you said, you have encountered this issue in the past with a
>>> different hypervisor. So it doesn't look like to be Xen related.
>>> 
>>> Was there any official statement from TI? If not, can we try to get
>>> some input from them first?
> Thank you all for your support so far, its really appreciated.  Is
> there a quick patch that I can try with this maintenance interrupt to
> get the level interrupts working as well? I can pose the question to
> TI but would like to close the loop and make sure there are no other
> issues that pop out first.

If you can that would be good to ask TI because I did work on the Keystone2 a while ago and they might have a firmware solution for that.

Bertrand

>>> 
>>> @Marc, I know you dropped 32-bit support in KVM recently :). Although,
>> 
>> Yes! Victory is mine! Freedom from the shackles of 32bit, at last! :D
>> 
>>> I was wondering if you heard about any potential issue with guest EOI
>>> not forwarded to the host. This is on TI Keystone (Cortex A-15).
>> 
>> Not that I know of. A-15 definitely works (TC2, Tegra-K1, Calxeda Midway
>> all run just fine with guest EOI), and GIC-400 is a pretty solid piece
>> of kit (it is just sloooooow...).
>> 
>> Thinking of it, you would see something like that if the GIC was seeing
>> the writes coming from the guest as secure instead of NS (cue the early
>> firmware on XGene that exposed the wrong side of GIC-400).
>> 
>> Is there some kind of funky bridge between the CPU and the GIC?
>> 
>>         M.
>> --
>> Jazz is not dead. It just smells funny...



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-10  8:06                                                           ` Bertrand Marquis
@ 2020-06-10  8:20                                                             ` Marc Zyngier
  2020-06-10  8:39                                                               ` Bertrand Marquis
  0 siblings, 1 reply; 55+ messages in thread
From: Marc Zyngier @ 2020-06-10  8:20 UTC (permalink / raw)
  To: Bertrand Marquis
  Cc: xen-devel, nd, Stefano Stabellini, Julien Grall, CodeWiz2280

On 2020-06-10 09:06, Bertrand Marquis wrote:
> Hi,
> 
>> On 9 Jun 2020, at 18:45, Marc Zyngier <maz@kernel.org> wrote:
>> 
>> Hi Julien,
>> 
>> On 2020-06-09 18:32, Julien Grall wrote:
>>> (+ Marc)
>>> On 09/06/2020 18:03, Bertrand Marquis wrote:
>>>> Hi
>>>>> On 9 Jun 2020, at 16:47, Julien Grall <julien@xen.org> wrote:
>>>>> On 09/06/2020 16:28, Bertrand Marquis wrote:
>>>>>> Hi,
>>>>>>> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> 
>>>>>>> wrote:
>>>>>>> There does appear to be a secondary (CIC) controller that can 
>>>>>>> forward
>>>>>>> events to the GIC-400 and EDMA controllers for the keystone 2 
>>>>>>> family.
>>>>>>> Admittedly, i'm not sure how it is being used with regards to the
>>>>>>> peripherals.  I only see mention of the GIC-400 parent for the 
>>>>>>> devices
>>>>>>> in the device tree.  Maybe Bertrand has a better idea on whether 
>>>>>>> any
>>>>>>> peripherals go through the CIC first?  I see that gic_interrupt 
>>>>>>> ()
>>>>>>> fires once in Xen, which calls doIRQ to push out the virtual 
>>>>>>> interrupt
>>>>>>> to the dom0 kernel.  The dom0 kernel then handles the interrupt 
>>>>>>> and
>>>>>>> returns, but gic_interrupt() never fires again in Xen.
>>>>>> I do not remember of any CIC but the behaviour definitely look 
>>>>>> like an interrupt acknowledge problem.
>>>>>> Could you try the following:
>>>>>> --- a/xen/arch/arm/gic-v2.c
>>>>>> +++ b/xen/arch/arm/gic-v2.c
>>>>>> @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct 
>>>>>> irq_desc *desc)
>>>>>>      /* Lower the priority of the IRQ */
>>>>>>      gicv2_eoi_irq(desc);
>>>>>>      /* Deactivation happens in maintenance interrupt / via GICV 
>>>>>> */
>>>>>> +
>>>>>> +    /* Test for Keystone2 */
>>>>>> +    gicv2_dir_irq(desc);
>>>>>>  }
>>>>>> I think the problem I had was related to the vgic not deactivating 
>>>>>> properly the interrupt.
>>>>> Are you suggesting the guest EOI is not properly forwarded to the 
>>>>> hardware when LR.HW is set? If so, this could possibly be 
>>>>> workaround in Xen by raising a maintenance interrupt every time a 
>>>>> guest EOI an interrupt.
>>>> Agree the maintenance interrupt would definitely be the right 
>>>> solution
>>> I would like to make sure we aren't missing anything in Xen first.
>>> From what you said, you have encountered this issue in the past with 
>>> a
>>> different hypervisor. So it doesn't look like to be Xen related.
>>> Was there any official statement from TI? If not, can we try to get
>>> some input from them first?
>>> @Marc, I know you dropped 32-bit support in KVM recently :). 
>>> Although,
>> 
>> Yes! Victory is mine! Freedom from the shackles of 32bit, at last! :D
>> 
>>> I was wondering if you heard about any potential issue with guest EOI
>>> not forwarded to the host. This is on TI Keystone (Cortex A-15).
>> 
>> Not that I know of. A-15 definitely works (TC2, Tegra-K1, Calxeda 
>> Midway all run just fine with guest EOI), and GIC-400 is a pretty 
>> solid piece of kit (it is just sloooooow...).
>> 
>> Thinking of it, you would see something like that if the GIC was 
>> seeing the writes coming from the guest as secure instead of NS (cue 
>> the early firmware on XGene that exposed the wrong side of GIC-400).
>> 
>> Is there some kind of funky bridge between the CPU and the GIC?
> 
> Yes the behaviour I had was coherent with the GIC seeing the processor
> in secure mode and not in non secure hence making the VGIC ack non
> functional.

Can you please check this with the TI folks? It may be fixable if
the bridge is SW configurable.

> So the only way to solve this is actually to do the interrupt
> deactivate inside Xen (using a maintenance interrupt).

That's a terrible hack, and one that would encourage badly integrated 
HW.
I appreciate the need to "make things work", but I'd be wary of putting
this in released SW. Broken HW must die. I have written more than my 
share
of these terrible hacks (see TX1 support), and I deeply regret it, as
it has only given Si vendors an excuse not to fix things.

> I remember that I also had to do something specific for the
> configuration of edge/level and priorities to have an almost proper
> behaviour.

Well, the moment the GIC observes secure accesses when they should be
non-secure, all bets are off and you have to resort to the above hacks.
The fun part is that if you have secure SW running on this platform,
you can probably DoS it from non-secure. It's good, isn't it?

> Sadly I have no access to the code anymore, so I would need to guess
> back what that was..

I'd say this *is* a good thing.

         M.
-- 
Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-10  8:20                                                             ` Marc Zyngier
@ 2020-06-10  8:39                                                               ` Bertrand Marquis
  2020-06-10 12:39                                                                 ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-10  8:39 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: xen-devel, nd, Stefano Stabellini, Julien Grall, CodeWiz2280



> On 10 Jun 2020, at 09:20, Marc Zyngier <maz@kernel.org> wrote:
> 
> On 2020-06-10 09:06, Bertrand Marquis wrote:
>> Hi,
>>> On 9 Jun 2020, at 18:45, Marc Zyngier <maz@kernel.org> wrote:
>>> Hi Julien,
>>> On 2020-06-09 18:32, Julien Grall wrote:
>>>> (+ Marc)
>>>> On 09/06/2020 18:03, Bertrand Marquis wrote:
>>>>> Hi
>>>>>> On 9 Jun 2020, at 16:47, Julien Grall <julien@xen.org> wrote:
>>>>>> On 09/06/2020 16:28, Bertrand Marquis wrote:
>>>>>>> Hi,
>>>>>>>> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
>>>>>>>> There does appear to be a secondary (CIC) controller that can forward
>>>>>>>> events to the GIC-400 and EDMA controllers for the keystone 2 family.
>>>>>>>> Admittedly, i'm not sure how it is being used with regards to the
>>>>>>>> peripherals.  I only see mention of the GIC-400 parent for the devices
>>>>>>>> in the device tree.  Maybe Bertrand has a better idea on whether any
>>>>>>>> peripherals go through the CIC first?  I see that gic_interrupt ()
>>>>>>>> fires once in Xen, which calls doIRQ to push out the virtual interrupt
>>>>>>>> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
>>>>>>>> returns, but gic_interrupt() never fires again in Xen.
>>>>>>> I do not remember of any CIC but the behaviour definitely look like an interrupt acknowledge problem.
>>>>>>> Could you try the following:
>>>>>>> --- a/xen/arch/arm/gic-v2.c
>>>>>>> +++ b/xen/arch/arm/gic-v2.c
>>>>>>> @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc *desc)
>>>>>>>     /* Lower the priority of the IRQ */
>>>>>>>     gicv2_eoi_irq(desc);
>>>>>>>     /* Deactivation happens in maintenance interrupt / via GICV */
>>>>>>> +
>>>>>>> +    /* Test for Keystone2 */
>>>>>>> +    gicv2_dir_irq(desc);
>>>>>>> }
>>>>>>> I think the problem I had was related to the vgic not deactivating properly the interrupt.
>>>>>> Are you suggesting the guest EOI is not properly forwarded to the hardware when LR.HW is set? If so, this could possibly be workaround in Xen by raising a maintenance interrupt every time a guest EOI an interrupt.
>>>>> Agree the maintenance interrupt would definitely be the right solution
>>>> I would like to make sure we aren't missing anything in Xen first.
>>>> From what you said, you have encountered this issue in the past with a
>>>> different hypervisor. So it doesn't look like to be Xen related.
>>>> Was there any official statement from TI? If not, can we try to get
>>>> some input from them first?
>>>> @Marc, I know you dropped 32-bit support in KVM recently :). Although,
>>> Yes! Victory is mine! Freedom from the shackles of 32bit, at last! :D
>>>> I was wondering if you heard about any potential issue with guest EOI
>>>> not forwarded to the host. This is on TI Keystone (Cortex A-15).
>>> Not that I know of. A-15 definitely works (TC2, Tegra-K1, Calxeda Midway all run just fine with guest EOI), and GIC-400 is a pretty solid piece of kit (it is just sloooooow...).
>>> Thinking of it, you would see something like that if the GIC was seeing the writes coming from the guest as secure instead of NS (cue the early firmware on XGene that exposed the wrong side of GIC-400).
>>> Is there some kind of funky bridge between the CPU and the GIC?
>> Yes the behaviour I had was coherent with the GIC seeing the processor
>> in secure mode and not in non secure hence making the VGIC ack non
>> functional.
> 
> Can you please check this with the TI folks? It may be fixable if
> the bridge is SW configurable.

At that time they did not “offer” that solution but does not mean it is not possible.

> 
>> So the only way to solve this is actually to do the interrupt
>> deactivate inside Xen (using a maintenance interrupt).
> 
> That's a terrible hack, and one that would encourage badly integrated HW.
> I appreciate the need to "make things work", but I'd be wary of putting
> this in released SW. Broken HW must die. I have written more than my share
> of these terrible hacks (see TX1 support), and I deeply regret it, as
> it has only given Si vendors an excuse not to fix things.

Fully agree and I also had to do some hacks for the TX1 ;-)

> 
>> I remember that I also had to do something specific for the
>> configuration of edge/level and priorities to have an almost proper
>> behaviour.
> 
> Well, the moment the GIC observes secure accesses when they should be
> non-secure, all bets are off and you have to resort to the above hacks.
> The fun part is that if you have secure SW running on this platform,
> you can probably DoS it from non-secure. It's good, isn't it?

Definitely is but if I remember correctly they have 2 kind of SoC: one that can be only used in non-secure and an other which is meant to be use with secure and non secure.

Bertrand

> 
>> Sadly I have no access to the code anymore, so I would need to guess
>> back what that was..
> 
> I'd say this *is* a good thing.
> 
>        M.
> -- 
> Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-10  8:39                                                               ` Bertrand Marquis
@ 2020-06-10 12:39                                                                 ` CodeWiz2280
  2020-06-10 12:53                                                                   ` Marc Zyngier
  2020-06-10 12:58                                                                   ` Julien Grall
  0 siblings, 2 replies; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-10 12:39 UTC (permalink / raw)
  To: Bertrand Marquis
  Cc: Marc Zyngier, nd, Stefano Stabellini, Julien Grall, xen-devel

On Wed, Jun 10, 2020 at 4:39 AM Bertrand Marquis
<Bertrand.Marquis@arm.com> wrote:
>
>
>
> > On 10 Jun 2020, at 09:20, Marc Zyngier <maz@kernel.org> wrote:
> >
> > On 2020-06-10 09:06, Bertrand Marquis wrote:
> >> Hi,
> >>> On 9 Jun 2020, at 18:45, Marc Zyngier <maz@kernel.org> wrote:
> >>> Hi Julien,
> >>> On 2020-06-09 18:32, Julien Grall wrote:
> >>>> (+ Marc)
> >>>> On 09/06/2020 18:03, Bertrand Marquis wrote:
> >>>>> Hi
> >>>>>> On 9 Jun 2020, at 16:47, Julien Grall <julien@xen.org> wrote:
> >>>>>> On 09/06/2020 16:28, Bertrand Marquis wrote:
> >>>>>>> Hi,
> >>>>>>>> On 9 Jun 2020, at 15:33, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> >>>>>>>> There does appear to be a secondary (CIC) controller that can forward
> >>>>>>>> events to the GIC-400 and EDMA controllers for the keystone 2 family.
> >>>>>>>> Admittedly, i'm not sure how it is being used with regards to the
> >>>>>>>> peripherals.  I only see mention of the GIC-400 parent for the devices
> >>>>>>>> in the device tree.  Maybe Bertrand has a better idea on whether any
> >>>>>>>> peripherals go through the CIC first?  I see that gic_interrupt ()
> >>>>>>>> fires once in Xen, which calls doIRQ to push out the virtual interrupt
> >>>>>>>> to the dom0 kernel.  The dom0 kernel then handles the interrupt and
> >>>>>>>> returns, but gic_interrupt() never fires again in Xen.
> >>>>>>> I do not remember of any CIC but the behaviour definitely look like an interrupt acknowledge problem.
> >>>>>>> Could you try the following:
> >>>>>>> --- a/xen/arch/arm/gic-v2.c
> >>>>>>> +++ b/xen/arch/arm/gic-v2.c
> >>>>>>> @@ -667,6 +667,9 @@ static void gicv2_guest_irq_end(struct irq_desc *desc)
> >>>>>>>     /* Lower the priority of the IRQ */
> >>>>>>>     gicv2_eoi_irq(desc);
> >>>>>>>     /* Deactivation happens in maintenance interrupt / via GICV */
> >>>>>>> +
> >>>>>>> +    /* Test for Keystone2 */
> >>>>>>> +    gicv2_dir_irq(desc);
> >>>>>>> }
> >>>>>>> I think the problem I had was related to the vgic not deactivating properly the interrupt.
> >>>>>> Are you suggesting the guest EOI is not properly forwarded to the hardware when LR.HW is set? If so, this could possibly be workaround in Xen by raising a maintenance interrupt every time a guest EOI an interrupt.
> >>>>> Agree the maintenance interrupt would definitely be the right solution
> >>>> I would like to make sure we aren't missing anything in Xen first.
> >>>> From what you said, you have encountered this issue in the past with a
> >>>> different hypervisor. So it doesn't look like to be Xen related.
> >>>> Was there any official statement from TI? If not, can we try to get
> >>>> some input from them first?
> >>>> @Marc, I know you dropped 32-bit support in KVM recently :). Although,
> >>> Yes! Victory is mine! Freedom from the shackles of 32bit, at last! :D
> >>>> I was wondering if you heard about any potential issue with guest EOI
> >>>> not forwarded to the host. This is on TI Keystone (Cortex A-15).
> >>> Not that I know of. A-15 definitely works (TC2, Tegra-K1, Calxeda Midway all run just fine with guest EOI), and GIC-400 is a pretty solid piece of kit (it is just sloooooow...).
> >>> Thinking of it, you would see something like that if the GIC was seeing the writes coming from the guest as secure instead of NS (cue the early firmware on XGene that exposed the wrong side of GIC-400).
> >>> Is there some kind of funky bridge between the CPU and the GIC?
> >> Yes the behaviour I had was coherent with the GIC seeing the processor
> >> in secure mode and not in non secure hence making the VGIC ack non
> >> functional.
> >
> > Can you please check this with the TI folks? It may be fixable if
> > the bridge is SW configurable.
>
> At that time they did not “offer” that solution but does not mean it is not possible.
>
> >
> >> So the only way to solve this is actually to do the interrupt
> >> deactivate inside Xen (using a maintenance interrupt).
> >
> > That's a terrible hack, and one that would encourage badly integrated HW.
> > I appreciate the need to "make things work", but I'd be wary of putting
> > this in released SW. Broken HW must die. I have written more than my share
> > of these terrible hacks (see TX1 support), and I deeply regret it, as
> > it has only given Si vendors an excuse not to fix things.
>
> Fully agree and I also had to do some hacks for the TX1 ;-)
>
> >
> >> I remember that I also had to do something specific for the
> >> configuration of edge/level and priorities to have an almost proper
> >> behaviour.
> >
> > Well, the moment the GIC observes secure accesses when they should be
> > non-secure, all bets are off and you have to resort to the above hacks.
> > The fun part is that if you have secure SW running on this platform,
> > you can probably DoS it from non-secure. It's good, isn't it?
>
> Definitely is but if I remember correctly they have 2 kind of SoC: one that can be only used in non-secure and an other which is meant to be use with secure and non secure.
>
> Bertrand
>
> >
> >> Sadly I have no access to the code anymore, so I would need to guess
> >> back what that was..
> >
> > I'd say this *is* a good thing.
The problem is that a hack may be my only solution to getting this
working on this platform.  If TI says that they don't support it then
i'm stuck.  Just to summarize the problem, we believe that the GIC is
seeing secure accesses from dom0 when they should be non-secure.  This
is causing the VGIC ack to be non-functional from dom0.   We would
need a firmware that supports both secure and non-secure accesses.

The Xen code gets to "gicv2_guest_irq_end()" where it executes
"gicv2_eoi_irq()", but then we had to add the deactivate
"gicv2_dir_irq" to clear the virtual interrupt manually to get things
going again.

> >
> >        M.
> > --
> > Jazz is not dead. It just smells funny...
>


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-10 12:39                                                                 ` CodeWiz2280
@ 2020-06-10 12:53                                                                   ` Marc Zyngier
  2020-06-10 12:58                                                                   ` Julien Grall
  1 sibling, 0 replies; 55+ messages in thread
From: Marc Zyngier @ 2020-06-10 12:53 UTC (permalink / raw)
  To: CodeWiz2280
  Cc: xen-devel, nd, Stefano Stabellini, Julien Grall, Bertrand Marquis

On 2020-06-10 13:39, CodeWiz2280 wrote:

[...]

> The problem is that a hack may be my only solution to getting this
> working on this platform.  If TI says that they don't support it then
> i'm stuck.  Just to summarize the problem, we believe that the GIC is
> seeing secure accesses from dom0 when they should be non-secure.  This

Not necessarily just dom0. The hypothesis is that accesses to the GICV
and/or GICD regions from a non-secure guest are treated as secure.
My hunch is that it is only GICV that gets messed with, as you seem
to solve it by writing to GICC.

> is causing the VGIC ack to be non-functional from dom0.   We would
> need a firmware that supports both secure and non-secure accesses.

Not exactly. You'd need the bridge between the CPU and the GIC to honor
NS bit passed on the bus (AXI or otherwise). That is assuming that:
- the NS attribute is actually present on the interconnect
- the HW is configurable
- our "finger in the air" analysis is actually correct

As for the last point, only someone with access to the RTL could
tell you...

> The Xen code gets to "gicv2_guest_irq_end()" where it executes
> "gicv2_eoi_irq()", but then we had to add the deactivate
> "gicv2_dir_irq" to clear the virtual interrupt manually to get things
> going again.

Also known as "priority drop" and "deactivation". You may want to
use architectural terms when explaining this to HW people.

         M.
-- 
Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-10 12:39                                                                 ` CodeWiz2280
  2020-06-10 12:53                                                                   ` Marc Zyngier
@ 2020-06-10 12:58                                                                   ` Julien Grall
  1 sibling, 0 replies; 55+ messages in thread
From: Julien Grall @ 2020-06-10 12:58 UTC (permalink / raw)
  To: CodeWiz2280, Bertrand Marquis
  Cc: Marc Zyngier, nd, Stefano Stabellini, xen-devel



On 10/06/2020 13:39, CodeWiz2280 wrote:
>>>> So the only way to solve this is actually to do the interrupt
>>>> deactivate inside Xen (using a maintenance interrupt).
>>>
>>> That's a terrible hack, and one that would encourage badly integrated HW.
>>> I appreciate the need to "make things work", but I'd be wary of putting
>>> this in released SW. Broken HW must die. I have written more than my share
>>> of these terrible hacks (see TX1 support), and I deeply regret it, as
>>> it has only given Si vendors an excuse not to fix things.
>>
>> Fully agree and I also had to do some hacks for the TX1 ;-)
>>
>>>
>>>> I remember that I also had to do something specific for the
>>>> configuration of edge/level and priorities to have an almost proper
>>>> behaviour.
>>>
>>> Well, the moment the GIC observes secure accesses when they should be
>>> non-secure, all bets are off and you have to resort to the above hacks.
>>> The fun part is that if you have secure SW running on this platform,
>>> you can probably DoS it from non-secure. It's good, isn't it?
>>
>> Definitely is but if I remember correctly they have 2 kind of SoC: one that can be only used in non-secure and an other which is meant to be use with secure and non secure.
>>
>> Bertrand
>>
>>>
>>>> Sadly I have no access to the code anymore, so I would need to guess
>>>> back what that was..
>>>
>>> I'd say this *is* a good thing.
> The problem is that a hack may be my only solution to getting this
> working on this platform.  If TI says that they don't support it then
> i'm stuck.

OOI, what's your end goal for Xen on Keystone?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-09 17:45                                                         ` Marc Zyngier
  2020-06-09 20:07                                                           ` CodeWiz2280
  2020-06-10  8:06                                                           ` Bertrand Marquis
@ 2020-06-10 21:46                                                           ` Julien Grall
  2020-06-15 19:14                                                             ` CodeWiz2280
  2 siblings, 1 reply; 55+ messages in thread
From: Julien Grall @ 2020-06-10 21:46 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: xen-devel, nd, Stefano Stabellini, CodeWiz2280, Bertrand Marquis

Hi Marc,

On Tue, 9 Jun 2020 at 18:45, Marc Zyngier <maz@kernel.org> wrote:
> > I was wondering if you heard about any potential issue with guest EOI
> > not forwarded to the host. This is on TI Keystone (Cortex A-15).
>
> Not that I know of. A-15 definitely works (TC2, Tegra-K1, Calxeda Midway
> all run just fine with guest EOI), and GIC-400 is a pretty solid piece
> of kit (it is just sloooooow...).
>
> Thinking of it, you would see something like that if the GIC was seeing
> the writes coming from the guest as secure instead of NS (cue the early
> firmware on XGene that exposed the wrong side of GIC-400).

Ah, I remember that one.  We used to carry an hack in Xen [1] for
X-Gene. Thankfully they fixed the firmware!

If it is a similar issue, then the firmware path would definitely be
my preference.

Thank you for the input!

Cheers,

[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=50dcb3de603927db2fd87ba09e29c817415aaa44


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-10 21:46                                                           ` Julien Grall
@ 2020-06-15 19:14                                                             ` CodeWiz2280
  2020-06-15 21:32                                                               ` Stefano Stabellini
  2020-06-16  8:11                                                               ` Marc Zyngier
  0 siblings, 2 replies; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-15 19:14 UTC (permalink / raw)
  To: Julien Grall
  Cc: Marc Zyngier, nd, Stefano Stabellini, xen-devel, Bertrand Marquis

On Wed, Jun 10, 2020 at 5:46 PM Julien Grall <julien.grall.oss@gmail.com> wrote:
>
> Hi Marc,
>
> On Tue, 9 Jun 2020 at 18:45, Marc Zyngier <maz@kernel.org> wrote:
> > > I was wondering if you heard about any potential issue with guest EOI
> > > not forwarded to the host. This is on TI Keystone (Cortex A-15).
> >
> > Not that I know of. A-15 definitely works (TC2, Tegra-K1, Calxeda Midway
> > all run just fine with guest EOI), and GIC-400 is a pretty solid piece
> > of kit (it is just sloooooow...).
> >
> > Thinking of it, you would see something like that if the GIC was seeing
> > the writes coming from the guest as secure instead of NS (cue the early
> > firmware on XGene that exposed the wrong side of GIC-400).
>
> Ah, I remember that one.  We used to carry an hack in Xen [1] for
> X-Gene. Thankfully they fixed the firmware!
>
> If it is a similar issue, then the firmware path would definitely be
> my preference.
>
> Thank you for the input!

Thank you all for the information.  If I pull the changes to use the
maintenance interrupt for the X-Gene back into the latest build of Xen
then my issue with the Edge and Level interrupts is resolved.  My
ethernet and other devices work fine again for the Keystone in dom0.
Are there any concerns over operating this way, meaning with the
maintenance interrupt workaround rather than the EOI?  Is this safe?

Also, the latest linux kernel still has the X-Gene storm distributor
address as "0x78010000" in the device tree, which is what the Xen code
considers a match with the old firmware.  What were the addresses for
the device tree supposed to be changed to?  Is my understanding
correct that there is a different base address required to access the
"non-secure" region instead of the "secure" 0x78010000 region?  I'm
trying to see if there are corresponding different addresses for the
keystone K2E, but haven't found them yet in the manuals.

>
> Cheers,
>
> [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=50dcb3de603927db2fd87ba09e29c817415aaa44


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-15 19:14                                                             ` CodeWiz2280
@ 2020-06-15 21:32                                                               ` Stefano Stabellini
  2020-06-16  7:56                                                                 ` Bertrand Marquis
  2020-06-16  8:11                                                               ` Marc Zyngier
  1 sibling, 1 reply; 55+ messages in thread
From: Stefano Stabellini @ 2020-06-15 21:32 UTC (permalink / raw)
  To: CodeWiz2280
  Cc: Stefano Stabellini, Marc Zyngier, Bertrand Marquis, xen-devel,
	nd, Julien Grall

On Mon, 15 Jun 2020, CodeWiz2280 wrote:
> On Wed, Jun 10, 2020 at 5:46 PM Julien Grall <julien.grall.oss@gmail.com> wrote:
> >
> > Hi Marc,
> >
> > On Tue, 9 Jun 2020 at 18:45, Marc Zyngier <maz@kernel.org> wrote:
> > > > I was wondering if you heard about any potential issue with guest EOI
> > > > not forwarded to the host. This is on TI Keystone (Cortex A-15).
> > >
> > > Not that I know of. A-15 definitely works (TC2, Tegra-K1, Calxeda Midway
> > > all run just fine with guest EOI), and GIC-400 is a pretty solid piece
> > > of kit (it is just sloooooow...).
> > >
> > > Thinking of it, you would see something like that if the GIC was seeing
> > > the writes coming from the guest as secure instead of NS (cue the early
> > > firmware on XGene that exposed the wrong side of GIC-400).
> >
> > Ah, I remember that one.  We used to carry an hack in Xen [1] for
> > X-Gene. Thankfully they fixed the firmware!
> >
> > If it is a similar issue, then the firmware path would definitely be
> > my preference.
> >
> > Thank you for the input!
> 
> Thank you all for the information.  If I pull the changes to use the
> maintenance interrupt for the X-Gene back into the latest build of Xen
> then my issue with the Edge and Level interrupts is resolved.  My
> ethernet and other devices work fine again for the Keystone in dom0.
> Are there any concerns over operating this way, meaning with the
> maintenance interrupt workaround rather than the EOI?  Is this safe?

It should be fine, a small impact on performance, that's all.


> Also, the latest linux kernel still has the X-Gene storm distributor
> address as "0x78010000" in the device tree, which is what the Xen code
> considers a match with the old firmware.  What were the addresses for
> the device tree supposed to be changed to?  Is my understanding
> correct that there is a different base address required to access the
> "non-secure" region instead of the "secure" 0x78010000 region?  I'm
> trying to see if there are corresponding different addresses for the
> keystone K2E, but haven't found them yet in the manuals.

I went through the old emails archive but couldn't find a mention of the
other address, sorry.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-15 21:32                                                               ` Stefano Stabellini
@ 2020-06-16  7:56                                                                 ` Bertrand Marquis
  0 siblings, 0 replies; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-16  7:56 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Marc Zyngier, nd, xen-devel, CodeWiz2280, Julien Grall



> On 15 Jun 2020, at 22:32, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Mon, 15 Jun 2020, CodeWiz2280 wrote:
>> On Wed, Jun 10, 2020 at 5:46 PM Julien Grall <julien.grall.oss@gmail.com> wrote:
>>> 
>>> Hi Marc,
>>> 
>>> On Tue, 9 Jun 2020 at 18:45, Marc Zyngier <maz@kernel.org> wrote:
>>>>> I was wondering if you heard about any potential issue with guest EOI
>>>>> not forwarded to the host. This is on TI Keystone (Cortex A-15).
>>>> 
>>>> Not that I know of. A-15 definitely works (TC2, Tegra-K1, Calxeda Midway
>>>> all run just fine with guest EOI), and GIC-400 is a pretty solid piece
>>>> of kit (it is just sloooooow...).
>>>> 
>>>> Thinking of it, you would see something like that if the GIC was seeing
>>>> the writes coming from the guest as secure instead of NS (cue the early
>>>> firmware on XGene that exposed the wrong side of GIC-400).
>>> 
>>> Ah, I remember that one.  We used to carry an hack in Xen [1] for
>>> X-Gene. Thankfully they fixed the firmware!
>>> 
>>> If it is a similar issue, then the firmware path would definitely be
>>> my preference.
>>> 
>>> Thank you for the input!
>> 
>> Thank you all for the information.  If I pull the changes to use the
>> maintenance interrupt for the X-Gene back into the latest build of Xen
>> then my issue with the Edge and Level interrupts is resolved.  My
>> ethernet and other devices work fine again for the Keystone in dom0.
>> Are there any concerns over operating this way, meaning with the
>> maintenance interrupt workaround rather than the EOI?  Is this safe?
> 
> It should be fine, a small impact on performance, that's all.

Agree, this is safe but you will have an overhead (one context switch back to Xen on interrupt ack in Dom0 in your case).

> 
> 
>> Also, the latest linux kernel still has the X-Gene storm distributor
>> address as "0x78010000" in the device tree, which is what the Xen code
>> considers a match with the old firmware.  What were the addresses for
>> the device tree supposed to be changed to?  Is my understanding
>> correct that there is a different base address required to access the
>> "non-secure" region instead of the "secure" 0x78010000 region?  I'm
>> trying to see if there are corresponding different addresses for the
>> keystone K2E, but haven't found them yet in the manuals.
> 
> I went through the old emails archive but couldn't find a mention of the
> other address, sorry.

I think there is no other address as even though there would be one the Secure status reported by the core would still say that you are running in secure mode.
I would really suggest to try to contact directly TI on that part to get an official answer from them as they might have a workaround.

Cheers
Bertrand





^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-15 19:14                                                             ` CodeWiz2280
  2020-06-15 21:32                                                               ` Stefano Stabellini
@ 2020-06-16  8:11                                                               ` Marc Zyngier
  2020-06-16 18:13                                                                 ` CodeWiz2280
  1 sibling, 1 reply; 55+ messages in thread
From: Marc Zyngier @ 2020-06-16  8:11 UTC (permalink / raw)
  To: CodeWiz2280
  Cc: xen-devel, nd, Bertrand Marquis, Stefano Stabellini, Julien Grall

On 2020-06-15 20:14, CodeWiz2280 wrote:

[...]

> Also, the latest linux kernel still has the X-Gene storm distributor
> address as "0x78010000" in the device tree, which is what the Xen code
> considers a match with the old firmware.  What were the addresses for
> the device tree supposed to be changed to?

We usually don't care, as the GIC address is provided by the bootloader, 
whether via DT or ACPI (this is certainly what happens on Mustang). 
Whatever is still in the kernel tree is just as dead as the platform it 
describes.

> Is my understanding
> correct that there is a different base address required to access the
> "non-secure" region instead of the "secure" 0x78010000 region?  I'm
> trying to see if there are corresponding different addresses for the
> keystone K2E, but haven't found them yet in the manuals.

There is no such address. Think of the NS bit as an *address space* 
identifier.

The only reason XGene presents the NS part of the GIC at a different 
address is because XGene is broken enough not to have EL3, hence no 
secure mode. To wire the GIC (and other standard ARM IPs) to the core, 
the designers simply used the CPU NS signal as an address bit.

On your platform, the NS bit does exist. I strongly suppose that it 
isn't wired to the GIC. Please talk to your SoC vendor for whether iot 
is possible to work around this.

         M.
-- 
Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-16  8:11                                                               ` Marc Zyngier
@ 2020-06-16 18:13                                                                 ` CodeWiz2280
  2020-06-16 18:23                                                                   ` Marc Zyngier
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-16 18:13 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: xen-devel, nd, Bertrand Marquis, Stefano Stabellini, Julien Grall

On Tue, Jun 16, 2020 at 4:11 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On 2020-06-15 20:14, CodeWiz2280 wrote:
>
> [...]
>
> > Also, the latest linux kernel still has the X-Gene storm distributor
> > address as "0x78010000" in the device tree, which is what the Xen code
> > considers a match with the old firmware.  What were the addresses for
> > the device tree supposed to be changed to?
>
> We usually don't care, as the GIC address is provided by the bootloader,
> whether via DT or ACPI (this is certainly what happens on Mustang).
> Whatever is still in the kernel tree is just as dead as the platform it
> describes.
>
> > Is my understanding
> > correct that there is a different base address required to access the
> > "non-secure" region instead of the "secure" 0x78010000 region?  I'm
> > trying to see if there are corresponding different addresses for the
> > keystone K2E, but haven't found them yet in the manuals.
>
> There is no such address. Think of the NS bit as an *address space*
> identifier.
>
> The only reason XGene presents the NS part of the GIC at a different
> address is because XGene is broken enough not to have EL3, hence no
> secure mode. To wire the GIC (and other standard ARM IPs) to the core,
> the designers simply used the CPU NS signal as an address bit.
>
> On your platform, the NS bit does exist. I strongly suppose that it
> isn't wired to the GIC. Please talk to your SoC vendor for whether iot
> is possible to work around this.
>
I do have a question about this out to TI, but at least this method
gives me something to work with in the meantime.  I was just looking
to confirm that there wouldn't be any other undesirable side effects
with Dom0 or DomU when using it.  Was there an actual FPGA for the
X-Gene that needed to be updated which controlled the GIC access?  Or
by firmware do you mean the boot loader (e.g. uboot).  Thanks for the
support so far to all.

>          M.
> --
> Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-16 18:13                                                                 ` CodeWiz2280
@ 2020-06-16 18:23                                                                   ` Marc Zyngier
  2020-06-17 14:45                                                                     ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: Marc Zyngier @ 2020-06-16 18:23 UTC (permalink / raw)
  To: CodeWiz2280
  Cc: xen-devel, nd, Bertrand Marquis, Stefano Stabellini, Julien Grall

On 2020-06-16 19:13, CodeWiz2280 wrote:
> On Tue, Jun 16, 2020 at 4:11 AM Marc Zyngier <maz@kernel.org> wrote:
>> 
>> On 2020-06-15 20:14, CodeWiz2280 wrote:
>> 
>> [...]
>> 
>> > Also, the latest linux kernel still has the X-Gene storm distributor
>> > address as "0x78010000" in the device tree, which is what the Xen code
>> > considers a match with the old firmware.  What were the addresses for
>> > the device tree supposed to be changed to?
>> 
>> We usually don't care, as the GIC address is provided by the 
>> bootloader,
>> whether via DT or ACPI (this is certainly what happens on Mustang).
>> Whatever is still in the kernel tree is just as dead as the platform 
>> it
>> describes.
>> 
>> > Is my understanding
>> > correct that there is a different base address required to access the
>> > "non-secure" region instead of the "secure" 0x78010000 region?  I'm
>> > trying to see if there are corresponding different addresses for the
>> > keystone K2E, but haven't found them yet in the manuals.
>> 
>> There is no such address. Think of the NS bit as an *address space*
>> identifier.
>> 
>> The only reason XGene presents the NS part of the GIC at a different
>> address is because XGene is broken enough not to have EL3, hence no
>> secure mode. To wire the GIC (and other standard ARM IPs) to the core,
>> the designers simply used the CPU NS signal as an address bit.
>> 
>> On your platform, the NS bit does exist. I strongly suppose that it
>> isn't wired to the GIC. Please talk to your SoC vendor for whether iot
>> is possible to work around this.
>> 
> I do have a question about this out to TI, but at least this method
> gives me something to work with in the meantime.  I was just looking
> to confirm that there wouldn't be any other undesirable side effects
> with Dom0 or DomU when using it.  Was there an actual FPGA for the
> X-Gene that needed to be updated which controlled the GIC access?  Or
> by firmware do you mean the boot loader (e.g. uboot).  Thanks for the
> support so far to all.

As I said, the specific case of XGene was just a matter of picking the 
right address, as the NS bit is used as an address bit on this platform. 
This was possible because this machine doesn't have any form of 
security. So no HW was changed, no FPGA reprogrammed. Only a firmware 
table was fixed to point to the right spot. Not even u-boot or EFI was 
changed.

         M.
-- 
Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-16 18:23                                                                   ` Marc Zyngier
@ 2020-06-17 14:45                                                                     ` CodeWiz2280
  2020-06-17 15:25                                                                       ` Marc Zyngier
  2020-06-17 18:46                                                                       ` Stefano Stabellini
  0 siblings, 2 replies; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-17 14:45 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: xen-devel, nd, Bertrand Marquis, Stefano Stabellini, Julien Grall

On Tue, Jun 16, 2020 at 2:23 PM Marc Zyngier <maz@kernel.org> wrote:
>
> On 2020-06-16 19:13, CodeWiz2280 wrote:
> > On Tue, Jun 16, 2020 at 4:11 AM Marc Zyngier <maz@kernel.org> wrote:
> >>
> >> On 2020-06-15 20:14, CodeWiz2280 wrote:
> >>
> >> [...]
> >>
> >> > Also, the latest linux kernel still has the X-Gene storm distributor
> >> > address as "0x78010000" in the device tree, which is what the Xen code
> >> > considers a match with the old firmware.  What were the addresses for
> >> > the device tree supposed to be changed to?
> >>
> >> We usually don't care, as the GIC address is provided by the
> >> bootloader,
> >> whether via DT or ACPI (this is certainly what happens on Mustang).
> >> Whatever is still in the kernel tree is just as dead as the platform
> >> it
> >> describes.
> >>
> >> > Is my understanding
> >> > correct that there is a different base address required to access the
> >> > "non-secure" region instead of the "secure" 0x78010000 region?  I'm
> >> > trying to see if there are corresponding different addresses for the
> >> > keystone K2E, but haven't found them yet in the manuals.
> >>
> >> There is no such address. Think of the NS bit as an *address space*
> >> identifier.
> >>
> >> The only reason XGene presents the NS part of the GIC at a different
> >> address is because XGene is broken enough not to have EL3, hence no
> >> secure mode. To wire the GIC (and other standard ARM IPs) to the core,
> >> the designers simply used the CPU NS signal as an address bit.
> >>
> >> On your platform, the NS bit does exist. I strongly suppose that it
> >> isn't wired to the GIC. Please talk to your SoC vendor for whether iot
> >> is possible to work around this.
> >>
> > I do have a question about this out to TI, but at least this method
> > gives me something to work with in the meantime.  I was just looking
> > to confirm that there wouldn't be any other undesirable side effects
> > with Dom0 or DomU when using it.  Was there an actual FPGA for the
> > X-Gene that needed to be updated which controlled the GIC access?  Or
> > by firmware do you mean the boot loader (e.g. uboot).  Thanks for the
> > support so far to all.
>
> As I said, the specific case of XGene was just a matter of picking the
> right address, as the NS bit is used as an address bit on this platform.
> This was possible because this machine doesn't have any form of
> security. So no HW was changed, no FPGA reprogrammed. Only a firmware
> table was fixed to point to the right spot. Not even u-boot or EFI was
> changed.
Ok, thank you for clarifying.  I have one more question if you don't
mind.  I'm aware that dom0 can share physical memory with dom1 via
grant tables.
However, is it possible to reserve a chunk of contiguous physical
memory and directly allocate it only to dom1?
For example, if I wanted dom1 to have access to 8MB of contiguous
memory at 0x8200_0000 (in addition to whatever virtual memory Xen
gives it).
How would one go about doing this on ARM?  Is there something in the
guest config or device tree that can be set?  Thanks for you help.
>
>          M.
> --
> Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-17 14:45                                                                     ` CodeWiz2280
@ 2020-06-17 15:25                                                                       ` Marc Zyngier
  2020-06-17 18:46                                                                       ` Stefano Stabellini
  1 sibling, 0 replies; 55+ messages in thread
From: Marc Zyngier @ 2020-06-17 15:25 UTC (permalink / raw)
  To: CodeWiz2280
  Cc: xen-devel, nd, Bertrand Marquis, Stefano Stabellini, Julien Grall

On 2020-06-17 15:45, CodeWiz2280 wrote:
> On Tue, Jun 16, 2020 at 2:23 PM Marc Zyngier <maz@kernel.org> wrote:
>> 
>> On 2020-06-16 19:13, CodeWiz2280 wrote:
>> > On Tue, Jun 16, 2020 at 4:11 AM Marc Zyngier <maz@kernel.org> wrote:
>> >>
>> >> On 2020-06-15 20:14, CodeWiz2280 wrote:
>> >>
>> >> [...]
>> >>
>> >> > Also, the latest linux kernel still has the X-Gene storm distributor
>> >> > address as "0x78010000" in the device tree, which is what the Xen code
>> >> > considers a match with the old firmware.  What were the addresses for
>> >> > the device tree supposed to be changed to?
>> >>
>> >> We usually don't care, as the GIC address is provided by the
>> >> bootloader,
>> >> whether via DT or ACPI (this is certainly what happens on Mustang).
>> >> Whatever is still in the kernel tree is just as dead as the platform
>> >> it
>> >> describes.
>> >>
>> >> > Is my understanding
>> >> > correct that there is a different base address required to access the
>> >> > "non-secure" region instead of the "secure" 0x78010000 region?  I'm
>> >> > trying to see if there are corresponding different addresses for the
>> >> > keystone K2E, but haven't found them yet in the manuals.
>> >>
>> >> There is no such address. Think of the NS bit as an *address space*
>> >> identifier.
>> >>
>> >> The only reason XGene presents the NS part of the GIC at a different
>> >> address is because XGene is broken enough not to have EL3, hence no
>> >> secure mode. To wire the GIC (and other standard ARM IPs) to the core,
>> >> the designers simply used the CPU NS signal as an address bit.
>> >>
>> >> On your platform, the NS bit does exist. I strongly suppose that it
>> >> isn't wired to the GIC. Please talk to your SoC vendor for whether iot
>> >> is possible to work around this.
>> >>
>> > I do have a question about this out to TI, but at least this method
>> > gives me something to work with in the meantime.  I was just looking
>> > to confirm that there wouldn't be any other undesirable side effects
>> > with Dom0 or DomU when using it.  Was there an actual FPGA for the
>> > X-Gene that needed to be updated which controlled the GIC access?  Or
>> > by firmware do you mean the boot loader (e.g. uboot).  Thanks for the
>> > support so far to all.
>> 
>> As I said, the specific case of XGene was just a matter of picking the
>> right address, as the NS bit is used as an address bit on this 
>> platform.
>> This was possible because this machine doesn't have any form of
>> security. So no HW was changed, no FPGA reprogrammed. Only a firmware
>> table was fixed to point to the right spot. Not even u-boot or EFI was
>> changed.
> Ok, thank you for clarifying.  I have one more question if you don't
> mind.  I'm aware that dom0 can share physical memory with dom1 via
> grant tables.
> However, is it possible to reserve a chunk of contiguous physical
> memory and directly allocate it only to dom1?
> For example, if I wanted dom1 to have access to 8MB of contiguous
> memory at 0x8200_0000 (in addition to whatever virtual memory Xen
> gives it).
> How would one go about doing this on ARM?  Is there something in the
> guest config or device tree that can be set?  Thanks for you help.

That's a question for someone who understands Xen (KVM maintainer here, 
sorry).

My hunch is that you could simply represent this memory as a device, and 
map that "device" into the guest. You'd still need Xen to give you the 
right memory attributes so that you can map it cacheable at Stage-1.

         M.
-- 
Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-17 14:45                                                                     ` CodeWiz2280
  2020-06-17 15:25                                                                       ` Marc Zyngier
@ 2020-06-17 18:46                                                                       ` Stefano Stabellini
  2020-06-17 23:52                                                                         ` CodeWiz2280
  1 sibling, 1 reply; 55+ messages in thread
From: Stefano Stabellini @ 2020-06-17 18:46 UTC (permalink / raw)
  To: CodeWiz2280
  Cc: Stefano Stabellini, Marc Zyngier, Bertrand Marquis, xen-devel,
	nd, Julien Grall

On Wed, 17 Jun 2020, CodeWiz2280 wrote:
> On Tue, Jun 16, 2020 at 2:23 PM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On 2020-06-16 19:13, CodeWiz2280 wrote:
> > > On Tue, Jun 16, 2020 at 4:11 AM Marc Zyngier <maz@kernel.org> wrote:
> > >>
> > >> On 2020-06-15 20:14, CodeWiz2280 wrote:
> > >>
> > >> [...]
> > >>
> > >> > Also, the latest linux kernel still has the X-Gene storm distributor
> > >> > address as "0x78010000" in the device tree, which is what the Xen code
> > >> > considers a match with the old firmware.  What were the addresses for
> > >> > the device tree supposed to be changed to?
> > >>
> > >> We usually don't care, as the GIC address is provided by the
> > >> bootloader,
> > >> whether via DT or ACPI (this is certainly what happens on Mustang).
> > >> Whatever is still in the kernel tree is just as dead as the platform
> > >> it
> > >> describes.
> > >>
> > >> > Is my understanding
> > >> > correct that there is a different base address required to access the
> > >> > "non-secure" region instead of the "secure" 0x78010000 region?  I'm
> > >> > trying to see if there are corresponding different addresses for the
> > >> > keystone K2E, but haven't found them yet in the manuals.
> > >>
> > >> There is no such address. Think of the NS bit as an *address space*
> > >> identifier.
> > >>
> > >> The only reason XGene presents the NS part of the GIC at a different
> > >> address is because XGene is broken enough not to have EL3, hence no
> > >> secure mode. To wire the GIC (and other standard ARM IPs) to the core,
> > >> the designers simply used the CPU NS signal as an address bit.
> > >>
> > >> On your platform, the NS bit does exist. I strongly suppose that it
> > >> isn't wired to the GIC. Please talk to your SoC vendor for whether iot
> > >> is possible to work around this.
> > >>
> > > I do have a question about this out to TI, but at least this method
> > > gives me something to work with in the meantime.  I was just looking
> > > to confirm that there wouldn't be any other undesirable side effects
> > > with Dom0 or DomU when using it.  Was there an actual FPGA for the
> > > X-Gene that needed to be updated which controlled the GIC access?  Or
> > > by firmware do you mean the boot loader (e.g. uboot).  Thanks for the
> > > support so far to all.
> >
> > As I said, the specific case of XGene was just a matter of picking the
> > right address, as the NS bit is used as an address bit on this platform.
> > This was possible because this machine doesn't have any form of
> > security. So no HW was changed, no FPGA reprogrammed. Only a firmware
> > table was fixed to point to the right spot. Not even u-boot or EFI was
> > changed.
> Ok, thank you for clarifying.  I have one more question if you don't
> mind.  I'm aware that dom0 can share physical memory with dom1 via
> grant tables.
> However, is it possible to reserve a chunk of contiguous physical
> memory and directly allocate it only to dom1?
> For example, if I wanted dom1 to have access to 8MB of contiguous
> memory at 0x8200_0000 (in addition to whatever virtual memory Xen
> gives it).
> How would one go about doing this on ARM?  Is there something in the
> guest config or device tree that can be set?  Thanks for you help.
 
There isn't a "proper" way to do it, only a workaround.

It is possible to do it by using the iomem option, which is meant for
device MMIO regions.

We have patches in the Xilinx Xen tree (not upstream) to allow for
specifying the cacheability that you want for the iomem mapping so that
you can map it as normal memory. This is the latest branch:

https://github.com/Xilinx/xen.git xilinx/release-2020.1

The relevant commits are the ones from:
https://github.com/Xilinx/xen/commit/a5c76ac1c5dc14d3e9ccedc5c1e7dd2ddc1397b6
to:
https://github.com/Xilinx/xen/commit/b4b7e91c1524f9cf530b81b7ba95df2bf50c78e4

You might want to make sure that the page is not used by the normal
memory allocator. This document explains how to something along those
lines:

https://github.com/Xilinx/xen/commit/35f72d9130448272e07466bd73cc30406f33135e


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-17 18:46                                                                       ` Stefano Stabellini
@ 2020-06-17 23:52                                                                         ` CodeWiz2280
  2020-06-23 20:50                                                                           ` CodeWiz2280
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-17 23:52 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Marc Zyngier, nd, Bertrand Marquis, xen-devel, Julien Grall

On Wed, Jun 17, 2020 at 2:46 PM Stefano Stabellini
<sstabellini@kernel.org> wrote:
>
> On Wed, 17 Jun 2020, CodeWiz2280 wrote:
> > On Tue, Jun 16, 2020 at 2:23 PM Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > On 2020-06-16 19:13, CodeWiz2280 wrote:
> > > > On Tue, Jun 16, 2020 at 4:11 AM Marc Zyngier <maz@kernel.org> wrote:
> > > >>
> > > >> On 2020-06-15 20:14, CodeWiz2280 wrote:
> > > >>
> > > >> [...]
> > > >>
> > > >> > Also, the latest linux kernel still has the X-Gene storm distributor
> > > >> > address as "0x78010000" in the device tree, which is what the Xen code
> > > >> > considers a match with the old firmware.  What were the addresses for
> > > >> > the device tree supposed to be changed to?
> > > >>
> > > >> We usually don't care, as the GIC address is provided by the
> > > >> bootloader,
> > > >> whether via DT or ACPI (this is certainly what happens on Mustang).
> > > >> Whatever is still in the kernel tree is just as dead as the platform
> > > >> it
> > > >> describes.
> > > >>
> > > >> > Is my understanding
> > > >> > correct that there is a different base address required to access the
> > > >> > "non-secure" region instead of the "secure" 0x78010000 region?  I'm
> > > >> > trying to see if there are corresponding different addresses for the
> > > >> > keystone K2E, but haven't found them yet in the manuals.
> > > >>
> > > >> There is no such address. Think of the NS bit as an *address space*
> > > >> identifier.
> > > >>
> > > >> The only reason XGene presents the NS part of the GIC at a different
> > > >> address is because XGene is broken enough not to have EL3, hence no
> > > >> secure mode. To wire the GIC (and other standard ARM IPs) to the core,
> > > >> the designers simply used the CPU NS signal as an address bit.
> > > >>
> > > >> On your platform, the NS bit does exist. I strongly suppose that it
> > > >> isn't wired to the GIC. Please talk to your SoC vendor for whether iot
> > > >> is possible to work around this.
> > > >>
> > > > I do have a question about this out to TI, but at least this method
> > > > gives me something to work with in the meantime.  I was just looking
> > > > to confirm that there wouldn't be any other undesirable side effects
> > > > with Dom0 or DomU when using it.  Was there an actual FPGA for the
> > > > X-Gene that needed to be updated which controlled the GIC access?  Or
> > > > by firmware do you mean the boot loader (e.g. uboot).  Thanks for the
> > > > support so far to all.
> > >
> > > As I said, the specific case of XGene was just a matter of picking the
> > > right address, as the NS bit is used as an address bit on this platform.
> > > This was possible because this machine doesn't have any form of
> > > security. So no HW was changed, no FPGA reprogrammed. Only a firmware
> > > table was fixed to point to the right spot. Not even u-boot or EFI was
> > > changed.
> > Ok, thank you for clarifying.  I have one more question if you don't
> > mind.  I'm aware that dom0 can share physical memory with dom1 via
> > grant tables.
> > However, is it possible to reserve a chunk of contiguous physical
> > memory and directly allocate it only to dom1?
> > For example, if I wanted dom1 to have access to 8MB of contiguous
> > memory at 0x8200_0000 (in addition to whatever virtual memory Xen
> > gives it).
> > How would one go about doing this on ARM?  Is there something in the
> > guest config or device tree that can be set?  Thanks for you help.
>
> There isn't a "proper" way to do it, only a workaround.
>
> It is possible to do it by using the iomem option, which is meant for
> device MMIO regions.
>
> We have patches in the Xilinx Xen tree (not upstream) to allow for
> specifying the cacheability that you want for the iomem mapping so that
> you can map it as normal memory. This is the latest branch:
>
> https://github.com/Xilinx/xen.git xilinx/release-2020.1
>
> The relevant commits are the ones from:
> https://github.com/Xilinx/xen/commit/a5c76ac1c5dc14d3e9ccedc5c1e7dd2ddc1397b6
> to:
> https://github.com/Xilinx/xen/commit/b4b7e91c1524f9cf530b81b7ba95df2bf50c78e4
>
> You might want to make sure that the page is not used by the normal
> memory allocator. This document explains how to something along those
> lines:
>
> https://github.com/Xilinx/xen/commit/35f72d9130448272e07466bd73cc30406f33135e

Thank you.  I appreciate it.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-17 23:52                                                                         ` CodeWiz2280
@ 2020-06-23 20:50                                                                           ` CodeWiz2280
  2020-06-24  7:50                                                                             ` Bertrand Marquis
  0 siblings, 1 reply; 55+ messages in thread
From: CodeWiz2280 @ 2020-06-23 20:50 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Marc Zyngier, nd, Bertrand Marquis, xen-devel, Julien Grall

Is it possible to passthrough PCI devices to domU on the 32-bit arm
keystone?  Any info is appreciated.

I found some old information online that "gic-v2m" is required.  I'm
not sure if the GIC-400 on the K2E supports "gic_v2m".  Based on the
fact that there is no "gic-v2m-frame" tag in the K2E device tree i'm
guessing that it does not.

If it is possible, is there a good example for arm that I can follow?

On Wed, Jun 17, 2020 at 7:52 PM CodeWiz2280 <codewiz2280@gmail.com> wrote:
>
> On Wed, Jun 17, 2020 at 2:46 PM Stefano Stabellini
> <sstabellini@kernel.org> wrote:
> >
> > On Wed, 17 Jun 2020, CodeWiz2280 wrote:
> > > On Tue, Jun 16, 2020 at 2:23 PM Marc Zyngier <maz@kernel.org> wrote:
> > > >
> > > > On 2020-06-16 19:13, CodeWiz2280 wrote:
> > > > > On Tue, Jun 16, 2020 at 4:11 AM Marc Zyngier <maz@kernel.org> wrote:
> > > > >>
> > > > >> On 2020-06-15 20:14, CodeWiz2280 wrote:
> > > > >>
> > > > >> [...]
> > > > >>
> > > > >> > Also, the latest linux kernel still has the X-Gene storm distributor
> > > > >> > address as "0x78010000" in the device tree, which is what the Xen code
> > > > >> > considers a match with the old firmware.  What were the addresses for
> > > > >> > the device tree supposed to be changed to?
> > > > >>
> > > > >> We usually don't care, as the GIC address is provided by the
> > > > >> bootloader,
> > > > >> whether via DT or ACPI (this is certainly what happens on Mustang).
> > > > >> Whatever is still in the kernel tree is just as dead as the platform
> > > > >> it
> > > > >> describes.
> > > > >>
> > > > >> > Is my understanding
> > > > >> > correct that there is a different base address required to access the
> > > > >> > "non-secure" region instead of the "secure" 0x78010000 region?  I'm
> > > > >> > trying to see if there are corresponding different addresses for the
> > > > >> > keystone K2E, but haven't found them yet in the manuals.
> > > > >>
> > > > >> There is no such address. Think of the NS bit as an *address space*
> > > > >> identifier.
> > > > >>
> > > > >> The only reason XGene presents the NS part of the GIC at a different
> > > > >> address is because XGene is broken enough not to have EL3, hence no
> > > > >> secure mode. To wire the GIC (and other standard ARM IPs) to the core,
> > > > >> the designers simply used the CPU NS signal as an address bit.
> > > > >>
> > > > >> On your platform, the NS bit does exist. I strongly suppose that it
> > > > >> isn't wired to the GIC. Please talk to your SoC vendor for whether iot
> > > > >> is possible to work around this.
> > > > >>
> > > > > I do have a question about this out to TI, but at least this method
> > > > > gives me something to work with in the meantime.  I was just looking
> > > > > to confirm that there wouldn't be any other undesirable side effects
> > > > > with Dom0 or DomU when using it.  Was there an actual FPGA for the
> > > > > X-Gene that needed to be updated which controlled the GIC access?  Or
> > > > > by firmware do you mean the boot loader (e.g. uboot).  Thanks for the
> > > > > support so far to all.
> > > >
> > > > As I said, the specific case of XGene was just a matter of picking the
> > > > right address, as the NS bit is used as an address bit on this platform.
> > > > This was possible because this machine doesn't have any form of
> > > > security. So no HW was changed, no FPGA reprogrammed. Only a firmware
> > > > table was fixed to point to the right spot. Not even u-boot or EFI was
> > > > changed.
> > > Ok, thank you for clarifying.  I have one more question if you don't
> > > mind.  I'm aware that dom0 can share physical memory with dom1 via
> > > grant tables.
> > > However, is it possible to reserve a chunk of contiguous physical
> > > memory and directly allocate it only to dom1?
> > > For example, if I wanted dom1 to have access to 8MB of contiguous
> > > memory at 0x8200_0000 (in addition to whatever virtual memory Xen
> > > gives it).
> > > How would one go about doing this on ARM?  Is there something in the
> > > guest config or device tree that can be set?  Thanks for you help.
> >
> > There isn't a "proper" way to do it, only a workaround.
> >
> > It is possible to do it by using the iomem option, which is meant for
> > device MMIO regions.
> >
> > We have patches in the Xilinx Xen tree (not upstream) to allow for
> > specifying the cacheability that you want for the iomem mapping so that
> > you can map it as normal memory. This is the latest branch:
> >
> > https://github.com/Xilinx/xen.git xilinx/release-2020.1
> >
> > The relevant commits are the ones from:
> > https://github.com/Xilinx/xen/commit/a5c76ac1c5dc14d3e9ccedc5c1e7dd2ddc1397b6
> > to:
> > https://github.com/Xilinx/xen/commit/b4b7e91c1524f9cf530b81b7ba95df2bf50c78e4
> >
> > You might want to make sure that the page is not used by the normal
> > memory allocator. This document explains how to something along those
> > lines:
> >
> > https://github.com/Xilinx/xen/commit/35f72d9130448272e07466bd73cc30406f33135e
>
> Thank you.  I appreciate it.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-23 20:50                                                                           ` CodeWiz2280
@ 2020-06-24  7:50                                                                             ` Bertrand Marquis
  2020-06-24 17:28                                                                               ` Stefano Stabellini
  0 siblings, 1 reply; 55+ messages in thread
From: Bertrand Marquis @ 2020-06-24  7:50 UTC (permalink / raw)
  To: CodeWiz2280; +Cc: Marc Zyngier, nd, Stefano Stabellini, xen-devel, Julien Grall



> On 23 Jun 2020, at 21:50, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> 
> Is it possible to passthrough PCI devices to domU on the 32-bit arm
> keystone?  Any info is appreciated.
> 
> I found some old information online that "gic-v2m" is required.  I'm
> not sure if the GIC-400 on the K2E supports "gic_v2m".  Based on the
> fact that there is no "gic-v2m-frame" tag in the K2E device tree i'm
> guessing that it does not.
> 
> If it is possible, is there a good example for arm that I can follow?

There is no PCI passthrough support on Arm for now in Xen.

This is something I am working on and I will present something on this subject at the Xen summit.
But we are not targeting arm32 for now.

The only thing possible for now is to have PCI devices handled by Dom0 and using xen virtual drivers to pass the functionalities (ethernet, block or others).

Regards
Bertrand

> 
> On Wed, Jun 17, 2020 at 7:52 PM CodeWiz2280 <codewiz2280@gmail.com> wrote:
>> 
>> On Wed, Jun 17, 2020 at 2:46 PM Stefano Stabellini
>> <sstabellini@kernel.org> wrote:
>>> 
>>> On Wed, 17 Jun 2020, CodeWiz2280 wrote:
>>>> On Tue, Jun 16, 2020 at 2:23 PM Marc Zyngier <maz@kernel.org> wrote:
>>>>> 
>>>>> On 2020-06-16 19:13, CodeWiz2280 wrote:
>>>>>> On Tue, Jun 16, 2020 at 4:11 AM Marc Zyngier <maz@kernel.org> wrote:
>>>>>>> 
>>>>>>> On 2020-06-15 20:14, CodeWiz2280 wrote:
>>>>>>> 
>>>>>>> [...]
>>>>>>> 
>>>>>>>> Also, the latest linux kernel still has the X-Gene storm distributor
>>>>>>>> address as "0x78010000" in the device tree, which is what the Xen code
>>>>>>>> considers a match with the old firmware.  What were the addresses for
>>>>>>>> the device tree supposed to be changed to?
>>>>>>> 
>>>>>>> We usually don't care, as the GIC address is provided by the
>>>>>>> bootloader,
>>>>>>> whether via DT or ACPI (this is certainly what happens on Mustang).
>>>>>>> Whatever is still in the kernel tree is just as dead as the platform
>>>>>>> it
>>>>>>> describes.
>>>>>>> 
>>>>>>>> Is my understanding
>>>>>>>> correct that there is a different base address required to access the
>>>>>>>> "non-secure" region instead of the "secure" 0x78010000 region?  I'm
>>>>>>>> trying to see if there are corresponding different addresses for the
>>>>>>>> keystone K2E, but haven't found them yet in the manuals.
>>>>>>> 
>>>>>>> There is no such address. Think of the NS bit as an *address space*
>>>>>>> identifier.
>>>>>>> 
>>>>>>> The only reason XGene presents the NS part of the GIC at a different
>>>>>>> address is because XGene is broken enough not to have EL3, hence no
>>>>>>> secure mode. To wire the GIC (and other standard ARM IPs) to the core,
>>>>>>> the designers simply used the CPU NS signal as an address bit.
>>>>>>> 
>>>>>>> On your platform, the NS bit does exist. I strongly suppose that it
>>>>>>> isn't wired to the GIC. Please talk to your SoC vendor for whether iot
>>>>>>> is possible to work around this.
>>>>>>> 
>>>>>> I do have a question about this out to TI, but at least this method
>>>>>> gives me something to work with in the meantime.  I was just looking
>>>>>> to confirm that there wouldn't be any other undesirable side effects
>>>>>> with Dom0 or DomU when using it.  Was there an actual FPGA for the
>>>>>> X-Gene that needed to be updated which controlled the GIC access?  Or
>>>>>> by firmware do you mean the boot loader (e.g. uboot).  Thanks for the
>>>>>> support so far to all.
>>>>> 
>>>>> As I said, the specific case of XGene was just a matter of picking the
>>>>> right address, as the NS bit is used as an address bit on this platform.
>>>>> This was possible because this machine doesn't have any form of
>>>>> security. So no HW was changed, no FPGA reprogrammed. Only a firmware
>>>>> table was fixed to point to the right spot. Not even u-boot or EFI was
>>>>> changed.
>>>> Ok, thank you for clarifying.  I have one more question if you don't
>>>> mind.  I'm aware that dom0 can share physical memory with dom1 via
>>>> grant tables.
>>>> However, is it possible to reserve a chunk of contiguous physical
>>>> memory and directly allocate it only to dom1?
>>>> For example, if I wanted dom1 to have access to 8MB of contiguous
>>>> memory at 0x8200_0000 (in addition to whatever virtual memory Xen
>>>> gives it).
>>>> How would one go about doing this on ARM?  Is there something in the
>>>> guest config or device tree that can be set?  Thanks for you help.
>>> 
>>> There isn't a "proper" way to do it, only a workaround.
>>> 
>>> It is possible to do it by using the iomem option, which is meant for
>>> device MMIO regions.
>>> 
>>> We have patches in the Xilinx Xen tree (not upstream) to allow for
>>> specifying the cacheability that you want for the iomem mapping so that
>>> you can map it as normal memory. This is the latest branch:
>>> 
>>> https://github.com/Xilinx/xen.git xilinx/release-2020.1
>>> 
>>> The relevant commits are the ones from:
>>> https://github.com/Xilinx/xen/commit/a5c76ac1c5dc14d3e9ccedc5c1e7dd2ddc1397b6
>>> to:
>>> https://github.com/Xilinx/xen/commit/b4b7e91c1524f9cf530b81b7ba95df2bf50c78e4
>>> 
>>> You might want to make sure that the page is not used by the normal
>>> memory allocator. This document explains how to something along those
>>> lines:
>>> 
>>> https://github.com/Xilinx/xen/commit/35f72d9130448272e07466bd73cc30406f33135e
>> 
>> Thank you.  I appreciate it.



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Keystone Issue
  2020-06-24  7:50                                                                             ` Bertrand Marquis
@ 2020-06-24 17:28                                                                               ` Stefano Stabellini
  0 siblings, 0 replies; 55+ messages in thread
From: Stefano Stabellini @ 2020-06-24 17:28 UTC (permalink / raw)
  To: Bertrand Marquis
  Cc: Stefano Stabellini, Marc Zyngier, CodeWiz2280, xen-devel, nd,
	Julien Grall

On Wed, 24 Jun 2020, Bertrand Marquis wrote:
> > On 23 Jun 2020, at 21:50, CodeWiz2280 <codewiz2280@gmail.com> wrote:
> > 
> > Is it possible to passthrough PCI devices to domU on the 32-bit arm
> > keystone?  Any info is appreciated.
> > 
> > I found some old information online that "gic-v2m" is required.  I'm
> > not sure if the GIC-400 on the K2E supports "gic_v2m".  Based on the
> > fact that there is no "gic-v2m-frame" tag in the K2E device tree i'm
> > guessing that it does not.
> > 
> > If it is possible, is there a good example for arm that I can follow?
> 
> There is no PCI passthrough support on Arm for now in Xen.
> 
> This is something I am working on and I will present something on this subject at the Xen summit.
> But we are not targeting arm32 for now.
> 
> The only thing possible for now is to have PCI devices handled by Dom0 and using xen virtual drivers to pass the functionalities (ethernet, block or others).

It should also possible to pass the entire PCI controller, together with
the whole aperture and all interrupts to a domU. The domU would get all
PCI devices this way, not just one.


 
> > On Wed, Jun 17, 2020 at 7:52 PM CodeWiz2280 <codewiz2280@gmail.com> wrote:
> >> 
> >> On Wed, Jun 17, 2020 at 2:46 PM Stefano Stabellini
> >> <sstabellini@kernel.org> wrote:
> >>> 
> >>> On Wed, 17 Jun 2020, CodeWiz2280 wrote:
> >>>> On Tue, Jun 16, 2020 at 2:23 PM Marc Zyngier <maz@kernel.org> wrote:
> >>>>> 
> >>>>> On 2020-06-16 19:13, CodeWiz2280 wrote:
> >>>>>> On Tue, Jun 16, 2020 at 4:11 AM Marc Zyngier <maz@kernel.org> wrote:
> >>>>>>> 
> >>>>>>> On 2020-06-15 20:14, CodeWiz2280 wrote:
> >>>>>>> 
> >>>>>>> [...]
> >>>>>>> 
> >>>>>>>> Also, the latest linux kernel still has the X-Gene storm distributor
> >>>>>>>> address as "0x78010000" in the device tree, which is what the Xen code
> >>>>>>>> considers a match with the old firmware.  What were the addresses for
> >>>>>>>> the device tree supposed to be changed to?
> >>>>>>> 
> >>>>>>> We usually don't care, as the GIC address is provided by the
> >>>>>>> bootloader,
> >>>>>>> whether via DT or ACPI (this is certainly what happens on Mustang).
> >>>>>>> Whatever is still in the kernel tree is just as dead as the platform
> >>>>>>> it
> >>>>>>> describes.
> >>>>>>> 
> >>>>>>>> Is my understanding
> >>>>>>>> correct that there is a different base address required to access the
> >>>>>>>> "non-secure" region instead of the "secure" 0x78010000 region?  I'm
> >>>>>>>> trying to see if there are corresponding different addresses for the
> >>>>>>>> keystone K2E, but haven't found them yet in the manuals.
> >>>>>>> 
> >>>>>>> There is no such address. Think of the NS bit as an *address space*
> >>>>>>> identifier.
> >>>>>>> 
> >>>>>>> The only reason XGene presents the NS part of the GIC at a different
> >>>>>>> address is because XGene is broken enough not to have EL3, hence no
> >>>>>>> secure mode. To wire the GIC (and other standard ARM IPs) to the core,
> >>>>>>> the designers simply used the CPU NS signal as an address bit.
> >>>>>>> 
> >>>>>>> On your platform, the NS bit does exist. I strongly suppose that it
> >>>>>>> isn't wired to the GIC. Please talk to your SoC vendor for whether iot
> >>>>>>> is possible to work around this.
> >>>>>>> 
> >>>>>> I do have a question about this out to TI, but at least this method
> >>>>>> gives me something to work with in the meantime.  I was just looking
> >>>>>> to confirm that there wouldn't be any other undesirable side effects
> >>>>>> with Dom0 or DomU when using it.  Was there an actual FPGA for the
> >>>>>> X-Gene that needed to be updated which controlled the GIC access?  Or
> >>>>>> by firmware do you mean the boot loader (e.g. uboot).  Thanks for the
> >>>>>> support so far to all.
> >>>>> 
> >>>>> As I said, the specific case of XGene was just a matter of picking the
> >>>>> right address, as the NS bit is used as an address bit on this platform.
> >>>>> This was possible because this machine doesn't have any form of
> >>>>> security. So no HW was changed, no FPGA reprogrammed. Only a firmware
> >>>>> table was fixed to point to the right spot. Not even u-boot or EFI was
> >>>>> changed.
> >>>> Ok, thank you for clarifying.  I have one more question if you don't
> >>>> mind.  I'm aware that dom0 can share physical memory with dom1 via
> >>>> grant tables.
> >>>> However, is it possible to reserve a chunk of contiguous physical
> >>>> memory and directly allocate it only to dom1?
> >>>> For example, if I wanted dom1 to have access to 8MB of contiguous
> >>>> memory at 0x8200_0000 (in addition to whatever virtual memory Xen
> >>>> gives it).
> >>>> How would one go about doing this on ARM?  Is there something in the
> >>>> guest config or device tree that can be set?  Thanks for you help.
> >>> 
> >>> There isn't a "proper" way to do it, only a workaround.
> >>> 
> >>> It is possible to do it by using the iomem option, which is meant for
> >>> device MMIO regions.
> >>> 
> >>> We have patches in the Xilinx Xen tree (not upstream) to allow for
> >>> specifying the cacheability that you want for the iomem mapping so that
> >>> you can map it as normal memory. This is the latest branch:
> >>> 
> >>> https://github.com/Xilinx/xen.git xilinx/release-2020.1
> >>> 
> >>> The relevant commits are the ones from:
> >>> https://github.com/Xilinx/xen/commit/a5c76ac1c5dc14d3e9ccedc5c1e7dd2ddc1397b6
> >>> to:
> >>> https://github.com/Xilinx/xen/commit/b4b7e91c1524f9cf530b81b7ba95df2bf50c78e4
> >>> 
> >>> You might want to make sure that the page is not used by the normal
> >>> memory allocator. This document explains how to something along those
> >>> lines:
> >>> 
> >>> https://github.com/Xilinx/xen/commit/35f72d9130448272e07466bd73cc30406f33135e
> >> 
> >> Thank you.  I appreciate it.
> 


^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2020-06-24 17:29 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-01 12:38 Keystone Issue CodeWiz2280
2020-06-01 13:29 ` Julien Grall
2020-06-01 15:21   ` CodeWiz2280
2020-06-01 17:38     ` CodeWiz2280
2020-06-03 11:32       ` Julien Grall
2020-06-03 17:13         ` CodeWiz2280
2020-06-03 18:09           ` Julien Grall
2020-06-03 18:37             ` CodeWiz2280
2020-06-04  8:02             ` Bertrand Marquis
2020-06-04  8:59               ` Julien Grall
2020-06-04  9:08                 ` Bertrand Marquis
2020-06-04 10:15                   ` Julien Grall
2020-06-04 12:07                     ` CodeWiz2280
2020-06-04 18:24                       ` Julien Grall
2020-06-05  2:29                         ` CodeWiz2280
2020-06-05  7:36                           ` Bertrand Marquis
2020-06-05 12:25                             ` CodeWiz2280
2020-06-05 12:30                               ` Julien Grall
2020-06-05 12:42                                 ` CodeWiz2280
2020-06-05 12:47                                   ` Bertrand Marquis
2020-06-05 15:05                                     ` CodeWiz2280
2020-06-05 19:12                                       ` CodeWiz2280
2020-06-08  8:40                                         ` Bertrand Marquis
2020-06-08 12:33                                           ` CodeWiz2280
2020-06-08 16:13                                             ` Stefano Stabellini
2020-06-09 14:33                                               ` CodeWiz2280
2020-06-09 15:28                                                 ` Bertrand Marquis
2020-06-09 15:47                                                   ` Julien Grall
2020-06-09 15:58                                                     ` CodeWiz2280
2020-06-09 17:05                                                       ` Bertrand Marquis
2020-06-09 17:03                                                     ` Bertrand Marquis
2020-06-09 17:32                                                       ` Julien Grall
2020-06-09 17:45                                                         ` Marc Zyngier
2020-06-09 20:07                                                           ` CodeWiz2280
2020-06-10  8:13                                                             ` Bertrand Marquis
2020-06-10  8:06                                                           ` Bertrand Marquis
2020-06-10  8:20                                                             ` Marc Zyngier
2020-06-10  8:39                                                               ` Bertrand Marquis
2020-06-10 12:39                                                                 ` CodeWiz2280
2020-06-10 12:53                                                                   ` Marc Zyngier
2020-06-10 12:58                                                                   ` Julien Grall
2020-06-10 21:46                                                           ` Julien Grall
2020-06-15 19:14                                                             ` CodeWiz2280
2020-06-15 21:32                                                               ` Stefano Stabellini
2020-06-16  7:56                                                                 ` Bertrand Marquis
2020-06-16  8:11                                                               ` Marc Zyngier
2020-06-16 18:13                                                                 ` CodeWiz2280
2020-06-16 18:23                                                                   ` Marc Zyngier
2020-06-17 14:45                                                                     ` CodeWiz2280
2020-06-17 15:25                                                                       ` Marc Zyngier
2020-06-17 18:46                                                                       ` Stefano Stabellini
2020-06-17 23:52                                                                         ` CodeWiz2280
2020-06-23 20:50                                                                           ` CodeWiz2280
2020-06-24  7:50                                                                             ` Bertrand Marquis
2020-06-24 17:28                                                                               ` Stefano Stabellini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.