qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Bug 1859384 <1859384@bugs.launchpad.net>
Cc: qemu-devel@nongnu.org
Subject: Re: [Bug 1859384] [NEW] arm gic: interrupt model never 1 on non-mpcore and race condition in gic_acknowledge_irq
Date: Mon, 13 Jan 2020 13:44:16 +0000	[thread overview]
Message-ID: <87k15vo4i7.fsf@linaro.org> (raw)
In-Reply-To: <157887973843.5281.117317310678495552.malonedeb@gac.canonical.com>


Alex Longwall <1859384@bugs.launchpad.net> writes:

> Public bug reported:
>
> For a 1-N interrupt (any SPI on the GICv2), as mandated by the TRM, only
> one CPU can acknowledge the IRQ until it becomes inactive.
>
> The TRM also mandates that SGIs and PPIs follow the N-N model and that
> SPIs follow the 1-N model.
>
> However this is not currently the case with QEMU. I have locally (no
> minimal test case) seen e.g. uart interrupts being acknowledged twice
> before having been deactivated (expected: irqId on one CPU and 1023 on
> the other instead).

You might find there is enough in kvm-unit-tests GIC tests already to
build a test case for what you are seeing.

>
> I have narrowed the issue down to the following:
>
> 1) arm_gic_common_reset resets all irq_state[id] fields to 0. This means
> all IRQ will use the N-N model, and if s->revision != REV_11MPCORE, then
> there's no way to set any interrupt to 1-N.
>
> If ""fixed"" locally with a hackjob, I still have the following trace:
>
> pl011_irq_state 534130.800 pid=2424 level=0x1
> gic_set_irq 2.900 pid=2424 irq=0x21 level=0x1 cpumask=0xff target=0xff
> gic_update_set_irq 3.300 pid=2424 cpu=0x0 name=irq level=0x1
> gic_update_set_irq 4.200 pid=2424 cpu=0x1 name=irq level=0x1
> gic_acknowledge_irq 539.400 pid=2424 s=cpu cpu=0x1 irq=0x21
> gic_update_set_irq 269.800 pid=2424 cpu=0x0 name=irq level=0x1
> gic_cpu_read 4.100 pid=2424 s=cpu cpu=0x1 addr=0xc val=0x21
> gic_acknowledge_irq 15.600 pid=2424 s=cpu cpu=0x0 irq=0x21
> gic_cpu_read 265.000 pid=2424 s=cpu cpu=0x0 addr=0xc val=0x21
> pl011_write 1594.700 pid=2424 addr=0x44 value=0x50
> pl011_irq_state 2.000 pid=2424 level=0x0
> gic_set_irq 1.300 pid=2424 irq=0x21 level=0x0 cpumask=0xff target=0xff
> pl011_write 30.700 pid=2424 addr=0x38 value=0x0
> pl011_irq_state 1.200 pid=2424 level=0x0
> gic_cpu_write 110.600 pid=2424 s=cpu cpu=0x0 addr=0x10 val=0x21
> gic_cpu_write 193.400 pid=2424 s=cpu cpu=0x0 addr=0x1000 val=0x21
> pl011_irq_state 1169.500 pid=2424 level=0x0
>
> This is because:
>
> 2) gic_acknowledge_irq calls gic_clear_pending which uses
> GIC_DIST_CLEAR_PENDING but this usually has no effect on level-sensitive
> interrupts.
>
> With this often being a no-op (ie. assuming ispendr was not written to),
> any 1-n level-sensitive interrupt is still improperly pending on all the
> other cores.
>
> (Also, I don't really know how the qemu thread model works, there might
> be race conditions in the acknowledgment logic if gic_acknowledge_irq is
> called by multiple threads, too.)

All updates to the GIC internals should be protected by the BQL which
applies to all mmio emulated devices.

>
> Option used:
> -nographic -machine virt,virtualization=on,accel=tcg,gic-version=2 -cpu cortex-a57 -smp 4 -m 1024
> -kernel whatever.elf -d unimp,guest_errors -semihosting-config enable,target=native
> -chardev stdio,id=uart -serial chardev:uart -monitor none
> -trace gic_update_set_irq -trace gic_acknowledge_irq -trace pl011_irq_state -trace pl011_write -trace gic_cpu_read -trace gic_cpu_write
> -trace gic_set_irq
>
> Commit used: dc65a5bdc9fa543690a775b50d4ffbeb22c56d6d "Merge remote-
> tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20200108' into
> staging"
>
> ** Affects: qemu
>      Importance: Undecided
>          Status: New
>
>
> ** Tags: arm gic
>
> ** Description changed:
>
>   For a 1-N interrupt (any SPI on the GICv2), as mandated by the TRM, only
>   one CPU can acknowledge the IRQ until it becomes inactive.
>   
>   The TRM also mandates that SGIs and PPIs follow the N-N model and that
>   SPIs follow the 1-N model.
>   
>   However this is not currently the case with QEMU. I have locally (no
>   minimal test case) seen e.g. uart interrupts being acknowledged twice
>   before having been deactivated (expected: irqId on one CPU and 1023 on
>   the other instead).
>   
>   I have narrowed the issue down to the following:
>   
>   1) arm_gic_common_reset resets all irq_state[id] fields to 0. This means
>   all IRQ will use the N-N model, and if s->revision != REV_11MPCORE, then
>   there's no way to set any interrupt to 1-N.
>   
>   **If fixed locally** with a hackjob, I still have the following trace:
>   
>   pl011_irq_state 534130.800 pid=2424 level=0x1
>   gic_set_irq 2.900 pid=2424 irq=0x21 level=0x1 cpumask=0xff target=0xff
>   gic_update_set_irq 3.300 pid=2424 cpu=0x0 name=irq level=0x1
>   gic_update_set_irq 4.200 pid=2424 cpu=0x1 name=irq level=0x1
>   gic_acknowledge_irq 539.400 pid=2424 s=cpu cpu=0x1 irq=0x21
>   gic_update_set_irq 269.800 pid=2424 cpu=0x0 name=irq level=0x1
>   gic_cpu_read 4.100 pid=2424 s=cpu cpu=0x1 addr=0xc val=0x21
>   gic_acknowledge_irq 15.600 pid=2424 s=cpu cpu=0x0 irq=0x21
>   gic_cpu_read 265.000 pid=2424 s=cpu cpu=0x0 addr=0xc val=0x21
>   pl011_write 1594.700 pid=2424 addr=0x44 value=0x50
>   pl011_irq_state 2.000 pid=2424 level=0x0
>   gic_set_irq 1.300 pid=2424 irq=0x21 level=0x0 cpumask=0xff target=0xff
>   pl011_write 30.700 pid=2424 addr=0x38 value=0x0
>   pl011_irq_state 1.200 pid=2424 level=0x0
>   gic_cpu_write 110.600 pid=2424 s=cpu cpu=0x0 addr=0x10 val=0x21
>   gic_cpu_write 193.400 pid=2424 s=cpu cpu=0x0 addr=0x1000 val=0x21
>   pl011_irq_state 1169.500 pid=2424 level=0x0
>   
>   This is because:
>   
>   2) gic_acknowledge_irq calls gic_clear_pending which uses
>   GIC_DIST_CLEAR_PENDING but this usually has no effect on level-sensitive
>   interrupts.
>   
>   With this often being a no-op (ie. assuming ispendr was not written to),
>   any 1-n level-sensitive interrupt is still improperly pending on all the
>   other cores.
>   
>   (Also, I don't really know how the qemu thread model works, there might
>   be race conditions in the acknowledgment logic if gic_acknowledge_irq is
>   called by multiple threads, too.)
> + 
> + Option used:
> + -nographic -machine virt,virtualization=on,accel=tcg,gic-version=2 -cpu cortex-a57 -smp 4 -m 1024
> + -kernel whatever.elf -d unimp,guest_errors -semihosting-config enable,target=native
> + -chardev stdio,id=uart -serial chardev:uart -monitor none
> + -trace gic_update_set_irq -trace gic_acknowledge_irq -trace pl011_irq_state -trace pl011_write -trace gic_cpu_read -trace gic_cpu_write
> + -trace gic_set_irq
> + 
> + Commit used: dc65a5bdc9fa543690a775b50d4ffbeb22c56d6d "Merge remote-
> + tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20200108' into
> + staging"
>
> ** Description changed:
>
>   For a 1-N interrupt (any SPI on the GICv2), as mandated by the TRM, only
>   one CPU can acknowledge the IRQ until it becomes inactive.
>   
>   The TRM also mandates that SGIs and PPIs follow the N-N model and that
>   SPIs follow the 1-N model.
>   
>   However this is not currently the case with QEMU. I have locally (no
>   minimal test case) seen e.g. uart interrupts being acknowledged twice
>   before having been deactivated (expected: irqId on one CPU and 1023 on
>   the other instead).
>   
>   I have narrowed the issue down to the following:
>   
>   1) arm_gic_common_reset resets all irq_state[id] fields to 0. This means
>   all IRQ will use the N-N model, and if s->revision != REV_11MPCORE, then
>   there's no way to set any interrupt to 1-N.
>   
> - **If fixed locally** with a hackjob, I still have the following trace:
> + If ""fixed"" locally with a hackjob, I still have the following trace:
>   
>   pl011_irq_state 534130.800 pid=2424 level=0x1
>   gic_set_irq 2.900 pid=2424 irq=0x21 level=0x1 cpumask=0xff target=0xff
>   gic_update_set_irq 3.300 pid=2424 cpu=0x0 name=irq level=0x1
>   gic_update_set_irq 4.200 pid=2424 cpu=0x1 name=irq level=0x1
>   gic_acknowledge_irq 539.400 pid=2424 s=cpu cpu=0x1 irq=0x21
>   gic_update_set_irq 269.800 pid=2424 cpu=0x0 name=irq level=0x1
>   gic_cpu_read 4.100 pid=2424 s=cpu cpu=0x1 addr=0xc val=0x21
>   gic_acknowledge_irq 15.600 pid=2424 s=cpu cpu=0x0 irq=0x21
>   gic_cpu_read 265.000 pid=2424 s=cpu cpu=0x0 addr=0xc val=0x21
>   pl011_write 1594.700 pid=2424 addr=0x44 value=0x50
>   pl011_irq_state 2.000 pid=2424 level=0x0
>   gic_set_irq 1.300 pid=2424 irq=0x21 level=0x0 cpumask=0xff target=0xff
>   pl011_write 30.700 pid=2424 addr=0x38 value=0x0
>   pl011_irq_state 1.200 pid=2424 level=0x0
>   gic_cpu_write 110.600 pid=2424 s=cpu cpu=0x0 addr=0x10 val=0x21
>   gic_cpu_write 193.400 pid=2424 s=cpu cpu=0x0 addr=0x1000 val=0x21
>   pl011_irq_state 1169.500 pid=2424 level=0x0
>   
>   This is because:
>   
>   2) gic_acknowledge_irq calls gic_clear_pending which uses
>   GIC_DIST_CLEAR_PENDING but this usually has no effect on level-sensitive
>   interrupts.
>   
>   With this often being a no-op (ie. assuming ispendr was not written to),
>   any 1-n level-sensitive interrupt is still improperly pending on all the
>   other cores.
>   
>   (Also, I don't really know how the qemu thread model works, there might
>   be race conditions in the acknowledgment logic if gic_acknowledge_irq is
>   called by multiple threads, too.)
>   
>   Option used:
>   -nographic -machine virt,virtualization=on,accel=tcg,gic-version=2 -cpu cortex-a57 -smp 4 -m 1024
>   -kernel whatever.elf -d unimp,guest_errors -semihosting-config enable,target=native
>   -chardev stdio,id=uart -serial chardev:uart -monitor none
>   -trace gic_update_set_irq -trace gic_acknowledge_irq -trace pl011_irq_state -trace pl011_write -trace gic_cpu_read -trace gic_cpu_write
>   -trace gic_set_irq
>   
>   Commit used: dc65a5bdc9fa543690a775b50d4ffbeb22c56d6d "Merge remote-
>   tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20200108' into
>   staging"


-- 
Alex Bennée


WARNING: multiple messages have this Message-ID (diff)
From: "Alex Bennée" <alex.bennee@linaro.org>
To: qemu-devel@nongnu.org
Subject: Re: [Bug 1859384] [NEW] arm gic: interrupt model never 1 on non-mpcore and race condition in gic_acknowledge_irq
Date: Mon, 13 Jan 2020 13:44:16 -0000	[thread overview]
Message-ID: <87k15vo4i7.fsf@linaro.org> (raw)
Message-ID: <20200113134416.J06W6bWQg_jVWWFT88MgOVuwIZ97DEHX-WotGzmNJNs@z> (raw)
In-Reply-To: 157887973843.5281.117317310678495552.malonedeb@gac.canonical.com

Alex Longwall <1859384@bugs.launchpad.net> writes:

> Public bug reported:
>
> For a 1-N interrupt (any SPI on the GICv2), as mandated by the TRM, only
> one CPU can acknowledge the IRQ until it becomes inactive.
>
> The TRM also mandates that SGIs and PPIs follow the N-N model and that
> SPIs follow the 1-N model.
>
> However this is not currently the case with QEMU. I have locally (no
> minimal test case) seen e.g. uart interrupts being acknowledged twice
> before having been deactivated (expected: irqId on one CPU and 1023 on
> the other instead).

You might find there is enough in kvm-unit-tests GIC tests already to
build a test case for what you are seeing.

>
> I have narrowed the issue down to the following:
>
> 1) arm_gic_common_reset resets all irq_state[id] fields to 0. This means
> all IRQ will use the N-N model, and if s->revision != REV_11MPCORE, then
> there's no way to set any interrupt to 1-N.
>
> If ""fixed"" locally with a hackjob, I still have the following trace:
>
> pl011_irq_state 534130.800 pid=2424 level=0x1
> gic_set_irq 2.900 pid=2424 irq=0x21 level=0x1 cpumask=0xff target=0xff
> gic_update_set_irq 3.300 pid=2424 cpu=0x0 name=irq level=0x1
> gic_update_set_irq 4.200 pid=2424 cpu=0x1 name=irq level=0x1
> gic_acknowledge_irq 539.400 pid=2424 s=cpu cpu=0x1 irq=0x21
> gic_update_set_irq 269.800 pid=2424 cpu=0x0 name=irq level=0x1
> gic_cpu_read 4.100 pid=2424 s=cpu cpu=0x1 addr=0xc val=0x21
> gic_acknowledge_irq 15.600 pid=2424 s=cpu cpu=0x0 irq=0x21
> gic_cpu_read 265.000 pid=2424 s=cpu cpu=0x0 addr=0xc val=0x21
> pl011_write 1594.700 pid=2424 addr=0x44 value=0x50
> pl011_irq_state 2.000 pid=2424 level=0x0
> gic_set_irq 1.300 pid=2424 irq=0x21 level=0x0 cpumask=0xff target=0xff
> pl011_write 30.700 pid=2424 addr=0x38 value=0x0
> pl011_irq_state 1.200 pid=2424 level=0x0
> gic_cpu_write 110.600 pid=2424 s=cpu cpu=0x0 addr=0x10 val=0x21
> gic_cpu_write 193.400 pid=2424 s=cpu cpu=0x0 addr=0x1000 val=0x21
> pl011_irq_state 1169.500 pid=2424 level=0x0
>
> This is because:
>
> 2) gic_acknowledge_irq calls gic_clear_pending which uses
> GIC_DIST_CLEAR_PENDING but this usually has no effect on level-sensitive
> interrupts.
>
> With this often being a no-op (ie. assuming ispendr was not written to),
> any 1-n level-sensitive interrupt is still improperly pending on all the
> other cores.
>
> (Also, I don't really know how the qemu thread model works, there might
> be race conditions in the acknowledgment logic if gic_acknowledge_irq is
> called by multiple threads, too.)

All updates to the GIC internals should be protected by the BQL which
applies to all mmio emulated devices.

>
> Option used:
> -nographic -machine virt,virtualization=on,accel=tcg,gic-version=2 -cpu cortex-a57 -smp 4 -m 1024
> -kernel whatever.elf -d unimp,guest_errors -semihosting-config enable,target=native
> -chardev stdio,id=uart -serial chardev:uart -monitor none
> -trace gic_update_set_irq -trace gic_acknowledge_irq -trace pl011_irq_state -trace pl011_write -trace gic_cpu_read -trace gic_cpu_write
> -trace gic_set_irq
>
> Commit used: dc65a5bdc9fa543690a775b50d4ffbeb22c56d6d "Merge remote-
> tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20200108' into
> staging"
>
> ** Affects: qemu
>      Importance: Undecided
>          Status: New
>
>
> ** Tags: arm gic
>
> ** Description changed:
>
>   For a 1-N interrupt (any SPI on the GICv2), as mandated by the TRM, only
>   one CPU can acknowledge the IRQ until it becomes inactive.
>   
>   The TRM also mandates that SGIs and PPIs follow the N-N model and that
>   SPIs follow the 1-N model.
>   
>   However this is not currently the case with QEMU. I have locally (no
>   minimal test case) seen e.g. uart interrupts being acknowledged twice
>   before having been deactivated (expected: irqId on one CPU and 1023 on
>   the other instead).
>   
>   I have narrowed the issue down to the following:
>   
>   1) arm_gic_common_reset resets all irq_state[id] fields to 0. This means
>   all IRQ will use the N-N model, and if s->revision != REV_11MPCORE, then
>   there's no way to set any interrupt to 1-N.
>   
>   **If fixed locally** with a hackjob, I still have the following trace:
>   
>   pl011_irq_state 534130.800 pid=2424 level=0x1
>   gic_set_irq 2.900 pid=2424 irq=0x21 level=0x1 cpumask=0xff target=0xff
>   gic_update_set_irq 3.300 pid=2424 cpu=0x0 name=irq level=0x1
>   gic_update_set_irq 4.200 pid=2424 cpu=0x1 name=irq level=0x1
>   gic_acknowledge_irq 539.400 pid=2424 s=cpu cpu=0x1 irq=0x21
>   gic_update_set_irq 269.800 pid=2424 cpu=0x0 name=irq level=0x1
>   gic_cpu_read 4.100 pid=2424 s=cpu cpu=0x1 addr=0xc val=0x21
>   gic_acknowledge_irq 15.600 pid=2424 s=cpu cpu=0x0 irq=0x21
>   gic_cpu_read 265.000 pid=2424 s=cpu cpu=0x0 addr=0xc val=0x21
>   pl011_write 1594.700 pid=2424 addr=0x44 value=0x50
>   pl011_irq_state 2.000 pid=2424 level=0x0
>   gic_set_irq 1.300 pid=2424 irq=0x21 level=0x0 cpumask=0xff target=0xff
>   pl011_write 30.700 pid=2424 addr=0x38 value=0x0
>   pl011_irq_state 1.200 pid=2424 level=0x0
>   gic_cpu_write 110.600 pid=2424 s=cpu cpu=0x0 addr=0x10 val=0x21
>   gic_cpu_write 193.400 pid=2424 s=cpu cpu=0x0 addr=0x1000 val=0x21
>   pl011_irq_state 1169.500 pid=2424 level=0x0
>   
>   This is because:
>   
>   2) gic_acknowledge_irq calls gic_clear_pending which uses
>   GIC_DIST_CLEAR_PENDING but this usually has no effect on level-sensitive
>   interrupts.
>   
>   With this often being a no-op (ie. assuming ispendr was not written to),
>   any 1-n level-sensitive interrupt is still improperly pending on all the
>   other cores.
>   
>   (Also, I don't really know how the qemu thread model works, there might
>   be race conditions in the acknowledgment logic if gic_acknowledge_irq is
>   called by multiple threads, too.)
> + 
> + Option used:
> + -nographic -machine virt,virtualization=on,accel=tcg,gic-version=2 -cpu cortex-a57 -smp 4 -m 1024
> + -kernel whatever.elf -d unimp,guest_errors -semihosting-config enable,target=native
> + -chardev stdio,id=uart -serial chardev:uart -monitor none
> + -trace gic_update_set_irq -trace gic_acknowledge_irq -trace pl011_irq_state -trace pl011_write -trace gic_cpu_read -trace gic_cpu_write
> + -trace gic_set_irq
> + 
> + Commit used: dc65a5bdc9fa543690a775b50d4ffbeb22c56d6d "Merge remote-
> + tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20200108' into
> + staging"
>
> ** Description changed:
>
>   For a 1-N interrupt (any SPI on the GICv2), as mandated by the TRM, only
>   one CPU can acknowledge the IRQ until it becomes inactive.
>   
>   The TRM also mandates that SGIs and PPIs follow the N-N model and that
>   SPIs follow the 1-N model.
>   
>   However this is not currently the case with QEMU. I have locally (no
>   minimal test case) seen e.g. uart interrupts being acknowledged twice
>   before having been deactivated (expected: irqId on one CPU and 1023 on
>   the other instead).
>   
>   I have narrowed the issue down to the following:
>   
>   1) arm_gic_common_reset resets all irq_state[id] fields to 0. This means
>   all IRQ will use the N-N model, and if s->revision != REV_11MPCORE, then
>   there's no way to set any interrupt to 1-N.
>   
> - **If fixed locally** with a hackjob, I still have the following trace:
> + If ""fixed"" locally with a hackjob, I still have the following trace:
>   
>   pl011_irq_state 534130.800 pid=2424 level=0x1
>   gic_set_irq 2.900 pid=2424 irq=0x21 level=0x1 cpumask=0xff target=0xff
>   gic_update_set_irq 3.300 pid=2424 cpu=0x0 name=irq level=0x1
>   gic_update_set_irq 4.200 pid=2424 cpu=0x1 name=irq level=0x1
>   gic_acknowledge_irq 539.400 pid=2424 s=cpu cpu=0x1 irq=0x21
>   gic_update_set_irq 269.800 pid=2424 cpu=0x0 name=irq level=0x1
>   gic_cpu_read 4.100 pid=2424 s=cpu cpu=0x1 addr=0xc val=0x21
>   gic_acknowledge_irq 15.600 pid=2424 s=cpu cpu=0x0 irq=0x21
>   gic_cpu_read 265.000 pid=2424 s=cpu cpu=0x0 addr=0xc val=0x21
>   pl011_write 1594.700 pid=2424 addr=0x44 value=0x50
>   pl011_irq_state 2.000 pid=2424 level=0x0
>   gic_set_irq 1.300 pid=2424 irq=0x21 level=0x0 cpumask=0xff target=0xff
>   pl011_write 30.700 pid=2424 addr=0x38 value=0x0
>   pl011_irq_state 1.200 pid=2424 level=0x0
>   gic_cpu_write 110.600 pid=2424 s=cpu cpu=0x0 addr=0x10 val=0x21
>   gic_cpu_write 193.400 pid=2424 s=cpu cpu=0x0 addr=0x1000 val=0x21
>   pl011_irq_state 1169.500 pid=2424 level=0x0
>   
>   This is because:
>   
>   2) gic_acknowledge_irq calls gic_clear_pending which uses
>   GIC_DIST_CLEAR_PENDING but this usually has no effect on level-sensitive
>   interrupts.
>   
>   With this often being a no-op (ie. assuming ispendr was not written to),
>   any 1-n level-sensitive interrupt is still improperly pending on all the
>   other cores.
>   
>   (Also, I don't really know how the qemu thread model works, there might
>   be race conditions in the acknowledgment logic if gic_acknowledge_irq is
>   called by multiple threads, too.)
>   
>   Option used:
>   -nographic -machine virt,virtualization=on,accel=tcg,gic-version=2 -cpu cortex-a57 -smp 4 -m 1024
>   -kernel whatever.elf -d unimp,guest_errors -semihosting-config enable,target=native
>   -chardev stdio,id=uart -serial chardev:uart -monitor none
>   -trace gic_update_set_irq -trace gic_acknowledge_irq -trace pl011_irq_state -trace pl011_write -trace gic_cpu_read -trace gic_cpu_write
>   -trace gic_set_irq
>   
>   Commit used: dc65a5bdc9fa543690a775b50d4ffbeb22c56d6d "Merge remote-
>   tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20200108' into
>   staging"


-- 
Alex Bennée

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1859384

Title:
  arm gic: interrupt model never 1 on non-mpcore and race condition in
  gic_acknowledge_irq

Status in QEMU:
  New

Bug description:
  For a 1-N interrupt (any SPI on the GICv2), as mandated by the TRM,
  only one CPU can acknowledge the IRQ until it becomes inactive.

  The TRM also mandates that SGIs and PPIs follow the N-N model and that
  SPIs follow the 1-N model.

  However this is not currently the case with QEMU. I have locally (no
  minimal test case) seen e.g. uart interrupts being acknowledged twice
  before having been deactivated (expected: irqId on one CPU and 1023 on
  the other instead).

  I have narrowed the issue down to the following:

  1) arm_gic_common_reset resets all irq_state[id] fields to 0. This
  means all IRQ will use the N-N model, and if s->revision !=
  REV_11MPCORE, then there's no way to set any interrupt to 1-N.

  If ""fixed"" locally with a hackjob, I still have the following trace:

  pl011_irq_state 534130.800 pid=2424 level=0x1
  gic_set_irq 2.900 pid=2424 irq=0x21 level=0x1 cpumask=0xff target=0xff
  gic_update_set_irq 3.300 pid=2424 cpu=0x0 name=irq level=0x1
  gic_update_set_irq 4.200 pid=2424 cpu=0x1 name=irq level=0x1
  gic_acknowledge_irq 539.400 pid=2424 s=cpu cpu=0x1 irq=0x21
  gic_update_set_irq 269.800 pid=2424 cpu=0x0 name=irq level=0x1
  gic_cpu_read 4.100 pid=2424 s=cpu cpu=0x1 addr=0xc val=0x21
  gic_acknowledge_irq 15.600 pid=2424 s=cpu cpu=0x0 irq=0x21
  gic_cpu_read 265.000 pid=2424 s=cpu cpu=0x0 addr=0xc val=0x21
  pl011_write 1594.700 pid=2424 addr=0x44 value=0x50
  pl011_irq_state 2.000 pid=2424 level=0x0
  gic_set_irq 1.300 pid=2424 irq=0x21 level=0x0 cpumask=0xff target=0xff
  pl011_write 30.700 pid=2424 addr=0x38 value=0x0
  pl011_irq_state 1.200 pid=2424 level=0x0
  gic_cpu_write 110.600 pid=2424 s=cpu cpu=0x0 addr=0x10 val=0x21
  gic_cpu_write 193.400 pid=2424 s=cpu cpu=0x0 addr=0x1000 val=0x21
  pl011_irq_state 1169.500 pid=2424 level=0x0

  This is because:

  2) gic_acknowledge_irq calls gic_clear_pending which uses
  GIC_DIST_CLEAR_PENDING but this usually has no effect on level-
  sensitive interrupts.

  With this often being a no-op (ie. assuming ispendr was not written
  to), any 1-n level-sensitive interrupt is still improperly pending on
  all the other cores.

  (Also, I don't really know how the qemu thread model works, there
  might be race conditions in the acknowledgment logic if
  gic_acknowledge_irq is called by multiple threads, too.)

  Option used:
  -nographic -machine virt,virtualization=on,accel=tcg,gic-version=2 -cpu cortex-a57 -smp 4 -m 1024
  -kernel whatever.elf -d unimp,guest_errors -semihosting-config enable,target=native
  -chardev stdio,id=uart -serial chardev:uart -monitor none
  -trace gic_update_set_irq -trace gic_acknowledge_irq -trace pl011_irq_state -trace pl011_write -trace gic_cpu_read -trace gic_cpu_write
  -trace gic_set_irq

  Commit used: dc65a5bdc9fa543690a775b50d4ffbeb22c56d6d "Merge remote-
  tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20200108' into
  staging"

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1859384/+subscriptions


  parent reply	other threads:[~2020-01-13 13:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-13  1:42 [Bug 1859384] [NEW] arm gic: interrupt model never 1 on non-mpcore and race condition in gic_acknowledge_irq Alex Longwall
2020-01-13  1:55 ` [Bug 1859384] " Alex Longwall
2020-01-13 13:44 ` Alex Bennée [this message]
2020-01-13 13:44   ` [Bug 1859384] [NEW] " Alex Bennée
2020-01-13 14:14 ` [Bug 1859384] " Peter Maydell
2020-01-13 14:23 ` Alex Longwall
2020-01-13 14:36 ` Peter Maydell
2020-01-13 14:37 ` Alex Longwall
2020-01-13 15:13 ` Alex Longwall
2020-01-13 15:33 ` Alex Longwall
2020-01-13 17:22 ` Alex Longwall
2020-01-14 18:14 ` Alex Bennée
2020-01-17 15:08 ` [Bug 1859384] Re: arm gic: gic_acknowledge_irq doesn't clear line level for other cores for 1-n level-sensitive interrupts and gic_clear_pending uses GIC_DIST_TEST_MODEL (even on v2 where it always read 0 - "N-N") Alex Longwall
2020-11-05 11:02 ` Peter Maydell
2021-05-11  5:37 ` Thomas Huth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k15vo4i7.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=1859384@bugs.launchpad.net \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).