All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] GICv3/GIC-500
@ 2015-03-05  8:54 Shlomo Pongratz
  2015-03-05 12:01 ` Peter Maydell
  0 siblings, 1 reply; 3+ messages in thread
From: Shlomo Pongratz @ 2015-03-05  8:54 UTC (permalink / raw)
  To: qemu-devel

Hi,

I'm trying to implement GICv3 (actually GIC-500) in order to support more than 8 cores for ARM64.
Up to 24 cores I didn't notice any significant problems (just slow boot) but with 64 or 32 cores the Linux kernel is usually got stuck, seldom it completes the boot.
When examining the registers I see that all cores except one is stuck in ether "PC=ffffffc000108a18" or "PC=ffffffc000108a18" which is according to objdump in:
kernel/stop_machine.c::multi_cpu_stop

        msdata->state = newstate;
ffffffc000108a10:       b9002261        str     w1, [x19,#32]
ffffffc000108a14:       2a1403e1        mov     w1, w20
                        default:
                                break;
                        }
                        ack_state(msdata);
                }
        } while (curstate != MULTI_STOP_EXIT);
ffffffc000108a18:       7100103f        cmp     w1, #0x4
ffffffc000108a1c:       54000120        b.eq    ffffffc000108a40 <multi_cpu_stop+0xcc>

        /* Simple state machine */
        do {
                /* Chill out and ensure we re-read multi_stop_state. */
                cpu_relax();
                if (msdata->state != curstate) {
ffffffc000108a20:       b9402274        ldr     w20, [x19,#32]
ffffffc000108a24:       6b01029f        cmp     w20, w1

There is one CPU however (and there is always such CPU) is stuck in "PC=ffffffc0002cd9f4" which is in

drivers/irqchip/irq-gic-v3.c ::gic_eoi_irq

static void gic_eoi_irq(struct irq_data *d)
{
        gic_write_eoir(gic_irq(d));
ffffffc0002cd9ec:       b9400800        ldr     w0, [x0,#8]
ffffffc0002cd9f0:       d518cc20        msr     s3_0_c12_c12_1, x0
ffffffc0002cd9f4:       d5033fdf        isb
}
ffffffc0002cd9f8:       d65f03c0        ret

But according to target-arm/translate-a64.c::handle_sync "isb" is translated as no-op!
BTW X00=000000000000001b is the virtual timer IRQ-27, so it seems that only this core (number 7 at this point)  is getting clock.

Dose anyone can give me an advice of how to further debug this issue?

Best regards,

S.P.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] GICv3/GIC-500
  2015-03-05  8:54 [Qemu-devel] GICv3/GIC-500 Shlomo Pongratz
@ 2015-03-05 12:01 ` Peter Maydell
  2015-03-05 15:26   ` Shlomo Pongratz
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Maydell @ 2015-03-05 12:01 UTC (permalink / raw)
  To: Shlomo Pongratz; +Cc: qemu-devel

On 5 March 2015 at 17:54, Shlomo Pongratz <shlomo.pongratz@huawei.com> wrote:
> Hi,
>
> I'm trying to implement GICv3 (actually GIC-500) in order to support more than 8 cores for ARM64.

Fully emulated, or just using the kernel's GICv3 support under KVM?
I assume the former, given that you're talking about TCG below.

> Up to 24 cores I didn't notice any significant problems (just slow boot) but with 64 or 32 cores the Linux kernel is usually got stuck, seldom it completes the boot.
> When examining the registers I see that all cores except one is stuck in ether "PC=ffffffc000108a18" or "PC=ffffffc000108a18" which is according to objdump in:
> kernel/stop_machine.c::multi_cpu_stop
>
>         msdata->state = newstate;
> ffffffc000108a10:       b9002261        str     w1, [x19,#32]
> ffffffc000108a14:       2a1403e1        mov     w1, w20
>                         default:
>                                 break;
>                         }
>                         ack_state(msdata);
>                 }
>         } while (curstate != MULTI_STOP_EXIT);
> ffffffc000108a18:       7100103f        cmp     w1, #0x4
> ffffffc000108a1c:       54000120        b.eq    ffffffc000108a40 <multi_cpu_stop+0xcc>
>
>         /* Simple state machine */
>         do {
>                 /* Chill out and ensure we re-read multi_stop_state. */
>                 cpu_relax();
>                 if (msdata->state != curstate) {
> ffffffc000108a20:       b9402274        ldr     w20, [x19,#32]
> ffffffc000108a24:       6b01029f        cmp     w20, w1
>
> There is one CPU however (and there is always such CPU) is stuck in "PC=ffffffc0002cd9f4" which is in
>
> drivers/irqchip/irq-gic-v3.c ::gic_eoi_irq
>
> static void gic_eoi_irq(struct irq_data *d)
> {
>         gic_write_eoir(gic_irq(d));
> ffffffc0002cd9ec:       b9400800        ldr     w0, [x0,#8]
> ffffffc0002cd9f0:       d518cc20        msr     s3_0_c12_c12_1, x0
> ffffffc0002cd9f4:       d5033fdf        isb
> }
> ffffffc0002cd9f8:       d65f03c0        ret
>
> But according to target-arm/translate-a64.c::handle_sync "isb" is translated as no-op!

Yes. This is correct in that we don't implement running more
than one guest CPU at a time and we don't implement caches,
and all the behaviour of system instructions is always completed
immediately and synchronously, and so ISB need not do anything.
There is one corner case here which is on my list of "wrong but
not apparently causing a problem" low-priority todo items: ISB
should cause the CPU to take any pending interrupts, but in QEMU
we don't do anything special and so interrupts won't be taken
until the end of the next TB (usually that's about to happen or
happened just before the ISB anyway). In the code fragment
you list here, we will end the QEMU TB on the immediately following
"ret" and so will take interrupts then, so it's not the cause
of the behaviour you're seeing here.

> BTW X00=000000000000001b is the virtual timer IRQ-27, so it
> seems that only this core (number 7 at this point)  is getting clock.
>
> Dose anyone can give me an advice of how to further debug this issue?

Without any code about all I can say is "presumably this is
a bug in your GICv3 implementation" :-)

-- PMM

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] GICv3/GIC-500
  2015-03-05 12:01 ` Peter Maydell
@ 2015-03-05 15:26   ` Shlomo Pongratz
  0 siblings, 0 replies; 3+ messages in thread
From: Shlomo Pongratz @ 2015-03-05 15:26 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

Hi Peter,

Thank you for your response.
You are correct, I'm implementing fully emulated GIC-500.
I assume that you are correct and indeed I have a bug in the implementation, I think it is related to timing somehow as by adding debug printouts the system is more likely to boot.

I'll prepare a RFC patch and send it at the beginning of next week, and I hope you can spare the time to review it.

Best regards,

S.P.
 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-03-05 15:26 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-05  8:54 [Qemu-devel] GICv3/GIC-500 Shlomo Pongratz
2015-03-05 12:01 ` Peter Maydell
2015-03-05 15:26   ` Shlomo Pongratz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.