All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: irq_fpu_usable() is irreliable
       [not found] <CAHmME9rgXh3zQDfc2Yo_Au0CSg--X+ak=SQdS9DoXNsKK0TPmA@mail.gmail.com>
@ 2015-11-17 14:06 ` Thomas Gleixner
  2015-11-17 14:51   ` Jason A. Donenfeld
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2015-11-17 14:06 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: mingo, hpa, LKML

Jason,

On Tue, 17 Nov 2015, Jason A. Donenfeld wrote:
> The availability of the FPU in kernel space, as you know, is determined by
> this function:
> 
> bool irq_fpu_usable(void)
> {
>         return !in_interrupt() ||
>                 interrupted_user_mode() ||
>                 interrupted_kernel_fpu_idle();
> }
> 
> My understanding is that the first check is !in_interrupt(), because if
> `current` is valid - if we are in process context - then we have a place to
> store the existing FPU regs in kernel_fpu_begin, to be restored later in
> kernel_fpu_end. Recently I've been tracking down a problem in
> which irq_fpu_usable() returns false, yet a stack trace shows the first
> function is the syscall entry point. This leads me to believe that
> in_interrupt() is not an adequate way of testing for a valid `current`.

This function has absolute nothing to do with current. current is
always valid. The function checks whether we can use the fpu safely in
kernel context.

> In my particular problematic case, the reason in_interrupt() was
> returning false is because a number of rcu_read_lock_bh()s were
> being held; IOW this is occurring in the ndo_start_xmit path of a
> network driver.
>
> I therefore propose changing the function to this:
> 
> bool irq_fpu_usable(void)
> {
>         return (!in_irq() && !in_nmi()) ||
>                 interrupted_user_mode() ||
>                 interrupted_kernel_fpu_idle();
> }
> 
> What would you think of that?

That's broken. Assume we interrupted a kernel thread which fiddles
with the FPU and then on irq exit we run a softirq which tries to use
the FPU....

The real question in your case is WHY interrupted_kernel_fpu_idle()
returns false. We know for sure that in a syscall with BH disabled the
first two checks are false.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: irq_fpu_usable() is irreliable
  2015-11-17 14:06 ` irq_fpu_usable() is irreliable Thomas Gleixner
@ 2015-11-17 14:51   ` Jason A. Donenfeld
  2015-11-17 19:54     ` Jason A. Donenfeld
  0 siblings, 1 reply; 9+ messages in thread
From: Jason A. Donenfeld @ 2015-11-17 14:51 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: mingo, hpa, LKML

Hi Thomas,

On Tue, Nov 17, 2015 at 3:06 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> The real question in your case is WHY interrupted_kernel_fpu_idle()
> returns false. We know for sure that in a syscall with BH disabled the
> first two checks are false.

Blurg, indeed. I've been trying to track that down in a different
thread with the netdev folks. After not getting anywhere, I sort of
felt, "dammit! can't this not be the issue, and can't I just get rid
of that in_interrupt() condition?" But, as you've explained above, no,
we can't get rid of that. So yes: the question is why
interrupted_kernel_fpu_idle() is false. Mysteriously it happens to be
the case in UDP mode but not TCP mode (the topic of the other thread),
and so I should resume trying to determine why this is so. I don't
entirely understand the function though:

static bool interrupted_kernel_fpu_idle(void)
{
        if (kernel_fpu_disabled())
                return false;

        if (use_eager_fpu())
                return true;

        return !current->thread.fpu.fpregs_active && (read_cr0() & X86_CR0_TS);
}

>From my tests, when irq_fpu_usable() is false, the expression
`!current->thread.fpu.fpregs_active && (read_cr0() & X86_CR0_TS);` is
false, for both of it. What, then, is leading to the call of
fpregs_activate()? I can't find anything along the syscall path that
would result in this. I admit I do not have a deep understanding of
how the FPU is implemented in Linux. Is it possible that this means
that userspace is using the FPU? Is this what user_fpu_begin() is all
about?

(If so, why is that state not stored on syscall entry? If the reason
is "because it would be expensive to do it everytime", then is there a
way to selectively do that only when it's necessary?)

Or, must this imply that the kernel is actually using it elsewhere,
and I need to just keep digging diligently?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: irq_fpu_usable() is irreliable
  2015-11-17 14:51   ` Jason A. Donenfeld
@ 2015-11-17 19:54     ` Jason A. Donenfeld
  0 siblings, 0 replies; 9+ messages in thread
From: Jason A. Donenfeld @ 2015-11-17 19:54 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: mingo, hpa, LKML

Trying to get to the bottom of this still...

Is interrupted_kernel_fpu_idle() in any way dependent on what
userspace is doing? Or is it entirely related to other happenings
inside the kernel?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: irq_fpu_usable() is irreliable
  2015-11-18 12:16   ` Jason A. Donenfeld
  2015-11-18 19:59     ` Jason A. Donenfeld
@ 2015-11-27  8:47     ` Ingo Molnar
  1 sibling, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2015-11-27  8:47 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: Thomas Gleixner, mingo, hpa, LKML


* Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> Intel 3820QM, but inside VMWare Workstation 12.
> 
> > Third, could you post such a problematic stack trace?
> 
> Sure: https://paste.kde.org/pfhhdchs9/7mmtvb

So it's:

    [  187.194226] CPU: 0 PID: 1165 Comm: iperf3 Tainted: G           O    4.2.3-1-ARCH #1
    [  187.194229] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
    [  187.194231]  0000000000000000 0000000062ca03ad ffff88003b82f0d0 ffffffff8156c0ca
    [  187.194233]  ffff88003bfa0dc0 0000000000000090 ffff88003b82f260 ffffffffa03fc27e
    [  187.194234]  0000000000000010 ffff88003be05300 0000000000000000 ffff88003b82f3e0
    [  187.194235] Call Trace:
    [  187.194244]  [<ffffffff8156c0ca>] dump_stack+0x4c/0x6e
    [  187.194248]  [<ffffffffa03fc27e>] chacha20_avx+0x23e/0x250 [wireguard]
    [  187.194253]  [<ffffffff8101de03>] ? nommu_map_page+0x43/0x80
    [  187.194257]  [<ffffffffa0344161>] ? e1000_xmit_frame+0xdf1/0x11c0 [e1000]
    [  187.194259]  [<ffffffffa03fbe6e>] ? poly1305_update_asm+0x11e/0x1b0 [wireguard]
    [  187.194260]  [<ffffffffa03fcd0d>] chacha20_finish+0x3d/0x60 [wireguard]
    [  187.194262]  [<ffffffffa03f8eae>] chacha20poly1305_encrypt_finish+0x2e/0xf0 [wireguard]
    [  187.194263]  [<ffffffffa03efa32>] noise_message_encrypt+0x162/0x180 [wireguard]
    [  187.194269]  [<ffffffff811b60e5>] ? __kmalloc_node_track_caller+0x35/0x2e0
    [  187.194274]  [<ffffffff81460af7>] ? __alloc_skb+0x87/0x210
    [  187.194275]  [<ffffffff81460a11>] ? __kmalloc_reserve.isra.5+0x31/0x90
    [  187.194276]  [<ffffffff81460acb>] ? __alloc_skb+0x5b/0x210
    [  187.194278]  [<ffffffff81460b0b>] ? __alloc_skb+0x9b/0x210
    [  187.194279]  [<ffffffffa03f2a65>] noise_message_create_data+0x55/0x80 [wireguard]
    [  187.194280]  [<ffffffffa03e9708>] packet_send_queue+0x1f8/0x4d0 [wireguard]
    [  187.194285]  [<ffffffff810a8219>] ? dequeue_entity+0x149/0x690
    [  187.194287]  [<ffffffff810a9051>] ? put_prev_entity+0x31/0x420
    [  187.194289]  [<ffffffff810146ec>] ? __switch_to+0x25c/0x4a0
    [  187.194291]  [<ffffffff81099ce2>] ? finish_task_switch+0x62/0x1b0
    [  187.194292]  [<ffffffff8156d500>] ? __schedule+0x340/0xa00
    [  187.194296]  [<ffffffff810ddf19>] ? hrtimer_try_to_cancel+0x29/0x120
    [  187.194298]  [<ffffffff810b4464>] ? add_wait_queue+0x44/0x50
    [  187.194299]  [<ffffffff811b60e5>] ? __kmalloc_node_track_caller+0x35/0x2e0
    [  187.194302]  [<ffffffff811e33ce>] ? __pollwait+0x7e/0xe0
    [  187.194303]  [<ffffffff81460af7>] ? __alloc_skb+0x87/0x210
    [  187.194304]  [<ffffffff81460a11>] ? __kmalloc_reserve.isra.5+0x31/0x90
    [  187.194305]  [<ffffffffa03e861f>] xmit+0x8f/0xe0 [wireguard]
    [  187.194308]  [<ffffffff8147588f>] dev_hard_start_xmit+0x24f/0x3f0
    [  187.194309]  [<ffffffff814753be>] ? validate_xmit_skb.isra.34.part.35+0x1e/0x2a0
    [  187.194310]  [<ffffffff81476042>] __dev_queue_xmit+0x4d2/0x540
    [  187.194311]  [<ffffffff814760c3>] dev_queue_xmit_sk+0x13/0x20
    [  187.194313]  [<ffffffff8147d9c2>] neigh_direct_output+0x12/0x20
    [  187.194315]  [<ffffffff814b1756>] ip_finish_output2+0x1b6/0x3c0
    [  187.194317]  [<ffffffff814b309e>] ? __ip_append_data.isra.3+0x6ae/0xac0
    [  187.194317]  [<ffffffff814b376c>] ip_finish_output+0x13c/0x1d0
    [  187.194318]  [<ffffffff814b3b75>] ip_output+0x75/0xe0
    [  187.194319]  [<ffffffff814b468d>] ? ip_make_skb+0x10d/0x130
    [  187.194320]  [<ffffffff814b1381>] ip_local_out_sk+0x31/0x40
    [  187.194321]  [<ffffffff814b44ea>] ip_send_skb+0x1a/0x50
    [  187.194323]  [<ffffffff814dc221>] udp_send_skb+0x151/0x280
    [  187.194325]  [<ffffffff814dd7f5>] udp_sendmsg+0x305/0x9d0
    [  187.194327]  [<ffffffff8157115e>] ? _raw_spin_unlock_bh+0xe/0x10
    [  187.194328]  [<ffffffff814e8daf>] inet_sendmsg+0x7f/0xb0
    [  187.194329]  [<ffffffff81457227>] sock_sendmsg+0x17/0x30
    [  187.194330]  [<ffffffff814572c5>] sock_write_iter+0x85/0xf0
    [  187.194332]  [<ffffffff811d028c>] __vfs_write+0xcc/0x100
    [  187.194333]  [<ffffffff811d0b04>] vfs_write+0xa4/0x1a0
    [  187.194334]  [<ffffffff811d1815>] SyS_write+0x55/0xc0
    [  187.194335]  [<ffffffff8157162e>] entry_SYSCALL_64_fastpath+0x12/0x71

so this does not seem to be a very complex stack trace: we are trying to use the 
FPU from a regular process, from a regular system call path. No interrupts, no 
kernel threads, no complications.

We possibly context switched recently:

    [  187.194285]  [<ffffffff810a8219>] ? dequeue_entity+0x149/0x690
    [  187.194287]  [<ffffffff810a9051>] ? put_prev_entity+0x31/0x420
    [  187.194289]  [<ffffffff810146ec>] ? __switch_to+0x25c/0x4a0
    [  187.194291]  [<ffffffff81099ce2>] ? finish_task_switch+0x62/0x1b0
    [  187.194292]  [<ffffffff8156d500>] ? __schedule+0x340/0xa00

but that's all that I can see in the trace.

So as a first step I'd try Linus's very latest kernel, to make sure it's not a bug 
that got fixed meanwhile. If it still occurs, try to report it to the vmware 
virtualization folks. Maybe it's some host kernel activity that changes the state 
of the FPU. I don't know ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: irq_fpu_usable() is irreliable
  2015-11-18 19:59     ` Jason A. Donenfeld
@ 2015-11-27  8:41       ` Ingo Molnar
  0 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2015-11-27  8:41 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: Thomas Gleixner, mingo, hpa, LKML


* Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> Hi Ingo,
> 
> The plot thickens, once again.
> 
> > > Also, what CPU does the test system have, Intel or AMD? The FPU behavior can be
> > > very different in the two cases.
> > Intel 3820QM, but inside VMWare Workstation 12.
> 
> Trying this on bare metal, the problem goes away! Though the VM is 4.2
> and my bare metal is 4.3. Perhaps your recent changes did something.
> 
> But more likely, there's some funny business happening with VMWare.
> Any speculation about this? Fascinating issue...

I have no idea unfortunately :-( You might want to take it up with the vmware 
guys.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: irq_fpu_usable() is irreliable
  2015-11-18 12:16   ` Jason A. Donenfeld
@ 2015-11-18 19:59     ` Jason A. Donenfeld
  2015-11-27  8:41       ` Ingo Molnar
  2015-11-27  8:47     ` Ingo Molnar
  1 sibling, 1 reply; 9+ messages in thread
From: Jason A. Donenfeld @ 2015-11-18 19:59 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Thomas Gleixner, mingo, hpa, LKML

Hi Ingo,

The plot thickens, once again.

> > Also, what CPU does the test system have, Intel or AMD? The FPU behavior can be
> > very different in the two cases.
> Intel 3820QM, but inside VMWare Workstation 12.

Trying this on bare metal, the problem goes away! Though the VM is 4.2
and my bare metal is 4.3. Perhaps your recent changes did something.

But more likely, there's some funny business happening with VMWare.
Any speculation about this? Fascinating issue...

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: irq_fpu_usable() is irreliable
  2015-11-18  6:55 ` Ingo Molnar
@ 2015-11-18 12:16   ` Jason A. Donenfeld
  2015-11-18 19:59     ` Jason A. Donenfeld
  2015-11-27  8:47     ` Ingo Molnar
  0 siblings, 2 replies; 9+ messages in thread
From: Jason A. Donenfeld @ 2015-11-18 12:16 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Thomas Gleixner, mingo, hpa, LKML

Hi Ingo,

Thanks for looking into this.

On Wed, Nov 18, 2015 at 7:55 AM, Ingo Molnar <mingo@kernel.org> wrote:
> Is this 'problem' a performance problem (code not being able to use the FPU
> occasionally and hence sporadically performing poorly), or some sort of actual
> stability/correctness problem?

More of the performance variety; I want the FPU but sometimes don't
have it, that coy mistress. This happens in the ndo_start_xmit() path
of a network driver, which executes with a non-zero softirq_count().
This means that in_interrupt() will be true and
interrupted_user_mode() will be false (confirmed by my tests and by
Thomas' assertions). What isn't clear is why
interrupted_kernel_fpu_idle() is false. In a strange twist of fate,
interrupted_kernel_fpu_idle() is true and thus irq_fpu_usable() is
true when sending TCP packets, but interrupted_kernel_fpu_idle() is
false and thus irq_fpu_usable() is false when sending UDP packets. I
haven't found anything along the UDP path that might result in the FPU
being used, leaving me a bit flummoxed.

So, my inquiries have lead in two directions:
1. Why would interrupted_kernel_fpu_idle() be false here? And does
interrupted_kernel_fpu_idle() depend on what userspace is doing? Or is
it entirely limited to behavior inside the kernel?
2. Most of the time ndo_start_xmit() is reached via a syscall
(sys_write or similar). I know there's a softirq_count() for all sorts
of reasons involving the networking stack, but pretty please - can't
there be some way for irq_fpu_usable() to always be true when the
entry point is a syscall?

> Also, what CPU does the test system have, Intel or AMD? The FPU behavior can be
> very different in the two cases.

Intel 3820QM, but inside VMWare Workstation 12.

> Third, could you post such a problematic stack trace?

Sure: https://paste.kde.org/pfhhdchs9/7mmtvb


Regards,
Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: irq_fpu_usable() is irreliable
  2015-11-17 11:39 Jason A. Donenfeld
@ 2015-11-18  6:55 ` Ingo Molnar
  2015-11-18 12:16   ` Jason A. Donenfeld
  0 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2015-11-18  6:55 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: tglx, mingo, hpa, LKML


* Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> [...] Recently I've been tracking down a problem in which irq_fpu_usable() 
> returns false, yet a stack trace shows the first function is the syscall entry 
> point. [...]

Is this 'problem' a performance problem (code not being able to use the FPU 
occasionally and hence sporadically performing poorly), or some sort of actual 
stability/correctness problem?

Also, what CPU does the test system have, Intel or AMD? The FPU behavior can be 
very different in the two cases.

Third, could you post such a problematic stack trace?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* irq_fpu_usable() is irreliable
@ 2015-11-17 11:39 Jason A. Donenfeld
  2015-11-18  6:55 ` Ingo Molnar
  0 siblings, 1 reply; 9+ messages in thread
From: Jason A. Donenfeld @ 2015-11-17 11:39 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: LKML

Hi folks,

The availability of the FPU in kernel space, as you know, is
determined by this function:

bool irq_fpu_usable(void)
{
        return !in_interrupt() ||
                interrupted_user_mode() ||
                interrupted_kernel_fpu_idle();
}

My understanding is that the first check is !in_interrupt(), because
if `current` is valid - if we are in process context - then we have a
place to store the existing FPU regs in kernel_fpu_begin, to be
restored later in kernel_fpu_end. Recently I've been tracking down a
problem in which irq_fpu_usable() returns false, yet a stack trace
shows the first function is the syscall entry point. This leads me to
believe that in_interrupt() is not an adequate way of testing for a
valid `current`. In my particular problematic case, the reason
in_interrupt() was returning false is because a number of
rcu_read_lock_bh()s were being held; IOW this is occurring in the
ndo_start_xmit path of a network driver.

I therefore propose changing the function to this:

bool irq_fpu_usable(void)
{
        return (!in_irq() && !in_nmi()) ||
                interrupted_user_mode() ||
                interrupted_kernel_fpu_idle();
}

What would you think of that?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-11-27  8:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAHmME9rgXh3zQDfc2Yo_Au0CSg--X+ak=SQdS9DoXNsKK0TPmA@mail.gmail.com>
2015-11-17 14:06 ` irq_fpu_usable() is irreliable Thomas Gleixner
2015-11-17 14:51   ` Jason A. Donenfeld
2015-11-17 19:54     ` Jason A. Donenfeld
2015-11-17 11:39 Jason A. Donenfeld
2015-11-18  6:55 ` Ingo Molnar
2015-11-18 12:16   ` Jason A. Donenfeld
2015-11-18 19:59     ` Jason A. Donenfeld
2015-11-27  8:41       ` Ingo Molnar
2015-11-27  8:47     ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.