All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Habets <habetsm.xilinx@gmail.com>
To: "Íñigo Huguet" <ihuguet@redhat.com>
Cc: netdev@vger.kernel.org, richardcochran@gmail.com,
	yangbo.lu@nxp.com, mlichvar@redhat.com,
	gerhard@engleder-embedded.com, ecree.xilinx@gmail.com,
	davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, alex.maftei@amd.com
Subject: Re: PTP vclock: BUG: scheduling while atomic
Date: Fri, 3 Feb 2023 09:09:16 +0000	[thread overview]
Message-ID: <Y9zPPON16NEbzw86@gmail.com> (raw)
In-Reply-To: <69d0ff33-bd32-6aa5-d36c-fbdc3c01337c@redhat.com>

On Thu, Feb 02, 2023 at 05:02:07PM +0100, Íñigo Huguet wrote:
> Hello,
> 
> Our QA team was testing PTP vclocks, and they've found this error with sfc NIC/driver:
>   BUG: scheduling while atomic: ptp5/25223/0x00000002
> 
> The reason seems to be that vclocks disable interrupts with `spin_lock_irqsave` in
> `ptp_vclock_gettime`, and then read the timecounter, which in turns ends calling to
> the driver's `gettime64` callback.
> 
> Vclock framework was added in commit 5d43f951b1ac ("ptp: add ptp virtual clock driver
> framework").

Looking at that commit we'll face the same spinlock issue in
ptp_vclock_adjfine and ptp_vclock_adjtime.

> At first glance, it seems that vclock framework is reusing the already existing callbacks
> of the drivers' ptp clocks, but it's imposing a new limitation that didn't exist before:
> now they can't sleep (due the spin_lock_irqsave). Sfc driver might sleep waiting for the
> fw response.
> 
> Sfc driver can be fixed to avoid this issue, but I wonder if something might not be
> correct in the vclock framework. I don't have enough knowledge about how clocks
> synchronization should work regarding this, so I leave it to your consideration.

If the timer hardware is local to the CPU core a spinlock could work.
But if it global across CPUs, or like in our case remote behind a PCI bus,
using a spinlock is too much of a restriction.
I also wonder why the spinlock was used, and if that limitation can be
reduced.

Martin

> These are the logs with stack traces:
>  BUG: scheduling while atomic: ptp5/25223/0x00000002
>  [...skip...]
>  Call Trace:
>   dump_stack_lvl+0x34/0x48
>   __schedule_bug.cold+0x47/0x53
>   __schedule+0x40e/0x580
>   schedule+0x43/0xa0
>   schedule_timeout+0x88/0x160
>   ? __bpf_trace_tick_stop+0x10/0x10
>   _efx_mcdi_rpc_finish+0x2a9/0x480 [sfc]
>   ? efx_mcdi_send_request+0x1d5/0x260 [sfc]
>   ? dequeue_task_stop+0x70/0x70
>   _efx_mcdi_rpc.constprop.0+0xcd/0x3d0 [sfc]
>   ? update_load_avg+0x7e/0x730
>   _efx_mcdi_rpc_evb_retry+0x5d/0x1d0 [sfc]
>   efx_mcdi_rpc+0x10/0x20 [sfc]
>   efx_phc_gettime+0x5f/0xc0 [sfc]
>   ptp_vclock_read+0xa3/0xc0
>   timecounter_read+0x11/0x60
>   ptp_vclock_refresh+0x31/0x60
>   ? ptp_clock_release+0x50/0x50
>   ptp_aux_kworker+0x19/0x40
>   kthread_worker_fn+0xa9/0x250
>   ? kthread_should_park+0x30/0x30
>   kthread+0x146/0x170
>   ? set_kthread_struct+0x50/0x50
>   ret_from_fork+0x1f/0x30
>  BUG: scheduling while atomic: ptp5/25223/0x00000000
>  [...skip...]
>  Call Trace:
>   dump_stack_lvl+0x34/0x48
>   __schedule_bug.cold+0x47/0x53
>   __schedule+0x40e/0x580
>   ? ptp_clock_release+0x50/0x50
>   schedule+0x43/0xa0
>   kthread_worker_fn+0x128/0x250
>   ? kthread_should_park+0x30/0x30
>   kthread+0x146/0x170
>   ? set_kthread_struct+0x50/0x50
>   ret_from_fork+0x1f/0x30

      parent reply	other threads:[~2023-02-03  9:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-02 16:02 PTP vclock: BUG: scheduling while atomic Íñigo Huguet
2023-02-02 16:33 ` Miroslav Lichvar
2023-02-02 20:52   ` Jacob Keller
2023-02-03  0:10   ` Richard Cochran
2023-02-03 16:04     ` Íñigo Huguet
2023-02-03  9:09 ` Martin Habets [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9zPPON16NEbzw86@gmail.com \
    --to=habetsm.xilinx@gmail.com \
    --cc=alex.maftei@amd.com \
    --cc=davem@davemloft.net \
    --cc=ecree.xilinx@gmail.com \
    --cc=edumazet@google.com \
    --cc=gerhard@engleder-embedded.com \
    --cc=ihuguet@redhat.com \
    --cc=kuba@kernel.org \
    --cc=mlichvar@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=richardcochran@gmail.com \
    --cc=yangbo.lu@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.