linux-renesas-soc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
To: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>,
	"REE dirk.behme@de.bosch.com" <dirk.behme@de.bosch.com>,
	"linux-renesas-soc@vger.kernel.org" 
	<linux-renesas-soc@vger.kernel.org>
Cc: "Shashikant.Suguni@in.bosch.com" <Shashikant.Suguni@in.bosch.com>
Subject: RE: [PATCH v2] ravb: On timeout disable IRQs to stop processing
Date: Fri, 22 May 2020 05:57:44 +0000	[thread overview]
Message-ID: <OSAPR01MB36831809AD6F3D66F1F39670D8B40@OSAPR01MB3683.jpnprd01.prod.outlook.com> (raw)
In-Reply-To: <41c79682-c707-d393-57d8-954586f81ab3@cogentembedded.com>

Hello Sergei-san,

> From: Sergei Shtylyov, Sent: Monday, May 18, 2020 5:45 PM
> 
> On 18.05.2020 7:54, Dirk Behme wrote:
> 
> > Analyzing [1] it seems there is a race condition where ravb_start_xmit()
>  > can be called from interrupt while tx skbuffs are being torn down in
>  > the scheduled timeout handling. The actual timeout work is done in
>  > ravb_tx_timeout_work() during which the tx skbuffs are torn down via
>  > invocations of ravb_ring_free(). But there seems to be no flag to tell
>  > the driver it is shutting down so it continues to use the ring buffer
>  > when it should not.
>  >
>  > Fix this by disabling the interrupts in the timeout handler.
> 
>     Hm, given that we stop all TX queues prior to tearing down the buffers,
> it is somewhat strange that you see the driver's send path called...
>     But disabling the interrupts seems the Right Thing anyways...

I didn't reproduce this issue and I didn't know the root cause yet.
But, I'm feeling to strange. Especially, "ravb_start_xmit() can be called from interrupt".
I didn't find where ravb_start_xmit() can be called from interrupt for now.

By the way, I'm thinking the following message is related to the issue.
> > ravb e6800000.ethernet ethernet: failed to switch device to config mode

The ravb_config() output the message if failed. And, ravb_tx_timeout_work()
calls ravb_config() via ravb_stop_dma() and ravb_dmac_init().
---
ravb_tx_timeout_work() {
	ravb_stop_dma()		// call ravb_config()

	ravb_ring_free()	// *1
<snip>
	ravb_dmac_init()	// call ravb_config()
<snip>
}

ravb_dmac_init()
{
<snip>
	error = ravb_config()	// "failed to switch device to config modes" here and -ETIMEDOUT
	if (error)
		return error	// *2
	ravb_ring_init()	// *3
<snip>
}
--

If ravb_config() failed at the *2, since ravb_ring_init() was not called,
any descriptors were not allocated and then the issue should happen.
Note that according to the datasheet, the controller cannot change the
mode from "Operation" to Configuration" when the controller is doing
TX or RX.

I don't know why the first ravb_config() doesn't fail for now.
But, my scenario is one of functions below enables the TX and RX
by calling ravb_rcv_snd_enable():
 - ravb_emac_interrupt_unlocked() ... if ESCR_LCHNG && !no_avb_link && PSR_LMON
 - ravb_adjust_link() ... if no_avb_link && phydev->link
 - ravb_set_rx_csum() ... if enabling NETIF_F_RXCSUM

ravb_tx_timeout_work() calls ravb_stop_dma(). And, ravb_stop_dma() calls
ravb_rcv_snd_disable(). So, we assumed ravb_tx_timeout_work() didn't transfer
anymore. However, if one of the above functions was called, this is possible
to enable TX and RX.

If this scenario is true, I'm thinking we can fix the issue by using
spin_{un}lock_irq{restore,save}() between ravb_stop_dma() to ravb_dmac_init().
# ravb_ptp_init() calls spin_lock API so that we should not spin lock ravb_ptp_{init,stop}().

What do you think?

I'll try to reproduce this issue on my environment again...
# I'm thinking ravb_adjust_link() is possible to cause this issue.
# But, Salvator-X doesn't have renesas,no-ether-link so that I'll try to add it.

Best regards,
Yoshihiro Shimoda


  reply	other threads:[~2020-05-22  5:57 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-18  4:54 [PATCH v2] ravb: On timeout disable IRQs to stop processing Dirk Behme
2020-05-18  5:12 ` Shashikant Suguni (RBEI/ECF1)
2020-05-18  8:36 ` Sergei Shtylyov
2020-05-28  8:12   ` Behme Dirk (CM/ESO2)
2020-05-18  8:44 ` Sergei Shtylyov
2020-05-22  5:57   ` Yoshihiro Shimoda [this message]
2020-05-22 11:16     ` Yoshihiro Shimoda
2020-05-22 11:58       ` Yoshihiro Shimoda
2020-05-26  9:51         ` Yoshihiro Shimoda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OSAPR01MB36831809AD6F3D66F1F39670D8B40@OSAPR01MB3683.jpnprd01.prod.outlook.com \
    --to=yoshihiro.shimoda.uh@renesas.com \
    --cc=Shashikant.Suguni@in.bosch.com \
    --cc=dirk.behme@de.bosch.com \
    --cc=linux-renesas-soc@vger.kernel.org \
    --cc=sergei.shtylyov@cogentembedded.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).