Linux-Renesas-SoC Archive on lore.kernel.org
 help / color / Atom feed
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
To: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>,
	"REE dirk.behme@de.bosch.com" <dirk.behme@de.bosch.com>,
	"linux-renesas-soc@vger.kernel.org" 
	<linux-renesas-soc@vger.kernel.org>
Cc: "Shashikant.Suguni@in.bosch.com" <Shashikant.Suguni@in.bosch.com>
Subject: RE: [PATCH v2] ravb: On timeout disable IRQs to stop processing
Date: Fri, 22 May 2020 05:57:44 +0000
Message-ID: <OSAPR01MB36831809AD6F3D66F1F39670D8B40@OSAPR01MB3683.jpnprd01.prod.outlook.com> (raw)
In-Reply-To: <41c79682-c707-d393-57d8-954586f81ab3@cogentembedded.com>

Hello Sergei-san,

> From: Sergei Shtylyov, Sent: Monday, May 18, 2020 5:45 PM
> 
> On 18.05.2020 7:54, Dirk Behme wrote:
> 
> > Analyzing [1] it seems there is a race condition where ravb_start_xmit()
>  > can be called from interrupt while tx skbuffs are being torn down in
>  > the scheduled timeout handling. The actual timeout work is done in
>  > ravb_tx_timeout_work() during which the tx skbuffs are torn down via
>  > invocations of ravb_ring_free(). But there seems to be no flag to tell
>  > the driver it is shutting down so it continues to use the ring buffer
>  > when it should not.
>  >
>  > Fix this by disabling the interrupts in the timeout handler.
> 
>     Hm, given that we stop all TX queues prior to tearing down the buffers,
> it is somewhat strange that you see the driver's send path called...
>     But disabling the interrupts seems the Right Thing anyways...

I didn't reproduce this issue and I didn't know the root cause yet.
But, I'm feeling to strange. Especially, "ravb_start_xmit() can be called from interrupt".
I didn't find where ravb_start_xmit() can be called from interrupt for now.

By the way, I'm thinking the following message is related to the issue.
> > ravb e6800000.ethernet ethernet: failed to switch device to config mode

The ravb_config() output the message if failed. And, ravb_tx_timeout_work()
calls ravb_config() via ravb_stop_dma() and ravb_dmac_init().
---
ravb_tx_timeout_work() {
	ravb_stop_dma()		// call ravb_config()

	ravb_ring_free()	// *1
<snip>
	ravb_dmac_init()	// call ravb_config()
<snip>
}

ravb_dmac_init()
{
<snip>
	error = ravb_config()	// "failed to switch device to config modes" here and -ETIMEDOUT
	if (error)
		return error	// *2
	ravb_ring_init()	// *3
<snip>
}
--

If ravb_config() failed at the *2, since ravb_ring_init() was not called,
any descriptors were not allocated and then the issue should happen.
Note that according to the datasheet, the controller cannot change the
mode from "Operation" to Configuration" when the controller is doing
TX or RX.

I don't know why the first ravb_config() doesn't fail for now.
But, my scenario is one of functions below enables the TX and RX
by calling ravb_rcv_snd_enable():
 - ravb_emac_interrupt_unlocked() ... if ESCR_LCHNG && !no_avb_link && PSR_LMON
 - ravb_adjust_link() ... if no_avb_link && phydev->link
 - ravb_set_rx_csum() ... if enabling NETIF_F_RXCSUM

ravb_tx_timeout_work() calls ravb_stop_dma(). And, ravb_stop_dma() calls
ravb_rcv_snd_disable(). So, we assumed ravb_tx_timeout_work() didn't transfer
anymore. However, if one of the above functions was called, this is possible
to enable TX and RX.

If this scenario is true, I'm thinking we can fix the issue by using
spin_{un}lock_irq{restore,save}() between ravb_stop_dma() to ravb_dmac_init().
# ravb_ptp_init() calls spin_lock API so that we should not spin lock ravb_ptp_{init,stop}().

What do you think?

I'll try to reproduce this issue on my environment again...
# I'm thinking ravb_adjust_link() is possible to cause this issue.
# But, Salvator-X doesn't have renesas,no-ether-link so that I'll try to add it.

Best regards,
Yoshihiro Shimoda


  reply index

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-18  4:54 Dirk Behme
2020-05-18  5:12 ` Shashikant Suguni (RBEI/ECF1)
2020-05-18  8:36 ` Sergei Shtylyov
2020-05-18  8:44 ` Sergei Shtylyov
2020-05-22  5:57   ` Yoshihiro Shimoda [this message]
2020-05-22 11:16     ` Yoshihiro Shimoda
2020-05-22 11:58       ` Yoshihiro Shimoda
2020-05-26  9:51         ` Yoshihiro Shimoda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OSAPR01MB36831809AD6F3D66F1F39670D8B40@OSAPR01MB3683.jpnprd01.prod.outlook.com \
    --to=yoshihiro.shimoda.uh@renesas.com \
    --cc=Shashikant.Suguni@in.bosch.com \
    --cc=dirk.behme@de.bosch.com \
    --cc=linux-renesas-soc@vger.kernel.org \
    --cc=sergei.shtylyov@cogentembedded.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Renesas-SoC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-renesas-soc/0 linux-renesas-soc/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-renesas-soc linux-renesas-soc/ https://lore.kernel.org/linux-renesas-soc \
		linux-renesas-soc@vger.kernel.org
	public-inbox-index linux-renesas-soc

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-renesas-soc


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git