All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harini Katakam <harinik@xilinx.com>
To: anssi.hannula@bitwise.fi
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>,
	David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org
Subject: Re: [PATCH 2/3] net: macb: fix dropped RX frames due to a race
Date: Mon, 3 Dec 2018 16:06:38 +0530	[thread overview]
Message-ID: <CAFcVECLhZAcQFxB7FxJyXYfyNdGZ3oJf0Sei8DFige5YSU1DWw@mail.gmail.com> (raw)
In-Reply-To: <f15d8b59-7e89-225b-9c52-52a61713cc05@bitwise.fi>

Hi Anssi,
On Mon, Dec 3, 2018 at 4:02 PM Anssi Hannula <anssi.hannula@bitwise.fi> wrote:
>
> Hi,
>
> On 3.12.2018 6:52, Harini Katakam wrote:
> > Hi Anssi,
> > On Fri, Nov 30, 2018 at 11:53 PM Anssi Hannula <anssi.hannula@bitwise.fi> wrote:
> >> Bit RX_USED set to 0 in the address field allows the controller to write
> >> data to the receive buffer descriptor.
> >>
> >> The driver does not ensure the ctrl field is ready (cleared) when the
> >> controller sees the RX_USED=0 written by the driver. The ctrl field might
> >> only be cleared after the controller has already updated it according to
> >> a newly received frame, causing the frame to be discarded in gem_rx() due
> >> to unexpected ctrl field contents.
> >>
> >> A message is logged when the above scenario occurs:
> >>
> >>   macb ff0b0000.ethernet eth0: not whole frame pointed by descriptor
> >>
> >> Fix the issue by ensuring that when the controller sees RX_USED=0 the
> >> ctrl field is already cleared.
> >>
> >> This issue was observed on a ZynqMP based system.
> >>
> > Thanks for the patch.
> > Could you please describe the test in which this behavior was observed?
>
> Sure. The testcase I used for the patches is:
>
> - RT_FULL kernel,
> - CPU-bound SCHED_FF RT priority 15 process (with
> rcutree.kthread_prio=20 to avoid RCU starvation),
> - Pyropus memtester running for 3GB (system has 4GB memory),
> - "ping -f -l 5000 -s 100" running from a PC.
>
> The "not whole frame pointed by descriptor" issue occurs within minutes
> and the RX memory corruption within an hour. I did not try to reduce the
> testcase to a minimum.
>
> Both were also observed using real production loads (that of course do
> not have CPU-bound RT tasks).
>
> > Were you able to confirm that this was because of the ctrl field being
> > cleared late? This error can also be observed under stress when RX UBR
> > is observed.
>
> I observed that the issue occurred without this patch, and didn't occur
> after applying this patch (individually), but I didn't check it further
> than that. If you have anything you'd like me to test, let me know.

Thanks for the details.

Regards,
Harini

  reply	other threads:[~2018-12-03 10:37 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-30 18:21 [PATCH 0/3] net: macb: DMA race condition fixes Anssi Hannula
2018-11-30 18:21 ` [PATCH 1/3] net: macb: fix random memory corruption on RX with 64-bit DMA Anssi Hannula
2018-12-03  4:44   ` Harini Katakam
2018-12-05 12:37   ` Claudiu.Beznea
2018-12-05 13:58     ` Anssi Hannula
2018-12-05 20:32   ` David Miller
2018-12-06 14:16     ` Claudiu.Beznea
2018-11-30 18:21 ` [PATCH 2/3] net: macb: fix dropped RX frames due to a race Anssi Hannula
2018-12-03  4:52   ` Harini Katakam
2018-12-03 10:31     ` Anssi Hannula
2018-12-03 10:36       ` Harini Katakam [this message]
2018-12-05 12:38   ` Claudiu.Beznea
2018-11-30 18:21 ` [PATCH 3/3] net: macb: add missing barriers when reading buffers Anssi Hannula
2018-12-05 12:37   ` Claudiu.Beznea
2018-12-05 14:00     ` Anssi Hannula
2018-12-06 14:14       ` Claudiu.Beznea
2018-12-07 12:00         ` Anssi Hannula
2018-12-10 10:34           ` Claudiu.Beznea
2018-12-11 13:21             ` Anssi Hannula
2018-12-12 10:58               ` Claudiu.Beznea
2018-12-12 11:27                 ` Anssi Hannula
2018-12-13 10:48                   ` Claudiu.Beznea
2018-12-12 10:59               ` [PATCH 3/3 v2] net: macb: add missing barriers when reading descriptors Anssi Hannula
2018-12-12 23:19                 ` David Miller
2018-12-03  8:26 ` [PATCH 0/3] net: macb: DMA race condition fixes Nicolas.Ferre
2018-12-03 23:56   ` David Miller
2018-12-05 20:35 ` David Miller
2018-12-07 12:04   ` Anssi Hannula
2018-12-10 10:58     ` Nicolas.Ferre
2018-12-10 11:32       ` Claudiu.Beznea
2018-12-10 11:34         ` Claudiu.Beznea

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFcVECLhZAcQFxB7FxJyXYfyNdGZ3oJf0Sei8DFige5YSU1DWw@mail.gmail.com \
    --to=harinik@xilinx.com \
    --cc=anssi.hannula@bitwise.fi \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.ferre@microchip.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.