All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stephen Boyd <swboyd@chromium.org>
To: Doug Anderson <dianders@chromium.org>
Cc: Roja Rani Yarubandi <rojay@codeaurora.org>,
	Mark Brown <broonie@kernel.org>, Andy Gross <agross@kernel.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	linux-spi <linux-spi@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Akash Asthana <akashast@codeaurora.org>,
	msavaliy@qti.qualcomm.com
Subject: Re: [PATCH] spi: spi-geni-qcom: Fix NULL pointer access in geni_spi_isr
Date: Thu, 10 Dec 2020 17:21:16 -0800	[thread overview]
Message-ID: <160764967649.1580929.3992720095789306793@swboyd.mtv.corp.google.com> (raw)
In-Reply-To: <CAD=FV=VD78fmSRciFf38AbZG=EFPzDiT_e7QkEC08zA9iL1vTw@mail.gmail.com>

Quoting Doug Anderson (2020-12-10 17:04:06)
> On Thu, Dec 10, 2020 at 4:51 PM Stephen Boyd <swboyd@chromium.org> wrote:
> >
> > I'm worried about the buffer disappearing if spi core calls handle_err()
> > but the geni_spi_isr() handler runs both an rx and a cancel/abort
> > routine. That doesn't seem to be the case though so it looks all fine.
> 
> It would be pretty racy if that was the case.  Until it calls
> handle_timeout() we're still free to write to that buffer, right?

Right I don't see any badness here.

> 
> 
> > > If we want to try to do better, we can do timeout handling ourselves.
> > > The SPI core allows for that.
> > >
> > >
> > > > So why don't we check for cur_xfer being NULL in the rx/tx handling
> > > > paths too and bail out there? Does the FIFO need to be cleared out in
> > > > such a situation that spi core thinks a timeout happened but there's RX
> > > > data according to m_irq? Do we need to read it all and throw it away? Or
> > > > does the abort/cancel clear out the RX fifo?
> > >
> > > I don't know for sure, but IMO it's safest to read anything that's in
> > > the FIFO.  It's also important to adjust the watermark in the TX case.
> > > The suggestions I provided in my original reply (#2 and #3) handle
> > > this and are plenty simple.
> > >
> > > As per my original reply, though, anything we do in the ISR doesn't
> > > replace the changes we need to make to handle_fifo_timeout().  It is
> > > very important that when handle_fifo_timeout() finishes that no future
> > > interrupts for old transfers will fire.
> > >
> >
> > Alright. With a proper diagram in the commit text I think doing all of
> > the points, 1 through 3, would be good and required to leave the
> > hardware in a sane state for all the scenarios. Why do we need to call
> > synchronize_irq() at the start and end of handle_fifo_timeout() though?
> > Presumably having it at the start would make sure the long delayed irq
> > runs and handles any rx/tx by throwing it away. Sounds great, but having
> > it at the end leaves me confused. We want to make sure the cancel really
> > went through?  Don't we know that because the completion variable for
> > cancel succeeded?
> 
> I want it to handle the case where the "abort" command timed out!  :-)
>  If the "abort" command timed out then it's still pending and we could
> get an interrupt for it at some future point in time.

Sure but who cares? We set a completion variable if abort comes in
later. We'll put a message in the log indicating that it "Failed" but
otherwise handle_fifo_timeout() can't return an error so we have to give
up eventually.

> 
> 
> > I guess I'm not convinced that the hardware is so bad that it cancels
> > and aborts the sequencer, raises an irq for that, and then raises an irq
> > for the earlier rx/tx that the sequencer canceled out. Is that
> > happening? It's called a sequencer because presumably it runs a sequence
> > of operations like tx, rx, cs change, cancel, and abort. Hopefully it
> > doesn't run them out of order. If they run at the same time it's fine,
> > the irq handler will see all of them and throw away reads, etc.
> 
> Maybe answered by me explaining that I'm worried about the case where
> "abort" times out (and thus the "done" from the abort is still
> pending).
> 
> NOTE: I will also assert that if we send the "abort" then it seems
> like it has a high likelihood of timing out.  Why do I say that?  In
> order to even get to sending the "abort", it means:
> 
> a) The original transfer timed out
> 
> b) The "cancel" timed out.  As you can see, if the "cancel" doesn't
> time out we don't even send the "abort"
> 
> ...so two things timed out, one of which we _just_ sent.  The "abort"
> feels like a last ditch effort.  Might as well try it, but things were
> in pretty sorry shape to start with by the time we tried it.
> 

Yeah and so if it comes way later because it timed out then what's the
point of calling synchronize_irq() again? To make the completion
variable set when it won't be tested again until it is reinitialized?

  reply	other threads:[~2020-12-11  1:23 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03  7:44 [PATCH] spi: spi-geni-qcom: Fix NULL pointer access in geni_spi_isr Roja Rani Yarubandi
2020-12-03 16:40 ` Doug Anderson
2020-12-10  3:17   ` Stephen Boyd
2020-12-10 17:14     ` Doug Anderson
2020-12-10 22:57       ` Stephen Boyd
2020-12-10 23:07         ` Doug Anderson
2020-12-10 23:32           ` Stephen Boyd
2020-12-10 23:50             ` Doug Anderson
2020-12-11  0:50               ` Stephen Boyd
2020-12-11  1:04                 ` Doug Anderson
2020-12-11  1:21                   ` Stephen Boyd [this message]
2020-12-11  1:30                     ` Doug Anderson
2020-12-11  1:39                       ` Stephen Boyd
2020-12-11  1:51                         ` Doug Anderson
2020-12-12  1:32                           ` Stephen Boyd
2020-12-15  0:31                             ` Doug Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=160764967649.1580929.3992720095789306793@swboyd.mtv.corp.google.com \
    --to=swboyd@chromium.org \
    --cc=agross@kernel.org \
    --cc=akashast@codeaurora.org \
    --cc=bjorn.andersson@linaro.org \
    --cc=broonie@kernel.org \
    --cc=dianders@chromium.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-spi@vger.kernel.org \
    --cc=msavaliy@qti.qualcomm.com \
    --cc=rojay@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.