linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Boyd <swboyd@chromium.org>
To: Doug Anderson <dianders@chromium.org>
Cc: Roja Rani Yarubandi <rojay@codeaurora.org>,
	Mark Brown <broonie@kernel.org>, Andy Gross <agross@kernel.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	linux-spi <linux-spi@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Akash Asthana <akashast@codeaurora.org>,
	msavaliy@qti.qualcomm.com
Subject: Re: [PATCH] spi: spi-geni-qcom: Fix NULL pointer access in geni_spi_isr
Date: Thu, 10 Dec 2020 17:21:16 -0800	[thread overview]
Message-ID: <160764967649.1580929.3992720095789306793@swboyd.mtv.corp.google.com> (raw)
In-Reply-To: <CAD=FV=VD78fmSRciFf38AbZG=EFPzDiT_e7QkEC08zA9iL1vTw@mail.gmail.com>

Quoting Doug Anderson (2020-12-10 17:04:06)
> On Thu, Dec 10, 2020 at 4:51 PM Stephen Boyd <swboyd@chromium.org> wrote:
> >
> > I'm worried about the buffer disappearing if spi core calls handle_err()
> > but the geni_spi_isr() handler runs both an rx and a cancel/abort
> > routine. That doesn't seem to be the case though so it looks all fine.
> 
> It would be pretty racy if that was the case.  Until it calls
> handle_timeout() we're still free to write to that buffer, right?

Right I don't see any badness here.

> 
> 
> > > If we want to try to do better, we can do timeout handling ourselves.
> > > The SPI core allows for that.
> > >
> > >
> > > > So why don't we check for cur_xfer being NULL in the rx/tx handling
> > > > paths too and bail out there? Does the FIFO need to be cleared out in
> > > > such a situation that spi core thinks a timeout happened but there's RX
> > > > data according to m_irq? Do we need to read it all and throw it away? Or
> > > > does the abort/cancel clear out the RX fifo?
> > >
> > > I don't know for sure, but IMO it's safest to read anything that's in
> > > the FIFO.  It's also important to adjust the watermark in the TX case.
> > > The suggestions I provided in my original reply (#2 and #3) handle
> > > this and are plenty simple.
> > >
> > > As per my original reply, though, anything we do in the ISR doesn't
> > > replace the changes we need to make to handle_fifo_timeout().  It is
> > > very important that when handle_fifo_timeout() finishes that no future
> > > interrupts for old transfers will fire.
> > >
> >
> > Alright. With a proper diagram in the commit text I think doing all of
> > the points, 1 through 3, would be good and required to leave the
> > hardware in a sane state for all the scenarios. Why do we need to call
> > synchronize_irq() at the start and end of handle_fifo_timeout() though?
> > Presumably having it at the start would make sure the long delayed irq
> > runs and handles any rx/tx by throwing it away. Sounds great, but having
> > it at the end leaves me confused. We want to make sure the cancel really
> > went through?  Don't we know that because the completion variable for
> > cancel succeeded?
> 
> I want it to handle the case where the "abort" command timed out!  :-)
>  If the "abort" command timed out then it's still pending and we could
> get an interrupt for it at some future point in time.

Sure but who cares? We set a completion variable if abort comes in
later. We'll put a message in the log indicating that it "Failed" but
otherwise handle_fifo_timeout() can't return an error so we have to give
up eventually.

> 
> 
> > I guess I'm not convinced that the hardware is so bad that it cancels
> > and aborts the sequencer, raises an irq for that, and then raises an irq
> > for the earlier rx/tx that the sequencer canceled out. Is that
> > happening? It's called a sequencer because presumably it runs a sequence
> > of operations like tx, rx, cs change, cancel, and abort. Hopefully it
> > doesn't run them out of order. If they run at the same time it's fine,
> > the irq handler will see all of them and throw away reads, etc.
> 
> Maybe answered by me explaining that I'm worried about the case where
> "abort" times out (and thus the "done" from the abort is still
> pending).
> 
> NOTE: I will also assert that if we send the "abort" then it seems
> like it has a high likelihood of timing out.  Why do I say that?  In
> order to even get to sending the "abort", it means:
> 
> a) The original transfer timed out
> 
> b) The "cancel" timed out.  As you can see, if the "cancel" doesn't
> time out we don't even send the "abort"
> 
> ...so two things timed out, one of which we _just_ sent.  The "abort"
> feels like a last ditch effort.  Might as well try it, but things were
> in pretty sorry shape to start with by the time we tried it.
> 

Yeah and so if it comes way later because it timed out then what's the
point of calling synchronize_irq() again? To make the completion
variable set when it won't be tested again until it is reinitialized?

  reply	other threads:[~2020-12-11  1:25 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03  7:44 [PATCH] spi: spi-geni-qcom: Fix NULL pointer access in geni_spi_isr Roja Rani Yarubandi
2020-12-03 16:40 ` Doug Anderson
2020-12-10  3:17   ` Stephen Boyd
2020-12-10 17:14     ` Doug Anderson
2020-12-10 22:57       ` Stephen Boyd
2020-12-10 23:07         ` Doug Anderson
2020-12-10 23:32           ` Stephen Boyd
2020-12-10 23:50             ` Doug Anderson
2020-12-11  0:50               ` Stephen Boyd
2020-12-11  1:04                 ` Doug Anderson
2020-12-11  1:21                   ` Stephen Boyd [this message]
2020-12-11  1:30                     ` Doug Anderson
2020-12-11  1:39                       ` Stephen Boyd
2020-12-11  1:51                         ` Doug Anderson
2020-12-12  1:32                           ` Stephen Boyd
2020-12-15  0:31                             ` Doug Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=160764967649.1580929.3992720095789306793@swboyd.mtv.corp.google.com \
    --to=swboyd@chromium.org \
    --cc=agross@kernel.org \
    --cc=akashast@codeaurora.org \
    --cc=bjorn.andersson@linaro.org \
    --cc=broonie@kernel.org \
    --cc=dianders@chromium.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-spi@vger.kernel.org \
    --cc=msavaliy@qti.qualcomm.com \
    --cc=rojay@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).