All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Vincent Whitchurch <vincent.whitchurch@axis.com>,
	broonie@kernel.org, krzysztof.kozlowski@linaro.org,
	andi@etezian.org, Christoph Hellwig <hch@lst.de>,
	Robin Murphy <robin.murphy@arm.com>
Cc: kernel@axis.com, alim.akhtar@samsung.com,
	linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-samsung-soc@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>
Subject: Re: [PATCH v2 2/4] spi: Fix cache corruption due to DMA/PIO overlap
Date: Fri, 30 Sep 2022 13:20:06 +0200	[thread overview]
Message-ID: <a4be6670-832a-ffac-4d68-e4a079eb2eed@samsung.com> (raw)
In-Reply-To: <20220927112117.77599-3-vincent.whitchurch@axis.com>

Hi,

CCed: Christoph and Robin, as the issue is partially dma-mapping related.

On 27.09.2022 13:21, Vincent Whitchurch wrote:
> The SPI core DMA mapping support performs cache management once for the
> entire message and not between transfers, and this leads to cache
> corruption if a message has two or more RX transfers with both
> transfers targeting the same cache line, and the controller driver
> decides to handle one using DMA and the other using PIO (for example,
> because one is much larger than the other).
>
> Fix it by syncing before/after the actual transfers.  This also means
> that we can skip the sync during the map/unmap of the message.
>
> Fixes: 99adef310f68 ("spi: Provide core support for DMA mapping transfers")
> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> ---

This patch landed in linux next-20220929 as commit 0c17ba73c08f ("spi: 
Fix cache corruption due to DMA/PIO overlap"). Unfortunately it causes 
kernel oops on one of my test systems:

8<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address 0000000c
[0000000c] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in: cmac bnep btsdio hci_uart btbcm s5p_mfc btintel 
brcmfmac bluetooth videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 
videobuf2_common videodev cfg80211 mc ecdh_generic ecc brcmutil
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 
6.0.0-rc7-next-20220929-dirty #12903
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: events ax88796c_work
PC is at dma_direct_sync_sg_for_device+0x24/0xb8
LR is at spi_transfer_one_message+0x4c4/0xabc
pc : [<c01cbcf0>]    lr : [<c0739fcc>]    psr: 20000013
...
Process kworker/0:1 (pid: 12, stack limit = 0xca429928)
Stack: (0xe0071d38 to 0xe0072000)
...
  dma_direct_sync_sg_for_device from spi_transfer_one_message+0x4c4/0xabc
  spi_transfer_one_message from __spi_pump_transfer_message+0x300/0x770
  __spi_pump_transfer_message from __spi_sync+0x304/0x3f4
  __spi_sync from spi_sync+0x28/0x40
  spi_sync from axspi_read_rxq+0x98/0xc8
  axspi_read_rxq from ax88796c_work+0x7a8/0xf6c
  ax88796c_work from process_one_work+0x288/0x774
  process_one_work from worker_thread+0x44/0x504
  worker_thread from kthread+0xf0/0x124
  kthread from ret_from_fork+0x14/0x2c
Exception stack(0xe0071fb0 to 0xe0071ff8)
...
---[ end trace 0000000000000000 ]---

This happens because sg_free_table() doesn't clear table->orig_nents nor 
table->nents. If the given spi xfer object is reused without dma-mapped 
buffer, then a NULL pointer de-reference happens at table->sgl 
spi_dma_sync_for_device()/spi_dma_sync_for_cpu(). A possible fix would 
be to zero table->orig_nents in spi_unmap_buf_attrs(). I will send a 
patch for this soon.

However, I think that clearing table->orig_nents and table->nents should 
be added to __sg_free_table() in lib/scatterlist.c to avoid this kind of 
issue in the future. This however will be a significant change that 
might break code somewhere, if it relies on the nents/orig_nents value 
after calling sg_free_table(). Christoph, Robin - what is your opinion?


>   drivers/spi/spi.c | 109 +++++++++++++++++++++++++++++++++++++---------
>   1 file changed, 88 insertions(+), 21 deletions(-)
>
> ...

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Vincent Whitchurch <vincent.whitchurch@axis.com>,
	broonie@kernel.org, krzysztof.kozlowski@linaro.org,
	andi@etezian.org, Christoph Hellwig <hch@lst.de>,
	Robin Murphy <robin.murphy@arm.com>
Cc: kernel@axis.com, alim.akhtar@samsung.com,
	linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-samsung-soc@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>
Subject: Re: [PATCH v2 2/4] spi: Fix cache corruption due to DMA/PIO overlap
Date: Fri, 30 Sep 2022 13:20:06 +0200	[thread overview]
Message-ID: <a4be6670-832a-ffac-4d68-e4a079eb2eed@samsung.com> (raw)
In-Reply-To: <20220927112117.77599-3-vincent.whitchurch@axis.com>

Hi,

CCed: Christoph and Robin, as the issue is partially dma-mapping related.

On 27.09.2022 13:21, Vincent Whitchurch wrote:
> The SPI core DMA mapping support performs cache management once for the
> entire message and not between transfers, and this leads to cache
> corruption if a message has two or more RX transfers with both
> transfers targeting the same cache line, and the controller driver
> decides to handle one using DMA and the other using PIO (for example,
> because one is much larger than the other).
>
> Fix it by syncing before/after the actual transfers.  This also means
> that we can skip the sync during the map/unmap of the message.
>
> Fixes: 99adef310f68 ("spi: Provide core support for DMA mapping transfers")
> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> ---

This patch landed in linux next-20220929 as commit 0c17ba73c08f ("spi: 
Fix cache corruption due to DMA/PIO overlap"). Unfortunately it causes 
kernel oops on one of my test systems:

8<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address 0000000c
[0000000c] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in: cmac bnep btsdio hci_uart btbcm s5p_mfc btintel 
brcmfmac bluetooth videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 
videobuf2_common videodev cfg80211 mc ecdh_generic ecc brcmutil
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 
6.0.0-rc7-next-20220929-dirty #12903
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: events ax88796c_work
PC is at dma_direct_sync_sg_for_device+0x24/0xb8
LR is at spi_transfer_one_message+0x4c4/0xabc
pc : [<c01cbcf0>]    lr : [<c0739fcc>]    psr: 20000013
...
Process kworker/0:1 (pid: 12, stack limit = 0xca429928)
Stack: (0xe0071d38 to 0xe0072000)
...
  dma_direct_sync_sg_for_device from spi_transfer_one_message+0x4c4/0xabc
  spi_transfer_one_message from __spi_pump_transfer_message+0x300/0x770
  __spi_pump_transfer_message from __spi_sync+0x304/0x3f4
  __spi_sync from spi_sync+0x28/0x40
  spi_sync from axspi_read_rxq+0x98/0xc8
  axspi_read_rxq from ax88796c_work+0x7a8/0xf6c
  ax88796c_work from process_one_work+0x288/0x774
  process_one_work from worker_thread+0x44/0x504
  worker_thread from kthread+0xf0/0x124
  kthread from ret_from_fork+0x14/0x2c
Exception stack(0xe0071fb0 to 0xe0071ff8)
...
---[ end trace 0000000000000000 ]---

This happens because sg_free_table() doesn't clear table->orig_nents nor 
table->nents. If the given spi xfer object is reused without dma-mapped 
buffer, then a NULL pointer de-reference happens at table->sgl 
spi_dma_sync_for_device()/spi_dma_sync_for_cpu(). A possible fix would 
be to zero table->orig_nents in spi_unmap_buf_attrs(). I will send a 
patch for this soon.

However, I think that clearing table->orig_nents and table->nents should 
be added to __sg_free_table() in lib/scatterlist.c to avoid this kind of 
issue in the future. This however will be a significant change that 
might break code somewhere, if it relies on the nents/orig_nents value 
after calling sg_free_table(). Christoph, Robin - what is your opinion?


>   drivers/spi/spi.c | 109 +++++++++++++++++++++++++++++++++++++---------
>   1 file changed, 88 insertions(+), 21 deletions(-)
>
> ...

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


  reply	other threads:[~2022-09-30 11:21 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-27 11:21 [PATCH v2 0/4] spi: Fix DMA bugs in (not only) spi-s3c64xx Vincent Whitchurch
2022-09-27 11:21 ` Vincent Whitchurch
2022-09-27 11:21 ` [PATCH v2 1/4] spi: Save current RX and TX DMA devices Vincent Whitchurch
2022-09-27 11:21   ` Vincent Whitchurch
2022-09-27 11:21 ` [PATCH v2 2/4] spi: Fix cache corruption due to DMA/PIO overlap Vincent Whitchurch
2022-09-27 11:21   ` Vincent Whitchurch
2022-09-30 11:20   ` Marek Szyprowski [this message]
2022-09-30 11:20     ` Marek Szyprowski
2022-09-30 12:10     ` Robin Murphy
2022-09-30 12:10       ` Robin Murphy
2022-10-03 11:29       ` Vincent Whitchurch
2022-10-03 11:29         ` Vincent Whitchurch
2022-09-27 11:21 ` [PATCH v2 3/4] spi: Split transfers larger than max size Vincent Whitchurch
2022-09-27 11:21   ` Vincent Whitchurch
2023-06-22 19:48   ` Eddie James
2023-06-22 19:48     ` Eddie James
2023-06-22 21:16     ` Mark Brown
2023-06-22 21:16       ` Mark Brown
2023-06-23 16:45       ` Eddie James
2023-06-23 16:45         ` Eddie James
2023-06-23 17:16         ` Mark Brown
2023-06-23 17:16           ` Mark Brown
2022-09-27 11:21 ` [PATCH v2 4/4] spi: s3c64xx: Fix large transfers with DMA Vincent Whitchurch
2022-09-27 11:21   ` Vincent Whitchurch
2022-09-28 17:27 ` [PATCH v2 0/4] spi: Fix DMA bugs in (not only) spi-s3c64xx Mark Brown
2022-09-28 17:27   ` Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a4be6670-832a-ffac-4d68-e4a079eb2eed@samsung.com \
    --to=m.szyprowski@samsung.com \
    --cc=alim.akhtar@samsung.com \
    --cc=andi@etezian.org \
    --cc=broonie@kernel.org \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=kernel@axis.com \
    --cc=krzysztof.kozlowski@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-samsung-soc@vger.kernel.org \
    --cc=linux-spi@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=vincent.whitchurch@axis.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.