All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: Vivek Gautam <vivek.gautam@codeaurora.org>
Cc: Jordan Crouse <jcrouse@codeaurora.org>,
	pdaly@codeaurora.org,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	Joerg Roedel <joro@8bytes.org>, Will Deacon <will.deacon@arm.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Tomasz Figa <tfiga@chromium.org>,
	"list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,
	" <iommu@lists.linux-foundation.org>,
	Robin Murphy <robin.murphy@arm.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	pratikp@codeaurora.org
Subject: Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
Date: Mon, 21 Jan 2019 11:50:53 +0100	[thread overview]
Message-ID: <CAKv+Gu9z_mGwdZYMKPfM_g2MZwrCF5=f4WAdn_R6wJ1A9xSZ_Q@mail.gmail.com> (raw)
In-Reply-To: <CAFp+6iEp-bzMrZz8cFUciTFKm7TwAoLdYpsSTD73kDfCRh60bA@mail.gmail.com>

On Mon, 21 Jan 2019 at 11:17, Vivek Gautam <vivek.gautam@codeaurora.org> wrote:
>
> Hi,
>
>
> On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
> >
> > On Mon, 21 Jan 2019 at 06:54, Vivek Gautam <vivek.gautam@codeaurora.org> wrote:
> > >
> > > Qualcomm SoCs have an additional level of cache called as
> > > System cache, aka. Last level cache (LLC). This cache sits right
> > > before the DDR, and is tightly coupled with the memory controller.
> > > The clients using this cache request their slices from this
> > > system cache, make it active, and can then start using it.
> > > For these clients with smmu, to start using the system cache for
> > > buffers and, related page tables [1], memory attributes need to be
> > > set accordingly. This series add the required support.
> > >
> >
> > Does this actually improve performance on reads from a device? The
> > non-cache coherent DMA routines perform an unconditional D-cache
> > invalidate by VA to the PoC before reading from the buffers filled by
> > the device, and I would expect the PoC to be defined as lying beyond
> > the LLC to still guarantee the architected behavior.
>
> We have seen performance improvements when running Manhattan
> GFXBench benchmarks.
>

Ah ok, that makes sense, since in that case, the data flow is mostly
to the device, not from the device.

> As for the PoC, from my knowledge on sdm845 the system cache, aka
> Last level cache (LLC) lies beyond the point of coherency.
> Non-cache coherent buffers will not be cached to system cache also, and
> no additional software cache maintenance ops are required for system cache.
> Pratik can add more if I am missing something.
>
> To take care of the memory attributes from DMA APIs side, we can add a
> DMA_ATTR definition to take care of any dma non-coherent APIs calls.
>

So does the device use the correct inner non-cacheable, outer
writeback cacheable attributes if the SMMU is in pass-through?

We have been looking into another use case where the fact that the
SMMU overrides memory attributes is causing issues (WC mappings used
by the radeon and amdgpu driver). So if the SMMU would honour the
existing attributes, would you still need the SMMU changes?

WARNING: multiple messages have this Message-ID (diff)
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: Vivek Gautam <vivek.gautam@codeaurora.org>
Cc: Will Deacon <will.deacon@arm.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Joerg Roedel <joro@8bytes.org>,
	"list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,"
	<iommu@lists.linux-foundation.org>,
	pdaly@codeaurora.org,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Tomasz Figa <tfiga@chromium.org>,
	Jordan Crouse <jcrouse@codeaurora.org>,
	pratikp@codeaurora.org,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
Date: Mon, 21 Jan 2019 11:50:53 +0100	[thread overview]
Message-ID: <CAKv+Gu9z_mGwdZYMKPfM_g2MZwrCF5=f4WAdn_R6wJ1A9xSZ_Q@mail.gmail.com> (raw)
In-Reply-To: <CAFp+6iEp-bzMrZz8cFUciTFKm7TwAoLdYpsSTD73kDfCRh60bA@mail.gmail.com>

On Mon, 21 Jan 2019 at 11:17, Vivek Gautam <vivek.gautam@codeaurora.org> wrote:
>
> Hi,
>
>
> On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
> >
> > On Mon, 21 Jan 2019 at 06:54, Vivek Gautam <vivek.gautam@codeaurora.org> wrote:
> > >
> > > Qualcomm SoCs have an additional level of cache called as
> > > System cache, aka. Last level cache (LLC). This cache sits right
> > > before the DDR, and is tightly coupled with the memory controller.
> > > The clients using this cache request their slices from this
> > > system cache, make it active, and can then start using it.
> > > For these clients with smmu, to start using the system cache for
> > > buffers and, related page tables [1], memory attributes need to be
> > > set accordingly. This series add the required support.
> > >
> >
> > Does this actually improve performance on reads from a device? The
> > non-cache coherent DMA routines perform an unconditional D-cache
> > invalidate by VA to the PoC before reading from the buffers filled by
> > the device, and I would expect the PoC to be defined as lying beyond
> > the LLC to still guarantee the architected behavior.
>
> We have seen performance improvements when running Manhattan
> GFXBench benchmarks.
>

Ah ok, that makes sense, since in that case, the data flow is mostly
to the device, not from the device.

> As for the PoC, from my knowledge on sdm845 the system cache, aka
> Last level cache (LLC) lies beyond the point of coherency.
> Non-cache coherent buffers will not be cached to system cache also, and
> no additional software cache maintenance ops are required for system cache.
> Pratik can add more if I am missing something.
>
> To take care of the memory attributes from DMA APIs side, we can add a
> DMA_ATTR definition to take care of any dma non-coherent APIs calls.
>

So does the device use the correct inner non-cacheable, outer
writeback cacheable attributes if the SMMU is in pass-through?

We have been looking into another use case where the fact that the
SMMU overrides memory attributes is causing issues (WC mappings used
by the radeon and amdgpu driver). So if the SMMU would honour the
existing attributes, would you still need the SMMU changes?

WARNING: multiple messages have this Message-ID (diff)
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: Vivek Gautam <vivek.gautam@codeaurora.org>
Cc: Jordan Crouse <jcrouse@codeaurora.org>,
	pdaly@codeaurora.org,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	Joerg Roedel <joro@8bytes.org>, Will Deacon <will.deacon@arm.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Tomasz Figa <tfiga@chromium.org>,
	"list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,
	" <iommu@lists.linux-foundation.org>,
	Robin Murphy <robin.murphy@arm.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	pratikp@codeaurora.org
Subject: Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
Date: Mon, 21 Jan 2019 11:50:53 +0100	[thread overview]
Message-ID: <CAKv+Gu9z_mGwdZYMKPfM_g2MZwrCF5=f4WAdn_R6wJ1A9xSZ_Q@mail.gmail.com> (raw)
In-Reply-To: <CAFp+6iEp-bzMrZz8cFUciTFKm7TwAoLdYpsSTD73kDfCRh60bA@mail.gmail.com>

On Mon, 21 Jan 2019 at 11:17, Vivek Gautam <vivek.gautam@codeaurora.org> wrote:
>
> Hi,
>
>
> On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
> >
> > On Mon, 21 Jan 2019 at 06:54, Vivek Gautam <vivek.gautam@codeaurora.org> wrote:
> > >
> > > Qualcomm SoCs have an additional level of cache called as
> > > System cache, aka. Last level cache (LLC). This cache sits right
> > > before the DDR, and is tightly coupled with the memory controller.
> > > The clients using this cache request their slices from this
> > > system cache, make it active, and can then start using it.
> > > For these clients with smmu, to start using the system cache for
> > > buffers and, related page tables [1], memory attributes need to be
> > > set accordingly. This series add the required support.
> > >
> >
> > Does this actually improve performance on reads from a device? The
> > non-cache coherent DMA routines perform an unconditional D-cache
> > invalidate by VA to the PoC before reading from the buffers filled by
> > the device, and I would expect the PoC to be defined as lying beyond
> > the LLC to still guarantee the architected behavior.
>
> We have seen performance improvements when running Manhattan
> GFXBench benchmarks.
>

Ah ok, that makes sense, since in that case, the data flow is mostly
to the device, not from the device.

> As for the PoC, from my knowledge on sdm845 the system cache, aka
> Last level cache (LLC) lies beyond the point of coherency.
> Non-cache coherent buffers will not be cached to system cache also, and
> no additional software cache maintenance ops are required for system cache.
> Pratik can add more if I am missing something.
>
> To take care of the memory attributes from DMA APIs side, we can add a
> DMA_ATTR definition to take care of any dma non-coherent APIs calls.
>

So does the device use the correct inner non-cacheable, outer
writeback cacheable attributes if the SMMU is in pass-through?

We have been looking into another use case where the fact that the
SMMU overrides memory attributes is causing issues (WC mappings used
by the radeon and amdgpu driver). So if the SMMU would honour the
existing attributes, would you still need the SMMU changes?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-01-21 10:50 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-21  5:53 [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache Vivek Gautam
2019-01-21  5:53 ` Vivek Gautam
2019-01-21  5:53 ` Vivek Gautam
     [not found] ` <20190121055335.15430-1-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2019-01-21  5:53   ` [PATCH 1/3] iommu/arm-smmu: Move to bitmap for arm_smmu_domain atrributes Vivek Gautam
2019-01-21  5:53     ` Vivek Gautam
2019-01-21  5:53     ` Vivek Gautam
2019-01-21 13:51     ` Robin Murphy
2019-01-21 13:51       ` Robin Murphy
2019-01-21 13:51       ` Robin Murphy
2019-01-22 17:06       ` Vivek Gautam
2019-01-22 17:06         ` Vivek Gautam
2019-01-22 17:06         ` Vivek Gautam
2019-01-21  5:53   ` [PATCH 2/3] iommu/io-pgtable-arm: Add support to use system cache Vivek Gautam
2019-01-21  5:53     ` Vivek Gautam
2019-01-21  5:53     ` Vivek Gautam
2019-01-21  5:53   ` [PATCH 3/3] iommu/arm-smmu: " Vivek Gautam
2019-01-21  5:53     ` Vivek Gautam
2019-01-21  5:53     ` Vivek Gautam
2019-01-21  7:26 ` [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache Ard Biesheuvel
2019-01-21  7:26   ` Ard Biesheuvel
2019-01-21 10:17   ` Vivek Gautam
2019-01-21 10:17     ` Vivek Gautam
2019-01-21 10:50     ` Ard Biesheuvel [this message]
2019-01-21 10:50       ` Ard Biesheuvel
2019-01-21 10:50       ` Ard Biesheuvel
2019-01-21 13:25       ` Robin Murphy
2019-01-21 13:25         ` Robin Murphy
2019-01-21 13:25         ` Robin Murphy
2019-01-21 13:36         ` Ard Biesheuvel
2019-01-21 13:36           ` Ard Biesheuvel
2019-01-21 13:56           ` Robin Murphy
2019-01-21 13:56             ` Robin Murphy
2019-01-21 13:56             ` Robin Murphy
2019-01-21 14:24             ` Ard Biesheuvel
2019-01-21 14:24               ` Ard Biesheuvel
2019-01-21 15:15               ` Robin Murphy
2019-01-21 15:15                 ` Robin Murphy
2019-01-24  6:58               ` Vivek Gautam
2019-01-24  6:58                 ` Vivek Gautam
2019-01-24  7:54                 ` Ard Biesheuvel
2019-01-24  7:54                   ` Ard Biesheuvel
2019-01-28 11:27                   ` Vivek Gautam
2019-01-28 11:27                     ` Vivek Gautam
2019-01-29 15:02                     ` Ard Biesheuvel
2019-01-29 15:02                       ` Ard Biesheuvel
2019-01-30  5:39                       ` Vivek Gautam
2019-01-30  5:39                         ` Vivek Gautam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKv+Gu9z_mGwdZYMKPfM_g2MZwrCF5=f4WAdn_R6wJ1A9xSZ_Q@mail.gmail.com' \
    --to=ard.biesheuvel@linaro.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jcrouse@codeaurora.org \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pdaly@codeaurora.org \
    --cc=pratikp@codeaurora.org \
    --cc=robin.murphy@arm.com \
    --cc=tfiga@chromium.org \
    --cc=vivek.gautam@codeaurora.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.