All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Heiko Stübner" <heiko@sntech.de>
To: Christoph Hellwig <hch@lst.de>
Cc: palmer@dabbelt.com, paul.walmsley@sifive.com,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	wefu@redhat.com, guoren@kernel.org, cmuellner@linux.com,
	philipp.tomsich@vrull.eu, hch@lst.de, samuel@sholland.org,
	atishp@atishpatra.org, anup@brainfault.org, mick@ics.forth.gr,
	robh+dt@kernel.org, krzk+dt@kernel.org,
	devicetree@vger.kernel.org, drew@beagleboard.org,
	Atish Patra <atish.patra@wdc.com>
Subject: Re: [PATCH 2/3] riscv: Implement Zicbom-based cache management operations
Date: Wed, 15 Jun 2022 18:56:40 +0200	[thread overview]
Message-ID: <110361853.nniJfEyVGO@diego> (raw)
In-Reply-To: <20220610055608.GA24221@lst.de>

Hi Christoph,

Am Freitag, 10. Juni 2022, 07:56:08 CEST schrieb Christoph Hellwig:
> On Fri, Jun 10, 2022 at 02:43:07AM +0200, Heiko Stuebner wrote:
> > +config RISCV_ISA_ZICBOM
> > +	bool "Zicbom extension support for non-coherent dma operation"
> > +	select ARCH_HAS_DMA_PREP_COHERENT
> > +	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
> > +	select ARCH_HAS_SYNC_DMA_FOR_CPU
> > +	select ARCH_HAS_SETUP_DMA_OPS
> > +	select DMA_DIRECT_REMAP
> > +	select RISCV_ALTERNATIVE
> > +	default y
> > +	help
> > +	   Adds support to dynamically detect the presence of the ZICBOM extension
> 
> Overly long line here.

fixed

> 
> > +	   (Cache Block Management Operations) and enable its usage.
> > +
> > +	   If you don't know what to do here, say Y.
> 
> But more importantly I think the whole text here is not very helpful.
> What users care about is non-coherent DMA support.  What extension is
> used for that is rather secondary.

I guess it might make sense to split that in some way.
I.e. Zicbom provides one implementation for handling non-coherence,
the D1 uses different (but very similar) instructions while the SoC on the
Beagle-V does something completely different.

So I guess it could make sense to have a general DMA_NONCOHERENT option
and which gets selected by the relevant users.

This also fixes the issue that Zicbom needs a very new binutils
but if beagle-v support happens that wouldn't need that.


> Also please capitalize DMA.

fixed

> > +void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir)
> > +{
> > +	switch (dir) {
> > +	case DMA_TO_DEVICE:
> > +		ALT_CMO_OP(CLEAN, (unsigned long)phys_to_virt(paddr), size, riscv_cbom_block_size);
> > +		break;
> > +	case DMA_FROM_DEVICE:
> > +		ALT_CMO_OP(INVAL, (unsigned long)phys_to_virt(paddr), size, riscv_cbom_block_size);
> > +		break;
> > +	case DMA_BIDIRECTIONAL:
> > +		ALT_CMO_OP(FLUSH, (unsigned long)phys_to_virt(paddr), size, riscv_cbom_block_size);
> > +		break;
> > +	default:
> > +		break;
> > +	}
> 
> Pleae avoid all these crazy long lines.  and use a logical variable
> for the virtual address.  And why do you pass that virtual address
> as an unsigned long to ALT_CMO_OP?  You're going to make your life
> much easier if you simply always pass a pointer.

fixed all of those.
And of course you're right, not having the cast when calling ALT_CMO_OP
makes things definitly a lot nicer looking.

> Last but not last, does in RISC-V clean mean writeback and flush mean
> writeback plus invalidate?  If so the code is correct, but the choice
> of names in the RISC-V spec is extremely unfortunate.

clean: 
    makes data [...] visible to a set of non-coherent agents [...] by
    performing a write transfer of a copy of a cache block [...]

flush:
    performs a clean followed by an invalidate

So that's a yes to your question

> > +void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, enum dma_data_direction dir)
> > +{
> > +	switch (dir) {
> > +	case DMA_TO_DEVICE:
> > +		break;
> > +	case DMA_FROM_DEVICE:
> > +	case DMA_BIDIRECTIONAL:
> > +		ALT_CMO_OP(INVAL, (unsigned long)phys_to_virt(paddr), size, riscv_cbom_block_size);
> > +		break;
> > +	default:
> > +		break;
> > +	}
> > +}
> 
> Same comment here and in few other places.

fixed

> > +
> > +void arch_dma_prep_coherent(struct page *page, size_t size)
> > +{
> > +	void *flush_addr = page_address(page);
> > +
> > +	memset(flush_addr, 0, size);
> > +	ALT_CMO_OP(FLUSH, (unsigned long)flush_addr, size, riscv_cbom_block_size);
> > +}
> 
> arch_dma_prep_coherent should never zero the memory, that is left
> for the upper layers.`

fixed

> > +void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> > +		const struct iommu_ops *iommu, bool coherent)
> > +{
> > +	/* If a specific device is dma-coherent, set it here */
> 
> This comment isn't all that useful.

ok, I've dropped it

> > +	dev->dma_coherent = coherent;
> > +}
> 
> But more importantly, this assums that once this code is built all
> devices are non-coherent by default.  I.e. with this patch applied
> and the config option enabled we'll now suddenly start doing cache
> management operations or setups that didn't do it before.

If I'm reading things correctly [0], the default for those functions
is for those to be empty - but defined in the coherent case.

When you look at the definition of ALT_CMO_OP

#define ALT_CMO_OP(_op, _start, _size, _cachesize)                      \
asm volatile(ALTERNATIVE_2(                                             \
        __nops(6),                                                      \

 you'll see that it's default variant is to do nothing and it doing any
non-coherency voodoo is only patched in if the Zicbom extension
(or T-Head errata) is detected at runtime.

So in the coherent case (with the memset removed as you suggested),
the arch_sync_dma_* and arch_dma_prep_coherent functions end up as
something like

void arch_dma_prep_coherent(struct page *page, size_t size)
{
        void *flush_addr = page_address(page);

        nops(6);
}

which is very mich similar to the defaults [0] I guess, or am I
overlooking something?

Thanks for taking the time for that review
Heiko


[0] https://elixir.bootlin.com/linux/latest/source/include/linux/dma-map-ops.h#L293



WARNING: multiple messages have this Message-ID (diff)
From: "Heiko Stübner" <heiko@sntech.de>
To: Christoph Hellwig <hch@lst.de>
Cc: palmer@dabbelt.com, paul.walmsley@sifive.com,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	wefu@redhat.com, guoren@kernel.org, cmuellner@linux.com,
	philipp.tomsich@vrull.eu, hch@lst.de, samuel@sholland.org,
	atishp@atishpatra.org, anup@brainfault.org, mick@ics.forth.gr,
	robh+dt@kernel.org, krzk+dt@kernel.org,
	devicetree@vger.kernel.org, drew@beagleboard.org,
	Atish Patra <atish.patra@wdc.com>
Subject: Re: [PATCH 2/3] riscv: Implement Zicbom-based cache management operations
Date: Wed, 15 Jun 2022 18:56:40 +0200	[thread overview]
Message-ID: <110361853.nniJfEyVGO@diego> (raw)
In-Reply-To: <20220610055608.GA24221@lst.de>

Hi Christoph,

Am Freitag, 10. Juni 2022, 07:56:08 CEST schrieb Christoph Hellwig:
> On Fri, Jun 10, 2022 at 02:43:07AM +0200, Heiko Stuebner wrote:
> > +config RISCV_ISA_ZICBOM
> > +	bool "Zicbom extension support for non-coherent dma operation"
> > +	select ARCH_HAS_DMA_PREP_COHERENT
> > +	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
> > +	select ARCH_HAS_SYNC_DMA_FOR_CPU
> > +	select ARCH_HAS_SETUP_DMA_OPS
> > +	select DMA_DIRECT_REMAP
> > +	select RISCV_ALTERNATIVE
> > +	default y
> > +	help
> > +	   Adds support to dynamically detect the presence of the ZICBOM extension
> 
> Overly long line here.

fixed

> 
> > +	   (Cache Block Management Operations) and enable its usage.
> > +
> > +	   If you don't know what to do here, say Y.
> 
> But more importantly I think the whole text here is not very helpful.
> What users care about is non-coherent DMA support.  What extension is
> used for that is rather secondary.

I guess it might make sense to split that in some way.
I.e. Zicbom provides one implementation for handling non-coherence,
the D1 uses different (but very similar) instructions while the SoC on the
Beagle-V does something completely different.

So I guess it could make sense to have a general DMA_NONCOHERENT option
and which gets selected by the relevant users.

This also fixes the issue that Zicbom needs a very new binutils
but if beagle-v support happens that wouldn't need that.


> Also please capitalize DMA.

fixed

> > +void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir)
> > +{
> > +	switch (dir) {
> > +	case DMA_TO_DEVICE:
> > +		ALT_CMO_OP(CLEAN, (unsigned long)phys_to_virt(paddr), size, riscv_cbom_block_size);
> > +		break;
> > +	case DMA_FROM_DEVICE:
> > +		ALT_CMO_OP(INVAL, (unsigned long)phys_to_virt(paddr), size, riscv_cbom_block_size);
> > +		break;
> > +	case DMA_BIDIRECTIONAL:
> > +		ALT_CMO_OP(FLUSH, (unsigned long)phys_to_virt(paddr), size, riscv_cbom_block_size);
> > +		break;
> > +	default:
> > +		break;
> > +	}
> 
> Pleae avoid all these crazy long lines.  and use a logical variable
> for the virtual address.  And why do you pass that virtual address
> as an unsigned long to ALT_CMO_OP?  You're going to make your life
> much easier if you simply always pass a pointer.

fixed all of those.
And of course you're right, not having the cast when calling ALT_CMO_OP
makes things definitly a lot nicer looking.

> Last but not last, does in RISC-V clean mean writeback and flush mean
> writeback plus invalidate?  If so the code is correct, but the choice
> of names in the RISC-V spec is extremely unfortunate.

clean: 
    makes data [...] visible to a set of non-coherent agents [...] by
    performing a write transfer of a copy of a cache block [...]

flush:
    performs a clean followed by an invalidate

So that's a yes to your question

> > +void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, enum dma_data_direction dir)
> > +{
> > +	switch (dir) {
> > +	case DMA_TO_DEVICE:
> > +		break;
> > +	case DMA_FROM_DEVICE:
> > +	case DMA_BIDIRECTIONAL:
> > +		ALT_CMO_OP(INVAL, (unsigned long)phys_to_virt(paddr), size, riscv_cbom_block_size);
> > +		break;
> > +	default:
> > +		break;
> > +	}
> > +}
> 
> Same comment here and in few other places.

fixed

> > +
> > +void arch_dma_prep_coherent(struct page *page, size_t size)
> > +{
> > +	void *flush_addr = page_address(page);
> > +
> > +	memset(flush_addr, 0, size);
> > +	ALT_CMO_OP(FLUSH, (unsigned long)flush_addr, size, riscv_cbom_block_size);
> > +}
> 
> arch_dma_prep_coherent should never zero the memory, that is left
> for the upper layers.`

fixed

> > +void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> > +		const struct iommu_ops *iommu, bool coherent)
> > +{
> > +	/* If a specific device is dma-coherent, set it here */
> 
> This comment isn't all that useful.

ok, I've dropped it

> > +	dev->dma_coherent = coherent;
> > +}
> 
> But more importantly, this assums that once this code is built all
> devices are non-coherent by default.  I.e. with this patch applied
> and the config option enabled we'll now suddenly start doing cache
> management operations or setups that didn't do it before.

If I'm reading things correctly [0], the default for those functions
is for those to be empty - but defined in the coherent case.

When you look at the definition of ALT_CMO_OP

#define ALT_CMO_OP(_op, _start, _size, _cachesize)                      \
asm volatile(ALTERNATIVE_2(                                             \
        __nops(6),                                                      \

 you'll see that it's default variant is to do nothing and it doing any
non-coherency voodoo is only patched in if the Zicbom extension
(or T-Head errata) is detected at runtime.

So in the coherent case (with the memset removed as you suggested),
the arch_sync_dma_* and arch_dma_prep_coherent functions end up as
something like

void arch_dma_prep_coherent(struct page *page, size_t size)
{
        void *flush_addr = page_address(page);

        nops(6);
}

which is very mich similar to the defaults [0] I guess, or am I
overlooking something?

Thanks for taking the time for that review
Heiko


[0] https://elixir.bootlin.com/linux/latest/source/include/linux/dma-map-ops.h#L293



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2022-06-15 16:57 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-10  0:43 [PATCH v3 0/3] riscv: implement Zicbom-based CMO instructions + the t-head variant Heiko Stuebner
2022-06-10  0:43 ` Heiko Stuebner
2022-06-10  0:43 ` [PATCH 1/3] dt-bindings: riscv: document cbom-block-size Heiko Stuebner
2022-06-10  0:43   ` Heiko Stuebner
2022-06-17 20:37   ` Rob Herring
2022-06-17 20:37     ` Rob Herring
2022-06-10  0:43 ` [PATCH 2/3] riscv: Implement Zicbom-based cache management operations Heiko Stuebner
2022-06-10  0:43   ` Heiko Stuebner
2022-06-10  3:22   ` Randy Dunlap
2022-06-10  3:22     ` Randy Dunlap
2022-06-10  5:56   ` Christoph Hellwig
2022-06-10  5:56     ` Christoph Hellwig
2022-06-15 16:56     ` Heiko Stübner [this message]
2022-06-15 16:56       ` Heiko Stübner
2022-06-15 17:49       ` Christoph Hellwig
2022-06-15 17:49         ` Christoph Hellwig
2022-06-16  9:46         ` Heiko Stübner
2022-06-16  9:46           ` Heiko Stübner
2022-06-16 11:53           ` Christoph Hellwig
2022-06-16 11:53             ` Christoph Hellwig
2022-06-16 12:09             ` Heiko Stübner
2022-06-16 12:09               ` Heiko Stübner
2022-06-16 12:11               ` Christoph Hellwig
2022-06-16 12:11                 ` Christoph Hellwig
2022-06-17  8:30                 ` Heiko Stübner
2022-06-17  8:30                   ` Heiko Stübner
2022-06-12 19:15   ` Samuel Holland
2022-06-12 19:15     ` Samuel Holland
2022-06-13  5:50     ` Christoph Hellwig
2022-06-13  5:50       ` Christoph Hellwig
2022-06-10  0:43 ` [PATCH 3/3] riscv: implement cache-management errata for T-Head SoCs Heiko Stuebner
2022-06-10  0:43   ` Heiko Stuebner
2022-06-10  1:04   ` Guo Ren
2022-06-10  1:04     ` Guo Ren
2022-06-12 19:18   ` Samuel Holland
2022-06-12 19:18     ` Samuel Holland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=110361853.nniJfEyVGO@diego \
    --to=heiko@sntech.de \
    --cc=anup@brainfault.org \
    --cc=atish.patra@wdc.com \
    --cc=atishp@atishpatra.org \
    --cc=cmuellner@linux.com \
    --cc=devicetree@vger.kernel.org \
    --cc=drew@beagleboard.org \
    --cc=guoren@kernel.org \
    --cc=hch@lst.de \
    --cc=krzk+dt@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=mick@ics.forth.gr \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=philipp.tomsich@vrull.eu \
    --cc=robh+dt@kernel.org \
    --cc=samuel@sholland.org \
    --cc=wefu@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.