linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Elder <elder@linaro.org>
To: Arnd Bergmann <arnd@arndb.de>
Cc: David Miller <davem@davemloft.net>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	syadagir@codeaurora.org, mjavid@codeaurora.org,
	evgreen@chromium.org, Ben Chan <benchan@google.com>,
	Eric Caruso <ejcaruso@google.com>,
	abhishek.esse@gmail.com,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
Date: Sun, 19 May 2019 12:11:29 -0500	[thread overview]
Message-ID: <dcd648f2-5305-04dd-8997-be87a9961fd9@linaro.org> (raw)
In-Reply-To: <4a34d381-d31d-ea49-d6d3-3c4f632958e3@linaro.org>

On 5/17/19 1:44 PM, Alex Elder wrote:
> On 5/17/19 1:33 PM, Arnd Bergmann wrote:
>> On Fri, May 17, 2019 at 8:08 PM Alex Elder <elder@linaro.org>
>> wrote:
>>> 
>>> On 5/15/19 2:34 AM, Arnd Bergmann wrote:
>>>>> +static void gsi_trans_tre_fill(struct gsi_tre *dest_tre,
>>>>> dma_addr_t addr, +                              u32 len, bool
>>>>> last_tre, bool bei, +                              enum
>>>>> ipa_cmd_opcode opcode) +{ +       struct gsi_tre tre; + +
>>>>> tre.addr = cpu_to_le64(addr); +       tre.len_opcode =
>>>>> gsi_tre_len_opcode(opcode, len); +       tre.reserved = 0; +
>>>>> tre.flags = gsi_tre_flags(last_tre, bei, opcode); + +
>>>>> *dest_tre = tre;        /* Write TRE as a single (16-byte)
>>>>> unit */ +}
>>>> Have you checked that the atomic write is actually what happens
>>>> here, but looking at the compiler output? You might need to add
>>>> a 'volatile' qualifier to the dest_tre argument so the
>>>> temporary structure doesn't get optimized away here.
>>> 
>>> Currently, the assignment *does* become a "stp" instruction. But
>>> I don't know that we can *force* the compiler to write it as a
>>> pair of registers, so I'll soften the comment with "Attempt to
>>> write" or something similar.
>>> 
>>> To my knowledge, adding a volatile qualifier only prevents the 
>>> compiler from performing funny optimizations, but that has no 
>>> effect on whether the 128-bit assignment is made as a single 
>>> unit.  Do you know otherwise?
>> 
>> I don't think it you can force the 128-bit assignment to be atomic,
>> but marking 'dest_tre' should serve to prevent a specific
>> optimization that replaces the function with
>> 
>> dest_tre->addr = ... dest_tre->len_opcode = ... dest_tre->reserved
>> = ... dest_tre->flags = ...
>> 
>> which it might find more efficient than the stp and is equivalent 
>> when the pointer is not marked volatile. We also have the
>> WRITE_ONCE() macro that can help prevent this, but it does not work
>> reliably beyond 64 bit assignments.
> 
> OK, I'll mark it volatile to avoid that potential result.

OK I got interesting results so I wanted to report back.

The way it is currently written (no volatile qualifier) is
the *only* way I get a 16-byte store instruction.

Specifically, with dest_tre having type (struct gsi_tre *):
        *dest_tre = tre;

Produces this:
 4ec: a9002013        stp     x19, x8, [x0]

I attempted to make the assigned-to pointer volatile:
        *(volatile struct gsi_tre *)dest_tre = tre;

But that apparently is interpreted as "assign things
in the destination in exactly the way they were assigned
above into the "tre" structure."  Because it produced this:
 748: f9000348        str     x8, [x26]
 74c: 7940c3e8        ldrh    w8, [sp, #96]
 750: 79001348        strh    w8, [x26, #8]
 754: 7940bbe8        ldrh    w8, [sp, #92]
 758: 79001748        strh    w8, [x26, #10]
 75c: b9405be8        ldr     w8, [sp, #88]
 760: b9000f48        str     w8, [x26, #12]

From there I went back and changed the type of the dest_tre pointer
parameter to (volatile struct gsi_tre *), and changed the the type
of the values passed through that argument in the two callers to
also have the volatile qualifier.  This way there was no need to
use a cast in the left-hand side.  That too produced a string of
separate assignments, not a single 128-bit one.

I also attempted this:
	*dest_tre = *(volatile struct gsi_tre *)&tre;
And even this:
        *(volatile struct gsi_tre *)dest_tre = *(volatile struct gsi_tre *)&tre;
But neither produced a single "stp" instruction; all produced
the same sequence of instructions above.

So it seems that I must *not* apply a volatile qualifier,
because doing so restricts the compiler from making the
single instruction optimization.

If I've missed something and you have another suggestion for
me to try let me know and I'll try it.

					-Alex


> Thanks.
> 
> -Alex
> 
>> 
>> Arnd
>> 
> 


  reply	other threads:[~2019-05-19 17:11 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
2019-05-12  1:24 ` [PATCH 01/18] bitfield.h: add FIELD_MAX() and field_max() Alex Elder
2019-05-12  6:33   ` Kalle Valo
2019-05-12 12:18     ` Alex Elder
2019-05-12 19:30       ` Johannes Berg
2019-05-12  1:24 ` [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h" Alex Elder
2019-05-12  2:34   ` Joe Perches
2019-05-12 12:15     ` Alex Elder
2019-05-15  6:59   ` Arnd Bergmann
2019-05-15 12:03     ` Alex Elder
2019-05-16  1:09       ` Subash Abhinov Kasiviswanathan
2019-05-17 17:27         ` Alex Elder
2019-05-17 18:08           ` Subash Abhinov Kasiviswanathan
2019-05-19 17:37             ` Alex Elder
2019-05-12  1:24 ` [PATCH 03/18] dt-bindings: soc: qcom: add IPA bindings Alex Elder
2019-05-15  7:03   ` Arnd Bergmann
2019-05-15 12:04     ` Alex Elder
2019-05-15 16:50       ` Rob Herring
2019-05-15 17:05         ` Alex Elder
2019-05-12  1:24 ` [PATCH 04/18] soc: qcom: ipa: main code Alex Elder
2019-05-12  1:24 ` [PATCH 05/18] soc: qcom: ipa: configuration data Alex Elder
2019-05-12  1:24 ` [PATCH 06/18] soc: qcom: ipa: clocking, interrupts, and memory Alex Elder
2019-05-12  1:24 ` [PATCH 07/18] soc: qcom: ipa: GSI headers Alex Elder
2019-05-12  1:24 ` [PATCH 08/18] soc: qcom: ipa: the generic software interface Alex Elder
2019-05-15  7:21   ` Arnd Bergmann
2019-05-15 12:13     ` Alex Elder
2019-05-15 12:40       ` Arnd Bergmann
2019-05-15 10:47   ` Arnd Bergmann
2019-05-15 13:32     ` Alex Elder
2019-05-15 19:37   ` Arnd Bergmann
2019-05-12  1:24 ` [PATCH 09/18] soc: qcom: ipa: GSI transactions Alex Elder
2019-05-15  7:34   ` Arnd Bergmann
2019-05-15 12:25     ` Alex Elder
2019-05-15 20:50       ` Arnd Bergmann
2019-05-17 18:08     ` Alex Elder
2019-05-17 18:33       ` Arnd Bergmann
2019-05-17 18:44         ` Alex Elder
2019-05-19 17:11           ` Alex Elder [this message]
2019-05-20  9:25             ` Arnd Bergmann
2019-05-20 12:50               ` Alex Elder
2019-05-20 14:43                 ` Arnd Bergmann
2019-05-20 14:44                   ` Alex Elder
2019-05-20 16:34                     ` Evan Green
2019-05-20 16:50                       ` Alex Elder
2019-05-20 17:36                         ` Evan Green
2019-05-12  1:25 ` [PATCH 10/18] soc: qcom: ipa: IPA interface to GSI Alex Elder
2019-05-12  1:25 ` [PATCH 11/18] soc: qcom: ipa: IPA endpoints Alex Elder
2019-05-12  1:25 ` [PATCH 12/18] soc: qcom: ipa: immediate commands Alex Elder
2019-05-15  8:16   ` Arnd Bergmann
2019-05-15 12:35     ` Alex Elder
2019-05-18  0:34       ` Alex Elder
2019-05-20 14:50         ` Arnd Bergmann
2019-05-20 14:55           ` Alex Elder
2019-05-20 17:35             ` Christoph Hellwig
2019-05-12  1:25 ` [PATCH 13/18] soc: qcom: ipa: IPA network device and microcontroller Alex Elder
2019-05-15  8:21   ` Arnd Bergmann
2019-05-15 12:46     ` Alex Elder
2019-05-12  1:25 ` [PATCH 14/18] soc: qcom: ipa: AP/modem communications Alex Elder
2019-05-12  1:25 ` [PATCH 15/18] soc: qcom: ipa: support build of IPA code Alex Elder
2019-05-12  1:25 ` [PATCH 16/18] MAINTAINERS: add entry for the Qualcomm IPA driver Alex Elder
2019-05-12  1:25 ` [PATCH 17/18] arm64: dts: sdm845: add IPA information Alex Elder
2019-05-12  1:25 ` [PATCH 18/18] arm64: defconfig: enable build of IPA code Alex Elder
2019-05-15  8:23   ` Arnd Bergmann
2019-05-15 12:49     ` Alex Elder
2019-05-15 12:37 ` [PATCH 00/18] net: introduce Qualcomm IPA driver Arnd Bergmann
2019-05-15 12:52   ` Alex Elder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dcd648f2-5305-04dd-8997-be87a9961fd9@linaro.org \
    --to=elder@linaro.org \
    --cc=abhishek.esse@gmail.com \
    --cc=arnd@arndb.de \
    --cc=benchan@google.com \
    --cc=bjorn.andersson@linaro.org \
    --cc=davem@davemloft.net \
    --cc=ejcaruso@google.com \
    --cc=evgreen@chromium.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mjavid@codeaurora.org \
    --cc=syadagir@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).