From: Zhen Lei <thunder.leizhen@huawei.com>
To: Will Deacon <will@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
"Joerg Roedel" <joro@8bytes.org>,
linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
iommu <iommu@lists.linux-foundation.org>,
<linux-kernel@vger.kernel.org>
Cc: Zhen Lei <thunder.leizhen@huawei.com>,
John Garry <john.garry@huawei.com>
Subject: [PATCH v2 2/2] iommu/arm-smmu-v3: Simplify useless instructions in arm_smmu_cmdq_build_cmd()
Date: Wed, 18 Aug 2021 16:04:52 +0800 [thread overview]
Message-ID: <20210818080452.2079-3-thunder.leizhen@huawei.com> (raw)
In-Reply-To: <20210818080452.2079-1-thunder.leizhen@huawei.com>
Although the parameter 'cmd' is always passed by a local array variable,
and only this function modifies it, the compiler does not know this. The
compiler almost always reads the value of cmd[i] from memory rather than
directly using the value cached in the register. This generates many
useless instruction operations and affects the performance to some extent.
To guide the compiler for proper optimization, 'cmd' is defined as a local
array variable, marked as register, and copied to the output parameter at
a time when the function is returned.
The optimization effect can be viewed by running the "size arm-smmu-v3.o"
command.
Before:
text data bss dec hex
26954 1348 56 28358 6ec6
After:
text data bss dec hex
26762 1348 56 28166 6e06
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 01e95b56ffa07d1..7cec0c967f71d86 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -234,10 +234,12 @@ static int queue_remove_raw(struct arm_smmu_queue *q, u64 *ent)
}
/* High-level queue accessors */
-static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
+static int arm_smmu_cmdq_build_cmd(u64 *out_cmd, struct arm_smmu_cmdq_ent *ent)
{
- memset(cmd, 0, 1 << CMDQ_ENT_SZ_SHIFT);
- cmd[0] |= FIELD_PREP(CMDQ_0_OP, ent->opcode);
+ register u64 cmd[CMDQ_ENT_DWORDS];
+
+ cmd[0] = FIELD_PREP(CMDQ_0_OP, ent->opcode);
+ cmd[1] = 0;
switch (ent->opcode) {
case CMDQ_OP_TLBI_EL2_ALL:
@@ -332,6 +334,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
return -ENOENT;
}
+ out_cmd[0] = cmd[0];
+ out_cmd[1] = cmd[1];
+
return 0;
}
--
2.26.0.106.g9fadedd
WARNING: multiple messages have this Message-ID (diff)
From: Zhen Lei <thunder.leizhen@huawei.com>
To: Will Deacon <will@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
"Joerg Roedel" <joro@8bytes.org>,
linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
iommu <iommu@lists.linux-foundation.org>,
<linux-kernel@vger.kernel.org>
Subject: [PATCH v2 2/2] iommu/arm-smmu-v3: Simplify useless instructions in arm_smmu_cmdq_build_cmd()
Date: Wed, 18 Aug 2021 16:04:52 +0800 [thread overview]
Message-ID: <20210818080452.2079-3-thunder.leizhen@huawei.com> (raw)
In-Reply-To: <20210818080452.2079-1-thunder.leizhen@huawei.com>
Although the parameter 'cmd' is always passed by a local array variable,
and only this function modifies it, the compiler does not know this. The
compiler almost always reads the value of cmd[i] from memory rather than
directly using the value cached in the register. This generates many
useless instruction operations and affects the performance to some extent.
To guide the compiler for proper optimization, 'cmd' is defined as a local
array variable, marked as register, and copied to the output parameter at
a time when the function is returned.
The optimization effect can be viewed by running the "size arm-smmu-v3.o"
command.
Before:
text data bss dec hex
26954 1348 56 28358 6ec6
After:
text data bss dec hex
26762 1348 56 28166 6e06
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 01e95b56ffa07d1..7cec0c967f71d86 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -234,10 +234,12 @@ static int queue_remove_raw(struct arm_smmu_queue *q, u64 *ent)
}
/* High-level queue accessors */
-static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
+static int arm_smmu_cmdq_build_cmd(u64 *out_cmd, struct arm_smmu_cmdq_ent *ent)
{
- memset(cmd, 0, 1 << CMDQ_ENT_SZ_SHIFT);
- cmd[0] |= FIELD_PREP(CMDQ_0_OP, ent->opcode);
+ register u64 cmd[CMDQ_ENT_DWORDS];
+
+ cmd[0] = FIELD_PREP(CMDQ_0_OP, ent->opcode);
+ cmd[1] = 0;
switch (ent->opcode) {
case CMDQ_OP_TLBI_EL2_ALL:
@@ -332,6 +334,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
return -ENOENT;
}
+ out_cmd[0] = cmd[0];
+ out_cmd[1] = cmd[1];
+
return 0;
}
--
2.26.0.106.g9fadedd
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
WARNING: multiple messages have this Message-ID (diff)
From: Zhen Lei <thunder.leizhen@huawei.com>
To: Will Deacon <will@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
"Joerg Roedel" <joro@8bytes.org>,
linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
iommu <iommu@lists.linux-foundation.org>,
<linux-kernel@vger.kernel.org>
Cc: Zhen Lei <thunder.leizhen@huawei.com>,
John Garry <john.garry@huawei.com>
Subject: [PATCH v2 2/2] iommu/arm-smmu-v3: Simplify useless instructions in arm_smmu_cmdq_build_cmd()
Date: Wed, 18 Aug 2021 16:04:52 +0800 [thread overview]
Message-ID: <20210818080452.2079-3-thunder.leizhen@huawei.com> (raw)
In-Reply-To: <20210818080452.2079-1-thunder.leizhen@huawei.com>
Although the parameter 'cmd' is always passed by a local array variable,
and only this function modifies it, the compiler does not know this. The
compiler almost always reads the value of cmd[i] from memory rather than
directly using the value cached in the register. This generates many
useless instruction operations and affects the performance to some extent.
To guide the compiler for proper optimization, 'cmd' is defined as a local
array variable, marked as register, and copied to the output parameter at
a time when the function is returned.
The optimization effect can be viewed by running the "size arm-smmu-v3.o"
command.
Before:
text data bss dec hex
26954 1348 56 28358 6ec6
After:
text data bss dec hex
26762 1348 56 28166 6e06
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 01e95b56ffa07d1..7cec0c967f71d86 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -234,10 +234,12 @@ static int queue_remove_raw(struct arm_smmu_queue *q, u64 *ent)
}
/* High-level queue accessors */
-static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
+static int arm_smmu_cmdq_build_cmd(u64 *out_cmd, struct arm_smmu_cmdq_ent *ent)
{
- memset(cmd, 0, 1 << CMDQ_ENT_SZ_SHIFT);
- cmd[0] |= FIELD_PREP(CMDQ_0_OP, ent->opcode);
+ register u64 cmd[CMDQ_ENT_DWORDS];
+
+ cmd[0] = FIELD_PREP(CMDQ_0_OP, ent->opcode);
+ cmd[1] = 0;
switch (ent->opcode) {
case CMDQ_OP_TLBI_EL2_ALL:
@@ -332,6 +334,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
return -ENOENT;
}
+ out_cmd[0] = cmd[0];
+ out_cmd[1] = cmd[1];
+
return 0;
}
--
2.26.0.106.g9fadedd
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-08-18 8:05 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-18 8:04 [PATCH v2 0/2] iommu/arm-smmu-v3: Perform some simple optimizations for arm_smmu_cmdq_build_cmd() Zhen Lei
2021-08-18 8:04 ` Zhen Lei
2021-08-18 8:04 ` Zhen Lei
2021-08-18 8:04 ` [PATCH v2 1/2] iommu/arm-smmu-v3: Properly handle the return value of arm_smmu_cmdq_build_cmd() Zhen Lei
2021-08-18 8:04 ` Zhen Lei
2021-08-18 8:04 ` Zhen Lei
2021-08-18 8:04 ` Zhen Lei [this message]
2021-08-18 8:04 ` [PATCH v2 2/2] iommu/arm-smmu-v3: Simplify useless instructions in arm_smmu_cmdq_build_cmd() Zhen Lei
2021-08-18 8:04 ` Zhen Lei
2021-10-04 12:07 ` Will Deacon
2021-10-04 12:07 ` Will Deacon
2021-10-04 12:07 ` Will Deacon
2021-10-20 7:29 ` Leizhen (ThunderTown)
2021-10-20 7:29 ` Leizhen (ThunderTown)
2021-10-20 7:29 ` Leizhen (ThunderTown)
2021-10-04 12:05 ` [PATCH v2 0/2] iommu/arm-smmu-v3: Perform some simple optimizations for arm_smmu_cmdq_build_cmd() Will Deacon
2021-10-04 12:05 ` Will Deacon
2021-10-04 12:05 ` Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210818080452.2079-3-thunder.leizhen@huawei.com \
--to=thunder.leizhen@huawei.com \
--cc=iommu@lists.linux-foundation.org \
--cc=john.garry@huawei.com \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.