All of lore.kernel.org
 help / color / mirror / Atom feed
From: guoren@kernel.org
To: julien.grall@arm.com, arnd@arndb.de, linux-kernel@vger.kernel.org
Cc: linux-csky@vger.kernel.org, Guo Ren <ren_guo@c-sky.com>,
	Catalin Marinas <catalin.marinas@arm.com>
Subject: [PATCH] arm64: asid: Optimize cache_flush for SMT
Date: Mon, 24 Jun 2019 00:04:29 +0800	[thread overview]
Message-ID: <1561305869-18872-1-git-send-email-guoren@kernel.org> (raw)

From: Guo Ren <ren_guo@c-sky.com>

The hardware threads of one core could share the same TLB for SMT+SMP
system. Assume hardware threads number sequence like this:

| 0 1 2 3 | 4 5 6 7 | 8 9 a b | c d e f |
   core1     core2     core3     core4

Current algorithm seems is correct for SMT+SMP, but it'll give some
duplicate local_tlb_flush. Because one hardware threads local_tlb_flush
will also flush other hardware threads' TLB entry in one core TLB.

So we can use bitmap to reduce local_tlb_flush for SMT.

C-SKY cores don't support SMT and the patch is no benefit for C-SKY.

Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Julien Grall <julien.grall@arm.com>
---
 arch/csky/include/asm/asid.h |  4 ++++
 arch/csky/mm/asid.c          | 11 ++++++++++-
 arch/csky/mm/context.c       |  2 +-
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/csky/include/asm/asid.h b/arch/csky/include/asm/asid.h
index ac08b0f..f654492 100644
--- a/arch/csky/include/asm/asid.h
+++ b/arch/csky/include/asm/asid.h
@@ -23,6 +23,9 @@ struct asid_info
 	unsigned int		ctxt_shift;
 	/* Callback to locally flush the context. */
 	void			(*flush_cpu_ctxt_cb)(void);
+	/* To reduce duplicate tlb_flush for SMT */
+	unsigned int		harts_per_core;
+	unsigned int		harts_per_core_mask;
 };
 
 #define NUM_ASIDS(info)			(1UL << ((info)->bits))
@@ -73,6 +76,7 @@ static inline void asid_check_context(struct asid_info *info,
 
 int asid_allocator_init(struct asid_info *info,
 			u32 bits, unsigned int asid_per_ctxt,
+			unsigned int harts_per_core,
 			void (*flush_cpu_ctxt_cb)(void));
 
 #endif
diff --git a/arch/csky/mm/asid.c b/arch/csky/mm/asid.c
index b2e9147..50a983e 100644
--- a/arch/csky/mm/asid.c
+++ b/arch/csky/mm/asid.c
@@ -148,8 +148,13 @@ void asid_new_context(struct asid_info *info, atomic64_t *pasid,
 		atomic64_set(pasid, asid);
 	}
 
-	if (cpumask_test_and_clear_cpu(cpu, &info->flush_pending))
+	if (cpumask_test_cpu(cpu, &info->flush_pending)) {
+		unsigned int i;
+		unsigned int harts_base = cpu & info->harts_per_core_mask;
 		info->flush_cpu_ctxt_cb();
+		for (i = 0; i < info->harts_per_core; i++)
+			cpumask_clear_cpu(harts_base + i, &info->flush_pending);
+	}
 
 	atomic64_set(&active_asid(info, cpu), asid);
 	cpumask_set_cpu(cpu, mm_cpumask(mm));
@@ -162,15 +167,19 @@ void asid_new_context(struct asid_info *info, atomic64_t *pasid,
  * @info: Pointer to the asid allocator structure
  * @bits: Number of ASIDs available
  * @asid_per_ctxt: Number of ASIDs to allocate per-context. ASIDs are
+ * @harts_per_core: Number hardware threads per core, must be 1, 2, 4, 8, 16 ...
  * allocated contiguously for a given context. This value should be a power of
  * 2.
  */
 int asid_allocator_init(struct asid_info *info,
 			u32 bits, unsigned int asid_per_ctxt,
+			unsigned int harts_per_core,
 			void (*flush_cpu_ctxt_cb)(void))
 {
 	info->bits = bits;
 	info->ctxt_shift = ilog2(asid_per_ctxt);
+	info->harts_per_core = harts_per_core;
+	info->harts_per_core_mask = ~((1 << ilog2(harts_per_core)) - 1);
 	info->flush_cpu_ctxt_cb = flush_cpu_ctxt_cb;
 	/*
 	 * Expect allocation after rollover to fail if we don't have at least
diff --git a/arch/csky/mm/context.c b/arch/csky/mm/context.c
index 0d95bdd..b58523b 100644
--- a/arch/csky/mm/context.c
+++ b/arch/csky/mm/context.c
@@ -30,7 +30,7 @@ static int asids_init(void)
 {
 	BUG_ON(((1 << CONFIG_CPU_ASID_BITS) - 1) <= num_possible_cpus());
 
-	if (asid_allocator_init(&asid_info, CONFIG_CPU_ASID_BITS, 1,
+	if (asid_allocator_init(&asid_info, CONFIG_CPU_ASID_BITS, 1, 1,
 				asid_flush_cpu_ctxt))
 		panic("Unable to initialize ASID allocator for %lu ASIDs\n",
 		      NUM_ASIDS(&asid_info));
-- 
2.7.4


             reply	other threads:[~2019-06-23 16:04 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-23 16:04 guoren [this message]
2019-06-24 11:40 ` [PATCH] arm64: asid: Optimize cache_flush for SMT Mark Rutland
2019-06-24 12:25   ` Guo Ren
2019-06-25  7:25   ` Palmer Dabbelt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1561305869-18872-1-git-send-email-guoren@kernel.org \
    --to=guoren@kernel.org \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=julien.grall@arm.com \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ren_guo@c-sky.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.