* [PATCH v2 1/2] dcache: allow word-at-a-time name hashing with big-endian CPUs
@ 2013-12-12 17:40 Will Deacon
2013-12-12 17:40 ` [PATCH v2 2/2] word-at-a-time: provide generic big-endian zero_bytemask implementation Will Deacon
2013-12-12 18:38 ` [PATCH v2 1/2] dcache: allow word-at-a-time name hashing with big-endian CPUs Linus Torvalds
0 siblings, 2 replies; 4+ messages in thread
From: Will Deacon @ 2013-12-12 17:40 UTC (permalink / raw)
To: linux-kernel; +Cc: torvalds, viro, Will Deacon
When explicitly hashing the end of a string with the word-at-a-time
interface, we have to be careful which end of the word we pick up.
On big-endian CPUs, the upper-bits will contain the data we're after,
so ensure we generate our masks accordingly (and avoid hashing whatever
random junk may have been sitting after the string).
This patch adds a new dcache helper, bytemask_from_count, which creates
a mask appropriate for the CPU endianness.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
v1 -> v2: moved shifting out into a macro to avoid inline the #ifdefs
I didn't bother checking for CONFIG_DCACHE_WORD_ACCESS, since
the macros are harmless enough as they are.
fs/dcache.c | 2 +-
fs/namei.c | 7 +------
include/linux/dcache.h | 2 ++
3 files changed, 4 insertions(+), 7 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.c
index 4bdb300b16e2..6055d61811d3 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -192,7 +192,7 @@ static inline int dentry_string_cmp(const unsigned char *cs, const unsigned char
if (!tcount)
return 0;
}
- mask = ~(~0ul << tcount*8);
+ mask = bytemask_from_count(tcount);
return unlikely(!!((a ^ b) & mask));
}
diff --git a/fs/namei.c b/fs/namei.c
index c53d3a9547f9..3531deebad30 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1598,11 +1598,6 @@ static inline int nested_symlink(struct path *path, struct nameidata *nd)
* do a "get_unaligned()" if this helps and is sufficiently
* fast.
*
- * - Little-endian machines (so that we can generate the mask
- * of low bytes efficiently). Again, we *could* do a byte
- * swapping load on big-endian architectures if that is not
- * expensive enough to make the optimization worthless.
- *
* - non-CONFIG_DEBUG_PAGEALLOC configurations (so that we
* do not trap on the (extremely unlikely) case of a page
* crossing operation.
@@ -1646,7 +1641,7 @@ unsigned int full_name_hash(const unsigned char *name, unsigned int len)
if (!len)
goto done;
}
- mask = ~(~0ul << len*8);
+ mask = bytemask_from_count(len);
hash += mask & a;
done:
return fold_hash(hash);
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 57e87e749a48..bf72e9ac6de0 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -29,8 +29,10 @@ struct vfsmount;
/* The hash is always the low bits of hash_len */
#ifdef __LITTLE_ENDIAN
#define HASH_LEN_DECLARE u32 hash; u32 len;
+ #define bytemask_from_count(cnt) (~(~0ul << (cnt)*8))
#else
#define HASH_LEN_DECLARE u32 len; u32 hash;
+ #define bytemask_from_count(cnt) (~(~0ul >> (cnt)*8))
#endif
/*
--
1.8.2.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2 2/2] word-at-a-time: provide generic big-endian zero_bytemask implementation
2013-12-12 17:40 [PATCH v2 1/2] dcache: allow word-at-a-time name hashing with big-endian CPUs Will Deacon
@ 2013-12-12 17:40 ` Will Deacon
2013-12-12 18:38 ` [PATCH v2 1/2] dcache: allow word-at-a-time name hashing with big-endian CPUs Linus Torvalds
1 sibling, 0 replies; 4+ messages in thread
From: Will Deacon @ 2013-12-12 17:40 UTC (permalink / raw)
To: linux-kernel; +Cc: torvalds, viro, Will Deacon
Whilst architectures may be able to do better than this (which they can,
by simply defining their own macro), this is a generic stab at a
zero_bytemask implementation for the asm-generic, big-endian
word-at-a-time implementation.
On arm64, a clz instruction is used to implement the fls efficiently.
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
include/asm-generic/word-at-a-time.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/include/asm-generic/word-at-a-time.h b/include/asm-generic/word-at-a-time.h
index 3f21f1b72e45..d3909effd725 100644
--- a/include/asm-generic/word-at-a-time.h
+++ b/include/asm-generic/word-at-a-time.h
@@ -49,4 +49,12 @@ static inline bool has_zero(unsigned long val, unsigned long *data, const struct
return (val + c->high_bits) & ~rhs;
}
+#ifndef zero_bytemask
+#ifdef CONFIG_64BIT
+#define zero_bytemask(mask) (~0ul << fls64(mask))
+#else
+#define zero_bytemask(mask) (~0ul << fls(mask))
+#endif /* CONFIG_64BIT */
+#endif /* zero_bytemask */
+
#endif /* _ASM_WORD_AT_A_TIME_H */
--
1.8.2.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2 1/2] dcache: allow word-at-a-time name hashing with big-endian CPUs
2013-12-12 17:40 [PATCH v2 1/2] dcache: allow word-at-a-time name hashing with big-endian CPUs Will Deacon
2013-12-12 17:40 ` [PATCH v2 2/2] word-at-a-time: provide generic big-endian zero_bytemask implementation Will Deacon
@ 2013-12-12 18:38 ` Linus Torvalds
2013-12-12 23:47 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2013-12-12 18:38 UTC (permalink / raw)
To: Will Deacon, Benjamin Herrenschmidt; +Cc: Linux Kernel Mailing List, Al Viro
Ok, looks good. So I'm taking these even though it's outside the merge
window - they're small, and until you enable DCACHE_WORD_ACCESS on
big-endian they don't actually do anything.
Ben - you might want to check if this makes DCACHE_WORD_ACCESS useful
on PowerPC too. Right now that requires an efficient fls()
implementation, but I think you have "cntlz" like ARM does.
Linus
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 1/2] dcache: allow word-at-a-time name hashing with big-endian CPUs
2013-12-12 18:38 ` [PATCH v2 1/2] dcache: allow word-at-a-time name hashing with big-endian CPUs Linus Torvalds
@ 2013-12-12 23:47 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 4+ messages in thread
From: Benjamin Herrenschmidt @ 2013-12-12 23:47 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Will Deacon, Linux Kernel Mailing List, Al Viro
On Thu, 2013-12-12 at 10:38 -0800, Linus Torvalds wrote:
> Ok, looks good. So I'm taking these even though it's outside the merge
> window - they're small, and until you enable DCACHE_WORD_ACCESS on
> big-endian they don't actually do anything.
>
> Ben - you might want to check if this makes DCACHE_WORD_ACCESS useful
> on PowerPC too. Right now that requires an efficient fls()
> implementation, but I think you have "cntlz" like ARM does.
Thanks. I'll see if I can get somebody to measure it. It should be good
though.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-12-12 23:47 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-12 17:40 [PATCH v2 1/2] dcache: allow word-at-a-time name hashing with big-endian CPUs Will Deacon
2013-12-12 17:40 ` [PATCH v2 2/2] word-at-a-time: provide generic big-endian zero_bytemask implementation Will Deacon
2013-12-12 18:38 ` [PATCH v2 1/2] dcache: allow word-at-a-time name hashing with big-endian CPUs Linus Torvalds
2013-12-12 23:47 ` Benjamin Herrenschmidt
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.