* [RFC][PATCH] apparent big-endian bugs in dwc-xlgmac @ 2017-12-10 4:53 Al Viro 2017-12-11 4:33 ` Jie Deng 0 siblings, 1 reply; 26+ messages in thread From: Al Viro @ 2017-12-10 4:53 UTC (permalink / raw) To: Jie Deng; +Cc: netdev In xlgmac_dev_xmit(): /* Mark it as a CONTEXT descriptor */ dma_desc->desc3 = XLGMAC_SET_REG_BITS_LE( dma_desc->desc3, TX_CONTEXT_DESC3_CTXT_POS, TX_CONTEXT_DESC3_CTXT_LEN, 1); Looking at XLGMAC_SET_REG_BITS_LE() we see this: #define XLGMAC_SET_REG_BITS_LE(var, pos, len, val) ({ \ typeof(var) _var = (var); \ typeof(pos) _pos = (pos); \ typeof(len) _len = (len); \ typeof(val) _val = (val); \ _val = (_val << _pos) & GENMASK(_pos + _len - 1, _pos); \ _var = (_var & ~GENMASK(_pos + _len - 1, _pos)) | _val; \ cpu_to_le32(_var); \ }) That thing assumes var to be host-endian and has a little-endian result. Unfortunately, we feed it a little-endian and store the result back into the same place. That might work if the original values *was* host-endian and we wanted to end up with little-endian. However, that is immediately followed by /* Indicate this descriptor contains the MSS */ dma_desc->desc3 = XLGMAC_SET_REG_BITS_LE( dma_desc->desc3, TX_CONTEXT_DESC3_TCMSSV_POS, TX_CONTEXT_DESC3_TCMSSV_LEN, 1); where we operate on the now definitely little-endian value. That really can't be right. I don't have the hardware in question, so I can't test that, but it smells like this needs something like diff below, making XLGMAC_SET_REG_BITS_LE take le32 and return le32. GET side of things is le32 -> u32; definition looks correct, but slightly misannotated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> --- diff --git a/drivers/net/ethernet/synopsys/dwc-xlgmac.h b/drivers/net/ethernet/synopsys/dwc-xlgmac.h index cab3e40a86b9..e95c4c250e16 100644 --- a/drivers/net/ethernet/synopsys/dwc-xlgmac.h +++ b/drivers/net/ethernet/synopsys/dwc-xlgmac.h @@ -106,7 +106,7 @@ #define XLGMAC_GET_REG_BITS_LE(var, pos, len) ({ \ typeof(pos) _pos = (pos); \ typeof(len) _len = (len); \ - typeof(var) _var = le32_to_cpu((var)); \ + u32 _var = le32_to_cpu((var)); \ ((_var) & GENMASK(_pos + _len - 1, _pos)) >> (_pos); \ }) @@ -125,8 +125,8 @@ typeof(len) _len = (len); \ typeof(val) _val = (val); \ _val = (_val << _pos) & GENMASK(_pos + _len - 1, _pos); \ - _var = (_var & ~GENMASK(_pos + _len - 1, _pos)) | _val; \ - cpu_to_le32(_var); \ + (_var & ~cpu_to_le32(GENMASK(_pos + _len - 1, _pos))) | \ + cpu_to_le32(_val); \ }) struct xlgmac_pdata; ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] apparent big-endian bugs in dwc-xlgmac 2017-12-10 4:53 [RFC][PATCH] apparent big-endian bugs in dwc-xlgmac Al Viro @ 2017-12-11 4:33 ` Jie Deng 2017-12-11 5:05 ` Al Viro 0 siblings, 1 reply; 26+ messages in thread From: Jie Deng @ 2017-12-11 4:33 UTC (permalink / raw) To: Al Viro, Jie Deng; +Cc: netdev Hi AI Viro, On 2017/12/10 12:53, Al Viro wrote: > In xlgmac_dev_xmit(): > > /* Mark it as a CONTEXT descriptor */ > dma_desc->desc3 = XLGMAC_SET_REG_BITS_LE( > dma_desc->desc3, > TX_CONTEXT_DESC3_CTXT_POS, > TX_CONTEXT_DESC3_CTXT_LEN, > 1); > > > > Looking at XLGMAC_SET_REG_BITS_LE() we see this: > #define XLGMAC_SET_REG_BITS_LE(var, pos, len, val) ({ \ > typeof(var) _var = (var); \ > typeof(pos) _pos = (pos); \ > typeof(len) _len = (len); \ > typeof(val) _val = (val); \ > _val = (_val << _pos) & GENMASK(_pos + _len - 1, _pos); \ > _var = (_var & ~GENMASK(_pos + _len - 1, _pos)) | _val; \ > cpu_to_le32(_var); \ > }) > > That thing assumes var to be host-endian and has a little-endian result. > Unfortunately, we feed it a little-endian and store the result back into > the same place. That might work if the original values *was* host-endian > and we wanted to end up with little-endian. However, that is immediately > followed by > /* Indicate this descriptor contains the MSS */ > dma_desc->desc3 = XLGMAC_SET_REG_BITS_LE( > dma_desc->desc3, > TX_CONTEXT_DESC3_TCMSSV_POS, > TX_CONTEXT_DESC3_TCMSSV_LEN, > 1); > > where we operate on the now definitely little-endian value. That really > can't be right. I don't have the hardware in question, so I can't test > that, but it smells like this needs something like diff below, making > XLGMAC_SET_REG_BITS_LE take le32 and return le32. GET side of things > is le32 -> u32; definition looks correct, but slightly misannotated. > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> > --- > diff --git a/drivers/net/ethernet/synopsys/dwc-xlgmac.h b/drivers/net/ethernet/synopsys/dwc-xlgmac.h > index cab3e40a86b9..e95c4c250e16 100644 > --- a/drivers/net/ethernet/synopsys/dwc-xlgmac.h > +++ b/drivers/net/ethernet/synopsys/dwc-xlgmac.h > @@ -106,7 +106,7 @@ > #define XLGMAC_GET_REG_BITS_LE(var, pos, len) ({ \ > typeof(pos) _pos = (pos); \ > typeof(len) _len = (len); \ > - typeof(var) _var = le32_to_cpu((var)); \ > + u32 _var = le32_to_cpu((var)); \ > ((_var) & GENMASK(_pos + _len - 1, _pos)) >> (_pos); \ > }) > > @@ -125,8 +125,8 @@ > typeof(len) _len = (len); \ > typeof(val) _val = (val); \ > _val = (_val << _pos) & GENMASK(_pos + _len - 1, _pos); \ > - _var = (_var & ~GENMASK(_pos + _len - 1, _pos)) | _val; \ > - cpu_to_le32(_var); \ > + (_var & ~cpu_to_le32(GENMASK(_pos + _len - 1, _pos))) | \ > + cpu_to_le32(_val); \ > }) > > struct xlgmac_pdata; Make sense. But I think what you want is fix as follows. Right ? @@ -125,8 +125,8 @@ typeof(len) _len = (len); \ - typeof(val) _val = (val); \ + u32 _var = le32_to_cpu((var)); \ _val = (_val << _pos) & GENMASK(_pos + _len - 1, _pos); \ - _var = (_var & ~GENMASK(_pos + _len - 1, _pos)) | _val; \ - cpu_to_le32(_var); \ + (cpu_to_le32(_var) & ~cpu_to_le32(GENMASK(_pos + _len - 1, _pos))) | \ + cpu_to_le32(_val); \ }) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] apparent big-endian bugs in dwc-xlgmac 2017-12-11 4:33 ` Jie Deng @ 2017-12-11 5:05 ` Al Viro 2017-12-11 5:38 ` Al Viro 2017-12-11 6:18 ` [RFC][PATCH] apparent big-endian bugs in dwc-xlgmac Jie Deng 0 siblings, 2 replies; 26+ messages in thread From: Al Viro @ 2017-12-11 5:05 UTC (permalink / raw) To: Jie Deng; +Cc: netdev On Mon, Dec 11, 2017 at 12:33:42PM +0800, Jie Deng wrote: > Hi AI Viro, > > @@ -125,8 +125,8 @@ > > typeof(len) _len = (len); \ > > typeof(val) _val = (val); \ > > _val = (_val << _pos) & GENMASK(_pos + _len - 1, _pos); \ > > - _var = (_var & ~GENMASK(_pos + _len - 1, _pos)) | _val; \ > > - cpu_to_le32(_var); \ > > + (_var & ~cpu_to_le32(GENMASK(_pos + _len - 1, _pos))) | \ > > + cpu_to_le32(_val); \ > > }) > > > > struct xlgmac_pdata; > > Make sense. But I think what you want is fix as follows. Right ? > > @@ -125,8 +125,8 @@ > typeof(len) _len = (len); \ > - typeof(val) _val = (val); \ > + u32 _var = le32_to_cpu((var)); \ > _val = (_val << _pos) & GENMASK(_pos + _len - 1, _pos); \ > - _var = (_var & ~GENMASK(_pos + _len - 1, _pos)) | _val; \ > - cpu_to_le32(_var); \ > + (cpu_to_le32(_var) & ~cpu_to_le32(GENMASK(_pos + _len - 1, _pos))) | \ > + cpu_to_le32(_val); \ > }) What for? Sure, this variant will work, but why bother with a = le32_to_cpu(b); (cpu_to_le32(a) & ....) | .... and how is that better than (b & ...) | ... IDGI... Mind you, I'm not sure if there is any point keeping _var in that thing, seeing that we use var only once - might be better off with ((var) & ~cpu_to_le32(GENMASK(_pos + _len - 1, _pos))) | \ cpu_to_le32(_val); \ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] apparent big-endian bugs in dwc-xlgmac 2017-12-11 5:05 ` Al Viro @ 2017-12-11 5:38 ` Al Viro 2017-12-11 6:46 ` Jie Deng 2017-12-11 15:54 ` [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() Al Viro 2017-12-11 6:18 ` [RFC][PATCH] apparent big-endian bugs in dwc-xlgmac Jie Deng 1 sibling, 2 replies; 26+ messages in thread From: Al Viro @ 2017-12-11 5:38 UTC (permalink / raw) To: Jie Deng; +Cc: netdev, linux-kernel, Linus Torvalds On Mon, Dec 11, 2017 at 05:05:20AM +0000, Al Viro wrote: > What for? Sure, this variant will work, but why bother with > a = le32_to_cpu(b); > (cpu_to_le32(a) & ....) | .... > and how is that better than > (b & ...) | ... > > IDGI... Mind you, I'm not sure if there is any point keeping _var in that thing, > seeing that we use var only once - might be better off with > ((var) & ~cpu_to_le32(GENMASK(_pos + _len - 1, _pos))) | \ > cpu_to_le32(_val); \ FWIW, seeing how many drivers end up open-coding that, I'm rather tempted to add to linux/bitops.h or linux/byteorder/generic.h the following: static inline __le16 le16_replace_bits(__le16 old, u16 v, int bit, int size) { __le16 mask = cpu_to_le16(GENMASK(bit + size - 1, bit)); return (old & ~mask) | (cpu_to_le16(v << bit) & mask); } static inline __le32 le32_replace_bits(__le32 old, u32 v, int bit, int size) { __le32 mask = cpu_to_le32(GENMASK(bit + size - 1, bit)); return (old & ~mask) | (cpu_to_le32(v << bit) & mask); } static inline __le64 le64_replace_bits(__le64 old, u64 v, int bit, int size) { __le64 mask = cpu_to_le64(GENMASK_ULL(bit + size - 1, bit)); return (old & ~mask) | (cpu_to_le64(v << bit) & mask); } static inline __be16 be16_replace_bits(__be16 old, u16 v, int bit, int size) { __be16 mask = cpu_to_be16(GENMASK(bit + size - 1, bit)); return (old & ~mask) | (cpu_to_be16(v << bit) & mask); } static inline __be32 be32_replace_bits(__be32 old, u32 v, int bit, int size) { __be32 mask = cpu_to_be32(GENMASK(bit + size - 1, bit)); return (old & ~mask) | (cpu_to_be32(v << bit) & mask); } static inline __be64 be64_replace_bits(__be64 old, u64 v, int bit, int size) { __be64 mask = cpu_to_be64(GENMASK_ULL(bit + size - 1, bit)); return (old & ~mask) | (cpu_to_be64(v << bit) & mask); } static inline u16 le16_get_bits(__le16 v, int bit, int size) { return (le16_to_cpu(v) >> bit) & (BIT(size) - 1); } static inline u32 le32_get_bits(__le32 v, int bit, int size) { return (le32_to_cpu(v) >> bit) & (BIT(size) - 1); } static inline u64 le64_get_bits(__le64 v, int bit, int size) { return (le64_to_cpu(v) >> bit) & (BIT_ULL(size) - 1); } static inline u16 be16_get_bits(__be16 v, int bit, int size) { return (be16_to_cpu(v) >> bit) & (BIT(size) - 1); } static inline u32 be32_get_bits(__be32 v, int bit, int size) { return (be32_to_cpu(v) >> bit) & (BIT(size) - 1); } static inline u64 be64_get_bits(__be64 v, int bit, int size) { return (be64_to_cpu(v) >> bit) & (BIT_ULL(size) - 1); } and let drivers use those... ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] apparent big-endian bugs in dwc-xlgmac 2017-12-11 5:38 ` Al Viro @ 2017-12-11 6:46 ` Jie Deng 2017-12-11 15:54 ` [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() Al Viro 1 sibling, 0 replies; 26+ messages in thread From: Jie Deng @ 2017-12-11 6:46 UTC (permalink / raw) To: Al Viro, Jie Deng; +Cc: netdev, linux-kernel, Linus Torvalds On 2017/12/11 13:38, Al Viro wrote: > On Mon, Dec 11, 2017 at 05:05:20AM +0000, Al Viro wrote: > >> What for? Sure, this variant will work, but why bother with >> a = le32_to_cpu(b); >> (cpu_to_le32(a) & ....) | .... >> and how is that better than >> (b & ...) | ... >> >> IDGI... Mind you, I'm not sure if there is any point keeping _var in that thing, >> seeing that we use var only once - might be better off with >> ((var) & ~cpu_to_le32(GENMASK(_pos + _len - 1, _pos))) | \ >> cpu_to_le32(_val); \ > FWIW, seeing how many drivers end up open-coding that, I'm rather tempted to > add to linux/bitops.h or linux/byteorder/generic.h the following: > > static inline __le16 le16_replace_bits(__le16 old, u16 v, int bit, int size) > { > __le16 mask = cpu_to_le16(GENMASK(bit + size - 1, bit)); > return (old & ~mask) | (cpu_to_le16(v << bit) & mask); > } > > static inline __le32 le32_replace_bits(__le32 old, u32 v, int bit, int size) > { > __le32 mask = cpu_to_le32(GENMASK(bit + size - 1, bit)); > return (old & ~mask) | (cpu_to_le32(v << bit) & mask); > } > > static inline __le64 le64_replace_bits(__le64 old, u64 v, int bit, int size) > { > __le64 mask = cpu_to_le64(GENMASK_ULL(bit + size - 1, bit)); > return (old & ~mask) | (cpu_to_le64(v << bit) & mask); > } > > static inline __be16 be16_replace_bits(__be16 old, u16 v, int bit, int size) > { > __be16 mask = cpu_to_be16(GENMASK(bit + size - 1, bit)); > return (old & ~mask) | (cpu_to_be16(v << bit) & mask); > } > > static inline __be32 be32_replace_bits(__be32 old, u32 v, int bit, int size) > { > __be32 mask = cpu_to_be32(GENMASK(bit + size - 1, bit)); > return (old & ~mask) | (cpu_to_be32(v << bit) & mask); > } > > static inline __be64 be64_replace_bits(__be64 old, u64 v, int bit, int size) > { > __be64 mask = cpu_to_be64(GENMASK_ULL(bit + size - 1, bit)); > return (old & ~mask) | (cpu_to_be64(v << bit) & mask); > } > > static inline u16 le16_get_bits(__le16 v, int bit, int size) > { > return (le16_to_cpu(v) >> bit) & (BIT(size) - 1); > } > > static inline u32 le32_get_bits(__le32 v, int bit, int size) > { > return (le32_to_cpu(v) >> bit) & (BIT(size) - 1); > } > > static inline u64 le64_get_bits(__le64 v, int bit, int size) > { > return (le64_to_cpu(v) >> bit) & (BIT_ULL(size) - 1); > } > > static inline u16 be16_get_bits(__be16 v, int bit, int size) > { > return (be16_to_cpu(v) >> bit) & (BIT(size) - 1); > } > > static inline u32 be32_get_bits(__be32 v, int bit, int size) > { > return (be32_to_cpu(v) >> bit) & (BIT(size) - 1); > } > > static inline u64 be64_get_bits(__be64 v, int bit, int size) > { > return (be64_to_cpu(v) >> bit) & (BIT_ULL(size) - 1); > } > > and let drivers use those... Sounds good. As a driver developer, I'm happy to see this. ^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-11 5:38 ` Al Viro 2017-12-11 6:46 ` Jie Deng @ 2017-12-11 15:54 ` Al Viro 2017-12-12 4:02 ` Jakub Kicinski 1 sibling, 1 reply; 26+ messages in thread From: Al Viro @ 2017-12-11 15:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: netdev, linux-kernel A lot of drivers are open-coding the "replace these bits in __be32 with the following value" kind of primitives. Let's add them to byteorder.h. Primitives: {be,le}{16,32,64}_replace_bits(old, v, bit, nbits) {be,le}{16,32,64}_get_bits(val, bit, nbits) Essentially, it gives helpers for work with bitfields in fixed-endian. Suppose we have e.g. a little-endian 32bit value with fixed layout; expressing that as a bitfield would go like struct foo { unsigned foo:4; /* bits 0..3 */ unsigned :2; unsigned bar:12; /* bits 6..17 */ unsigned baz:14; /* bits 18..31 */ } Even for host-endian it doesn't work all that well - you end up with ifdefs in structure definition and generated code stinks. For fixed-endian it gets really painful, and people tend to use explicit shift-and-mask kind of macros for accessing the fields (and often enough get the endianness conversions wrong, at that). With these primitives struct foo v <=> __le32 v v.foo = i ? 1 : 2 <=> v = le32_replace_bits(v, i ? 1 : 2, 0, 4) f(4 + v.baz) <=> f(4 + le32_get_bits(v, 18, 14)) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> --- diff --git a/include/linux/byteorder/generic.h b/include/linux/byteorder/generic.h index 451aaa0786ae..d8f169a7104a 100644 --- a/include/linux/byteorder/generic.h +++ b/include/linux/byteorder/generic.h @@ -187,4 +187,26 @@ static inline void be32_to_cpu_array(u32 *dst, const __be32 *src, size_t len) dst[i] = be32_to_cpu(src[i]); } +#define ____MASK(bit, nbits) ((((1ULL << ((nbits) - 1)) << 1) - 1) << (bit)) +#define ____MAKE_OP(type,base) \ +static inline __##type type##_replace_bits(__##type old, \ + base val, int bit, int nbits) \ +{ \ + __##type mask = cpu_to_##type(____MASK(bit, nbits)); \ + return (old & ~mask) | (cpu_to_##type(val << bit) & mask); \ +} \ +static inline base type##_get_bits(__##type val, int bit, int nbits) \ +{ \ + return (type##_to_cpu(val) >> bit) & ____MASK(0, nbits); \ +} + +____MAKE_OP(le16,u16) +____MAKE_OP(le32,u32) +____MAKE_OP(le64,u64) +____MAKE_OP(be16,u16) +____MAKE_OP(be32,u32) +____MAKE_OP(be64,u64) +#undef ____MAKE_OP +#undef ____MASK + #endif /* _LINUX_BYTEORDER_GENERIC_H */ ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-11 15:54 ` [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() Al Viro @ 2017-12-12 4:02 ` Jakub Kicinski 2017-12-12 6:20 ` Al Viro 0 siblings, 1 reply; 26+ messages in thread From: Jakub Kicinski @ 2017-12-12 4:02 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, netdev, linux-kernel On Mon, 11 Dec 2017 15:54:22 +0000, Al Viro wrote: > Essentially, it gives helpers for work with bitfields in fixed-endian. > Suppose we have e.g. a little-endian 32bit value with fixed layout; > expressing that as a bitfield would go like > struct foo { > unsigned foo:4; /* bits 0..3 */ > unsigned :2; > unsigned bar:12; /* bits 6..17 */ > unsigned baz:14; /* bits 18..31 */ > } > Even for host-endian it doesn't work all that well - you end up with > ifdefs in structure definition and generated code stinks. For fixed-endian > it gets really painful, and people tend to use explicit shift-and-mask > kind of macros for accessing the fields (and often enough get the > endianness conversions wrong, at that). With these primitives > > struct foo v <=> __le32 v > v.foo = i ? 1 : 2 <=> v = le32_replace_bits(v, i ? 1 : 2, 0, 4) > f(4 + v.baz) <=> f(4 + le32_get_bits(v, 18, 14)) Looks very useful. The [start bit, size] pair may not land itself too nicely to creating defines, though. Which is why in include/linux/bitfield.h we tried to use a shifted mask and work backwards from that single value what the start and size are. commit 3e9b3112ec74 ("add basic register-field manipulation macros") has the description. Could a similar trick perhaps be applicable here? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-12 4:02 ` Jakub Kicinski @ 2017-12-12 6:20 ` Al Viro 2017-12-12 19:45 ` Al Viro 0 siblings, 1 reply; 26+ messages in thread From: Al Viro @ 2017-12-12 6:20 UTC (permalink / raw) To: Jakub Kicinski; +Cc: Linus Torvalds, netdev, linux-kernel On Mon, Dec 11, 2017 at 08:02:24PM -0800, Jakub Kicinski wrote: > On Mon, 11 Dec 2017 15:54:22 +0000, Al Viro wrote: > > Essentially, it gives helpers for work with bitfields in fixed-endian. > > Suppose we have e.g. a little-endian 32bit value with fixed layout; > > expressing that as a bitfield would go like > > struct foo { > > unsigned foo:4; /* bits 0..3 */ > > unsigned :2; > > unsigned bar:12; /* bits 6..17 */ > > unsigned baz:14; /* bits 18..31 */ > > } > > Even for host-endian it doesn't work all that well - you end up with > > ifdefs in structure definition and generated code stinks. For fixed-endian > > it gets really painful, and people tend to use explicit shift-and-mask > > kind of macros for accessing the fields (and often enough get the > > endianness conversions wrong, at that). With these primitives > > > > struct foo v <=> __le32 v > > v.foo = i ? 1 : 2 <=> v = le32_replace_bits(v, i ? 1 : 2, 0, 4) > > f(4 + v.baz) <=> f(4 + le32_get_bits(v, 18, 14)) > > Looks very useful. The [start bit, size] pair may not land itself > too nicely to creating defines, though. Which is why in > include/linux/bitfield.h we tried to use a shifted mask and work > backwards from that single value what the start and size are. commit > 3e9b3112ec74 ("add basic register-field manipulation macros") has the > description. Could a similar trick perhaps be applicable here? Umm... What's wrong with #define FIELD_FOO 0,4 #define FIELD_BAR 6,12 #define FIELD_BAZ 18,14 A macro can bloody well expand to any sequence of tokens - le32_get_bits(v, FIELD_BAZ) will become le32_get_bits(v, 18, 14) just fine. What's the problem with that? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-12 6:20 ` Al Viro @ 2017-12-12 19:45 ` Al Viro 2017-12-12 20:04 ` Jakub Kicinski 0 siblings, 1 reply; 26+ messages in thread From: Al Viro @ 2017-12-12 19:45 UTC (permalink / raw) To: Jakub Kicinski; +Cc: Linus Torvalds, netdev, linux-kernel On Tue, Dec 12, 2017 at 06:20:02AM +0000, Al Viro wrote: > Umm... What's wrong with > > #define FIELD_FOO 0,4 > #define FIELD_BAR 6,12 > #define FIELD_BAZ 18,14 > > A macro can bloody well expand to any sequence of tokens - le32_get_bits(v, FIELD_BAZ) > will become le32_get_bits(v, 18, 14) just fine. What's the problem with that? FWIW, if you want to use the mask, __builtin_ffsll() is not the only way to do it - you don't need the shift. Multiplier would do just as well, and that can be had easier. If mask = (2*a + 1)<<n = ((2*a)<<n) ^ (1<<n), then mask - 1 = ((2*a) << n) + ((1<<n) - 1) = ((2*n) << n) ^ ((1<<n) - 1) mask ^ (mask - 1) = (1<<n) + ((1<<n) - 1) and mask & (mask ^ (mask - 1)) = 1<<n. IOW, with static __always_inline u64 mask_to_multiplier(u64 mask) { return mask & (mask ^ (mask - 1)); } we could do static __always_inline __le64 le64_replace_bits(__le64 old, u64 v, u64 mask) { __le64 m = cpu_to_le64(mask); return (old & ~m) | (cpu_to_le64(v * mask_to_multiplier(mask)) & m); } static __always_inline u64 le64_get_bits(__le64 v, u64 mask) { return (le64_to_cpu(v) & mask) / mask_to_multiplier(mask); } etc. Compiler will turn those into shifts... I can live with either calling conventions. Comments? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-12 19:45 ` Al Viro @ 2017-12-12 20:04 ` Jakub Kicinski 2017-12-12 23:48 ` Al Viro 0 siblings, 1 reply; 26+ messages in thread From: Jakub Kicinski @ 2017-12-12 20:04 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, netdev, linux-kernel On Tue, 12 Dec 2017 19:45:32 +0000, Al Viro wrote: > On Tue, Dec 12, 2017 at 06:20:02AM +0000, Al Viro wrote: > > > Umm... What's wrong with > > > > #define FIELD_FOO 0,4 > > #define FIELD_BAR 6,12 > > #define FIELD_BAZ 18,14 > > > > A macro can bloody well expand to any sequence of tokens - le32_get_bits(v, FIELD_BAZ) > > will become le32_get_bits(v, 18, 14) just fine. What's the problem with that? > > FWIW, if you want to use the mask, __builtin_ffsll() is not the only way to do > it - you don't need the shift. Multiplier would do just as well, and that can > be had easier. If mask = (2*a + 1)<<n = ((2*a)<<n) ^ (1<<n), then > mask - 1 = ((2*a) << n) + ((1<<n) - 1) = ((2*n) << n) ^ ((1<<n) - 1) > mask ^ (mask - 1) = (1<<n) + ((1<<n) - 1) > and > mask & (mask ^ (mask - 1)) = 1<<n. > > IOW, with > > static __always_inline u64 mask_to_multiplier(u64 mask) > { > return mask & (mask ^ (mask - 1)); > } > > we could do > > static __always_inline __le64 le64_replace_bits(__le64 old, u64 v, u64 mask) > { > __le64 m = cpu_to_le64(mask); > return (old & ~m) | (cpu_to_le64(v * mask_to_multiplier(mask)) & m); > } > > static __always_inline u64 le64_get_bits(__le64 v, u64 mask) > { > return (le64_to_cpu(v) & mask) / mask_to_multiplier(mask); > } > > etc. Compiler will turn those into shifts... I can live with either calling > conventions. > > Comments? Very nice! The compilation-time check if the value can fit in a field covered by the mask (if they're both known) did help me catch bugs early a few times over the years, so if it could be preserved we can maybe even drop the FIELD_* macros and just use this approach? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-12 20:04 ` Jakub Kicinski @ 2017-12-12 23:48 ` Al Viro 2017-12-12 23:59 ` Jakub Kicinski 0 siblings, 1 reply; 26+ messages in thread From: Al Viro @ 2017-12-12 23:48 UTC (permalink / raw) To: Jakub Kicinski; +Cc: Linus Torvalds, netdev, linux-kernel On Tue, Dec 12, 2017 at 12:04:09PM -0800, Jakub Kicinski wrote: > > static __always_inline u64 mask_to_multiplier(u64 mask) > > { > > return mask & (mask ^ (mask - 1)); > > } D'oh. Even simpler than that, of course - static __always_inline u64 mask_to_multiplier(u64 mask) { return mask & -mask; } > Very nice! The compilation-time check if the value can fit in a field > covered by the mask (if they're both known) did help me catch bugs > early a few times over the years, so if it could be preserved we can > maybe even drop the FIELD_* macros and just use this approach? Umm... Something like this, perhaps? Same bunch, plus u{16,32,64}_... variants for host-endian. Adding sanity check on mask is also not hard, but I don't know how useful it actually is... diff --git a/include/linux/byteorder/generic.h b/include/linux/byteorder/generic.h index 451aaa0786ae..a032de9aa03d 100644 --- a/include/linux/byteorder/generic.h +++ b/include/linux/byteorder/generic.h @@ -187,4 +187,36 @@ static inline void be32_to_cpu_array(u32 *dst, const __be32 *src, size_t len) dst[i] = be32_to_cpu(src[i]); } +extern void __compiletime_error("value doesn't fit into mask") +__field_overflow(void); +static __always_inline u64 mask_to_multiplier(u64 mask) +{ + return mask & -mask; +} + +#define ____MAKE_OP(type,base,to,from) \ +static __always_inline __##type type##_replace_bits(__##type old, \ + base val, base mask) \ +{ \ + __##type m = to(mask); \ + if (__builtin_constant_p(val) && \ + (val & ~(mask/mask_to_multiplier(mask)))) \ + __field_overflow(); \ + return (old & ~m) | \ + (to(val * mask_to_multiplier(mask)) & m); \ +} \ +static __always_inline base type##_get_bits(__##type v, base mask) \ +{ \ + return (from(v) & mask)/mask_to_multiplier(mask); \ +} +#define __MAKE_OP(size) \ + ____MAKE_OP(le##size,u##size,cpu_to_le##size,le##size##_to_cpu) \ + ____MAKE_OP(be##size,u##size,cpu_to_be##size,be##size##_to_cpu) \ + ____MAKE_OP(u##size,u##size,,) +__MAKE_OP(16) +__MAKE_OP(32) +__MAKE_OP(64) +#undef __MAKE_OP +#undef ____MAKE_OP + #endif /* _LINUX_BYTEORDER_GENERIC_H */ ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-12 23:48 ` Al Viro @ 2017-12-12 23:59 ` Jakub Kicinski 2017-12-13 0:36 ` Al Viro 0 siblings, 1 reply; 26+ messages in thread From: Jakub Kicinski @ 2017-12-12 23:59 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, netdev, linux-kernel On Tue, 12 Dec 2017 23:48:56 +0000, Al Viro wrote: > On Tue, Dec 12, 2017 at 12:04:09PM -0800, Jakub Kicinski wrote: > > > > static __always_inline u64 mask_to_multiplier(u64 mask) > > > { > > > return mask & (mask ^ (mask - 1)); > > > } > > D'oh. Even simpler than that, of course - > > static __always_inline u64 mask_to_multiplier(u64 mask) > { > return mask & -mask; > } > > > Very nice! The compilation-time check if the value can fit in a field > > covered by the mask (if they're both known) did help me catch bugs > > early a few times over the years, so if it could be preserved we can > > maybe even drop the FIELD_* macros and just use this approach? > > Umm... Something like this, perhaps? Same bunch, plus u{16,32,64}_... > variants for host-endian. Adding sanity check on mask is also not > hard, but I don't know how useful it actually is... Sanity checking the mask is not that useful in my experience. > diff --git a/include/linux/byteorder/generic.h b/include/linux/byteorder/generic.h > index 451aaa0786ae..a032de9aa03d 100644 > --- a/include/linux/byteorder/generic.h > +++ b/include/linux/byteorder/generic.h > @@ -187,4 +187,36 @@ static inline void be32_to_cpu_array(u32 *dst, const __be32 *src, size_t len) > dst[i] = be32_to_cpu(src[i]); > } > > +extern void __compiletime_error("value doesn't fit into mask") > +__field_overflow(void); > +static __always_inline u64 mask_to_multiplier(u64 mask) > +{ > + return mask & -mask; > +} > + > +#define ____MAKE_OP(type,base,to,from) \ > +static __always_inline __##type type##_replace_bits(__##type old, \ > + base val, base mask) \ > +{ \ > + __##type m = to(mask); \ > + if (__builtin_constant_p(val) && \ Is the lack of a __builtin_constant_p(mask) test intentional? Sometimes the bitfield is a packed array and people may have a helper to which only the mask is passed as non-constant and the value is implied by the helper, thus constant. > + (val & ~(mask/mask_to_multiplier(mask)))) \ > + __field_overflow(); \ > + return (old & ~m) | \ > + (to(val * mask_to_multiplier(mask)) & m); \ > +} \ > +static __always_inline base type##_get_bits(__##type v, base mask) \ > +{ \ > + return (from(v) & mask)/mask_to_multiplier(mask); \ > +} > +#define __MAKE_OP(size) \ > + ____MAKE_OP(le##size,u##size,cpu_to_le##size,le##size##_to_cpu) \ > + ____MAKE_OP(be##size,u##size,cpu_to_be##size,be##size##_to_cpu) \ > + ____MAKE_OP(u##size,u##size,,) > +__MAKE_OP(16) > +__MAKE_OP(32) > +__MAKE_OP(64) > +#undef __MAKE_OP > +#undef ____MAKE_OP > + > #endif /* _LINUX_BYTEORDER_GENERIC_H */ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-12 23:59 ` Jakub Kicinski @ 2017-12-13 0:36 ` Al Viro 2017-12-13 1:04 ` Jakub Kicinski 0 siblings, 1 reply; 26+ messages in thread From: Al Viro @ 2017-12-13 0:36 UTC (permalink / raw) To: Jakub Kicinski; +Cc: Linus Torvalds, netdev, linux-kernel On Tue, Dec 12, 2017 at 03:59:33PM -0800, Jakub Kicinski wrote: > > +static __always_inline __##type type##_replace_bits(__##type old, \ > > + base val, base mask) \ > > +{ \ > > + __##type m = to(mask); \ > > + if (__builtin_constant_p(val) && \ > > Is the lack of a __builtin_constant_p(mask) test intentional? Sometimes > the bitfield is a packed array and people may have a helper to which > only the mask is passed as non-constant and the value is implied by the > helper, thus constant. If the mask in non-constant, we probably shouldn't be using that at all; could you show a real-world example where that would be the case? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-13 0:36 ` Al Viro @ 2017-12-13 1:04 ` Jakub Kicinski 2017-12-13 1:30 ` Al Viro 0 siblings, 1 reply; 26+ messages in thread From: Jakub Kicinski @ 2017-12-13 1:04 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, netdev, linux-kernel On Wed, 13 Dec 2017 00:36:59 +0000, Al Viro wrote: > On Tue, Dec 12, 2017 at 03:59:33PM -0800, Jakub Kicinski wrote: > > > +static __always_inline __##type type##_replace_bits(__##type old, \ > > > + base val, base mask) \ > > > +{ \ > > > + __##type m = to(mask); \ > > > + if (__builtin_constant_p(val) && \ > > > > Is the lack of a __builtin_constant_p(mask) test intentional? Sometimes > > the bitfield is a packed array and people may have a helper to which > > only the mask is passed as non-constant and the value is implied by the > > helper, thus constant. > > If the mask in non-constant, we probably shouldn't be using that at all; > could you show a real-world example where that would be the case? FIELD_* macros explicitly forbid this, since the code would be... suboptimal with the runtime ffsl. Real life examples are the hackish macro NFP_ETH_SET_BIT_CONFIG() in drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c I remember there was also some Renesas code.. maybe this: https://patchwork.kernel.org/patch/9881279/ it looks like cpg_z_clk_recalc_rate() and cpg_z2_clk_recalc_rate() only differ in mask. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-13 1:04 ` Jakub Kicinski @ 2017-12-13 1:30 ` Al Viro 2017-12-13 1:35 ` Jakub Kicinski 0 siblings, 1 reply; 26+ messages in thread From: Al Viro @ 2017-12-13 1:30 UTC (permalink / raw) To: Jakub Kicinski; +Cc: Linus Torvalds, netdev, linux-kernel On Tue, Dec 12, 2017 at 05:04:37PM -0800, Jakub Kicinski wrote: > On Wed, 13 Dec 2017 00:36:59 +0000, Al Viro wrote: > > On Tue, Dec 12, 2017 at 03:59:33PM -0800, Jakub Kicinski wrote: > > > > +static __always_inline __##type type##_replace_bits(__##type old, \ > > > > + base val, base mask) \ > > > > +{ \ > > > > + __##type m = to(mask); \ > > > > + if (__builtin_constant_p(val) && \ > > > > > > Is the lack of a __builtin_constant_p(mask) test intentional? Sometimes > > > the bitfield is a packed array and people may have a helper to which > > > only the mask is passed as non-constant and the value is implied by the > > > helper, thus constant. > > > > If the mask in non-constant, we probably shouldn't be using that at all; > > could you show a real-world example where that would be the case? > > FIELD_* macros explicitly forbid this, since the code would be... > suboptimal with the runtime ffsl. Real life examples are the hackish > macro NFP_ETH_SET_BIT_CONFIG() in > > drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c Why not simply make nfp_eth_set_bit_config() static inline and be done with that? It's not that large and there are only few callers, so... > I remember there was also some Renesas code.. maybe this: > > https://patchwork.kernel.org/patch/9881279/ > > it looks like cpg_z_clk_recalc_rate() and cpg_z2_clk_recalc_rate() only > differ in mask. *shrug* That thing would expand into "reg &= 15" in one case and "reg >>= 8; reg &= 15" in another. Either of which is cheaper than a function call, and definitely cheaper than any kind of dynamic calculation of shift, no matter how implemented. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-13 1:30 ` Al Viro @ 2017-12-13 1:35 ` Jakub Kicinski 2017-12-13 1:51 ` Al Viro 0 siblings, 1 reply; 26+ messages in thread From: Jakub Kicinski @ 2017-12-13 1:35 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, netdev, linux-kernel On Wed, 13 Dec 2017 01:30:56 +0000, Al Viro wrote: > On Tue, Dec 12, 2017 at 05:04:37PM -0800, Jakub Kicinski wrote: > > On Wed, 13 Dec 2017 00:36:59 +0000, Al Viro wrote: > > > On Tue, Dec 12, 2017 at 03:59:33PM -0800, Jakub Kicinski wrote: > > > > > +static __always_inline __##type type##_replace_bits(__##type old, \ > > > > > + base val, base mask) \ > > > > > +{ \ > > > > > + __##type m = to(mask); \ > > > > > + if (__builtin_constant_p(val) && \ > > > > > > > > Is the lack of a __builtin_constant_p(mask) test intentional? Sometimes > > > > the bitfield is a packed array and people may have a helper to which > > > > only the mask is passed as non-constant and the value is implied by the > > > > helper, thus constant. > > > > > > If the mask in non-constant, we probably shouldn't be using that at all; > > > could you show a real-world example where that would be the case? > > > > FIELD_* macros explicitly forbid this, since the code would be... > > suboptimal with the runtime ffsl. Real life examples are the hackish > > macro NFP_ETH_SET_BIT_CONFIG() in > > > > drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c > > Why not simply make nfp_eth_set_bit_config() static inline and be > done with that? It's not that large and there are only few callers, > so... It used to be __always_inline, but apparently LLVM/clang doesn't propagate constants :( 4e59532541c8 ("nfp: don't depend on compiler constant propagation") > > I remember there was also some Renesas code.. maybe this: > > > > https://patchwork.kernel.org/patch/9881279/ > > > > it looks like cpg_z_clk_recalc_rate() and cpg_z2_clk_recalc_rate() only > > differ in mask. > > *shrug* > > That thing would expand into "reg &= 15" in one case and "reg >>= 8; reg &= 15" > in another. Either of which is cheaper than a function call, and definitely > cheaper than any kind of dynamic calculation of shift, no matter how implemented. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-13 1:35 ` Jakub Kicinski @ 2017-12-13 1:51 ` Al Viro 2017-12-13 2:44 ` Jakub Kicinski 0 siblings, 1 reply; 26+ messages in thread From: Al Viro @ 2017-12-13 1:51 UTC (permalink / raw) To: Jakub Kicinski; +Cc: Linus Torvalds, netdev, linux-kernel On Tue, Dec 12, 2017 at 05:35:28PM -0800, Jakub Kicinski wrote: > It used to be __always_inline, but apparently LLVM/clang doesn't > propagate constants :( > > 4e59532541c8 ("nfp: don't depend on compiler constant propagation") Doesn't propagate constants or doesn't have exact same set of rules for __builtin_constant_p()? IOW, if you dropped that BUILD_BUG_ON(), what would be left after optimizations? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-13 1:51 ` Al Viro @ 2017-12-13 2:44 ` Jakub Kicinski 2017-12-13 14:22 ` Al Viro 0 siblings, 1 reply; 26+ messages in thread From: Jakub Kicinski @ 2017-12-13 2:44 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, netdev, linux-kernel On Wed, 13 Dec 2017 01:51:25 +0000, Al Viro wrote: > On Tue, Dec 12, 2017 at 05:35:28PM -0800, Jakub Kicinski wrote: > > > It used to be __always_inline, but apparently LLVM/clang doesn't > > propagate constants :( > > > > 4e59532541c8 ("nfp: don't depend on compiler constant propagation") > > Doesn't propagate constants or doesn't have exact same set of > rules for __builtin_constant_p()? IOW, if you dropped that > BUILD_BUG_ON(), what would be left after optimizations? Hm. You're right. It just doesn't recognize the parameter as constant in __builtin_constant_p(). I haven't compiled the actual code, but here is a trivial test: 18:36 ~$ clang --version clang version 4.0.1 (tags/RELEASE_401/final) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/bin 18:36 ~$ cat /tmp/test.c #include <stdio.h> static inline int test_thing(unsigned int a) { printf("const: %d\n", __builtin_constant_p(a)); if (a > 10) return a - 1; return a; } int main() { printf("const: %d\n", __builtin_constant_p(0)); return test_thing(0); } 18:36 ~$ clang -g /tmp/test.c -O2 -Wall -W -o /tmp/test 18:36 ~$ /tmp/test const: 1 const: 0 18:36 ~$ objdump -S /tmp/test ... 00000000004004e0 <main>: return a; } int main() { printf("const: %d\n", __builtin_constant_p(0)); 4004e0: 50 push %rax 4004e1: bf a0 05 40 00 mov $0x4005a0,%edi 4004e6: be 01 00 00 00 mov $0x1,%esi 4004eb: 31 c0 xor %eax,%eax 4004ed: e8 fe fe ff ff callq 4003f0 <printf@plt> printf("const: %d\n", __builtin_constant_p(a)); 4004f2: bf a0 05 40 00 mov $0x4005a0,%edi 4004f7: 31 f6 xor %esi,%esi 4004f9: 31 c0 xor %eax,%eax 4004fb: e8 f0 fe ff ff callq 4003f0 <printf@plt> return test_thing(0); 400500: 31 c0 xor %eax,%eax 400502: 59 pop %rcx 400503: c3 retq 400504: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 40050b: 00 00 00 40050e: 66 90 xchg %ax,%ax ... ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-13 2:44 ` Jakub Kicinski @ 2017-12-13 14:22 ` Al Viro 2017-12-13 17:45 ` Al Viro 2017-12-13 19:04 ` [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() Jakub Kicinski 0 siblings, 2 replies; 26+ messages in thread From: Al Viro @ 2017-12-13 14:22 UTC (permalink / raw) To: Jakub Kicinski; +Cc: Linus Torvalds, netdev, linux-kernel On Tue, Dec 12, 2017 at 06:44:00PM -0800, Jakub Kicinski wrote: > On Wed, 13 Dec 2017 01:51:25 +0000, Al Viro wrote: > > On Tue, Dec 12, 2017 at 05:35:28PM -0800, Jakub Kicinski wrote: > > > > > It used to be __always_inline, but apparently LLVM/clang doesn't > > > propagate constants :( > > > > > > 4e59532541c8 ("nfp: don't depend on compiler constant propagation") > > > > Doesn't propagate constants or doesn't have exact same set of > > rules for __builtin_constant_p()? IOW, if you dropped that > > BUILD_BUG_ON(), what would be left after optimizations? > > Hm. You're right. It just doesn't recognize the parameter as constant > in __builtin_constant_p(). FWIW, clang does propagate them well enough to detect and optimize multiplication by constant power of two. With the variant I've posted. The check on field overflow (which works on gcc builds) does nothing on clang - __builtin_constant_p() gives a flat-out false for arguments of inline function there. I can understand their reasoning, even though it's inconvenient in cases like this one - semantics of __builtin_constant_p() is a fucking mess and the less you rely upon its details, the safer you are... Hell knows - a part of that can be recovered by taking the check into a wrapper; as in static __always_inline __le32_replace_bits(__le32 old, u32 v, u32 mask) { __le32 m = cpu_to_le32(mask); return (old & ~m) | cpu_to_le32(v * mask/mask_to_multiplier(mask))); } #define le32_replace_bits(old, v, mask) ({ \ typeof(v) ____v = (v); \ typeof(mask) ____m = (mask); \ if (__builtin_constant_p(____v)) \ if (v & ~(____m / mask_to_multiplier(____m))) \ __field_overflow(); \ __l32_replace_bits(old, ____v, ____m); \ }) That would give that sanity check a better coverage on clang builds, but... does it really buy us enough to bother, especially since all those macros would have to be spelled out; you can have a macro expanding to definition of static inline, but you can't have a macro expanding to anything that would contain a preprocessor directive, including #define. And with that kind of sanity checks, the first build with gcc will catch everything missed by clang builds anyway. IMO it's not worth the trouble; let's go with the check inside of static inline and accept that on clang builds it'll do nothing. Next question: where do we put that bunch? I've put it into linux/byteorder/generic.h, so that anything picking fixed-endian primitives would pick those as well; I hadn't thought of linux/bitfield.h at the time. We certainly could put it there instead - it's never pulled by other headers, so adding #include <asm/byteorder.h> into linux/bitfield.h is not going to cause header order problems. Not sure... Linus, do you have any preferences in that area? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-13 14:22 ` Al Viro @ 2017-12-13 17:45 ` Al Viro 2017-12-15 2:33 ` [RFC][PATCH] Add primitives for manipulating bitfields both in host- and fixed-endian Al Viro 2017-12-13 19:04 ` [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() Jakub Kicinski 1 sibling, 1 reply; 26+ messages in thread From: Al Viro @ 2017-12-13 17:45 UTC (permalink / raw) To: Jakub Kicinski; +Cc: Linus Torvalds, netdev, linux-kernel On Wed, Dec 13, 2017 at 02:22:12PM +0000, Al Viro wrote: > Next question: where do we put that bunch? I've put it into > linux/byteorder/generic.h, so that anything picking fixed-endian primitives > would pick those as well; I hadn't thought of linux/bitfield.h at the time. > We certainly could put it there instead - it's never pulled by other headers, > so adding #include <asm/byteorder.h> into linux/bitfield.h is not going to > cause header order problems. Not sure... > > Linus, do you have any preferences in that area? After looking at some of the callers of bitfield.h stuff: it might be useful to add static inline void le64p_replace_bits(__le64 *p, u64 v, u64 mask) { __le64 m = cpu_to_le64(mask); *p = (*p & ~m) | (cpu_to_le64(v * mask_to_multiplier(mask)) & m); } and similar for other types. Not sure what would be a good name for host-endian variants - u64p_replace_bits() sounds a bit clumsy. Suggestions? ^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH] Add primitives for manipulating bitfields both in host- and fixed-endian. 2017-12-13 17:45 ` Al Viro @ 2017-12-15 2:33 ` Al Viro 2017-12-15 5:07 ` Jakub Kicinski 0 siblings, 1 reply; 26+ messages in thread From: Al Viro @ 2017-12-15 2:33 UTC (permalink / raw) To: Jakub Kicinski; +Cc: Linus Torvalds, netdev, linux-kernel The following primitives are defined in linux/bitfield.h: * u32 le32_get_bits(__le32 val, u32 field) extracts the contents of the bitfield specified by @field in little-endian 32bit value @val and converts it to host-endian. * void le32p_replace_bits(__le32 *p, u32 v, u32 field) replaces the contents of the bitfield specified by @field in little-endian 32bit object pointet to by *p with the value of @v. New value is given in host-endian and stored as little-endian. * __le32 le32_replace_bits(__le32 old, u32 v, u32 field) is equivalent to ({__le32 tmp = old; le32p_replace_bits(&old, v, field); tmp;}) In other words, instead of modifying an object in memory, it takes the initial value and returns the modified one. Such set of helpers is defined for each of little-, big- and host-endian types; e.g. u64_get_bits(val, field) will return the contents of the bitfield specified by @field in host-endian 64bit value @val, etc. Of course, for host-endian no conversion is involved. Fields to access are specified as GENMASK() values - an N-bit field starting at bit #M is encoded as GENMASK(M + N - 1, N). Note that bit numbers refer to endianness of the object we are working with - e.g. GENMASK(11, 0) in __be16 refers to the second byte and the lower 4 bits of the first byte. In __le16 it would refer to the first byte and the lower 4 bits of the second byte, etc. Field specification must be a constant; __builtin_constant_p() doesn't have to be true for it, but compiler must be able to evaluate it at build time. If it cannot or if the value does not encode any bitfield, the build will fail. If the value being stored in ..._replace_bits() is a constant that does not fit into bitfield, a warning will be generated at compile time. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h index 1030651f8309..4b4f0531a79c 100644 --- a/include/linux/bitfield.h +++ b/include/linux/bitfield.h @@ -16,6 +16,7 @@ #define _LINUX_BITFIELD_H #include <linux/build_bug.h> +#include <asm/byteorder.h> /* * Bitfield access macros @@ -103,4 +104,50 @@ (typeof(_mask))(((_reg) & (_mask)) >> __bf_shf(_mask)); \ }) +extern void __compiletime_warning("value doesn't fit into mask") +__field_overflow(void); +extern void __compiletime_error("bad bitfield mask") +__bad_mask(void); +static __always_inline u64 mask_to_multiplier(u64 mask) +{ + if ((mask | (mask - 1)) & ((mask | (mask - 1)) + 1)) + __bad_mask(); + return mask & -mask; +} + +#define ____MAKE_OP(type,base,to,from) \ +static __always_inline __##type type##_replace_bits(__##type old, \ + base val, base field) \ +{ \ + __##type m = to(field); \ + if (__builtin_constant_p(val) && \ + (val & ~(field/mask_to_multiplier(field)))) \ + __field_overflow(); \ + return (old & ~m) | \ + (to(val * mask_to_multiplier(field)) & m); \ +} \ +static __always_inline void type##p_replace_bits(__##type *p, \ + base val, base field) \ +{ \ + __##type m = to(field); \ + if (__builtin_constant_p(val) && \ + (val & ~(field/mask_to_multiplier(field)))) \ + __field_overflow(); \ + *p = (*p & ~m) | \ + (to(val * mask_to_multiplier(field)) & m); \ +} \ +static __always_inline base type##_get_bits(__##type v, base field) \ +{ \ + return (from(v) & field)/mask_to_multiplier(field); \ +} +#define __MAKE_OP(size) \ + ____MAKE_OP(le##size,u##size,cpu_to_le##size,le##size##_to_cpu) \ + ____MAKE_OP(be##size,u##size,cpu_to_be##size,be##size##_to_cpu) \ + ____MAKE_OP(u##size,u##size,,) +__MAKE_OP(16) +__MAKE_OP(32) +__MAKE_OP(64) +#undef __MAKE_OP +#undef ____MAKE_OP + #endif ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] Add primitives for manipulating bitfields both in host- and fixed-endian. 2017-12-15 2:33 ` [RFC][PATCH] Add primitives for manipulating bitfields both in host- and fixed-endian Al Viro @ 2017-12-15 5:07 ` Jakub Kicinski 2017-12-15 5:34 ` Al Viro 0 siblings, 1 reply; 26+ messages in thread From: Jakub Kicinski @ 2017-12-15 5:07 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, netdev, linux-kernel Looks great to me! On Fri, 15 Dec 2017 02:33:43 +0000, Al Viro wrote: > The following primitives are defined in linux/bitfield.h: > > * u32 le32_get_bits(__le32 val, u32 field) extracts the contents of the > bitfield specified by @field in little-endian 32bit value @val and > converts it to host-endian. > > * void le32p_replace_bits(__le32 *p, u32 v, u32 field) replaces > the contents of the bitfield specified by @field in little-endian > 32bit object pointet to by *p with the value of @v. New value is > given in host-endian and stored as little-endian. > > * __le32 le32_replace_bits(__le32 old, u32 v, u32 field) is equivalent to > ({__le32 tmp = old; le32p_replace_bits(&old, v, field); tmp;}) > In other words, instead of modifying an object in memory, it takes > the initial value and returns the modified one. the current macros take filed/mask as first param, not sure if it's worth maintaining the order ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] Add primitives for manipulating bitfields both in host- and fixed-endian. 2017-12-15 5:07 ` Jakub Kicinski @ 2017-12-15 5:34 ` Al Viro 2017-12-15 16:48 ` [RFC][PATCH v2] " Al Viro 0 siblings, 1 reply; 26+ messages in thread From: Al Viro @ 2017-12-15 5:34 UTC (permalink / raw) To: Jakub Kicinski; +Cc: Linus Torvalds, netdev, linux-kernel On Thu, Dec 14, 2017 at 09:07:13PM -0800, Jakub Kicinski wrote: > Looks great to me! > > On Fri, 15 Dec 2017 02:33:43 +0000, Al Viro wrote: > > The following primitives are defined in linux/bitfield.h: > > > > * u32 le32_get_bits(__le32 val, u32 field) extracts the contents of the > > bitfield specified by @field in little-endian 32bit value @val and > > converts it to host-endian. > > > > * void le32p_replace_bits(__le32 *p, u32 v, u32 field) replaces > > the contents of the bitfield specified by @field in little-endian > > 32bit object pointet to by *p with the value of @v. New value is > > given in host-endian and stored as little-endian. > > > > * __le32 le32_replace_bits(__le32 old, u32 v, u32 field) is equivalent to > > ({__le32 tmp = old; le32p_replace_bits(&old, v, field); tmp;}) > > In other words, instead of modifying an object in memory, it takes > > the initial value and returns the modified one. > > the current macros take filed/mask as first param, not sure if it's > worth maintaining the order Umm... For something like Haskell that would be more natural (as in replace_foo = replace_field foo), but it's C - no partially applied functions here... While we are at it, to cover the FIELD_PREP users it might make sense to add __le32 le32_encode_bits(u32 v, u32 field) { if (__builtin_constant_p(v) && (v & ~(field/mask_to_multiplier(field)))) __field_overflow(); return cpu_to_le32((v * mask_to_multiplier(field)) & field); } turning the body of le32_replace_bits into return (old & ~cpu_to_le32(field)) | le32_encode_bits(v, field); ^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH v2] Add primitives for manipulating bitfields both in host- and fixed-endian. 2017-12-15 5:34 ` Al Viro @ 2017-12-15 16:48 ` Al Viro 0 siblings, 0 replies; 26+ messages in thread From: Al Viro @ 2017-12-15 16:48 UTC (permalink / raw) To: Linus Torvalds; +Cc: netdev, linux-kernel, Jakub Kicinski [Folks, please review and comment; if no objections show up, into -next it goes] The following primitives are defined in linux/bitfield.h: * u32 le32_get_bits(__le32 val, u32 field) extracts the contents of the bitfield specified by @field in little-endian 32bit object @val and converts it to host-endian. * void le32p_replace_bits(__le32 *p, u32 v, u32 field) replaces the contents of the bitfield specified by @field in little-endian 32bit object pointed to by @p with the value of @v. New value is given in host-endian and stored as little-endian. * __le32 le32_replace_bits(__le32 old, u32 v, u32 field) is equivalent to ({__le32 tmp = old; le32p_replace_bits(&tmp, v, field); tmp;}) In other words, instead of modifying an object in memory, it takes the initial value and returns the modified one. * __le32 le32_encode_bits(u32 v, u32 field) is equivalent to le32_replace_bits(0, v, field). In other words, it returns a little-endian 32bit object with the bitfield specified by @field containing the value of @v and all bits outside that bitfield being zero. Such set of helpers is defined for each of little-, big- and host-endian types; e.g. u64_get_bits(val, field) will return the contents of the bitfield specified by @field in host-endian 64bit object @val, etc. Of course, for host-endian no conversion is involved. Fields to access are specified as GENMASK() values - an N-bit field starting at bit #M is encoded as GENMASK(M + N - 1, M). Note that bit numbers refer to endianness of the object we are working with - e.g. GENMASK(11, 0) in __be16 refers to the second byte and the lower 4 bits of the first byte. In __le16 it would refer to the first byte and the lower 4 bits of the second byte, etc. Field specification must be a constant; __builtin_constant_p() doesn't have to be true for it, but compiler must be able to evaluate it at build time. If it cannot or if the value does not encode any bitfield, the build will fail. If the value being stored in a bitfield is a constant that does not fit into that bitfield, a warning will be generated at compile time. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h index 1030651f8309..cf2588d81148 100644 --- a/include/linux/bitfield.h +++ b/include/linux/bitfield.h @@ -16,6 +16,7 @@ #define _LINUX_BITFIELD_H #include <linux/build_bug.h> +#include <asm/byteorder.h> /* * Bitfield access macros @@ -103,4 +104,49 @@ (typeof(_mask))(((_reg) & (_mask)) >> __bf_shf(_mask)); \ }) +extern void __compiletime_warning("value doesn't fit into mask") +__field_overflow(void); +extern void __compiletime_error("bad bitfield mask") +__bad_mask(void); +static __always_inline u64 field_multiplier(u64 field) +{ + if ((field | (field - 1)) & ((field | (field - 1)) + 1)) + __bad_mask(); + return field & -field; +} +static __always_inline u64 field_mask(u64 field) +{ + return field / field_multiplier(field); +} +#define ____MAKE_OP(type,base,to,from) \ +static __always_inline __##type type##_encode_bits(base v, base field) \ +{ \ + if (__builtin_constant_p(v) && (v & ~field_multiplier(field))) \ + __field_overflow(); \ + return to((v & field_mask(field)) * field_multiplier(field)); \ +} \ +static __always_inline __##type type##_replace_bits(__##type old, \ + base val, base field) \ +{ \ + return (old & ~to(field)) | type##_encode_bits(val, field); \ +} \ +static __always_inline void type##p_replace_bits(__##type *p, \ + base val, base field) \ +{ \ + *p = (*p & ~to(field)) | type##_encode_bits(val, field); \ +} \ +static __always_inline base type##_get_bits(__##type v, base field) \ +{ \ + return (from(v) & field)/field_multiplier(field); \ +} +#define __MAKE_OP(size) \ + ____MAKE_OP(le##size,u##size,cpu_to_le##size,le##size##_to_cpu) \ + ____MAKE_OP(be##size,u##size,cpu_to_be##size,be##size##_to_cpu) \ + ____MAKE_OP(u##size,u##size,,) +__MAKE_OP(16) +__MAKE_OP(32) +__MAKE_OP(64) +#undef __MAKE_OP +#undef ____MAKE_OP + #endif ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() 2017-12-13 14:22 ` Al Viro 2017-12-13 17:45 ` Al Viro @ 2017-12-13 19:04 ` Jakub Kicinski 1 sibling, 0 replies; 26+ messages in thread From: Jakub Kicinski @ 2017-12-13 19:04 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, netdev, linux-kernel On Wed, 13 Dec 2017 14:22:12 +0000, Al Viro wrote: > IMO it's not worth the trouble; let's go with the check inside of > static inline and accept that on clang builds it'll do nothing. Ack, thanks for investigating! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH] apparent big-endian bugs in dwc-xlgmac 2017-12-11 5:05 ` Al Viro 2017-12-11 5:38 ` Al Viro @ 2017-12-11 6:18 ` Jie Deng 1 sibling, 0 replies; 26+ messages in thread From: Jie Deng @ 2017-12-11 6:18 UTC (permalink / raw) To: Al Viro, Jie Deng; +Cc: netdev On 2017/12/11 13:05, Al Viro wrote: > On Mon, Dec 11, 2017 at 12:33:42PM +0800, Jie Deng wrote: >> Hi AI Viro, >>> @@ -125,8 +125,8 @@ >>> typeof(len) _len = (len); \ >>> typeof(val) _val = (val); \ >>> _val = (_val << _pos) & GENMASK(_pos + _len - 1, _pos); \ >>> - _var = (_var & ~GENMASK(_pos + _len - 1, _pos)) | _val; \ >>> - cpu_to_le32(_var); \ >>> + (_var & ~cpu_to_le32(GENMASK(_pos + _len - 1, _pos))) | \ >>> + cpu_to_le32(_val); \ >>> }) >>> >>> struct xlgmac_pdata; >> Make sense. But I think what you want is fix as follows. Right ? >> >> @@ -125,8 +125,8 @@ >> typeof(len) _len = (len); \ >> - typeof(val) _val = (val); \ >> + u32 _var = le32_to_cpu((var)); \ >> _val = (_val << _pos) & GENMASK(_pos + _len - 1, _pos); \ >> - _var = (_var & ~GENMASK(_pos + _len - 1, _pos)) | _val; \ >> - cpu_to_le32(_var); \ >> + (cpu_to_le32(_var) & ~cpu_to_le32(GENMASK(_pos + _len - 1, _pos))) | \ >> + cpu_to_le32(_val); \ >> }) > What for? Sure, this variant will work, but why bother with > a = le32_to_cpu(b); > (cpu_to_le32(a) & ....) | .... > and how is that better than > (b & ...) | ... > > IDGI... Mind you, I'm not sure if there is any point keeping _var in that thing, > seeing that we use var only once - might be better off with > ((var) & ~cpu_to_le32(GENMASK(_pos + _len - 1, _pos))) | \ > cpu_to_le32(_val); \ I agree. Could you please send the patch with this better fix ? ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2017-12-15 16:48 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-12-10 4:53 [RFC][PATCH] apparent big-endian bugs in dwc-xlgmac Al Viro 2017-12-11 4:33 ` Jie Deng 2017-12-11 5:05 ` Al Viro 2017-12-11 5:38 ` Al Viro 2017-12-11 6:46 ` Jie Deng 2017-12-11 15:54 ` [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() Al Viro 2017-12-12 4:02 ` Jakub Kicinski 2017-12-12 6:20 ` Al Viro 2017-12-12 19:45 ` Al Viro 2017-12-12 20:04 ` Jakub Kicinski 2017-12-12 23:48 ` Al Viro 2017-12-12 23:59 ` Jakub Kicinski 2017-12-13 0:36 ` Al Viro 2017-12-13 1:04 ` Jakub Kicinski 2017-12-13 1:30 ` Al Viro 2017-12-13 1:35 ` Jakub Kicinski 2017-12-13 1:51 ` Al Viro 2017-12-13 2:44 ` Jakub Kicinski 2017-12-13 14:22 ` Al Viro 2017-12-13 17:45 ` Al Viro 2017-12-15 2:33 ` [RFC][PATCH] Add primitives for manipulating bitfields both in host- and fixed-endian Al Viro 2017-12-15 5:07 ` Jakub Kicinski 2017-12-15 5:34 ` Al Viro 2017-12-15 16:48 ` [RFC][PATCH v2] " Al Viro 2017-12-13 19:04 ` [RFC][PATCH] new byteorder primitives - ..._{replace,get}_bits() Jakub Kicinski 2017-12-11 6:18 ` [RFC][PATCH] apparent big-endian bugs in dwc-xlgmac Jie Deng
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.