* [B.A.T.M.A.N.] [PATCH maint] batman-adv: fix packet checksum in receive path
@ 2018-01-22 19:24 Matthias Schiffer
2018-01-22 20:52 ` Sven Eckelmann
0 siblings, 1 reply; 6+ messages in thread
From: Matthias Schiffer @ 2018-01-22 19:24 UTC (permalink / raw)
To: b.a.t.m.a.n
skb_postpull_rcsum() is necessary after eth_type_trans() to adjust the
skb checksum, otherwise log spam of the form "bat0: hw csum failure" will
result when packets with CHECKSUM_COMPLETE are received (at least in some
setups, e.g. when stacking batman-adv on top of VXLAN).
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
---
I don't know what the exact circumstances are that trigger the log spam,
but it seems this was broken forever (I could also reproduce the issue with
our compat-14 legacy branch)... so please ask David to queue this up for
stable :)
net/batman-adv/soft-interface.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c
index c95e2b26..edeffcb9 100644
--- a/net/batman-adv/soft-interface.c
+++ b/net/batman-adv/soft-interface.c
@@ -459,13 +459,7 @@ void batadv_interface_rx(struct net_device *soft_iface,
/* skb->dev & skb->pkt_type are set here */
skb->protocol = eth_type_trans(skb, soft_iface);
-
- /* should not be necessary anymore as we use skb_pull_rcsum()
- * TODO: please verify this and remove this TODO
- * -- Dec 21st 2009, Simon Wunderlich
- */
-
- /* skb->ip_summed = CHECKSUM_UNNECESSARY; */
+ skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
batadv_inc_counter(bat_priv, BATADV_CNT_RX);
batadv_add_counter(bat_priv, BATADV_CNT_RX_BYTES,
--
2.16.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH maint] batman-adv: fix packet checksum in receive path
2018-01-22 19:24 [B.A.T.M.A.N.] [PATCH maint] batman-adv: fix packet checksum in receive path Matthias Schiffer
@ 2018-01-22 20:52 ` Sven Eckelmann
2018-01-22 21:18 ` Matthias Schiffer
2018-01-23 14:25 ` Maximilian Wilhelm
0 siblings, 2 replies; 6+ messages in thread
From: Sven Eckelmann @ 2018-01-22 20:52 UTC (permalink / raw)
To: b.a.t.m.a.n
Cc: Matthias Schiffer, Maximilian Wilhelm, Felix Kaechele, Mephisto
[-- Attachment #1: Type: text/plain, Size: 1813 bytes --]
On Montag, 22. Januar 2018 20:24:50 CET Matthias Schiffer wrote:
> skb_postpull_rcsum() is necessary after eth_type_trans() to adjust the
> skb checksum, otherwise log spam of the form "bat0: hw csum failure" will
> result when packets with CHECKSUM_COMPLETE are received (at least in some
> setups, e.g. when stacking batman-adv on top of VXLAN).
Would be nice to have a better explanation here.
The comment previously assumed that skb_pull_rcsum would be enough. But the
problem here is that the skb_pull_rcsum only pulls the batman-adv headers. The
actual pull of the ethernet header (with skb_pull_inline) happens inside
eth_type_trans. Or did I miss anything?
[...]
> I don't know what the exact circumstances are that trigger the log spam,
> but it seems this was broken forever (I could also reproduce the issue with
> our compat-14 legacy branch)... so please ask David to queue this up for
> stable :)
Yes, this is broken since earliest commits. The most relevant commit in
batman-adv is:
Fixes: fe28a94c01e1 ("batman-adv: receive packets directly using skbs")
But I would propose to use following in the kernel tree:
Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
The 4.15 release will be soon(tm) and Simon is currently on vacation. So we
will most likely postpone the submission to David until Simon found a way out
of the snow and after 4.15 is released...
But it would be nice when some people could test the patch [1] (together with
vxlan?) on batman-adv or batman-adv-legacy. And please provide a
"Tested-by: Full Name <email@example.org>" [2] reply when it works.
Thanks,
Sven
[1] https://patchwork.open-mesh.org/patch/17250/
[2] https://www.kernel.org/doc/html/v4.12/process/submitting-patches.html#using-reported-by-tested-by-reviewed-by-suggested-by-and-fixes
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH maint] batman-adv: fix packet checksum in receive path
2018-01-22 20:52 ` Sven Eckelmann
@ 2018-01-22 21:18 ` Matthias Schiffer
2018-01-23 9:12 ` Matthias Schiffer
2018-01-23 14:25 ` Maximilian Wilhelm
1 sibling, 1 reply; 6+ messages in thread
From: Matthias Schiffer @ 2018-01-22 21:18 UTC (permalink / raw)
To: Sven Eckelmann, b.a.t.m.a.n; +Cc: Maximilian Wilhelm, Felix Kaechele, Mephisto
[-- Attachment #1.1: Type: text/plain, Size: 2454 bytes --]
On 01/22/2018 09:52 PM, Sven Eckelmann wrote:
> On Montag, 22. Januar 2018 20:24:50 CET Matthias Schiffer wrote:
>> skb_postpull_rcsum() is necessary after eth_type_trans() to adjust the
>> skb checksum, otherwise log spam of the form "bat0: hw csum failure" will
>> result when packets with CHECKSUM_COMPLETE are received (at least in some
>> setups, e.g. when stacking batman-adv on top of VXLAN).
>
> Would be nice to have a better explanation here.
>
> The comment previously assumed that skb_pull_rcsum would be enough. But the
> problem here is that the skb_pull_rcsum only pulls the batman-adv headers. The
> actual pull of the ethernet header (with skb_pull_inline) happens inside
> eth_type_trans. Or did I miss anything?
This is correct, eth_type_trans() contains a simple skb_pull(), so the csum
must be adjusted afterwards (grepping the kernel for eth_type_trans will
find a lot of this). I can send a v2 with a better commit message later.
>
> [...]
>> I don't know what the exact circumstances are that trigger the log spam,
>> but it seems this was broken forever (I could also reproduce the issue with
>> our compat-14 legacy branch)... so please ask David to queue this up for
>> stable :)
>
> Yes, this is broken since earliest commits. The most relevant commit in
> batman-adv is:
>
> Fixes: fe28a94c01e1 ("batman-adv: receive packets directly using skbs")
>
> But I would propose to use following in the kernel tree:
>
> Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
>
> The 4.15 release will be soon(tm) and Simon is currently on vacation. So we
> will most likely postpone the submission to David until Simon found a way out
> of the snow and after 4.15 is released...
>
> But it would be nice when some people could test the patch [1] (together with
> vxlan?) on batman-adv or batman-adv-legacy. And please provide a
> "Tested-by: Full Name <email@example.org>" [2] reply when it works.
>
> Thanks,> Sven
I've tested this on Kernel 4.14.14 (everything working correctly now) and
4.4.110 (here, there are still checksum errors; it seems on older kernels,
the checksum handling in VXLAN is broken too? Still debugging this...)
Matthias
>
> [1] https://patchwork.open-mesh.org/patch/17250/
> [2] https://www.kernel.org/doc/html/v4.12/process/submitting-patches.html#using-reported-by-tested-by-reviewed-by-suggested-by-and-fixes
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH maint] batman-adv: fix packet checksum in receive path
2018-01-22 21:18 ` Matthias Schiffer
@ 2018-01-23 9:12 ` Matthias Schiffer
2018-01-23 21:56 ` Maximilian Wilhelm
0 siblings, 1 reply; 6+ messages in thread
From: Matthias Schiffer @ 2018-01-23 9:12 UTC (permalink / raw)
To: Sven Eckelmann, b.a.t.m.a.n; +Cc: Maximilian Wilhelm, Mephisto, Felix Kaechele
[-- Attachment #1.1: Type: text/plain, Size: 3249 bytes --]
On 01/22/2018 10:18 PM, Matthias Schiffer wrote:
> On 01/22/2018 09:52 PM, Sven Eckelmann wrote:
>> On Montag, 22. Januar 2018 20:24:50 CET Matthias Schiffer wrote:
>>> skb_postpull_rcsum() is necessary after eth_type_trans() to adjust the
>>> skb checksum, otherwise log spam of the form "bat0: hw csum failure" will
>>> result when packets with CHECKSUM_COMPLETE are received (at least in some
>>> setups, e.g. when stacking batman-adv on top of VXLAN).
>>
>> Would be nice to have a better explanation here.
>>
>> The comment previously assumed that skb_pull_rcsum would be enough. But the
>> problem here is that the skb_pull_rcsum only pulls the batman-adv headers. The
>> actual pull of the ethernet header (with skb_pull_inline) happens inside
>> eth_type_trans. Or did I miss anything?
>
> This is correct, eth_type_trans() contains a simple skb_pull(), so the csum
> must be adjusted afterwards (grepping the kernel for eth_type_trans will
> find a lot of this). I can send a v2 with a better commit message later.
>
>>
>> [...]
>>> I don't know what the exact circumstances are that trigger the log spam,
>>> but it seems this was broken forever (I could also reproduce the issue with
>>> our compat-14 legacy branch)... so please ask David to queue this up for
>>> stable :)
>>
>> Yes, this is broken since earliest commits. The most relevant commit in
>> batman-adv is:
>>
>> Fixes: fe28a94c01e1 ("batman-adv: receive packets directly using skbs")
>>
>> But I would propose to use following in the kernel tree:
>>
>> Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
>>
>> The 4.15 release will be soon(tm) and Simon is currently on vacation. So we
>> will most likely postpone the submission to David until Simon found a way out
>> of the snow and after 4.15 is released...
>>
>> But it would be nice when some people could test the patch [1] (together with
>> vxlan?) on batman-adv or batman-adv-legacy. And please provide a
>> "Tested-by: Full Name <email@example.org>" [2] reply when it works.
>>
>> Thanks,> Sven
>
> I've tested this on Kernel 4.14.14 (everything working correctly now) and
> 4.4.110 (here, there are still checksum errors; it seems on older kernels,
> the checksum handling in VXLAN is broken too? Still debugging this...)
I've found the issue of this other checksum problem: batman-adv
fragmentation code doesn't handle the checksum on reassembly at all. I
think the best option here is to simply set ip_summed to CHECKSUM_NONE on
reassembly, I will send another patch for that.
The IP fragmentation code does more fancy things when all fragments have
CHECKSUM_COMPLETE, adding up the checksums of the fragments under certain
circumstances. This only works because IP fragments are guaranteed to be
split at even byte offsets (multiples of 8, actually); as far as I can
tell, batman-adv allows odd fragment sizes, making it impossible to add up
the 16bit checksums in the general case.
Matthias
>
>
>
>>
>> [1] https://patchwork.open-mesh.org/patch/17250/
>> [2] https://www.kernel.org/doc/html/v4.12/process/submitting-patches.html#using-reported-by-tested-by-reviewed-by-suggested-by-and-fixes
>>
>
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH maint] batman-adv: fix packet checksum in receive path
2018-01-22 20:52 ` Sven Eckelmann
2018-01-22 21:18 ` Matthias Schiffer
@ 2018-01-23 14:25 ` Maximilian Wilhelm
1 sibling, 0 replies; 6+ messages in thread
From: Maximilian Wilhelm @ 2018-01-23 14:25 UTC (permalink / raw)
To: Sven Eckelmann
Cc: b.a.t.m.a.n, Matthias Schiffer, Maximilian Wilhelm,
Felix Kaechele, Mephisto
Anno domini 2018 Sven Eckelmann scripsit:
> On Montag, 22. Januar 2018 20:24:50 CET Matthias Schiffer wrote:
[...]
> > I don't know what the exact circumstances are that trigger the log spam,
> > but it seems this was broken forever (I could also reproduce the issue with
> > our compat-14 legacy branch)... so please ask David to queue this up for
> > stable :)
>
> Yes, this is broken since earliest commits. The most relevant commit in
> batman-adv is:
>
> Fixes: fe28a94c01e1 ("batman-adv: receive packets directly using skbs")
>
> But I would propose to use following in the kernel tree:
>
> Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
>
> The 4.15 release will be soon(tm) and Simon is currently on vacation. So we
> will most likely postpone the submission to David until Simon found a way out
> of the snow and after 4.15 is released...
>
> But it would be nice when some people could test the patch [1] (together with
> vxlan?) on batman-adv or batman-adv-legacy. And please provide a
> "Tested-by: Full Name <email@example.org>" [2] reply when it works.
I took a Debian Kernel package (4.14.13-1~bpo9+1), applied the patch
and deployed the package on a gateway running BATMAN over VTEPs. The
log messages from previous kernels don't show up anymore <3.
Tested-by: Maximilian Wilhelm <max@sdn.clinic>
Thanks!
Best
Max
--
"I have to admit I've always suspected that MTBWTF would be a more useful
metric of real-world performance."
-- Valdis Kletnieks on NANOG
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH maint] batman-adv: fix packet checksum in receive path
2018-01-23 9:12 ` Matthias Schiffer
@ 2018-01-23 21:56 ` Maximilian Wilhelm
0 siblings, 0 replies; 6+ messages in thread
From: Maximilian Wilhelm @ 2018-01-23 21:56 UTC (permalink / raw)
To: The list for a Better Approach To Mobile Ad-hoc Networking
Cc: Sven Eckelmann, Maximilian Wilhelm, Mephisto, Felix Kaechele
Anno domini 2018 Matthias Schiffer scripsit:
Hi,
> On 01/22/2018 10:18 PM, Matthias Schiffer wrote:
> > On 01/22/2018 09:52 PM, Sven Eckelmann wrote:
> >> On Montag, 22. Januar 2018 20:24:50 CET Matthias Schiffer wrote:
> >>> skb_postpull_rcsum() is necessary after eth_type_trans() to adjust the
> >>> skb checksum, otherwise log spam of the form "bat0: hw csum failure" will
> >>> result when packets with CHECKSUM_COMPLETE are received (at least in some
> >>> setups, e.g. when stacking batman-adv on top of VXLAN).
> >>
> >> Would be nice to have a better explanation here.
> >>
> >> The comment previously assumed that skb_pull_rcsum would be enough. But the
> >> problem here is that the skb_pull_rcsum only pulls the batman-adv headers. The
> >> actual pull of the ethernet header (with skb_pull_inline) happens inside
> >> eth_type_trans. Or did I miss anything?
> >
> > This is correct, eth_type_trans() contains a simple skb_pull(), so the csum
> > must be adjusted afterwards (grepping the kernel for eth_type_trans will
> > find a lot of this). I can send a v2 with a better commit message later.
> >
> >>
> >> [...]
> >>> I don't know what the exact circumstances are that trigger the log spam,
> >>> but it seems this was broken forever (I could also reproduce the issue with
> >>> our compat-14 legacy branch)... so please ask David to queue this up for
> >>> stable :)
> >>
> >> Yes, this is broken since earliest commits. The most relevant commit in
> >> batman-adv is:
> >>
> >> Fixes: fe28a94c01e1 ("batman-adv: receive packets directly using skbs")
> >>
> >> But I would propose to use following in the kernel tree:
> >>
> >> Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
> >>
> >> The 4.15 release will be soon(tm) and Simon is currently on vacation. So we
> >> will most likely postpone the submission to David until Simon found a way out
> >> of the snow and after 4.15 is released...
> >>
> >> But it would be nice when some people could test the patch [1] (together with
> >> vxlan?) on batman-adv or batman-adv-legacy. And please provide a
> >> "Tested-by: Full Name <email@example.org>" [2] reply when it works.
> >>
> >> Thanks,> Sven
> >
> > I've tested this on Kernel 4.14.14 (everything working correctly now) and
> > 4.4.110 (here, there are still checksum errors; it seems on older kernels,
> > the checksum handling in VXLAN is broken too? Still debugging this...)
>
> I've found the issue of this other checksum problem: batman-adv
> fragmentation code doesn't handle the checksum on reassembly at all. I
> think the best option here is to simply set ip_summed to CHECKSUM_NONE on
> reassembly, I will send another patch for that.
>
> The IP fragmentation code does more fancy things when all fragments have
> CHECKSUM_COMPLETE, adding up the checksums of the fragments under certain
> circumstances. This only works because IP fragments are guaranteed to be
> split at even byte offsets (multiples of 8, actually); as far as I can
> tell, batman-adv allows odd fragment sizes, making it impossible to add up
> the 16bit checksums in the general case.
And
Tested-By: Maximilian Wilhelm <max@sdn.clinic>
to the fix for fragmentation.c, too.
Disclaimer: As MTUs are calculated accordingly in our backbone
fragmentation of VXLAN packets isn't an issue and we did not see these
messages before. I can confirm, that I still don't see any now,
meaning the log spam from the previous fix is still fixed and no new
issues have arisen as of now.
Thanks a lot! <3
Best
Max
--
"Does is bother me, that people hurt others, because they are to weak to face the truth? Yeah. Sorry 'bout that."
-- Thirteen, House M.D.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-01-23 21:56 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-22 19:24 [B.A.T.M.A.N.] [PATCH maint] batman-adv: fix packet checksum in receive path Matthias Schiffer
2018-01-22 20:52 ` Sven Eckelmann
2018-01-22 21:18 ` Matthias Schiffer
2018-01-23 9:12 ` Matthias Schiffer
2018-01-23 21:56 ` Maximilian Wilhelm
2018-01-23 14:25 ` Maximilian Wilhelm
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).