LKML Archive on lore.kernel.org
 help / Atom feed
* [PATCH v0] strparser: remove any offset before parsing messages
@ 2018-08-09 22:40 Dominique Martinet
  2018-08-21 12:51 ` [PATCH] " Dominique Martinet
  0 siblings, 1 reply; 12+ messages in thread
From: Dominique Martinet @ 2018-08-09 22:40 UTC (permalink / raw)
  To: Doron Roberts-Kedes, Tom Herbert, Dave Watson
  Cc: Dominique Martinet, David S. Miller, netdev, linux-kernel

Offset is not well handled by strparser users right now.

Out of the current strparser users, we have:
 - tls, that handles offset properly in parse and rcv callbacks
 - kcm, that handles offset in rcv but not in parse
 - bpf sockmap, that does not seem to handle offset anywhere

Calling pskb_pull() on the skb before parsing ensures that the offset
will be 0 everywhere in practice unless the user modifies it themselves
like tls, as a workaround for the other two protocols.

This fixes a bug whilch can be exhibited by implementing a simpe kcm
parser that looks for the packet size in the first word of the packet,
and sending two such packets in a single write() call on the other side: 
the second message will be cut at the length of the first message.
Since this is a stream protocol, all the following messages will also
be corrupt since it will start looking for the next offset at a wrong
position.

Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
---

Discussions on the bug along with a (bad) reproducer can be found here:
http://lkml.kernel.org/m/20180803182830.GB29193@nautica
(now the problem is better understood though it's much simpler to send
two messages at once than to spam and wait for tcp aggregation to do it)


Two notes:
 - I've marked this patch v0 as we could move the pskb_pull() up to
where strp.offset is set, and just always leave it at 0 in the strparser
code.
This will let applications that are fine dealing with a non-zero offset
deal with it as they seem fit (tls writes into the offset and full_len
fields behind the back of the stream parser), while still being safe for
kcm/sockmap

 - Even with that modification I'm not totally happy with
single-handedly eating the offset for strparser users which could handle
it, but I'm not really familiar with the cost this really has in
practice...
A better fix would be to handle the offset properly in the callbacks,
but frankly at least for kcm I don't see how (maybe because I'm not
familiar with how bpf programs work)

Another idea I had would be to write flags when registering the protocol
e.g. strp->cb.flags & STRP_CAN_PARSE_WITH_OFFSET or something like that,
but without an idea of the cost of that pull I don't know if it's worth
doing.


Anyway, comments welcome.


 net/strparser/strparser.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/strparser/strparser.c b/net/strparser/strparser.c
index 625acb27efcc..d7a3b81c3481 100644
--- a/net/strparser/strparser.c
+++ b/net/strparser/strparser.c
@@ -222,6 +222,16 @@ static int __strp_recv(read_descriptor_t *desc, struct sk_buff *orig_skb,
 		if (!stm->strp.full_len) {
 			ssize_t len;
 
+			/* Can only parse if there is no offset */
+			if (unlikely(stm->strp.offset)) {
+				if (!pskb_pull(skb, stm->strp.offset)) {
+					STRP_STATS_INCR(strp->stats.mem_fail);
+					strp_parser_err(strp, -ENOMEM, desc);
+					break;
+				}
+				stm->strp.offset = 0;
+			}
+
 			len = (*strp->cb.parse_msg)(strp, head);
 
 			if (!len) {
@@ -249,8 +259,7 @@ static int __strp_recv(read_descriptor_t *desc, struct sk_buff *orig_skb,
 				STRP_STATS_INCR(strp->stats.msg_too_big);
 				strp_parser_err(strp, -EMSGSIZE, desc);
 				break;
-			} else if (len <= (ssize_t)head->len -
-					  skb->len - stm->strp.offset) {
+			} else if (len <= (ssize_t)head->len - skb->len) {
 				/* Length must be into new skb (and also
 				 * greater than zero)
 				 */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] strparser: remove any offset before parsing messages
  2018-08-09 22:40 [PATCH v0] strparser: remove any offset before parsing messages Dominique Martinet
@ 2018-08-21 12:51 ` " Dominique Martinet
  2018-08-21 14:53   ` Doron Roberts-Kedes
  0 siblings, 1 reply; 12+ messages in thread
From: Dominique Martinet @ 2018-08-21 12:51 UTC (permalink / raw)
  To: Doron Roberts-Kedes, Tom Herbert, Dave Watson
  Cc: Dominique Martinet, David S. Miller, netdev, linux-kernel

Offset is not well handled by strparser users right now.

Out of the current strparser users, we have:
 - tls, that handles offset properly in parse and rcv callbacks
 - kcm, that handles offset in rcv but not in parse
 - bpf sockmap, that does not seem to handle offset anywhere

Calling pskb_pull() on new skb ensures that the offset will be 0
everywhere in practice, and in particular for the parse function,
unless the user modifies it themselves like tls does.

This fixes a bug which can be exhibited by implementing a simple kcm
parser that looks for the packet size in the first word of the packet,
and sending two such packets in a single write() call on the other side:
the second message will be cut at the length of the first message.
Since this is a stream protocol, all the following messages will also
be corrupt since it will start looking for the next offset at a wrong
position.

Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
---

I haven't had any comment on v0, so here is what I had planned as
refactoring anyawy, but I'd *really* like some opinion on this as a
whole...

Also note that compiling bpf programs with libbcc is currently broken in
linus' master, see the fix thread here:
https://lkml.kernel.org/r/1534834088-15835-1-git-send-email-yamada.masahiro@socionext.com
You can likely just revert cafa0010cd51 ("Raise the minimum required gcc
version to 4.6") localy meanwhile, or base the patch off 4.18.

 net/strparser/strparser.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/net/strparser/strparser.c b/net/strparser/strparser.c
index da1a676860ca..d7fb30b1bcfc 100644
--- a/net/strparser/strparser.c
+++ b/net/strparser/strparser.c
@@ -201,7 +201,17 @@ static int __strp_recv(read_descriptor_t *desc, struct sk_buff *orig_skb,
 			strp->skb_nextp = NULL;
 			stm = _strp_msg(head);
 			memset(stm, 0, sizeof(*stm));
-			stm->strp.offset = orig_offset + eaten;
+
+			/* Can only parse if there is no offset */
+			if (unlikely(orig_offset + eaten)) {
+				if (!pskb_pull(skb, orig_offset + eaten)) {
+					STRP_STATS_INCR(strp->stats.mem_fail);
+					strp_parser_err(strp, -ENOMEM, desc);
+					break;
+				}
+				orig_len -= eaten;
+				orig_offset = eaten = 0;
+			}
 		} else {
 			/* Unclone if we are appending to an skb that we
 			 * already share a frag_list with.
@@ -253,8 +263,7 @@ static int __strp_recv(read_descriptor_t *desc, struct sk_buff *orig_skb,
 				STRP_STATS_INCR(strp->stats.msg_too_big);
 				strp_parser_err(strp, -EMSGSIZE, desc);
 				break;
-			} else if (len <= (ssize_t)head->len -
-					  skb->len - stm->strp.offset) {
+			} else if (len <= (ssize_t)head->len - skb->len) {
 				/* Length must be into new skb (and also
 				 * greater than zero)
 				 */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] strparser: remove any offset before parsing messages
  2018-08-21 12:51 ` [PATCH] " Dominique Martinet
@ 2018-08-21 14:53   ` Doron Roberts-Kedes
  2018-08-21 19:36     ` Dominique Martinet
  0 siblings, 1 reply; 12+ messages in thread
From: Doron Roberts-Kedes @ 2018-08-21 14:53 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel

On Tue, Aug 21, 2018 at 02:51:46PM +0200, Dominique Martinet wrote:
> Offset is not well handled by strparser users right now.
> 
> Out of the current strparser users, we have:
>  - tls, that handles offset properly in parse and rcv callbacks
>  - kcm, that handles offset in rcv but not in parse
>  - bpf sockmap, that does not seem to handle offset anywhere
> 
> Calling pskb_pull() on new skb ensures that the offset will be 0
> everywhere in practice, and in particular for the parse function,
> unless the user modifies it themselves like tls does.
> 
> This fixes a bug which can be exhibited by implementing a simple kcm
> parser that looks for the packet size in the first word of the packet,
> and sending two such packets in a single write() call on the other side:
> the second message will be cut at the length of the first message.
> Since this is a stream protocol, all the following messages will also
> be corrupt since it will start looking for the next offset at a wrong
> position.
> 
> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
> ---

There are a few issues with this patch. First, it seems like you're
trying to fix bugs in users of strparser by changing an implementation
detail of strparser. Second, this implementation change can add malloc's
and copies where there were none before. 

If strparser users do not handle non-zero offset properly, then that
doesn't motivate changing the implementation of strparser to copy
around data to accomodate those buggy users. 

Why not submit a patch that handles offset properly in the code you
pointed out? 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] strparser: remove any offset before parsing messages
  2018-08-21 14:53   ` Doron Roberts-Kedes
@ 2018-08-21 19:36     ` Dominique Martinet
  2018-08-21 21:15       ` Doron Roberts-Kedes
  0 siblings, 1 reply; 12+ messages in thread
From: Dominique Martinet @ 2018-08-21 19:36 UTC (permalink / raw)
  To: Doron Roberts-Kedes
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel

Doron Roberts-Kedes wrote on Tue, Aug 21, 2018:
> There are a few issues with this patch. First, it seems like you're
> trying to fix bugs in users of strparser by changing an implementation
> detail of strparser.

Yes, that's why I have been writing since the original discussion that I
do not like this fix, but as I said in the other thread and v0 of this
patch I do not know how to tell the bpf function to start with an offset
in the skb in e.g. kcm_parse_func_strparser

I could add the pull in that function, but that feels a bit wrong on a
separation level to me.

> Second, this implementation change can add malloc's and copies where
> there were none before.

Yes I agree this is more than suboptimal for tls, I've also said that.


> If strparser users do not handle non-zero offset properly, then that
> doesn't motivate changing the implementation of strparser to copy
> around data to accomodate those buggy users. 
>
> Why not submit a patch that handles offset properly in the code you
> pointed out? 

One of the solutions I had suggested was adding a flag at strparser
setup time to only do that pull for users which cannot handle offset,
but nobody seemed interested two weeks ago. I can still do that.

That's still suboptimal, but I don't have any better idea.
To properly fix the users, I'd really need help with how bpf works to
even know if passing an offset would be possible in the first place, as
I do not see how at this time.


Thanks,
-- 
Dominique Martinet

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] strparser: remove any offset before parsing messages
  2018-08-21 19:36     ` Dominique Martinet
@ 2018-08-21 21:15       ` Doron Roberts-Kedes
  2018-08-21 22:51         ` Dominique Martinet
  0 siblings, 1 reply; 12+ messages in thread
From: Doron Roberts-Kedes @ 2018-08-21 21:15 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel

On Tue, Aug 21, 2018 at 09:36:55PM +0200, Dominique Martinet wrote:

> One of the solutions I had suggested was adding a flag at strparser
> setup time to only do that pull for users which cannot handle offset,
> but nobody seemed interested two weeks ago. I can still do that.

This seems overly complicated. 

> That's still suboptimal, but I don't have any better idea.
> To properly fix the users, I'd really need help with how bpf works to
> even know if passing an offset would be possible in the first place, as
> I do not see how at this time.

Thanks for clarifying Dominique.
It seems like we mainly agree that the proposed patch is suboptimal for
existing clients of the library that use offset correctly (tls). 

It also seems like you've identified that the proper fix is in bpf. 
Regrettably, I cannot help you understand how bpf works because I'm not
familiar with that code. 

As an aside, I would recommend reaching the netdev FAQ page:
https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt

It contains helpful hints about how to format email subjects (specifying
net vs. net-next) and determining when trees are closed (currently
closed). 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] strparser: remove any offset before parsing messages
  2018-08-21 21:15       ` Doron Roberts-Kedes
@ 2018-08-21 22:51         ` Dominique Martinet
  2018-08-21 23:35           ` Doron Roberts-Kedes
  0 siblings, 1 reply; 12+ messages in thread
From: Dominique Martinet @ 2018-08-21 22:51 UTC (permalink / raw)
  To: Doron Roberts-Kedes
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel

Doron Roberts-Kedes wrote on Tue, Aug 21, 2018:
> On Tue, Aug 21, 2018 at 09:36:55PM +0200, Dominique Martinet wrote:
> > One of the solutions I had suggested was adding a flag at strparser
> > setup time to only do that pull for users which cannot handle offset,
> > but nobody seemed interested two weeks ago. I can still do that.
> 
> This seems overly complicated. 

That sounds much simpler than adding a similar feature to bpf if it
doesn't already support it -- we already have an user-provided struct at
strparser init timer (strp->cb) in which it would be easy to add a flag
saying whether the callbacks support offset or not.

That's maybe three more lines than the current patch, which is also
pretty simple, I'm not sure what you're expecting from alternative
solutions to call that overly complicated...


> It seems like we mainly agree that the proposed patch is suboptimal for
> existing clients of the library that use offset correctly (tls). 
> 
> It also seems like you've identified that the proper fix is in bpf. 

I don't think bpf itself needs to be changed here -- the offset is
stored in a strparser specific struct so short of such a skb_pull I
think we'd need to change the type of the bpf function, pass it it the
extra parameter, and make it a user visible change breaking the kcm
API... And I have no idea for sockmap but probably something similar.

I can't think of that as better than adding a flag to strparser.

(Also, note that pskb_pull will not copy any data or allocate memory
unless we're pulling past the end of the skb, which seems pretty
unlikely in that situation as we should have consumed any fully "eaten"
skb before getting to a new one here -- so in practice this patch just
adds a skb->data += offset with safety guards "just in case")


> As an aside, I would recommend reaching the netdev FAQ page:
> https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
> 
> It contains helpful hints about how to format email subjects (specifying
> net vs. net-next) and determining when trees are closed (currently
> closed). 

Thank you for pointing out to that document.
While this bug has been around from day one of kcm it is still a bug fix
so I'd rather let maintainers decide which tree it goes to.

I wasn't aware that net-next closes during the Linus merge window, but
if you want to split hairs, the FAQ says "do not send new net-next
content to netdev [while closed]" and this isn't new content as the v0
of the patch was mostly the same and sent before the 4.18 release, (and,
ironically, did not get any comment).
Anyway, sure, I'll wait until next week for a v2 if that matters.


Thanks,
-- 
Dominique

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] strparser: remove any offset before parsing messages
  2018-08-21 22:51         ` Dominique Martinet
@ 2018-08-21 23:35           ` Doron Roberts-Kedes
  2018-08-22  0:46             ` Dominique Martinet
  0 siblings, 1 reply; 12+ messages in thread
From: Doron Roberts-Kedes @ 2018-08-21 23:35 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel

On Wed, Aug 22, 2018 at 12:51:13AM +0200, Dominique Martinet wrote:
> That's maybe three more lines than the current patch, which is also
> pretty simple, I'm not sure what you're expecting from alternative
> solutions to call that overly complicated...

The line count is not the source of the complexity. The undue complexity
is having strparser operate in two modes: one for clients that properly
use the API by respecting the value of offset, and another for clients 
that do not. 

> I don't think bpf itself needs to be changed here -- the offset is
> stored in a strparser specific struct so short of such a skb_pull I
> think we'd need to change the type of the bpf function, pass it it the
> extra parameter, and make it a user visible change breaking the kcm
> API... And I have no idea for sockmap but probably something similar.

I'm not sure I follow you here. Any rcv_msg callback implementation
receives an skb. Calling strp_msg() on the skb gives you the strp_msg
which has the offset value. Can you explain why passing an extra
parameter is necessary to get the offset?

> I can't think of that as better than adding a flag to strparser.
> 
> (Also, note that pskb_pull will not copy any data or allocate memory
> unless we're pulling past the end of the skb, which seems pretty
> unlikely in that situation as we should have consumed any fully "eaten"
> skb before getting to a new one here -- so in practice this patch just
> adds a skb->data += offset with safety guards "just in case")

Yes, no data will be copied if the you don't pull beyond the linear
buffer. Adding overhead even in a small percentage of cases still
requires a good justification. In this particular case, I think a good
justification would be demonstrating that it is impractical for the 
buggy strparser users you've pointed out to use the existing API and
respect the value of offset. You have indicated that you are not super
familiar with the bpf code, which is fine (I'm not either), but this
isn't a good reason to make a change to strparser instead of bpf.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] strparser: remove any offset before parsing messages
  2018-08-21 23:35           ` Doron Roberts-Kedes
@ 2018-08-22  0:46             ` Dominique Martinet
  2018-08-22  2:33               ` Doron Roberts-Kedes
  0 siblings, 1 reply; 12+ messages in thread
From: Dominique Martinet @ 2018-08-22  0:46 UTC (permalink / raw)
  To: Doron Roberts-Kedes
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel

Doron Roberts-Kedes wrote on Tue, Aug 21, 2018:
> On Wed, Aug 22, 2018 at 12:51:13AM +0200, Dominique Martinet wrote:
> > That's maybe three more lines than the current patch, which is also
> > pretty simple, I'm not sure what you're expecting from alternative
> > solutions to call that overly complicated...
> 
> The line count is not the source of the complexity. The undue complexity
> is having strparser operate in two modes: one for clients that properly
> use the API by respecting the value of offset, and another for clients 
> that do not. 

I might sound stubborn at this point but that still does not sound
complex compared to the alternatives I can think of.


> > I don't think bpf itself needs to be changed here -- the offset is
> > stored in a strparser specific struct so short of such a skb_pull I
> > think we'd need to change the type of the bpf function, pass it it the
> > extra parameter, and make it a user visible change breaking the kcm
> > API... And I have no idea for sockmap but probably something similar.
> 
> I'm not sure I follow you here. Any rcv_msg callback implementation
> receives an skb. Calling strp_msg() on the skb gives you the strp_msg
> which has the offset value. Can you explain why passing an extra
> parameter is necessary to get the offset?


Yes, the rcv_msg callback itself can get the offset easily, and it's not
that which needs an extra parameter but the bpf function kcm/sockmap are
calling which would need either an extra parameter or changing to get
that value themselves -- the later is probably not very difficult and
won't break the API, but at the same time pushes the responsability to
kcm/sockmap users who will get that wrong and waste time trying to
understand how kcm works like I did.

Breaking the API has the advantage that users will get an error telling
them to update their bpf program instead of randomly failing when tcp
aggregation happens.

For what it's worth, I don't think either are acceptable solutions, I'm
just stating what would a "fix in bpf" would be.


A "fix in kcm/sockmap" would be to pull just before calling the bpf
program, which could be a middle ground, but I think that's more
perverse than adding a flag to strparser for new users because we'd
basically be saying users should modify the skb that strparser users
under its feet, and there is no guarantee what would happen to the
strparser logic in that case -- it might work to pull in the parser
function but it might not work in rcv for all I know, or the next user
might think that since pull is ok some other operation on the skb is as
well...


> > (Also, note that pskb_pull will not copy any data or allocate memory
> > unless we're pulling past the end of the skb, which seems pretty
> > unlikely in that situation as we should have consumed any fully "eaten"
> > skb before getting to a new one here -- so in practice this patch just
> > adds a skb->data += offset with safety guards "just in case")
> 
> Yes, no data will be copied if the you don't pull beyond the linear
> buffer. Adding overhead even in a small percentage of cases still
> requires a good justification.

As I wrote above, I think it should not be possible, so we're not
even talking about a small percentage here.
The reason I didn't use skb_pull (the head-only variant) is that I'd
rather have the overhead than a BUG() if I'm wrong on this...

> In this particular case, I think a good justification would be
> demonstrating that it is impractical for the buggy strparser users
> you've pointed out to use the existing API and respect the value of
> offset.

That's what the previous paragraph about changing the API intended to
do.

> You have indicated that you are not super familiar with the bpf code,
> which is fine (I'm not either), but this isn't a good reason to make a
> change to strparser instead of bpf. 

I didn't even know kcm/strparser existed a few weeks ago, not being
familiar doesn't mean I didn't look at how it works.

I'd be glad to be proven wrong here but I really do not think there is a
possibility within the bpf framework to give a fake skb with an external
offset to the bpf program transparently.

Assuming there is, it would have to be more expensive than a pull (it'd
basically need to do a pull then restore the original data somehow)...

And if there isn't I certainly do not want to add one, not because I
don't want to mess with something I'm not too familiar with (that'd
actually be an argument to do it that way to me), but because I do not
think it belongs there; bpf should not need to know what a skb is or how
it works.


All being said I'm still convinced having two modes in strparser is the
simplest solution.

-- 
Dominique

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] strparser: remove any offset before parsing messages
  2018-08-22  0:46             ` Dominique Martinet
@ 2018-08-22  2:33               ` Doron Roberts-Kedes
  2018-08-22  5:47                 ` Dominique Martinet
  0 siblings, 1 reply; 12+ messages in thread
From: Doron Roberts-Kedes @ 2018-08-22  2:33 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel

On Wed, Aug 22, 2018 at 02:46:47AM +0200, Dominique Martinet wrote:
> Yes, the rcv_msg callback itself can get the offset easily, and it's not
> that which needs an extra parameter but the bpf function kcm/sockmap are
> calling which would need either an extra parameter or changing to get
> that value themselves.

Ah cool. Thanks for explaining. 

> For what it's worth, I don't think either are acceptable solutions, I'm
> just stating what would a "fix in bpf" would be.

Agreed that the discussion should be about whether to fix it up in
strparser or sockmap. bpf seems inappropriate.

> strparser logic in that case -- it might work to pull in the parser
> function but it might not work in rcv for all I know, or the next user
> might think that since pull is ok some other operation on the skb is as
> well...

Just to make sure I understand, is it possible you meant to say that the
other way around? Surely the rcv callback can do whatever it wants with
the skb. Its the parse callback that may need to be a little more
careful with the skb.

For the parse case, why not just clone and pull? 

> As I wrote above, I think it should not be possible, so we're not
> even talking about a small percentage here.
> The reason I didn't use skb_pull (the head-only variant) is that I'd
> rather have the overhead than a BUG() if I'm wrong on this...

A printk in that section when (orig_offset + eaten > skb_headlen(head)) 
confirms that this case is not uncommon or impossible. Would have to do
more work to see how many hundreds of times per second, but it is not a
philosophical concern.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] strparser: remove any offset before parsing messages
  2018-08-22  2:33               ` Doron Roberts-Kedes
@ 2018-08-22  5:47                 ` Dominique Martinet
  2018-08-22 18:38                   ` Dave Watson
  0 siblings, 1 reply; 12+ messages in thread
From: Dominique Martinet @ 2018-08-22  5:47 UTC (permalink / raw)
  To: Doron Roberts-Kedes
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel

Doron Roberts-Kedes wrote on Tue, Aug 21, 2018:
> > strparser logic in that case -- it might work to pull in the parser
> > function but it might not work in rcv for all I know, or the next user
> > might think that since pull is ok some other operation on the skb is as
> > well...
> 
> Just to make sure I understand, is it possible you meant to say that the
> other way around? Surely the rcv callback can do whatever it wants with
> the skb. Its the parse callback that may need to be a little more
> careful with the skb.

Hmm, right, when we call the rcv callback we remove the skb from
strp->skb_head, and by the next iteration we have a new clone of the
orig_skb so it looks safe, but that's not something that's obvious at
first glance; it works because the skb was cloned at the start of the
loop.


> For the parse case, why not just clone and pull? 

Hmm, if we clone I guess there is no side effect to fear, that could be
acceptable... It feels that we're just pushing more overhead on to
kcm/sockmap though; in strparser's __strp_recv we know we can pull
safely so doing it there feels simpler to me.

After testing the overhead doesn't seem to be bad though, it looks like
it's less than the noise level on my laptop on performance governor; the
only thing to pay attention to is that if we clone here I'll need to
also add another pull in sockmap's rcv function
(smap_read_sock_strparser) that doesn't handle offset either.

If we only pull just before parsing I think rcv is also guaranteed to
have no offset, it has to start with the same offset as the parsing
unless the user changed it, right?



Anyway there's progress, we're down to these two-ish choices if I
followed correctly:
 - add a flag in strp_callbacks that says offsets are ok or not for
parsing, and just pull if it's set.
Now we're back to that I'd be moving the pull just before parsing like I
did in v0, as that's easier to follow.
 or
 - add a (clone?+)pull in kcm_rcv_strparser and smap_parse_func_strparser
(+ in smap_read_sock_strparser if clone)


(As a side note, I noticed this patch is bugged, orig_offset/eaten
shouldn't be reset to zero after the pull (and thus orig_len doesn't
need changing either); that's what I get for trying to "simplify"
something that was simpler in the v0 to me...)


> > As I wrote above, I think it should not be possible, so we're not
> > even talking about a small percentage here.
> > The reason I didn't use skb_pull (the head-only variant) is that I'd
> > rather have the overhead than a BUG() if I'm wrong on this...
> 
> A printk in that section when (orig_offset + eaten > skb_headlen(head)) 
> confirms that this case is not uncommon or impossible. Would have to do
> more work to see how many hundreds of times per second, but it is not a
> philosophical concern.

Hmm, right, it does happen if I force two bigger packets in a single
write() on my reproducer; I guess my workload didn't exhibit that
behaviour with a 9p client...

I've tried measuring that overhead as well by writing a more complex bpf
program that would fetch the offset in the skb but for some reason I'm
reading a 0 offset when it's not zero... well, not like there's much
choice for this at this point anyway; I don't think we'll do this
without pull, I'll put that on background.

-- 
Dominique


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] strparser: remove any offset before parsing messages
  2018-08-22  5:47                 ` Dominique Martinet
@ 2018-08-22 18:38                   ` Dave Watson
  2018-08-23  1:04                     ` Dominique Martinet
  0 siblings, 1 reply; 12+ messages in thread
From: Dave Watson @ 2018-08-22 18:38 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Doron Roberts-Kedes, Tom Herbert, David S. Miller, netdev, linux-kernel

On 08/22/18 07:47 AM, Dominique Martinet wrote:
> > > As I wrote above, I think it should not be possible, so we're not
> > > even talking about a small percentage here.
> > > The reason I didn't use skb_pull (the head-only variant) is that I'd
> > > rather have the overhead than a BUG() if I'm wrong on this...
> > 
> > A printk in that section when (orig_offset + eaten > skb_headlen(head)) 
> > confirms that this case is not uncommon or impossible. Would have to do
> > more work to see how many hundreds of times per second, but it is not a
> > philosophical concern.
> 
> Hmm, right, it does happen if I force two bigger packets in a single
> write() on my reproducer; I guess my workload didn't exhibit that
> behaviour with a 9p client...
> 
> I've tried measuring that overhead as well by writing a more complex bpf
> program that would fetch the offset in the skb but for some reason I'm
> reading a 0 offset when it's not zero... well, not like there's much
> choice for this at this point anyway; I don't think we'll do this
> without pull, I'll put that on background.

For what it is worth we checked the offset in bpf, something
along the lines of
	  
struct kcm_rx_msg {   int full_len;  int offset;};
static inline struct kcm_rx_msg *kcm_rx_msg(struct __sk_buff *skb)
      { return (struct kcm_rx_msg *)skb->cb;}

int decode_framing(struct __sk_buff *skb)
{ return load_word(skb, kcm_rx_msg(skb)->offset);}

Although it did puzzle me for a while figuring that out when I ran in
to it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] strparser: remove any offset before parsing messages
  2018-08-22 18:38                   ` Dave Watson
@ 2018-08-23  1:04                     ` Dominique Martinet
  0 siblings, 0 replies; 12+ messages in thread
From: Dominique Martinet @ 2018-08-23  1:04 UTC (permalink / raw)
  To: Dave Watson
  Cc: Doron Roberts-Kedes, Tom Herbert, David S. Miller, netdev, linux-kernel

Dave Watson wrote on Wed, Aug 22, 2018:
> > I've tried measuring that overhead as well by writing a more complex bpf
> > program that would fetch the offset in the skb but for some reason I'm
> > reading a 0 offset when it's not zero... well, not like there's much
> > choice for this at this point anyway; I don't think we'll do this
> > without pull, I'll put that on background.
> 
> For what it is worth we checked the offset in bpf, something
> along the lines of

Oh, thanks!

> 	  > struct kcm_rx_msg {   int full_len;  int offset;};
> static inline struct kcm_rx_msg *kcm_rx_msg(struct __sk_buff *skb)
>       { return (struct kcm_rx_msg *)skb->cb;}
> 
> int decode_framing(struct __sk_buff *skb)
> { return load_word(skb, kcm_rx_msg(skb)->offset);}

So you're taking directly the address at skb->cb but the linux code has
this function:
static inline struct strp_msg *strp_msg(struct sk_buff *skb)
{
        return (struct strp_msg *)((void *)skb->cb +
                offsetof(struct qdisc_skb_cb, data));
}
and qdisc_skb_cb.data is another 8 bytes in, that would explain I had
different results (and now I'm trying your snippet it does work), but
I'll have to admit I fail to understand this....

Ok, so 'cb' in __sk_buff is 48 bytes in but 'cb' in sk_buff is 40 bytes
in -- I might just start getting annoyed over this, is there a reason
for the different offset?!


> Although it did puzzle me for a while figuring that out when I ran in
> to it.

Well, at least it means some people were aware of the problem and worked
around it in their own way -- what do you think of pulling instead?
I mean, we could just document that "really well" and provide the
get-offset function in some header that would be made include-able from
bpf.. But right now this isn't really the case.


FWIW now I have this version I also don't notice any performance change
with the pull on my example, it actually looks like the bpf load_word is
slightly slower than pull to access data that is not in the head, but
the noise level is pretty bad.


Thanks,
-- 
Dominique

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-09 22:40 [PATCH v0] strparser: remove any offset before parsing messages Dominique Martinet
2018-08-21 12:51 ` [PATCH] " Dominique Martinet
2018-08-21 14:53   ` Doron Roberts-Kedes
2018-08-21 19:36     ` Dominique Martinet
2018-08-21 21:15       ` Doron Roberts-Kedes
2018-08-21 22:51         ` Dominique Martinet
2018-08-21 23:35           ` Doron Roberts-Kedes
2018-08-22  0:46             ` Dominique Martinet
2018-08-22  2:33               ` Doron Roberts-Kedes
2018-08-22  5:47                 ` Dominique Martinet
2018-08-22 18:38                   ` Dave Watson
2018-08-23  1:04                     ` Dominique Martinet

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox