From: Dominique Martinet <asmadeus@codewreck.org>
To: David Miller <davem@davemloft.net>
Cc: doronrk@fb.com, tom@quantonium.net, davejwatson@fb.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] kcm: remove any offset before parsing messages
Date: Wed, 31 Oct 2018 03:56:57 +0100 [thread overview]
Message-ID: <20181031025657.GA17861@nautica> (raw)
In-Reply-To: <20180918015723.GA26300@nautica>
Dominique Martinet wrote on Tue, Sep 18, 2018:
> David Miller wrote on Mon, Sep 17, 2018:
> > From: Dominique Martinet <asmadeus@codewreck.org>
> > Date: Wed, 12 Sep 2018 07:36:42 +0200
> > > Dominique Martinet wrote on Tue, Sep 11, 2018:
> > >> Hmm, while trying to benchmark this, I sometimes got hangs in
> > >> kcm_wait_data() for the last packet somehow?
> > >> The sender program was done (exited (zombie) so I assumed the sender
> > >> socket flushed), but the receiver was in kcm_wait_data in kcm_recvmsg
> > >> indicating it parsed a header but there was no skb to peek at?
> > >> But the sock is locked so this shouldn't be racy...
> > >>
> > >> I can get it fairly often with this patch and small messages with an
> > >> offset, but I think it's just because the pull changes some timing - I
> > >> can't hit it with just the clone, and I can hit it with a pull without
> > >> clone as well.... And I don't see how pulling a cloned skb can impact
> > >> the original socket, but I'm a bit fuzzy on this.
> > >
> > > This is weird, I cannot reproduce at all without that pull, even if I
> > > add another delay there instead of the pull, so it's not just timing...
> >
> > I really can't apply this patch until you resolve this.
> >
> > It is weird, given your description, though...
>
> Thanks for the reminder! I totally agree with you here and did not
> expect this to be merged as it is (in retrospect, I probably should have
> written something to that extent in the subject, "RFC"?)
Found the issue after some trouble reproducing on other VM, long story
short:
- I was blaming kcm_wait_data's sk_wait_data to wait while there was
something in sk->sk_receive_queue, but after adding a fake timeout and
some debug messages I can see the receive queue is empty.
However going back up from the kcm_sock to the kcm_mux to the kcm_psock,
there are things in the psock's socket's receive_queue... (If I'm
following the code correctly, that would be the underlying tcp socket)
- that psock's strparser contains some hints: the interrupted and
stopped bits are set. strp->interrupted looks like it's only set if
kcm_parse_msg returns something < 0. . .
And surely enough, the skb_pull returns NULL iff there's such a hang...!
I might be tempted to send a patch to strparser to add a pr_debug
message in strp_abort_strp...
Anyway, that probably explains I have no problem with bigger VM
(uselessly more memory available) or without KASAN (I guess there's
overhead?), but I'm sending at most 300k of data and the VM has a 1.5GB
of ram, so if there's an allocation failure there I think there's a
problem ! . . .
So, well, I'm not sure on the way forward. Adding a bpf helper and
document that kcm users should mind the offset?
Thanks,
--
Dominique
next prev parent reply other threads:[~2018-10-31 2:56 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-11 9:21 [PATCH v2] kcm: remove any offset before parsing messages Dominique Martinet
2018-09-12 5:36 ` Dominique Martinet
2018-09-18 1:45 ` David Miller
2018-09-18 1:57 ` Dominique Martinet
2018-09-18 2:40 ` David Miller
2018-09-18 2:45 ` Dominique Martinet
2018-09-18 2:51 ` David Miller
2018-09-18 2:58 ` Dominique Martinet
2018-10-31 2:56 ` Dominique Martinet [this message]
2019-02-15 1:00 ` Dominique Martinet
2019-02-15 1:20 ` Tom Herbert
2019-02-15 1:57 ` Dominique Martinet
2019-02-15 2:48 ` Tom Herbert
2019-02-15 3:31 ` Dominique Martinet
2019-02-15 4:01 ` Tom Herbert
2019-02-15 4:52 ` Dominique Martinet
2019-02-20 4:11 ` Dominique Martinet
2019-02-20 16:18 ` Tom Herbert
2019-02-21 8:22 ` Dominique Martinet
2019-02-22 19:24 ` Tom Herbert
2019-02-22 20:27 ` Dominique Martinet
2019-02-22 21:01 ` Tom Herbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181031025657.GA17861@nautica \
--to=asmadeus@codewreck.org \
--cc=davejwatson@fb.com \
--cc=davem@davemloft.net \
--cc=doronrk@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=tom@quantonium.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).