All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joey Salazar <jgsal@protonmail.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Developer support list for Wireshark 
	<wireshark-dev@wireshark.org>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [Wireshark-dev] Multiple-line parsing of packets dissected over HTTP
Date: Wed, 20 Jan 2021 23:17:37 +0000	[thread overview]
Message-ID: <aCtJaGp2bmtHdiu65J9Z3vSEeFMfN9OvOzv-eXNbnXUVKpH-f7q48QJpTklEWoOEaAt4ctixkpOIC8vMTmBP7er36t-y8HHFRA17vXzB-UE=@protonmail.com> (raw)
In-Reply-To: <YAd0044tbht/DOKq@google.com>

Hi all, Jonathan,

On Tuesday, January 19, 2021 6:09 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:

> Hi,
>
> Pascal Quantin wrote:
>
> > Le mar. 19 janv. 2021 à 17:45, Joey Salazar via Wireshark-dev <
> > wireshark-dev@wireshark.org> a écrit :
>
> > > In commit 33af2649 [1] we can keep dissecting the contents of the req,
> > > adv, and res packets by setting
> > > while (plen > 0) { }
> > > either in `dissect_git_pdu()` or in `dissect_one_pkt_line()`, but for now
> > > in `dissect_git_pdu()` it'd be a bit messy, so wanted to ask for your
> > > feedback for getting `dissect_one_pkt_line()` to work properly first.
> > > As you can see in pcap 169 [2], it correctly parses the length of the
> > > first line as 0x0014 (20 bytes) until `0x0a`, then it's supposed to get the
> > > length of the next line by the first 4 hex bytes in that line, but instead
> > > of reading the length as 0x0018 (24 bytes) it's reading it as 0x0010 (16
> > > bytes), and anyways, this particular line's length actually is 59 bytes.
>
> Interesting. Let me summarize your question: getting the information
> in one place here, the relevant code at line 114 looks like
>
> | + while (plen > 0) {
> | + proto_tree_add_uint(git_tree, hf_git_packet_len, tvb, offset, 4, plen);
> | + offset += 4;
> | + plen -= 4;
> | +
> | + proto_tree_add_item(git_tree, hf_git_packet_data, tvb, offset, plen, ENC_NA);
> | + offset += plen;
> | + // To-do: add lines for parsing of terminator packet 0000
> | + }
>
> The relevant part of the pcap is shown in an image; transcribing
> imperfectly, I see
>
> | 0014command=ls-refs\n
> | 0018agent=git/2.29.0.rc2
> | 0016object-format=sha1
> | 0001
> [etc]
>
> where \n denotes a newline byte and there are no newlines between
> these pkt-lines.
>
> That first pkt-line has 4 bytes for the length, followed by 12 bytes
> for "command=ls-refs\n", including newline. So why does this parse as
>
> packet-length: 0x0014
> packet data: "command=ls-refs\n"
> packet-length: 0x0010
> packet data: "agent=[etc]"
>
> ?
>
> [...]
>
> > So what is the code leading to this dissection? It does not seem to be
> > https://gitlab.com/joeysal/wireshark/-/commit/33af2649927cb5660d4aeb64b9a9e9a58a1823aa
> > as dissect_one_pkt_line() seem to read only one line (BTW using a while
> > loop in this commit is useless as you are incrementing offset by plen, and
> > the code you shared considers that plen includes the 4 bytes of the packet
> > length field while your screenshot does not assume that).
>
> This reply is a bit dense, but it contains the hints to move forward:
>
> First, there's the matter of the contract of the dissect_one_pkt_line()
> function. The name suggests it would dissect a single pkt-line, but
> it has this loop in it. What does it actually do? And what do we
> want it to do?
>
> That second question is easy to answer: this code will be much easier
> to read if dissect_one_pktline dissects a single pkt-line! For
> example, if we imagine a contract like
>
> /** Dissects a single pkt-line.
> *
> * @param[in] tvb Buffer containing a pkt-line.
> * @param offset Offset at which to start reading.
> * @param[out] tree Tree to attach the dissected pkt-line to.
> * @return Number of bytes dissected, or -1 on error.
> */
> static int dissect_one_pkt_line(tvbuff_t *tvb, int offset, proto_tree *tree)
>
> then we could call this in a loop, like:
>
> int offset = 0;
>
> while (offset < total length)
> offset += dissect_one_pkt_line(tvb, offset, tree);
>
> Obtaining the total length and including some error handling left as
> an exercise to the reader.
>
> As for the first question: what does the current code do? The loop at
> l114 doesn't modify plen except by subtracting 4 from it. So instead
> of reading the pkt-length from the next pkt-line, it assumes it is 4
> bytes less. 0x14 - 4 is 0x10, hence the 0x10 as pkt length
> assumption.

This was most helpful, thank you, in some cases I couldn't understand why `pkt-length` was being parsed the one it was.
Apologies to all for my poor explanation of what I wanted the code to do!

I've updated the commit and it now is 9342d7fb [1], it is now reading the size of each pkt-line before checking for `0000` Flush Packet (checking for `0001` delimiter could be added in line 131) and jumping to the next pkt-line.

However, I understand now that `while (plen > 0)` in line 119 should not be doing that since that `plen` is for the first pkt-line only, making this an unsuitable approach.

I'll start working on the approach suggested above by Jonathan. All your feedback and observations most welcome.

Thank you all for the input and time invested so far,
Joey

[1] gitlab.com/joeysal/wireshark/-/commit/9342d7fbc15fad4ec7765bbbb0f1672319dc705a

      reply	other threads:[~2021-01-21  0:26 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-19  4:07 Multiple-line parsing of packets dissected over HTTP Joey Salazar
     [not found] ` <CAGka-81Ze71igsW2zsfzYqqkr+sz5FE6YYo3mEpdEQUSUxZvaw@mail.gmail.com>
2021-01-20  0:09   ` [Wireshark-dev] " Jonathan Nieder
2021-01-20 23:17     ` Joey Salazar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='aCtJaGp2bmtHdiu65J9Z3vSEeFMfN9OvOzv-eXNbnXUVKpH-f7q48QJpTklEWoOEaAt4ctixkpOIC8vMTmBP7er36t-y8HHFRA17vXzB-UE=@protonmail.com' \
    --to=jgsal@protonmail.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=wireshark-dev@wireshark.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.