netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bill Fenner <fenner@gmail.com>
To: Guy Harris <guy@alum.mit.edu>
Cc: Ani Sinha <ani@aristanetworks.com>,
	netdev@vger.kernel.org, tcpdump-workers@lists.tcpdump.org,
	Michael Richardson <mcr@sandelman.ca>,
	Francesco Ruggeri <fruggeri@aristanetworks.com>
Subject: Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Date: Fri, 2 Nov 2012 12:13:18 -0400	[thread overview]
Message-ID: <CAF4SogYMyANyZ4-RYoVHf3TewPVQgR49m5HLDdwVMJ7sn8_XxQ@mail.gmail.com> (raw)
In-Reply-To: <5422DBB2-EABF-4C9F-B0CD-8C77E91F9FF8@alum.mit.edu>

On Wed, Oct 31, 2012 at 6:20 PM, Guy Harris <guy@alum.mit.edu> wrote:
>
> On Oct 31, 2012, at 2:50 PM, Ani Sinha <ani@aristanetworks.com> wrote:
>
>> pcap files that already have the tags reinsrted should work with
>> current filter code. However for live traffic, one has to get the tags
>> from CMSG() and then reinsert it back to the packet for the current
>> filter to work.
>
> *Somebody* has to do that, at least to packets that pass the filter, before they're handed to a libpcap-based application, for programs that expect to see packets as they arrived from/were transmitted to the wire to work.
>
> I.e., the tags *should* be reinserted by libpcap, and, as I understand it, that's what the
>
>         #if defined(HAVE_PACKET_AUXDATA) && defined(HAVE_LINUX_TPACKET_AUXDATA_TP_VLAN_TCI)
>                 ...
>         #endif
>
> blocks of code in pcap-linux.c in libpcap are doing.
>
> Now, if filtering is being done in the *kernel*, and the tags aren't being reinserted by the kernel, then filter code stuffed into the kernel would need to differ from filter code run in userland.  There's already precedent for that on Linux, with the "cooked mode" headers; those are synthesized by libpcap from the metadata returned for PF_PACKET sockets, and the code that attempts to hand the kernel a filter goes through the filter code, which was generated under the assumption that the packet begins with a "cooked mode" header, and modifies (a copy of) the code to, instead, use the special Linux-BPF-interpreter offsets to access the metadata.
>
> The right thing to do here would be to, if possible, do the same, so that the kernel doesn't have to reinsert VLAN tags for packets that aren't going to be handed to userland.

In this case, it would be incredibly complicated to do this just
postprocessing a set of bpf instructions.  The problem is that when
running the filter in the kernel, the IP header, etc. are not offset,
so "off_macpl" and "off_linktype" would be zero, not 4, while
generating the rest of the expression.  We would also have to insert
code when comparing the ethertype to 0x8100 to instead load the
vlan-tagged metadata, so all jumps crossing that point would have to
be adjusted, and if the "if-false" instruction was also testing the
ethertype, then the ethertype would have to be reloaded (again
inserting another instruction).

Basically, take a look at the output of "tcpdump -d tcp port 22 or
(vlan and tcp port 22)".  Are the IPv4 tcp ports at x+14/x+16, or at
x+18/x+20?  If we're filtering in the kernel, they're at x+14/x+16
whether the packet is vlan tagged or not.  If we're filtering on the
actual packet contents (from a savefile, for example), they're at
x+18/x+20 if the packet is vlan tagged.

Also, an expression such as 'tcp port 22' would have to have some
instructions added at the beginning, for "vlan-tagged == false", or it
would match both tagged and untagged packets.

This would be much more straightforward to deal with in the code
generation phase, except until now the code generation phase hasn't
known whether the filter is headed for the kernel or not.

  Bill

  parent reply	other threads:[~2012-11-02 16:13 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAOxq_8Nd8VP3MaNBfUt9v82nmGDpxZz5_5QMdsruET1tjwuQPw@mail.gmail.com>
     [not found] ` <3246.1351717319@obiwan.sandelman.ca>
2012-10-31 21:50   ` [tcpdump-workers] vlan tagged packets and libpcap breakage Ani Sinha
2012-10-31 22:20     ` Guy Harris
2012-10-31 22:35       ` Ani Sinha
2012-11-01  0:50         ` [tcpdump-workers] " Guy Harris
2012-11-01  1:22           ` Ani Sinha
2012-12-06 21:20           ` Ani Sinha
2012-11-02 16:13       ` Bill Fenner [this message]
2012-11-13 22:41         ` Ani Sinha
2012-11-13 22:42           ` [tcpdump-workers] " Ani Sinha
2012-11-14 18:58           ` Michael Richardson
2012-10-31 22:42     ` [tcpdump-workers] " Michael Richardson
2012-12-12 21:53       ` Ani Sinha
2012-12-12 22:16         ` Ani Sinha
2012-12-13  8:35         ` [tcpdump-workers] " Daniel Borkmann
2012-12-13 17:34           ` Ani Sinha
2012-12-13 21:49             ` Daniel Borkmann
2012-12-13 22:07               ` Ani Sinha
2012-12-17  9:50               ` David Laight
2012-12-17 10:35                 ` Guy Harris
2012-12-17 11:08                   ` Daniel Borkmann
2012-12-17 19:49                   ` [tcpdump-workers] " Ani Sinha
2012-11-16  6:51     ` Eric W. Biederman
2012-11-17 22:14       ` Michael Richardson
2012-11-17 23:16         ` Daniel Borkmann
2012-11-17 23:37           ` Eric W. Biederman
2012-11-17 23:33         ` Eric W. Biederman
2012-12-06 21:22           ` Ani Sinha
2012-12-06 22:19             ` Eric W. Biederman
2012-12-06 22:40               ` Ani Sinha
2012-12-07  0:55               ` Ani Sinha
2012-12-07  1:03                 ` [tcpdump-workers] " Eric W. Biederman
2012-12-07  1:28                   ` Ani Sinha
2012-12-07  1:31                   ` Ani Sinha
2012-12-07  1:41                     ` Eric W. Biederman
2012-12-07  1:59                       ` Michael Richardson
2012-12-11  0:11                         ` [tcpdump-workers] " Ani Sinha
2012-12-11 22:36       ` Ani Sinha
2012-12-11 23:04         ` Eric Dumazet
2012-12-12  0:46           ` Ani Sinha
2012-12-12  0:50           ` [tcpdump-workers] " Ani Sinha
2012-12-11 23:12         ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAF4SogYMyANyZ4-RYoVHf3TewPVQgR49m5HLDdwVMJ7sn8_XxQ@mail.gmail.com \
    --to=fenner@gmail.com \
    --cc=ani@aristanetworks.com \
    --cc=fruggeri@aristanetworks.com \
    --cc=guy@alum.mit.edu \
    --cc=mcr@sandelman.ca \
    --cc=netdev@vger.kernel.org \
    --cc=tcpdump-workers@lists.tcpdump.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).