* [PATCH net-next 0/2] tools: psock_tpacket bug fixes @ 2017-01-03 23:27 Sowmini Varadhan 2017-01-03 23:27 ` [PATCH net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter Sowmini Varadhan 2017-01-03 23:27 ` [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it Sowmini Varadhan 0 siblings, 2 replies; 12+ messages in thread From: Sowmini Varadhan @ 2017-01-03 23:27 UTC (permalink / raw) To: linux-kselftest, netdev, sowmini.varadhan; +Cc: daniel, willemb, davem, shuah This patchset includes fixes to psock_tpacket for false-negatives sporadically reported by the test when it was run concurrently with other heavy network traffic (e.g., over an ssh session, as opposed to running the test from the console of the test machine). The test sometimes failed with errors reporting more recvd packets than expected (e.g., "walk_v0_rx: received 201 out of 100 pkts") or the reception of non-IP packets (e.g., ARP packets). There are 2 problems with the test: 1. set_sockfilter() has many bugs (e.g. it will allow TCP packets that may happen to match on the length field and happen to have the 'a' or 'b' at the location tested by the filter). In addition, it is very hard to read, making these bugs harder to find. 2. There is a race-window between packet_create() and packet_do_bind() in which packets from any interface (e.g., eth0) will get queued for Rx on the test socket. These packets should not be counted toward the test. Patch 1 fixes the first issue by cleaing up set_sockfilter() and hardening it to make sure that it only permits UDP/IPv4 packets. Patch 2 fixes the second issue by checking the Rx ifindex passed up with the tpacket rx frame and making sure that packets counted toward the test are those received on lo. Sowmini Varadhan (2): tools: tighten conditions checked in sock_setfilter tools: psock_tpacket: verify that packet was received on lo before counting it tools/testing/selftests/net/psock_lib.h | 28 ++++++++++++++++++++------ tools/testing/selftests/net/psock_tpacket.c | 25 +++++++++++++++++++---- 2 files changed, 41 insertions(+), 12 deletions(-) ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter 2017-01-03 23:27 [PATCH net-next 0/2] tools: psock_tpacket bug fixes Sowmini Varadhan @ 2017-01-03 23:27 ` Sowmini Varadhan 2017-01-04 14:27 ` Willem de Bruijn 2017-01-03 23:27 ` [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it Sowmini Varadhan 1 sibling, 1 reply; 12+ messages in thread From: Sowmini Varadhan @ 2017-01-03 23:27 UTC (permalink / raw) To: linux-kselftest, netdev, sowmini.varadhan; +Cc: daniel, willemb, davem, shuah The bpf_prog used in sock_setfilter() only attempts to check for ip pktlen, and verifies that the contents of the 80'th packet in the ethernet frame is 'a' or 'b'. Offsets used for checking these conditions are incorrectly computed. Thus many non-udp packets could incorrectly pass through this filter and cause the test to fail. This commit tightens the conditions checked by the filter so that only UDP/IPv4 packets with the matching length and test-character will be permitted by the filter. The filter has been cleaned up to explicitly use the BPF macros to make it more readable. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> --- tools/testing/selftests/net/psock_lib.h | 28 +++++++++++++++++++++------- 1 files changed, 21 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/net/psock_lib.h b/tools/testing/selftests/net/psock_lib.h index 24bc7ec..e62540e 100644 --- a/tools/testing/selftests/net/psock_lib.h +++ b/tools/testing/selftests/net/psock_lib.h @@ -27,6 +27,7 @@ #include <string.h> #include <arpa/inet.h> #include <unistd.h> +#include <netinet/udp.h> #define DATA_LEN 100 #define DATA_CHAR 'a' @@ -40,14 +41,27 @@ static __maybe_unused void sock_setfilter(int fd, int lvl, int optnum) { + uint16_t ip_len = DATA_LEN + + sizeof(struct iphdr) + sizeof(struct udphdr); + /* the filter below checks for all of the following conditions that + * are based on the contents of create_payload() + * ether type 0x800 and + * ip proto udp and + * ip len == ip_len and + * udp[38] == 'a' or udp[38] == 'b' + */ struct sock_filter bpf_filter[] = { - { 0x80, 0, 0, 0x00000000 }, /* LD pktlen */ - { 0x35, 0, 4, DATA_LEN }, /* JGE DATA_LEN [f goto nomatch]*/ - { 0x30, 0, 0, 0x00000050 }, /* LD ip[80] */ - { 0x15, 1, 0, DATA_CHAR }, /* JEQ DATA_CHAR [t goto match]*/ - { 0x15, 0, 1, DATA_CHAR_1}, /* JEQ DATA_CHAR_1 [t goto match]*/ - { 0x06, 0, 0, 0x00000060 }, /* RET match */ - { 0x06, 0, 0, 0x00000000 }, /* RET no match */ + BPF_STMT(BPF_LD | BPF_H | BPF_ABS, 12), /* LD ethertype */ + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, ETH_P_IP, 0, 8), + BPF_STMT(BPF_LD|BPF_B|BPF_ABS, 23), /* LD ip_proto */ + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, IPPROTO_UDP, 0, 6), + BPF_STMT(BPF_LD|BPF_H|BPF_ABS, 16), /* LD ip_len */ + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, ip_len, 0, 4), + BPF_STMT(BPF_LD|BPF_B|BPF_ABS, 80), /* LD udp[38] */ + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, DATA_CHAR, 1, 0), + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, DATA_CHAR_1, 0, 1), + BPF_STMT(BPF_RET | BPF_K, ~0), /* match */ + BPF_STMT(BPF_RET | BPF_K, 0) /* no match */ }; struct sock_fprog bpf_prog; -- 1.7.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter 2017-01-03 23:27 ` [PATCH net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter Sowmini Varadhan @ 2017-01-04 14:27 ` Willem de Bruijn 0 siblings, 0 replies; 12+ messages in thread From: Willem de Bruijn @ 2017-01-04 14:27 UTC (permalink / raw) To: Sowmini Varadhan Cc: linux-kselftest, Network Development, Daniel Borkmann, Willem de Bruijn, David Miller, shuah On Tue, Jan 3, 2017 at 6:27 PM, Sowmini Varadhan <sowmini.varadhan@oracle.com> wrote: > The bpf_prog used in sock_setfilter() only attempts to check for > ip pktlen, and verifies that the contents of the 80'th packet in 80th byte. > the ethernet frame is 'a' or 'b'. Offsets used for checking these > conditions are incorrectly computed. That's a bit strong. It's not incorrect, in that it just intended to match the packets as generated by the test -- which it does. > Thus many non-udp packets > could incorrectly pass through this filter and cause the test to > fail. Absolutely. The test has many potential false positives. Thanks for hardening it. I run these kinds of tests in network namespaces to rule out such flakiness from background traffic. > This commit tightens the conditions checked by the filter so > that only UDP/IPv4 packets with the matching length and test-character > will be permitted by the filter. The filter has been cleaned up > to explicitly use the BPF macros to make it more readable. > > Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> The few comments about the commit wording are no reason for a v2. Thanks for improving the test. Acked-by: Willem de Bruijn <willemb@google.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it 2017-01-03 23:27 [PATCH net-next 0/2] tools: psock_tpacket bug fixes Sowmini Varadhan 2017-01-03 23:27 ` [PATCH net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter Sowmini Varadhan @ 2017-01-03 23:27 ` Sowmini Varadhan 2017-01-04 14:30 ` Willem de Bruijn 1 sibling, 1 reply; 12+ messages in thread From: Sowmini Varadhan @ 2017-01-03 23:27 UTC (permalink / raw) To: linux-kselftest, netdev, sowmini.varadhan; +Cc: daniel, willemb, davem, shuah Packets from any/all interfaces may be queued up on the PF_PACKET socket before it is bound to the loopback interface by psock_tpacket, and when these are passed up by the kernel, they should not be counted toward the conditions needed to pass/fail the Rx tests. psock_tpacket discards these packets by examining the sll_ifindex sent up in each frame and ensuring that this is the same as the ifindex that was used in bind_ring() Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> --- tools/testing/selftests/net/psock_tpacket.c | 25 ++++++++++++++++++++----- 1 files changed, 20 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/net/psock_tpacket.c b/tools/testing/selftests/net/psock_tpacket.c index 4a1bc64..3c3f1aa 100644 --- a/tools/testing/selftests/net/psock_tpacket.c +++ b/tools/testing/selftests/net/psock_tpacket.c @@ -235,6 +235,7 @@ static void walk_v1_v2_rx(int sock, struct ring *ring) int udp_sock[2]; union frame_map ppd; unsigned int frame_num = 0; + int ifindex = ring->ll.sll_ifindex; bug_on(ring->type != PACKET_RX_RING); @@ -255,12 +256,18 @@ static void walk_v1_v2_rx(int sock, struct ring *ring) switch (ring->version) { case TPACKET_V1: + if (ppd.v1->s_ll.sll_ifindex != ifindex) + goto skip; + test_payload((uint8_t *) ppd.raw + ppd.v1->tp_h.tp_mac, ppd.v1->tp_h.tp_snaplen); total_bytes += ppd.v1->tp_h.tp_snaplen; break; case TPACKET_V2: + if (ppd.v2->s_ll.sll_ifindex != ifindex) + goto skip; + test_payload((uint8_t *) ppd.raw + ppd.v2->tp_h.tp_mac, ppd.v2->tp_h.tp_snaplen); total_bytes += ppd.v2->tp_h.tp_snaplen; @@ -270,6 +277,7 @@ static void walk_v1_v2_rx(int sock, struct ring *ring) status_bar_update(); total_packets++; +skip: __v1_v2_rx_user_ready(ppd.raw, ring->version); frame_num = (frame_num + 1) % ring->rd_num; @@ -553,11 +561,13 @@ static void __v3_test_block_header(struct block_desc *pbd, const int block_num) __v3_test_block_seq_num(pbd); } -static void __v3_walk_block(struct block_desc *pbd, const int block_num) +static void __v3_walk_block(struct block_desc *pbd, const int block_num, + int ifindex) { int num_pkts = pbd->h1.num_pkts, i; unsigned long bytes = 0, bytes_with_padding = ALIGN_8(sizeof(*pbd)); struct tpacket3_hdr *ppd; + struct sockaddr_ll *s_ll; __v3_test_block_header(pbd, block_num); @@ -572,10 +582,15 @@ static void __v3_walk_block(struct block_desc *pbd, const int block_num) else bytes_with_padding += ALIGN_8(ppd->tp_snaplen + ppd->tp_mac); - test_payload((uint8_t *) ppd + ppd->tp_mac, ppd->tp_snaplen); + s_ll = (struct sockaddr_ll *)&ppd[1]; - status_bar_update(); - total_packets++; + if (ifindex == s_ll->sll_ifindex) { + test_payload((uint8_t *) ppd + ppd->tp_mac, + ppd->tp_snaplen); + + status_bar_update(); + total_packets++; + } ppd = (struct tpacket3_hdr *) ((uint8_t *) ppd + ppd->tp_next_offset); __sync_synchronize(); @@ -616,7 +631,7 @@ static void walk_v3_rx(int sock, struct ring *ring) while ((pbd->h1.block_status & TP_STATUS_USER) == 0) poll(&pfd, 1, 1); - __v3_walk_block(pbd, block_num); + __v3_walk_block(pbd, block_num, ring->ll.sll_ifindex); __v3_flush_block(pbd); block_num = (block_num + 1) % ring->rd_num; -- 1.7.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it 2017-01-03 23:27 ` [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it Sowmini Varadhan @ 2017-01-04 14:30 ` Willem de Bruijn 2017-01-04 14:44 ` Sowmini Varadhan 0 siblings, 1 reply; 12+ messages in thread From: Willem de Bruijn @ 2017-01-04 14:30 UTC (permalink / raw) To: Sowmini Varadhan Cc: linux-kselftest, Network Development, Daniel Borkmann, Willem de Bruijn, David Miller, shuah On Tue, Jan 3, 2017 at 6:27 PM, Sowmini Varadhan <sowmini.varadhan@oracle.com> wrote: > Packets from any/all interfaces may be queued up on the PF_PACKET socket > before it is bound to the loopback interface by psock_tpacket, and > when these are passed up by the kernel, they should not be counted > toward the conditions needed to pass/fail the Rx tests. The common and simpler solution to this problem is to open the socket with protocol 0 to reject all packets, add the BPF filter and only then bind with sll_ifindex set to lo. That way no false positives can arrive. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it 2017-01-04 14:30 ` Willem de Bruijn @ 2017-01-04 14:44 ` Sowmini Varadhan 2017-01-04 15:03 ` Willem de Bruijn 0 siblings, 1 reply; 12+ messages in thread From: Sowmini Varadhan @ 2017-01-04 14:44 UTC (permalink / raw) To: Willem de Bruijn Cc: linux-kselftest, Network Development, Daniel Borkmann, Willem de Bruijn, David Miller, shuah On (01/04/17 09:30), Willem de Bruijn wrote: > > The common and simpler solution to this problem is to open the socket > with protocol 0 to reject all packets, add the BPF filter and only then bind > with sll_ifindex set to lo. That way no false positives can arrive. Yes, I thought of that too (and I've seen that done in one commercial implementation), but given that tpacket nicely returns the incoming interface, I figured, why not use the test prog to use this (thus verifying it, and also showing how to use it) --Sowmini ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it 2017-01-04 14:44 ` Sowmini Varadhan @ 2017-01-04 15:03 ` Willem de Bruijn 2017-01-04 15:13 ` Sowmini Varadhan 0 siblings, 1 reply; 12+ messages in thread From: Willem de Bruijn @ 2017-01-04 15:03 UTC (permalink / raw) To: Sowmini Varadhan Cc: linux-kselftest, Network Development, Daniel Borkmann, Willem de Bruijn, David Miller, shuah On Wed, Jan 4, 2017 at 9:44 AM, Sowmini Varadhan <sowmini.varadhan@oracle.com> wrote: > On (01/04/17 09:30), Willem de Bruijn wrote: >> >> The common and simpler solution to this problem is to open the socket >> with protocol 0 to reject all packets, add the BPF filter and only then bind >> with sll_ifindex set to lo. That way no false positives can arrive. > > Yes, I thought of that too (and I've seen that done in one commercial > implementation), but given that tpacket nicely returns the incoming > interface, I figured, why not use the test prog to use this (thus > verifying it, and also showing how to use it) This approach is less restrictive. It still allows incorrect packets to be enqueued in the time between the socket call and attaching the bpf filter. Also, if packets are restricted to a single packet, using bind with sll_ifindex is simpler. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it 2017-01-04 15:03 ` Willem de Bruijn @ 2017-01-04 15:13 ` Sowmini Varadhan 2017-01-04 16:07 ` Willem de Bruijn 0 siblings, 1 reply; 12+ messages in thread From: Sowmini Varadhan @ 2017-01-04 15:13 UTC (permalink / raw) To: Willem de Bruijn Cc: linux-kselftest, Network Development, Daniel Borkmann, Willem de Bruijn, David Miller, shuah On (01/04/17 10:03), Willem de Bruijn wrote: > > This approach is less restrictive. It still allows incorrect packets > to be enqueued in the time between the socket call and attaching the > bpf filter. Also, if packets are restricted to a single packet, using > bind with sll_ifindex is simpler. Do you want me to change this to first set up pfsocket() with proto 0, then set up filter, and then bind_ring() to the desired ifindex with ETH_P_ALL? I can spin out v2 (and if I have to that, I can also fix the comments) if you feel strongly about it. --Sowmini ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it 2017-01-04 15:13 ` Sowmini Varadhan @ 2017-01-04 16:07 ` Willem de Bruijn 2017-01-04 16:12 ` Sowmini Varadhan 0 siblings, 1 reply; 12+ messages in thread From: Willem de Bruijn @ 2017-01-04 16:07 UTC (permalink / raw) To: Sowmini Varadhan Cc: linux-kselftest, Network Development, Daniel Borkmann, Willem de Bruijn, David Miller, shuah On Wed, Jan 4, 2017 at 10:13 AM, Sowmini Varadhan <sowmini.varadhan@oracle.com> wrote: > On (01/04/17 10:03), Willem de Bruijn wrote: >> >> This approach is less restrictive. It still allows incorrect packets >> to be enqueued in the time between the socket call and attaching the >> bpf filter. Also, if packets are restricted to a single packet, using >> bind with sll_ifindex is simpler. > > Do you want me to change this to first set up pfsocket() with > proto 0, then set up filter, and then bind_ring() to the desired > ifindex with ETH_P_ALL? Please do. Then the patch is just a one-line change to the third argument of the socket call. Thanks! ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it 2017-01-04 16:07 ` Willem de Bruijn @ 2017-01-04 16:12 ` Sowmini Varadhan 2017-01-04 16:24 ` Willem de Bruijn 0 siblings, 1 reply; 12+ messages in thread From: Sowmini Varadhan @ 2017-01-04 16:12 UTC (permalink / raw) To: Willem de Bruijn Cc: linux-kselftest, Network Development, Daniel Borkmann, Willem de Bruijn, David Miller, shuah On (01/04/17 11:07), Willem de Bruijn wrote: > > Please do. Then the patch is just a one-line change to > the third argument of the socket call. Thanks! ok but it's going to be more than a one-line change. Today you have sock = pfsocket(version); memset(&ring, 0, sizeof(ring)); setup_ring(sock, &ring, version, type); mmap_ring(sock, &ring); bind_ring(sock, &ring); walk_ring(sock, &ring); And the pair_udp_setfilter() only gets set up out of walk_ring. I think you want to move that up to be done before bind_ring. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it 2017-01-04 16:12 ` Sowmini Varadhan @ 2017-01-04 16:24 ` Willem de Bruijn 2017-01-04 16:27 ` Sowmini Varadhan 0 siblings, 1 reply; 12+ messages in thread From: Willem de Bruijn @ 2017-01-04 16:24 UTC (permalink / raw) To: Sowmini Varadhan Cc: linux-kselftest, Network Development, Daniel Borkmann, Willem de Bruijn, David Miller, shuah On Wed, Jan 4, 2017 at 11:12 AM, Sowmini Varadhan <sowmini.varadhan@oracle.com> wrote: > On (01/04/17 11:07), Willem de Bruijn wrote: >> >> Please do. Then the patch is just a one-line change to >> the third argument of the socket call. Thanks! > > ok but it's going to be more than a one-line change. Today you > have > sock = pfsocket(version); > memset(&ring, 0, sizeof(ring)); > setup_ring(sock, &ring, version, type); > mmap_ring(sock, &ring); > bind_ring(sock, &ring); > walk_ring(sock, &ring); > > And the pair_udp_setfilter() only gets set up out of > walk_ring. I think you want to move that up to be done before > bind_ring. Oh, good point. It may require some more refactoring. Feel free to leave it for me if you prefer. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it 2017-01-04 16:24 ` Willem de Bruijn @ 2017-01-04 16:27 ` Sowmini Varadhan 0 siblings, 0 replies; 12+ messages in thread From: Sowmini Varadhan @ 2017-01-04 16:27 UTC (permalink / raw) To: Willem de Bruijn Cc: linux-kselftest, Network Development, Daniel Borkmann, Willem de Bruijn, David Miller, shuah On (01/04/17 11:24), Willem de Bruijn wrote: > > Oh, good point. It may require some more refactoring. Feel free to > leave it for me if you prefer. actually it may not be so bad, let me do it, since I already have a reliable way of reproducing this.. --Sowmini ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-01-04 16:27 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-01-03 23:27 [PATCH net-next 0/2] tools: psock_tpacket bug fixes Sowmini Varadhan 2017-01-03 23:27 ` [PATCH net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter Sowmini Varadhan 2017-01-04 14:27 ` Willem de Bruijn 2017-01-03 23:27 ` [PATCH net-next 2/2] tools: psock_tpacket: verify that packet was received on lo before counting it Sowmini Varadhan 2017-01-04 14:30 ` Willem de Bruijn 2017-01-04 14:44 ` Sowmini Varadhan 2017-01-04 15:03 ` Willem de Bruijn 2017-01-04 15:13 ` Sowmini Varadhan 2017-01-04 16:07 ` Willem de Bruijn 2017-01-04 16:12 ` Sowmini Varadhan 2017-01-04 16:24 ` Willem de Bruijn 2017-01-04 16:27 ` Sowmini Varadhan
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.