* Re: Single socket with TX_RING and RX_RING
2013-05-15 13:32 ` Ricardo Tubío
2013-05-15 14:47 ` Daniel Borkmann
@ 2013-05-20 20:50 ` Paul Chavent
1 sibling, 0 replies; 16+ messages in thread
From: Paul Chavent @ 2013-05-20 20:50 UTC (permalink / raw)
To: Ricardo Tubío; +Cc: netdev
[-- Attachment #1: Type: text/plain, Size: 3388 bytes --]
On 05/15/2013 03:32 PM, Ricardo Tubío wrote:
> Daniel Borkmann <dborkman <at> redhat.com> writes:
>
>>
>> On 05/15/2013 02:53 PM, Ricardo Tubío wrote:
>>> Once I tell kernel to export the TX_RING through setsockopt() (see code
>>> below) I always get an error (EBUSY) if i try to tell kernel to export the
>>> RX_RING with the same socket descriptor. Therefore, I have to open an
>>> additional socket for the RX_RING and I require of two sockets when I though
>>> that I would only require of one socket for both TX and RX using mmap()ed
>>> memory.
>>>
>>> Do I need both sockets or am I doing something wrong?
>>
>> The second time you call init_ring() in your code e.g. with TX_RING, where
>> you have previously set it up for the RX_RING. The kernel will give you
>> -EBUSY because the packet socket is already mmap(2)'ed.
>>
>
> Ok, so if I make the following system calls:
>
> void *ring=NULL;
> setsockopt(socket_fd, SOL_PACKET, PACKET_RX_RING, p, LEN__TPACKET_REQ);
> ring = mmap(NULL, ring_len, ring_access_flags, MAP_SHARED, socket_fd, 0);
>
> Would I be permitted to use the ring map obtained both for RX and for TX? If
> so, for me it is confusing to use PACKET_RX_RING if I can also TX data
> through that ring...
>
Hello Ricardo.
I managed to use the same socket and a single mmaped area for both RX_RING and TX_RING. Here is some sample code :
/* open socket */
sock_fd = socket(PF_PACKET, socket_type, htons(socket_protocol));
/* socket tuning and init */
[...]
/* rings geometry */
rx_packet_req.tp_block_size = pagesize << order;
rx_packet_req.tp_block_nr = 1;
rx_packet_req.tp_frame_size = frame_size;
rx_packet_req.tp_frame_nr = (rx_packet_req.tp_block_size / rx_packet_req.tp_frame_size) * rx_packet_req.tp_block_nr;
tx_packet_req = rx_packet_req;
/* set packet version */
setsockopt(sock_fd, SOL_PACKET, PACKET_VERSION, &version, sizeof(version))
/* set RX ring option */
setsockopt(sock_fd, SOL_PACKET, PACKET_RX_RING, &rx_packet_req, sizeof(rx_packet_req))
/* set TX ring option*/
setsockopt(sock_fd, SOL_PACKET, PACKET_TX_RING, &tx_packet_req, sizeof(tx_packet_req))
/* map rx + tx buffer to userspace : they are in this order */
mmap_size =
rx_packet_req.tp_block_size * rx_packet_req.tp_block_nr +
tx_packet_req.tp_block_size * tx_packet_req.tp_block_nr ;
mmap_base = mmap(0, mmap_size, PROT_READ|PROT_WRITE, MAP_SHARED, sock_fd, 0);
/* get rx and tx buffer description */
rx_buffer_size = rx_packet_req.tp_block_size * rx_packet_req.tp_block_nr;
rx_buffer_addr = mmap_base;
rx_buffer_idx = 0;
rx_buffer_cnt = rx_packet_req.tp_block_size * rx_packet_req.tp_block_nr / rx_packet_req.tp_frame_size;
tx_buffer_size = tx_packet_req.tp_block_size * tx_packet_req.tp_block_nr;
tx_buffer_addr = mmap_base + rx_buffer_size;
tx_buffer_idx = 0;
tx_buffer_cnt = tx_packet_req.tp_block_size * tx_packet_req.tp_block_nr / tx_packet_req.tp_frame_size;
I join to this mail a complete (but certainly outdated) sample code.
I've also begun to write a kind of howto (in french) on the packet mmap at this page : http://paul.chavent.free.fr/packet_mmap.html (this is a work in progress, i will add information on timestamping)
Regards.
Paul.
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
[-- Attachment #2: ethernet.c --]
[-- Type: text/plain, Size: 38648 bytes --]
/*
* This module allow to send/receive ethernet frames.
* The type of ethernet frames must be specified at compile time :
* - use 8021Q or not
* - tpid and tci
* - ethertype
* - filtering or not
*
* See /usr/src/linux/Documentation/networking/packet_mmap.txt for improvement
*
*
* Notes on packet mmap
*
* For tx example see :
* http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example
* For rx example see :
* http://www.scaramanga.co.uk/code-fu/lincap.c
*
* (1) If we open the socket with SOCK_DGRAM, the tp_mac and the
* tp_net are the same (the mac header isn't provided by the
* user). Eg tp_mac=80 and tp_net=80. If we open the socket with
* SOCK_RAW, the tp_net = tp_mac + 14. Eg tp_mac=66 and tp_net=80.
* (see (6) for alignment)
*
* (2) The tx and rx are asymetrics. On tx we fill data at
* TPACKET2_HDRLEN - sizeof(struct sockaddr_ll)
* on rx we get data at (see (1))
* tp_mac
* or
* tp_net
*
* (3) The mmaping is made only once for the two sides. The map gives
* rx before tx.
*
* (4) The tp_len is the real len of the frame, the tp_snaplen is the
* len of the data in the ring buffer. If you give a too small
* size for the struct tpacket_req -> tp_frame_size is the real
* length and if the PACKET_COPY_TRESH sockopt is set,
* TP_STATUS_COPY is set in tp_status.
*
* (5) The minimum tp_frame_size for tx is the minimum size of the
* payload (including the mac header if SOCK_RAW is selected) plus :
* TPACKET2_HDRLEN - sizeof(struct sockaddr_ll) = 32
* The TPACKET2_HDRLEN - sizeof(struct sockaddr_ll) is always aligned
* to 16 bytes
*
*
* (6) The minimum tp_frame_size for rx is the minimum size of the
* payload (including the mac header if SOCK_RAW is selected) plus :
* ALIGN_16(TPACKET2_HDRLEN) + 16 + tp_reserve (=0) = 80 = tp_net
* The tp_net will always be aligned to 16 bytes boundaries
*
*
* RX FRAME STRUCTURE :
*
* Start (aligned to TPACKET_ALIGNMENT=16) TPACKET_ALIGNMENT=16 TPACKET_ALIGNMENT=16
* v v v
* | | | tp_mac |tp_net
* | struct tpacket_hdr ... pad | struct sockaddr_ll ... gap | min(16, maclen) = 16 |
* |<--------------------------------------------------------------------->|<---------------------->|<----...
* tp_hdrlen = TPACKET2_HDRLEN if SOCK_RAW user data
*
*
* TX FRAME STRUCTURE :
*
* Start (aligned to TPACKET_ALIGNMENT=16) TPACKET_ALIGNMENT=16
* v v
* | |
* | struct tpacket_hdr ... pad | struct sockaddr_ll ... gap
* |<--------------------------------------------------------------------->|
* tp_hdrlen = TPACKET2_HDRLEN
* |<---- ...
* user data
*
*
* TODO / IMPROVEMENTS
* vlan 802Q
* timestamp
* filtering
* set the mtu according to the tp_frame_size or set tp_frame_size according
* to the mtu ?
*/
#undef USE_FILTER
#define COOKED_PACKET
#undef P_8021Q
#define PATCHED_PACKET
#define _GNU_SOURCE
#include <assert.h> /* assert */
#include <stdio.h> /* printf */
#include <stdlib.h> /* calloc, free */
#include <string.h> /* memcpy */
#include <errno.h> /* errno, perror, etc */
#include <unistd.h> /* close */
#include <sys/ioctl.h> /* ioctl */
#include <arpa/inet.h> /* htons, ntohs */
#include <poll.h> /* poll */
#include <time.h> /* struct timespec */
#include <sys/timerfd.h> /* timerfd_create etc. */
#include <sys/mman.h> /* mmap */
#include <sys/socket.h> /* socket */
#include <net/if.h> /* ifreq, ifconf */
#include <net/ethernet.h> /* struct ether_header, ETH_ALEN, ... */
#include <linux/if_packet.h> /* packet mmap*/
#if defined(USE_FILTER)
#include <linux/types.h> /* attach filter */
#include <linux/filter.h> /* attach filter */
#endif
#include "ethernet.h"
#if !defined(NDEBUG)
#include "debug.h"
#endif
#define MIN(x,y) ((x)<(y)?(x):(y))
/******************************************************************************
* *
* *
*****************************************************************************/
static inline unsigned next_power_of_two(unsigned n)
{
n--;
n |= n >> 1;
n |= n >> 2;
n |= n >> 4;
n |= n >> 8;
n |= n >> 16;
n++;
return n;
}
/******************************************************************************
* *
* *
*****************************************************************************/
static const uint8_t broadcast_addr[6] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
/******************************************************************************
* *
* *
*****************************************************************************/
struct ethernet_s
{
#if !defined(NDEBUG)
int debug;
#endif
int timer_fd;
int sock_fd;
struct sockaddr_ll local_addr;
struct sockaddr_ll remote_addr;
unsigned mtu;
struct tpacket_req rx_packet_req;
struct tpacket_req tx_packet_req;
void * mmap_base;
unsigned mmap_size;
unsigned rx_buffer_size;
void * rx_buffer_addr;
unsigned rx_buffer_cnt;
unsigned rx_buffer_idx;
unsigned rx_buffer_payload_offset;
unsigned rx_buffer_payload_max_size;
unsigned tx_buffer_size;
void * tx_buffer_addr;
unsigned tx_buffer_cnt;
unsigned tx_buffer_idx;
unsigned tx_buffer_payload_offset;
unsigned tx_buffer_payload_max_size;
struct pollfd pollfd[2];
};
/******************************************************************************
* *
* *
*****************************************************************************/
/* http://standards.ieee.org/develop/regauth/ethertype/eth.txt */
#define ETH_TYPE 0x88b5
/******************************************************************************
* *
* *
*****************************************************************************/
#if !defined(COOKED_PACKET)
static const int socket_type = SOCK_DGRAM;
static const int socket_protocol = ETH_P_802_3;
static const int bind_protocol = ETH_P_802_2; // man packet section Notes
static const int send_protocol = ETH_TYPE;
#endif /* !defined(COOKED_PACKET) */
/******************************************************************************
* *
* *
*****************************************************************************/
#if defined(COOKED_PACKET) && !defined(P_8021Q)
static const int socket_type = SOCK_RAW;
static const int socket_protocol = ETH_P_802_3;
static const int bind_protocol = ETH_P_802_2; // man packet section Notes
static const int send_protocol = ETH_TYPE;
struct ether_header_s
{
uint8_t dhost[ETH_ALEN];
uint8_t shost[ETH_ALEN];
uint16_t type;
} __attribute__ ((__packed__));
typedef struct ether_header_s ether_header_t;
#endif /* defined(COOKED_PACKET) && !defined(P_8021Q) */
/******************************************************************************
* *
* *
*****************************************************************************/
#if defined(COOKED_PACKET) && defined(P_8021Q)
static const int socket_type = SOCK_RAW;
static const int socket_protocol = ETH_P_ALL;
static const int bind_protocol = ETH_P_ALL;
static const int send_protocol = ETH_TYPE;
struct ether_header_s
{
uint8_t dhost[ETH_ALEN];
uint8_t shost[ETH_ALEN];
uint16_t tpid;
uint16_t tci;
uint16_t type;
} __attribute__ ((__packed__));
typedef struct ether_header_s ether_header_t;
#define E_8021Q_TPID 0x8100
#define E_8021Q_TCI 0xEFFE
#define E_8021Q_PCP 0x7 /* priority : highest -> better, from 0 to 7 */
#define E_8021Q_CFI 0
#define E_8021Q_VID 0xFFE /* vlan id, from 0 (reserved) to 0xFFF (reserved) */
#endif /* defined(COOKED_PACKET) && defined(P_8021Q) */
/******************************************************************************
* *
* *
*****************************************************************************/
#if defined(USE_FILTER)
static struct sock_filter filt_prog_code[] =
{
#if defined(P_8021Q)
/* load and check tpid */
BPF_STMT(BPF_LD | BPF_H | BPF_ABS, 12), /* Load tpid */
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, E_8021Q_TPID, 1, 0),/* equal 8021Q_TPID */
BPF_STMT(BPF_RET | BPF_K, 0), /* reject */
/* load and check tci */
BPF_STMT(BPF_LD | BPF_H | BPF_ABS, 14), /* Load tci */
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, E_8021Q_TCI, 1, 0),/* equal 8021Q_TCI */
BPF_STMT(BPF_RET | BPF_K, 0), /* reject */
#endif /* defined(USE_8021Q) */
BPF_STMT(BPF_LD | BPF_H | BPF_ABS, ETH_HDR_LEN - 2), /* Load ether type */
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, ETH_TYPE, 1, 0), /* equal ETHER_TYPE */
BPF_STMT(BPF_RET | BPF_K, 0), /* reject */
BPF_STMT(BPF_RET | BPF_K, 65535), /* accept */
};
static struct sock_fprog filt_prog =
{
sizeof(filt_prog_code) / sizeof(filt_prog_code[0]),
filt_prog_code
};
#endif /* defined(USE_FILTER) */
/******************************************************************************
* *
* *
*****************************************************************************/
#if !defined(NDEBUG)
static void ethernet_debug_frame(const void * base);
static void ethernet_debug_packet_req(const struct tpacket_req * rx_packet_req, const struct tpacket_req * tx_packet_req);
#endif
/******************************************************************************
* *
* *
*****************************************************************************/
ethernet_t * ethernet_alloc()
{
ethernet_t * itf = calloc(1, sizeof(*itf));
if(itf)
{
itf->timer_fd = -1;
itf->sock_fd = -1;
itf->mmap_base = (void *)-1;
}
return itf;
}
/******************************************************************************
* *
* *
*****************************************************************************/
void ethernet_free(ethernet_t *itf)
{
/* check parameters */
assert(itf);
ethernet_close(itf);
free(itf);
}
/******************************************************************************
* *
* *
*****************************************************************************/
int ethernet_open(ethernet_t *itf, const char *itf_name)
{
struct ifreq ifr;
int err = 0;
socklen_t errlen = sizeof(err);
/* fill ifr name field */
memset(&ifr, 0, sizeof(ifr));
strncpy(ifr.ifr_name, itf_name, sizeof(ifr.ifr_name));
/* check parameters */
assert(itf);
/* cleanup */
ethernet_close(itf);
/* setup timer fd */
itf->timer_fd = timerfd_create(CLOCK_REALTIME, 0);
if(itf->timer_fd < 0)
{
perror("timerfd_create failed");
return -1;
}
/* open socket */
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "socket\n");
}
#endif
itf->sock_fd = socket(PF_PACKET, socket_type, htons(socket_protocol));
if(itf->sock_fd < 0)
{
perror("socket failed");
return -1;
}
#if defined(USE_FILTER)
/* attach filter */
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "setsockopt SO_ATTACH_FILTER\n");
}
#endif
if(setsockopt(itf->sock_fd, SOL_SOCKET, SO_ATTACH_FILTER, &filt_prog, sizeof(filt_prog)))
{
perror("getsockopt SO_ERROR failed");
return -1;
}
#endif /* defined(USE_FILTER) */
/* set local addr */
memset(&itf->local_addr, 0, sizeof(itf->local_addr));
itf->local_addr.sll_family = AF_PACKET;
itf->local_addr.sll_protocol = htons(bind_protocol);
/* get itf index */
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "ioctl SIOCGIFINDEX\n");
}
#endif
if(ioctl(itf->sock_fd, SIOCGIFINDEX, &ifr) == -1)
{
perror("ioctl SIOCGIFINDEX failed");
return -1;
}
itf->local_addr.sll_ifindex = ifr.ifr_ifindex;
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "if index %d\n", ifr.ifr_ifindex);
}
#endif
/* get own MAC address */
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "ioctl SIOCGIFHWADDR\n");
}
#endif
if(ioctl(itf->sock_fd, SIOCGIFHWADDR, &ifr) < 0)
{
perror("ioctl SIOCGIFHWADDR failed");
return -1;
}
itf->local_addr.sll_halen = ETH_ALEN;
memcpy(&itf->local_addr.sll_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "if mac addr %02x:%02x:%02x:%02x:%02x:%02x:\n",
itf->local_addr.sll_addr[0], itf->local_addr.sll_addr[1],
itf->local_addr.sll_addr[2], itf->local_addr.sll_addr[3],
itf->local_addr.sll_addr[4], itf->local_addr.sll_addr[5]);
}
#endif
/* bind to eth */
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "bind\n");
}
#endif
if(bind(itf->sock_fd, (const void *)&itf->local_addr, sizeof(itf->local_addr)) == -1)
{
perror("bind failed");
return -1;
}
/* any pending errors, e.g., network is down? */
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "getsockopt SO_ERROR\n");
}
#endif
if(getsockopt(itf->sock_fd, SOL_SOCKET, SO_ERROR, &err, &errlen) == -1)
{
perror("getsockopt SO_ERROR failed");
return -1;
}
if(err > 0)
{
fprintf(stderr, "network is down ?\n");
return -1;
}
/* set remote addr */
itf->remote_addr = itf->local_addr;
itf->remote_addr.sll_protocol = htons(send_protocol);
memcpy(&itf->remote_addr.sll_addr, broadcast_addr, ETH_ALEN);
/* get own MTU */
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "ioctl SIOCGIFMTU\n");
}
#endif
if (ioctl(itf->sock_fd, SIOCGIFMTU, &ifr) < 0)
{
perror("ioctl SIOCGIFMTU failed");
return -1;
}
itf->mtu = ifr.ifr_mtu;
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "Mtu %d\n", itf->mtu);
}
#endif
/* prepare packet mmaping */
const long pagesize = sysconf(_SC_PAGESIZE); /* assume 4096 */
const unsigned order = 1;
const unsigned frame_size = next_power_of_two(itf->mtu + 128); /* 128 is an arbitrary value */
/* tp_block_size must be a power of two of PAGE_SIZE */
itf->rx_packet_req.tp_block_size = pagesize << order;
/* tp_block_nr */
itf->rx_packet_req.tp_block_nr = 1;
/* tp_frame_size must be greater than TPACKET2_HDRLEN and a multiple
* of TPACKET_ALIGNMENT. It should also be a divisor of tp_block_size */
itf->rx_packet_req.tp_frame_size = frame_size;
/* tp_frame_nr */
itf->rx_packet_req.tp_frame_nr = (itf->rx_packet_req.tp_block_size / itf->rx_packet_req.tp_frame_size) * itf->rx_packet_req.tp_block_nr;
/* sanity checks */
if(frame_size <= TPACKET2_HDRLEN)
{
fprintf(stderr, "frame_size (%u) must be greater than TPACKET2_HDRLEN (%u)\n", frame_size, TPACKET2_HDRLEN);
return -1;
}
if((frame_size % TPACKET_ALIGNMENT) != 0)
{
fprintf(stderr, "frame_size (%u) must be a multiple of TPACKET_ALIGNMENT (%u)\n", frame_size, TPACKET_ALIGNMENT);
return -1;
}
if((itf->rx_packet_req.tp_block_size % frame_size) != 0)
{
fprintf(stderr, "frame_size (%u) must be a divisor of tp_block_size (%u)\n", frame_size, itf->rx_packet_req.tp_block_size);
return -1;
}
/* same settings for tx */
itf->tx_packet_req = itf->rx_packet_req;
#if !defined(NDEBUG)
if(itf->debug)
{
ethernet_debug_packet_req(&itf->rx_packet_req, &itf->tx_packet_req);
}
#endif
/* set paquet version option */
int version = TPACKET_V2;
if(setsockopt(itf->sock_fd, SOL_PACKET, PACKET_VERSION, &version, sizeof(version)) < 0)
{
perror("setsockopt: PACKET_VERSION");
return -1;
}
/* set RX ring option */
if (setsockopt(itf->sock_fd, SOL_PACKET, PACKET_RX_RING, &itf->rx_packet_req, sizeof(itf->rx_packet_req)) < 0)
{
perror("setsockopt: PACKET_RX_RING");
return -1;
}
/* set TX ring option*/
if (setsockopt(itf->sock_fd, SOL_PACKET, PACKET_TX_RING, &itf->tx_packet_req, sizeof(itf->tx_packet_req)) < 0)
{
perror("setsockopt: PACKET_TX_RING");
return -1;
}
/* map rx + tx buffer to userspace : they are in this order */
itf->mmap_size =
itf->rx_packet_req.tp_block_size * itf->rx_packet_req.tp_block_nr +
itf->tx_packet_req.tp_block_size * itf->tx_packet_req.tp_block_nr ;
itf->mmap_base = mmap(0, itf->mmap_size, PROT_READ|PROT_WRITE, MAP_SHARED, itf->sock_fd, 0);
if (itf->mmap_base == (void*)-1)
{
perror("mmap rx buffer failed");
return -1;
}
/* get rx and tx buffer description */
itf->rx_buffer_size = itf->rx_packet_req.tp_block_size * itf->rx_packet_req.tp_block_nr;
itf->rx_buffer_addr = itf->mmap_base;
itf->rx_buffer_idx = 0;
itf->rx_buffer_cnt = itf->rx_packet_req.tp_block_size * itf->rx_packet_req.tp_block_nr / itf->rx_packet_req.tp_frame_size;
itf->tx_buffer_size = itf->tx_packet_req.tp_block_size * itf->tx_packet_req.tp_block_nr;
itf->tx_buffer_addr = itf->mmap_base + itf->rx_buffer_size;
itf->tx_buffer_idx = 0;
itf->tx_buffer_cnt = itf->tx_packet_req.tp_block_size * itf->tx_packet_req.tp_block_nr / itf->tx_packet_req.tp_frame_size;
/*
* Precompute payload offset and max size
* Warning : tx and rx are asymetrics
*/
/*
* - on rx we get data at tp_net (SOCK_DGRAM) and tp_mac if we need mac
* header (SOCK_RAW)
* the rx_buffer_payload_offset is the offset from the tp_net of the frame !
* For computing max size we consider the tp_net to be :
* TPACKET2_HDRLEN + 16 + reserve (=80)
* or
* TPACKET2_HDRLEN + min(16, maclen) + reserve
* see src/linux/net/packet/af_packet.c tpacket_rcv
*/
itf->rx_buffer_payload_offset = TPACKET_ALIGN(TPACKET2_HDRLEN + MIN(sizeof(ether_header_t), 16)); // only used here, use tp_net elsewhere
itf->rx_buffer_payload_max_size = itf->rx_packet_req.tp_frame_size - itf->rx_buffer_payload_offset;
/*
* - on tx we fill data at
* TPACKET2_HDRLEN - sizeof(struct sockaddr_ll)
* or
* TPACKET2_HDRLEN + min(16, maclen)
* see src/linux/net/packet/af_packet.c tpacket_fill_skb
*/
#if defined(PATCHED_PACKET)
itf->tx_buffer_payload_offset = TPACKET_ALIGN(TPACKET2_HDRLEN + MIN(sizeof(ether_header_t), 16));
#else /* defined(PATCHED_PACKET) */
itf->tx_buffer_payload_offset = (TPACKET2_HDRLEN - sizeof(struct sockaddr_ll));
#endif /* defined(PATCHED_PACKET) */
itf->tx_buffer_payload_max_size = itf->tx_packet_req.tp_frame_size - itf->tx_buffer_payload_offset;
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "rx_buffer_payload_max_size %u\n",
itf->rx_buffer_payload_max_size);
fprintf(stdout, "tx_buffer_payload_max_size %u\n",
itf->tx_buffer_payload_max_size);
}
#endif
#if defined(COOKED_PACKET)
/* for each packet we initialize the ethernet header */
ether_header_t ether_header;
memcpy(ether_header.dhost, &itf->remote_addr.sll_addr, sizeof(ether_header.dhost));
memcpy(ether_header.shost, &itf->local_addr.sll_addr, sizeof(ether_header.shost));
#if defined(P_8021Q)
ether_header.tpid = htons(E_8021Q_TPID);
ether_header.tci = htons(E_8021Q_TCI);
#endif /* defined(P_8021Q) */
ether_header.type = htons(send_protocol);
for(unsigned i = 0; i < itf->tx_buffer_cnt; i++)
{
void * base = itf->tx_buffer_addr + i * itf->tx_packet_req.tp_frame_size;
memcpy(base + itf->tx_buffer_payload_offset - sizeof(ether_header_t), ðer_header, sizeof(ether_header_t));
}
/* override the setting of the tx data offset and size */
/* apply the diffs */
itf->rx_buffer_payload_max_size -= sizeof(ether_header);
itf->tx_buffer_payload_max_size -= sizeof(ether_header);
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "rx_buffer_payload_max_size %u\n",
itf->rx_buffer_payload_max_size);
fprintf(stdout, "tx_buffer_payload_max_size %u\n",
itf->tx_buffer_payload_max_size);
}
#endif
#endif /* defined(COOKED_PACKET) */
/* threshold payload max size according to the mtu */
if(itf->mtu < itf->rx_buffer_payload_max_size)
{
itf->rx_buffer_payload_max_size = itf->mtu;
}
if(itf->mtu < itf->tx_buffer_payload_max_size)
{
itf->tx_buffer_payload_max_size = itf->mtu;
}
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "rx_buffer_payload_max_size %u\n", itf->rx_buffer_payload_max_size);
fprintf(stdout, "tx_buffer_payload_max_size %u\n", itf->tx_buffer_payload_max_size);
}
#endif
/* setup poll fd */
itf->pollfd[0].fd = itf->timer_fd;
itf->pollfd[0].events = POLLIN;
itf->pollfd[0].revents = 0;
itf->pollfd[1].fd = itf->sock_fd;
itf->pollfd[1].events = POLLIN|POLLRDNORM|POLLERR;
itf->pollfd[1].revents = 0;
return 0;
}
/******************************************************************************
* *
* *
*****************************************************************************/
void ethernet_close(ethernet_t * itf)
{
/* check parameters */
assert(itf);
/* */
if(itf->mmap_base != (void *)-1)
{
munmap(itf->mmap_base, itf->mmap_size);
itf->mmap_base = (void *)-1;
itf->mmap_size = 0;
}
/* close socket */
if(0 <= itf->sock_fd)
{
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "close\n");
}
#endif
close(itf->sock_fd);
itf->sock_fd = -1;
}
/* close timer */
if(0 <= itf->timer_fd)
{
close(itf->timer_fd);
itf->timer_fd = -1;
}
}
/******************************************************************************
* *
* *
* *
*****************************************************************************/
void ethernet_purge(ethernet_t * itf)
{
/* check parameters */
assert(itf);
/* get base adress of the current rx frame */
void * base = itf->rx_buffer_addr + itf->rx_buffer_idx * itf->rx_packet_req.tp_frame_size;
volatile struct tpacket2_hdr * header = (struct tpacket2_hdr *)base;
while(header->tp_status != TP_STATUS_KERNEL)
{
/* load the next rx frame index */
if(itf->rx_buffer_idx < (itf->rx_buffer_cnt - 1))
{
itf->rx_buffer_idx ++;
}
else
{
itf->rx_buffer_idx = 0;
}
/* clear the status */
header->tp_status = TP_STATUS_KERNEL;
/* get base adress of the current rx frame */
base = itf->rx_buffer_addr + itf->rx_buffer_idx * itf->rx_packet_req.tp_frame_size;
header = (struct tpacket2_hdr *)base;
}
}
/******************************************************************************
* *
* *
* *
*****************************************************************************/
int ethernet_rx_request(ethernet_t * itf, ethernet_msg_t * msg)
{
/* check parameters */
assert(itf && msg);
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "ethernet_rx_request\n");
}
#endif
if(msg->data || msg->data_len)
{
fprintf(stderr, "Rx request have to be released before requested.\n");
return -1;
}
/* get base adress of the current rx frame */
void * base = itf->rx_buffer_addr + itf->rx_buffer_idx * itf->rx_packet_req.tp_frame_size;
volatile struct tpacket2_hdr * header = (struct tpacket2_hdr *)base;
/* check if we need to poll */
if(header->tp_status == TP_STATUS_KERNEL)
{
int err;
/* setup read timeout */
struct itimerspec to = {{0,0}, msg->to};
int flags = (msg->to_is_relative)?0:TFD_TIMER_ABSTIME;
err = timerfd_settime(itf->timer_fd, flags, &to, NULL);
if(err < 0)
{
perror("timerfd_settime failed");
return -1;
}
/* poll input */
itf->pollfd[0].revents = 0;
itf->pollfd[1].revents = 0;
err = ppoll(itf->pollfd, 2, NULL, NULL);
if(err < 0)
{
perror("ppoll failed");
fprintf(stderr, "revents = %hd %hd\n", itf->pollfd[0].revents, itf->pollfd[1].revents);
return -1;
}
#if !defined(NDEBUG)
else if(err == 0)
{
fprintf(stderr, "ppoll timeout unexpected\n");
return -1;
}
#endif
else if(itf->pollfd[0].revents == POLLIN)
{
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "timerfd timeout\n");
}
#endif
return 0;
}
#if !defined(NDEBUG)
else if(!itf->pollfd[1].revents)
{
fprintf(stderr, "event on socket axpected\n");
return -1;
}
#endif
else if(itf->pollfd[1].revents & POLLERR)
{
fprintf(stderr, "error on socket poll\n");
return -1;
}
}
#if !defined(NDEBUG)
if(itf->debug)
{
ethernet_debug_frame(base);
}
#endif
/* so, here we have a frame ready to process */
/* load the next rx frame index */
if(itf->rx_buffer_idx < (itf->rx_buffer_cnt - 1))
{
itf->rx_buffer_idx ++;
}
else
{
itf->rx_buffer_idx = 0;
}
/* if the frame is good for reading */
if((header->tp_status == TP_STATUS_USER) && header->tp_snaplen)
{
/* give to the caller the payload adress and size */
msg->data = base + header->tp_net;
msg->data_len = header->tp_snaplen;
#if defined(COOKED_PACKET) // hope that header->tp_net - sizeof(ether_header_t) == header->tp_mac
assert((header->tp_net - sizeof(ether_header_t)) == header->tp_mac);
msg->data_len -= sizeof(ether_header_t);
#endif
return 0;
}
else
{
fprintf(stderr, "capture failed : revents %x, status %d, snap_len %d\n", itf->pollfd[1].revents, header->tp_status, header->tp_snaplen);
header->tp_status = TP_STATUS_KERNEL;
return -1;
}
}
/******************************************************************************
* *
* *
* *
*****************************************************************************/
int ethernet_rx_release(ethernet_t * itf, ethernet_msg_t * msg)
{
/* check parameters */
assert(itf && msg);
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "ethernet_rx_release\n");
}
#endif
if(!msg->data || !msg->data_len)
{
fprintf(stderr, "Rx request have to be requested before release.\n");
return -1;
}
/* find the index of the frame associated to this data pointer */
int i = (msg->data - itf->rx_buffer_addr) / itf->rx_packet_req.tp_frame_size;
if((0 <= i) && ((unsigned)i < itf->rx_buffer_cnt))
{
void * base = itf->rx_buffer_addr + i * itf->rx_packet_req.tp_frame_size;
volatile struct tpacket2_hdr * header = (struct tpacket2_hdr *)base;
header->tp_status = TP_STATUS_KERNEL;
msg->data = 0;
msg->data_len = 0;
return 0;
}
else
{
fprintf(stderr, "Rx release addr out of range (%p).\n", msg->data);
return -1;
}
}
/******************************************************************************
* *
* *
* *
*****************************************************************************/
int ethernet_tx_request(ethernet_t * itf, ethernet_msg_t * msg)
{
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "ethernet_tx_request\n");
}
#endif
/* check parameters */
assert(itf && msg);
if(msg->data || msg->data_len)
{
fprintf(stderr, "Tx request have to be released before requested.\n");
return -1;
}
/* get base adress of the current tx frame */
void * base;
volatile struct tpacket2_hdr * header;
do
{
/* get base adress of the current tx frame */
base = itf->tx_buffer_addr + itf->tx_buffer_idx * itf->tx_packet_req.tp_frame_size;
header = (struct tpacket2_hdr *)base;
/* load the next tx frame index */
if(itf->tx_buffer_idx < (itf->tx_buffer_cnt - 1))
{
itf->tx_buffer_idx ++;
}
else
{
itf->tx_buffer_idx = 0;
}
} while(header->tp_status != TP_STATUS_AVAILABLE);
/* give to the caller the payload adress and size */
msg->data = base + itf->tx_buffer_payload_offset;
msg->data_len = itf->tx_buffer_payload_max_size;
#if !defined(NDEBUG)
if(itf->debug)
{
ethernet_debug_frame(base);
}
#endif
return 0;
}
/******************************************************************************
* *
* *
* *
*****************************************************************************/
int ethernet_tx_release(ethernet_t * itf, ethernet_msg_t * msg)
{
#if !defined(NDEBUG)
if(itf->debug)
{
fprintf(stdout, "ethernet_tx_release\n");
}
#endif
/* check parameters */
assert(itf && msg);
if(!msg->data || !msg->data_len)
{
fprintf(stderr, "Tx request have to be requested before released.\n");
return -1;
}
if(itf->tx_buffer_payload_max_size < msg->data_len)
{
fprintf(stderr, "Tx request can be greater than %d bytes (requested %d).\n", itf->tx_buffer_payload_max_size, msg->data_len);
return -1;
}
/* ethernet payload are at least 46 bytes */
if(msg->data_len < 46)
{
memset(msg->data + msg->data_len, 0, 46 - msg->data_len);
msg->data_len = 46;
}
/* find the index of the frame associated to this data pointer */
int i = (msg->data - itf->tx_buffer_addr) / itf->tx_packet_req.tp_frame_size;
if((i < 0) || (itf->tx_buffer_cnt <= (unsigned)i))
{
fprintf(stderr, "Tx release addr out of range (%p).\n", msg->data);
return -1;
}
/* get base adress of this tx frame */
void * base = itf->tx_buffer_addr + i * itf->tx_packet_req.tp_frame_size;
volatile struct tpacket2_hdr * header = (struct tpacket2_hdr *)base;
#if defined(PATCHED_PACKET)
/* update packet offset */
header->tp_net = itf->tx_buffer_payload_offset;
#endif /* defined(PATCHED_PACKET) */
/* update packet len */
header->tp_len = msg->data_len;
#if defined(COOKED_PACKET)
header->tp_len += sizeof(ether_header_t);
#endif
/* set header flag to USER (trigs xmit)*/
header->tp_status = TP_STATUS_SEND_REQUEST;
/* ask the kernel to send data */
ssize_t err;
err = sendto(itf->sock_fd, NULL, 0, 0, (const struct sockaddr *)&itf->remote_addr, sizeof(itf->remote_addr));
if(err < 0)
{
perror("sendto failed");
fprintf(stderr, "errno = %d\n", errno);
return -1;
}
else if(err == 0 )
{
/* nothing to do */
fprintf(stderr, "Kernel have nothing to send.\n");
return -1;
}
/* reset the tp_len : optional */
header->tp_len = 0;
/* release the buffer */
msg->data = 0;
msg->data_len = 0;
return 0;
}
/******************************************************************************
* Permet de fixer le mode debug *
*****************************************************************************/
int ethernet_set_debug(ethernet_t * itf, int debug)
{
#if !defined(NDEBUG)
/* check parameters */
assert(itf);
int old_debug = itf->debug;
itf->debug = debug;
return old_debug;
#else
return 0;
#endif
}
/******************************************************************************
* Permet de recuperer l'adresse mac *
*****************************************************************************/
void ethernet_fill_with_mac_addr(ethernet_t * itf, uint8_t * addr, unsigned addr_len)
{
/* check parameters */
assert(itf && addr);
unsigned i;
for(i = 0; (i < itf->local_addr.sll_halen) && (i < addr_len); i++)
{
addr[i] = itf->local_addr.sll_addr[i];
}
for(; i < addr_len; i++)
{
addr[i] = 0;
}
}
/******************************************************************************
* *
* *
*****************************************************************************/
#if !defined(NDEBUG)
static void ethernet_debug_frame(const void * base)
{
fprintf(stdout, "buffer base addr %p\n", base);
const struct tpacket2_hdr * header = (const struct tpacket2_hdr *)base;
fprintf(stdout, "tpacket2_header :\n");
fprintf(stdout, " tp_status : 0x%02x\n", header->tp_status);
fprintf(stdout, " tp_len : %d\n", header->tp_len);
fprintf(stdout, " tp_snaplen : %d\n", header->tp_snaplen);
fprintf(stdout, " tp_mac : %d\n", header->tp_mac);
fprintf(stdout, " tp_net : %d\n", header->tp_net);
fprintf(stdout, " tp_sec : %d\n", header->tp_sec);
fprintf(stdout, " tp_nsec : %d\n", header->tp_nsec);
fprintf(stdout, " tp_vlan_tci : 0x%04x\n", header->tp_vlan_tci);
const struct sockaddr_ll * sll = (const struct sockaddr_ll *)(base + TPACKET_ALIGN(sizeof(struct tpacket2_hdr)));
fprintf(stdout, "sockaddr_ll :\n");
fprintf(stdout, " sll_family : 0x%02x\n", sll->sll_family);
fprintf(stdout, " sll_protocol : 0x%04x\n", sll->sll_protocol);
fprintf(stdout, " sll_ifindex : %d\n", sll->sll_ifindex);
fprintf(stdout, " sll_hatype : %d\n", sll->sll_hatype);
fprintf(stdout, " sll_pkttype : %d\n", sll->sll_pkttype);
fprintf(stdout, " sll_halen : %d\n", sll->sll_halen);
fprintf(stdout, " sll_addr[8] : %02x:%02x:%02x:%02x:%02x:%02x:\n",
sll->sll_addr[0], sll->sll_addr[1], sll->sll_addr[2],
sll->sll_addr[3], sll->sll_addr[4], sll->sll_addr[5]);
}
#endif
/******************************************************************************
* *
* *
*****************************************************************************/
#if !defined(NDEBUG)
static void ethernet_debug_packet_req(const struct tpacket_req * rx_packet_req, const struct tpacket_req * tx_packet_req)
{
fprintf(stdout, "Pagesize = %ld\n", sysconf(_SC_PAGESIZE));
fprintf(stdout, "TPACKET_ALIGNMENT = %d\n", TPACKET_ALIGNMENT);
fprintf(stdout, "TPACKET2_HDRLEN = %d\n", TPACKET2_HDRLEN);
fprintf(stdout, "sizeof(struct sockaddr_ll) = %d\n", sizeof(struct sockaddr_ll));
fprintf(stdout, "Rx packet req :\n");
fprintf(stdout, " tp_block_size = %d\n", rx_packet_req->tp_block_size);
fprintf(stdout, " tp_block_nr = %d\n", rx_packet_req->tp_block_nr);
fprintf(stdout, " tp_frame_size = %d\n", rx_packet_req->tp_frame_size);
fprintf(stdout, " tp_frame_nr = %d\n", rx_packet_req->tp_frame_nr);
fprintf(stdout, "Tx packet req :\n");
fprintf(stdout, " tp_block_size = %d\n", tx_packet_req->tp_block_size);
fprintf(stdout, " tp_block_nr = %d\n", tx_packet_req->tp_block_nr);
fprintf(stdout, " tp_frame_size = %d\n", tx_packet_req->tp_frame_size);
fprintf(stdout, " tp_frame_nr = %d\n", tx_packet_req->tp_frame_nr);
}
#endif
^ permalink raw reply [flat|nested] 16+ messages in thread