From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chema Gonzalez Subject: Re: [PATCH v2] filter: added BPF random opcode Date: Mon, 21 Apr 2014 14:54:12 -0700 Message-ID: References: <1397585816-1267-1-git-send-email-chema@google.com> <1398097284-20528-1-git-send-email-chema@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: David Miller , Eric Dumazet , Daniel Borkmann , Network Development To: Alexei Starovoitov Return-path: Received: from mail-ig0-f176.google.com ([209.85.213.176]:45999 "EHLO mail-ig0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755341AbaDUVyN (ORCPT ); Mon, 21 Apr 2014 17:54:13 -0400 Received: by mail-ig0-f176.google.com with SMTP id uy17so2265979igb.15 for ; Mon, 21 Apr 2014 14:54:12 -0700 (PDT) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Mon, Apr 21, 2014 at 2:46 PM, Alexei Starovoitov wrote: > as I was saying in the other thread, would be nice to see more > realistic example, since "icmp 1 in 4" can be done in user space... > What is the real problem being solved? > I suspect for true packet sampling you'd need to have the knowledge > of packet rate, potentially computing time delta within filter with > another extension? > The patch itself looks good to me. Random sampling. There's a huge performance penalty if you do this in user-space. You don't want to send all the packets to user-space to just get (e.g.) 1 in 1000 and discard all the others. >>From http://www.icir.org/vern/papers/secondary-path-raid06.pdf: When dealing with large volumes of network traffic, we can often derive significant benefit while minimizing the processing cost by employing sampling. Generally, this is done on either a per-packet or per-connection basis. BPF does not provide access to pseudo-random numbers, so applications have had to rely on proxies for random- ness in terms of network header fields with some semblance of entropy across packets (checksum and IP fragment identifier fields) or connections (ephemeral ports). These sometimes provide acceptable approximations to random sampling, but can also suffer from significant irregularities due to lack of entropy or aliasing; see [11] for an analysis. -Chema