From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 839C6C43381 for ; Fri, 15 Feb 2019 19:04:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 54C842192B for ; Fri, 15 Feb 2019 19:04:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XsnMxJiM" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732713AbfBOTEw (ORCPT ); Fri, 15 Feb 2019 14:04:52 -0500 Received: from mail-pf1-f194.google.com ([209.85.210.194]:33815 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726654AbfBOTEw (ORCPT ); Fri, 15 Feb 2019 14:04:52 -0500 Received: by mail-pf1-f194.google.com with SMTP id j18so5269197pfe.1 for ; Fri, 15 Feb 2019 11:04:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9467hQrImFg5B4efJQSUcx+enN6JET4PB44Ju+tIcuo=; b=XsnMxJiMztyBTKw3jX5t/po+luOOYDNiKaOXvBoKYcLlFg68F5hBm84BBShP3P69Op vCfpHMRrL6KKJqC3756lA7dT8IKmKea8GzBvi8JCYv3ZwEucI9jd2ooazn8bLAFJ64uD IbdaEvKtx/6ZWc2Nmc9ZL1XbuIfVNz1TMh4tZ//pAVFNt9ZwlBwacqJ+2jLu8tBvKE3r GSHowiNoOAFOXYDcfjg+pG+hXWr0Iqt8+tahkSb8WF8yfs4hcJAOLNHaF/5Bnoht0lHx CRDvjqAmO8CUlCnWgoQm8JQYzj7tH1sGzas571afVj9EAcDcsO5HE7Vn07U2WKSsSy2I N0Gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9467hQrImFg5B4efJQSUcx+enN6JET4PB44Ju+tIcuo=; b=ZudxsbMFBmkri70/xwXKhPxZaB43kBEU5pyKIDqGthnJodblOv+OCAGnx+5TLESTd6 PH2UyTTVVniHD6gIjAdggyXsYMh381sSCM08e980WxWMfKb4kNDtEVHHCZgGVqClwSSB wjdeJ8EInKPS0WNN7mSbXSR5vM9z/Ka0PXcAvVP5ZkaKT9Y4YIjw8NZRj0ApiLOn552g 5sCScANv3etmA4NlWpkWTlNR5IUHfnUKjweaZT8TnxcbIuxaKiAeKu2/eGfa+d7jHdoh YT+/32X4dVcrdfuQLcNs3jstO0UaCV/Lxyi+r1PBc3QaZmlOBqHdc0UW2Opz13g4FebR GgFg== X-Gm-Message-State: AHQUAuacFWCQgc99ofhTIiRDM2Dte4AL6svZnqNKmwbuVp3zZRvtgQxF yrqa7rj7/hlk/AwE9uoliYW7bCCv8Ti9o8Miz8o= X-Google-Smtp-Source: AHgI3IYMP0zeoow9HPssiBYHMTgmfi4nx/jHJQRs4L0Gcp7EeTBT/DU0kLpTThLzzcMBJ49ZwKc/vzXblMOVSLE1wRY= X-Received: by 2002:a62:62c5:: with SMTP id w188mr11373436pfb.160.1550257491245; Fri, 15 Feb 2019 11:04:51 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Cong Wang Date: Fri, 15 Feb 2019 11:04:38 -0800 Message-ID: Subject: Re: Three questions about busy poll To: Willem de Bruijn Cc: Alexander Duyck , Eric Dumazet , sridhar.samudrala@intel.com, Linux Kernel Network Developers Content-Type: text/plain; charset="UTF-8" Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Thu, Feb 14, 2019 at 4:39 PM Willem de Bruijn wrote: > > On Thu, Feb 14, 2019 at 3:15 PM Cong Wang wrote: > > > > Hello, > > > > While looking into the busy polling in Linux kernel, three questions > > come into my mind: > > > > 1. In the document[1], it claims sysctl.net.busy_poll depends on > > either SO_BUSY_POLL or sysctl.net.busy_read. However, from the code in > > ep_set_busy_poll_napi_id(), I don't see such a dependency. It simply > > checks sysctl_net_busy_poll and sk->sk_napi_id, but sk->sk_napi_id is > > always set as long as we enable CONFIG_NET_RX_BUSY_POLL. So what I am > > missing here? > > That documentation refers to sock_poll. This does call sk_busy_loop > individually on each socket in the pollset and thus respects those values. > Epoll was added later, after both sock_poll and that documentation. Ah, yeah, this explains my confusion. I thought busy_poll refers to all polling related syscalls, that is select()/poll()/epoll(), it looks like epoll() is so special here. Probably we need some clarification in net.txt. > > > 2. Why there is no socket option for sysctl.net.busy_poll? Clearly > > sysctl_net_busy_poll is global and SO_BUSY_POLL only works for > > sysctl.net.busy_read. > > I guess because of how sock_poll works. In that case it is not needed. > The poll duration applies more to the pollset than any of the > individual sockets, too. Good point, it's probably like struct eventpoll vs. struct epitem. The reason why I am looking for a per-socket tuning is to minimize the impact of setting busy_poll. I don't know if it is possible to somehow make this per-socket via epoll interfaces, perhaps fundamentally it is impossible? > > > 3. How is SO_INCOMING_NAPI_ID supposed to be used? I can't find any > > useful documents online. Any example or more detailed doc? > > From the commit message of 6d4339028b35 ("net: Introduce > SO_INCOMING_NAPI_ID") it sounds like a sharding mechanism that > maintains flow affinity by sharding based on rxqueue (assuming that > something like RSS was used to ensure flow affinity in the first > place). That commit message is the only thing I can find too. I kinda need a formal documentation in man page and hopefully an example too. Thanks for your explanations!