All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Cc: Network Development
	<netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Willem de Bruijn
	<willemb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH net-next v3 02/13] sock: skb_copy_ubufs support for compound pages
Date: Tue, 27 Jun 2017 11:53:25 -0400	[thread overview]
Message-ID: <CAF=yD-LJ-EvWbKtxqrcK1D=nRSzWzpPcFVOX1poZ3+P=aa0QfA@mail.gmail.com> (raw)
In-Reply-To: <CAF=yD-JGekEkBsFfgZa+-TN3-QBPy4sfK0QLPx83Uh3DAoodvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

>>>> I looked at some kmap_atomic() implementations and I do not think
>>>> it supports compound pages.
>>>
>>> Indeed. Thanks. It appears that I can do the obvious thing and
>>> kmap the individual page that is being copied inside the loop:
>>>
>>>   kmap_atomic(skb_frag_page(f) + (f_off >> PAGE_SHIFT));
>>>
>>> This is similar to existing logic in copy_huge_page_from_user
>>> and __flush_dcache_page in arch/arm/mm/flush.c
>>>
>>> But, this also applies to other skb operations that call kmap_atomic,
>>> such as skb_copy_bits and __skb_checksum. Not all can be called
>>> from a codepath with a compound user page, but I have to address
>>> the ones that can.
>>
>> Yeah that's quite a mess, it looks like this assumption that
>> kmap can handle compound pages exists in quite a few places.
>
> I hadn't even considered that skbs can already hold compound
> page frags without zerocopy.
>
> Open coding all call sites to iterate is tedious and unnecessary
> in the common case where a page is not highmem.
>
> kmap_atomic has enough slots to map an entire order-3 compound
> page at once. But kmap_atomic cannot fail and there may be edge
> cases that are larger than order-3.
>
> Packet rings allocate with __GFP_COMP and an order derived
> from (user supplied) tp_block_size, for instance. But it links each
> skb_frag_t from an individual page, so this case seems okay.
>
> Perhaps calls to kmap_atomic can be replaced with a
> kmap_compound(..) that checks
>
>  __this_cpu_read(__kmap_atomic_idx) +  (1 << compound_order(p)) < KM_TYPE_NR
>
> before calling kmap_atomic on all pages in the compound page. In
> the common case that the page is not high mem, a single call is
> enough, as there is no per-page operation.

This does not work. Some callers, such as __skb_checksum, cannot
fail, so neither can kmap_compound. Also, vaddr of consecutive
kmap_atomic calls are not guaranteed to be in order. Indeed, on x86
and arm vaddr appears to grows down: (FIXADDR_TOP - ((x) << PAGE_SHIFT))

An alternative is to change the kmap_atomic callers in skbuff.c. To
avoid open coding, we can wrap the kmap_atomic; op; kunmap_atomic
in a macro that loops only if needed:

static inline bool skb_frag_must_loop(struct page *p)
{
#if defined(CONFIG_HIGHMEM) || defined(CONFIG_X86_32)
        if (PageHighMem(p))
                return true;
#endif
        return false;
}

#define skb_frag_map_foreach(f, start, size, p, p_off, cp, copied)      \
        for (p = skb_frag_page(f) + ((start) >> PAGE_SHIFT),            \
             p_off = (start) & (PAGE_SIZE - 1),                         \
             copied = 0,                                                \
             cp = skb_frag_must_loop(p) ?                               \
                    min_t(u32, size, PAGE_SIZE - p_off) : size;         \
             copied < size;                                             \
             copied += cp, p++, p_off = 0,                              \
             cp = min_t(u32, size - copied, PAGE_SIZE))                 \

This does not change behavior on machines without high mem
or on low mem pages.

skb_seq_read keeps a mapping between calls to the function,
so will need a separate approach.

  parent reply	other threads:[~2017-06-27 15:53 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-21 21:18 [PATCH net-next v3 00/13] socket sendmsg MSG_ZEROCOPY Willem de Bruijn
2017-06-21 21:18 ` [PATCH net-next v3 01/13] sock: allocate skbs from optmem Willem de Bruijn
     [not found] ` <20170621211816.53837-1-willemdebruijn.kernel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-21 21:18   ` [PATCH net-next v3 02/13] sock: skb_copy_ubufs support for compound pages Willem de Bruijn
     [not found]     ` <20170621211816.53837-3-willemdebruijn.kernel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-22 17:05       ` David Miller
     [not found]         ` <20170622.130528.1762873686654379973.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2017-06-22 20:57           ` Willem de Bruijn
     [not found]             ` <CAF=yD-+WQ1b9BytG4JuhgHCR8gAKvUVJPUfi4t-u_DqAFnVrkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-23  1:36               ` David Miller
2017-06-23  3:59                 ` Willem de Bruijn
     [not found]                   ` <CAF=yD-JGekEkBsFfgZa+-TN3-QBPy4sfK0QLPx83Uh3DAoodvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-27 15:53                     ` Willem de Bruijn [this message]
     [not found]                       ` <CAF=yD-LJ-EvWbKtxqrcK1D=nRSzWzpPcFVOX1poZ3+P=aa0QfA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-29 15:54                         ` Willem de Bruijn
2017-06-21 21:18   ` [PATCH net-next v3 13/13] test: add msg_zerocopy test Willem de Bruijn
2017-06-21 21:18 ` [PATCH net-next v3 03/13] sock: add MSG_ZEROCOPY Willem de Bruijn
     [not found]   ` <20170621211816.53837-4-willemdebruijn.kernel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-22 17:06     ` David Miller
2017-06-21 21:18 ` [PATCH net-next v3 04/13] sock: add SOCK_ZEROCOPY sockopt Willem de Bruijn
2017-06-21 21:18 ` [PATCH net-next v3 05/13] sock: enable MSG_ZEROCOPY Willem de Bruijn
2017-06-21 21:18 ` [PATCH net-next v3 06/13] sock: MSG_ZEROCOPY notification coalescing Willem de Bruijn
     [not found]   ` <20170621211816.53837-7-willemdebruijn.kernel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-22 17:07     ` David Miller
2017-06-21 21:18 ` [PATCH net-next v3 07/13] sock: add ee_code SO_EE_CODE_ZEROCOPY_COPIED Willem de Bruijn
2017-06-21 21:18 ` [PATCH net-next v3 08/13] sock: ulimit on MSG_ZEROCOPY pages Willem de Bruijn
2017-06-21 21:18 ` [PATCH net-next v3 09/13] tcp: enable MSG_ZEROCOPY Willem de Bruijn
2017-06-21 21:18 ` [PATCH net-next v3 10/13] udp: " Willem de Bruijn
2017-06-21 21:18 ` [PATCH net-next v3 11/13] raw: enable MSG_ZEROCOPY with IP_HDRINCL Willem de Bruijn
2017-06-21 21:18 ` [PATCH net-next v3 12/13] packet: enable MSG_ZEROCOPY Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAF=yD-LJ-EvWbKtxqrcK1D=nRSzWzpPcFVOX1poZ3+P=aa0QfA@mail.gmail.com' \
    --to=willemdebruijn.kernel-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=willemb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.