netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Vasily Averin <vvs@virtuozzo.com>, Eric Dumazet <edumazet@google.com>
Cc: netdev <netdev@vger.kernel.org>, Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH] tcp: detect use sendpage for slab-based objects
Date: Mon, 4 Mar 2019 07:51:07 -0800	[thread overview]
Message-ID: <cb5993a4-9b00-2c9a-60ca-9cfa4c5c15b3@gmail.com> (raw)
In-Reply-To: <4a7d3903-971d-26ea-1d70-514abff88f91@virtuozzo.com>



On 03/04/2019 04:58 AM, Vasily Averin wrote:
> On 2/21/19 7:00 PM, Eric Dumazet wrote:
>> On Thu, Feb 21, 2019 at 7:30 AM Vasily Averin <vvs@virtuozzo.com> wrote:
>>>
>>> There was few incidents when XFS over network block device generates
>>> IO requests with slab-based metadata. If these requests are processed
>>> via sendpage path tcp_sendpage() calls skb_can_coalesce() and merges
>>> neighbour slab objects into one skb fragment.
>>>
>>> If receiving side is located on the same host tcp_recvmsg() can trigger
>>> following BUG_ON
>>> usercopy: kernel memory exposure attempt detected
>>>                 from XXXXXX (kmalloc-512) (1024 bytes)
>>>
>>> This patch helps to detect the reason of similar incidents on sending side.
>>>
>>> Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
>>> ---
>>>  net/ipv4/tcp.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>>> index 2079145a3b7c..cf9572f4fc0f 100644
>>> --- a/net/ipv4/tcp.c
>>> +++ b/net/ipv4/tcp.c
>>> @@ -996,6 +996,7 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
>>>                         goto wait_for_memory;
>>>
>>>                 if (can_coalesce) {
>>> +                       WARN_ON_ONCE(PageSlab(page));
>>
>> Please use VM_WARN_ON_ONCE() to make this a nop for CONFIG_VM_DEBUG=n
>> Also the whole tcp_sendpage() should be protected, not only the coalescing part.
>> (The get_page()  done few lines later should not be attempted either)
> 
> Eric, what do you think about following patch?
> I validate its backported version on RHEL7 based OpenVZ kernel before sending to mainline. 
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index cf3c5095c10e..7be7b6abe8b5 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -943,6 +943,11 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
>  	ssize_t copied;
>  	long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>  
> +	if (PageSlab(page)) {
> +		VM_WARN_ONCE(true, "sendpage should not handle Slab objects,"
> +				   " please fix callers\n");
> +		return sock_no_sendpage_locked(sk, page, offset, size, flags);
> +	}
>  	/* Wait for a connection to finish. One exception is TCP Fast Open
>  	 * (passive side) where data is allowed to be sent before a connection
>  	 * is fully established.
> 

There are at least four problems with this approach :

1) VM_WARN_ONCE() might be a NOP, and if not, it is simply some lines in syslog,
among thousands.

2) Falling back will give no incentive for callers to fix their code.

3) slowing down TCP, just because of some weird kernel-users.
   I agree to add sanity check for everything user space can think of (aka syzbot),
   but kernel users need to be fixed, without adding code in TCP.

4) sendpage() API is providing one page at a time.
   We therefore call very expensive lock_sock() and release_sock() for every page.
   sendfile() is sub optimal (compared to sendmsg(MSG_ZEROCOPY))
   There is an effort to provide batches of pages per round.
   Your patch would cancel this effort, or make it very complicated.



  reply	other threads:[~2019-03-04 15:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-21 15:30 [PATCH] tcp: detect use sendpage for slab-based objects Vasily Averin
2019-02-21 16:00 ` Eric Dumazet
2019-02-22 14:02   ` Vasily Averin
2019-02-22 16:39     ` Eric Dumazet
2019-02-25  9:15       ` Vasily Averin
2019-02-25  9:32         ` Vasily Averin
2019-03-04 12:58   ` Vasily Averin
2019-03-04 15:51     ` Eric Dumazet [this message]
2019-03-05 14:24       ` Vasily Averin
     [not found]         ` <CANn89iKss+mzwbeZgy3Bzct6sBe3UeyezXXGocAYtOe9pP8a9w@mail.gmail.com>
2019-03-05 15:11           ` Eric Dumazet
2019-03-05 16:44             ` Eric Dumazet
2019-03-05 18:35               ` Vasily Averin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cb5993a4-9b00-2c9a-60ca-9cfa4c5c15b3@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=edumazet@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vvs@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).