Re: [PATCH] tcp: detect use sendpage for slab-based objects

From: Vasily Averin <vvs@virtuozzo.com>
To: Eric Dumazet <edumazet@google.com>
Cc: netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH] tcp: detect use sendpage for slab-based objects
Date: Fri, 22 Feb 2019 17:02:22 +0300	[thread overview]
Message-ID: <80f35733-e3cf-c7da-1822-87054903dc67@virtuozzo.com> (raw)
In-Reply-To: <CANn89i+tJA1zHN9Hbw58UQOEDY2iZCAssJcZ-+Y7C=zsNW3VSA@mail.gmail.com>

On 2/21/19 7:00 PM, Eric Dumazet wrote:
> On Thu, Feb 21, 2019 at 7:30 AM Vasily Averin <vvs@virtuozzo.com> wrote:
>> index 2079145a3b7c..cf9572f4fc0f 100644
>> --- a/net/ipv4/tcp.c
>> +++ b/net/ipv4/tcp.c
>> @@ -996,6 +996,7 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
>>                         goto wait_for_memory;
>>
>>                 if (can_coalesce) {
>> +                       WARN_ON_ONCE(PageSlab(page));
> 
> Please use VM_WARN_ON_ONCE() to make this a nop for CONFIG_VM_DEBUG=n
> 
> Also the whole tcp_sendpage() should be protected, not only the coalescing part.
> 
> (The get_page()  done few lines later should not be attempted either)
> 
>>                         skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy);
>>                 } else {
>>                         get_page(page);
>> --
> 
> It seems the bug has nothing to do with TCP, and belongs to the caller.
> 
> Otherwise you need to add the check to all existing .sendpage() /
> .sendpage_locked() handler out there.

Eric, could you please elaborate once again why tcp_sendpage() should not handle slab objects?

There is known restriction: sendpage should not be called for pages with counter=0,
because internal put_page() releases the page. All sendpage callers I know have such check.

However why they should add one check for PageSlab?

Let me explain the problem once again:
I do not see any bugs neither in tcp nor in any sendpage callers,
there is false alert on receiving side that crashes correctly worked host.

There is network block device with XFS, 
XFS submit IO request with slab objects, 
block device driver checks that page count is positive and decides to use sendpage.
sendpage calls tcp_sendpage() that can merge 2 neighbour slab objects into one tcp fragment.

If data is transferred outside -- nothing bad happen, network device successfully send data outside.
However if data is received locally tcp_recvmsg detects strange vector with "merged" slab objects.
It is not real problem, data can be accessed correctly, however this check calls BUG_ON and crashes the host.

By this way recently added hardening check forces all .sendpage callers modify code that worked correctly for ages.

It looks abnormal to me, but I do not understand how to fix this problem correctly.

I do not like an idea to keep current state -- it can trigger crash of correctly worked hosts in some rare corner cases.
I do not like an idea to fix all callers -- why they need modify correctly worked code to protect from false positive?
I do not like an idea to modify tcp -- to block merge of fragments with slab objects like I proposed earlier.
We can trigger warning in tcp code -- to inform .sendpage callers that they are under fire,
however I agree with yours "bug has nothing to do with TCP" and do not understand why we need to modify tcp_sendpage().

May be it's better to replace BUG_ON to WARN_ON in hardening check?
Could you probably advise some other solution?

Thank you,
	Vasily Averin