All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wengang Wang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
To: santosh shilimkar
	<santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: leon-2ukJVAZIZ/Y@public.gmane.org
Subject: Re: [PATCH] RDS: sync congestion map updating
Date: Thu, 31 Mar 2016 09:51:51 +0800	[thread overview]
Message-ID: <56FC82B7.3070504@oracle.com> (raw)
In-Reply-To: <56FC09D6.7090602-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>



在 2016年03月31日 01:16, santosh shilimkar 写道:
> Hi Wengang,
>
> On 3/30/2016 9:19 AM, Leon Romanovsky wrote:
>> On Wed, Mar 30, 2016 at 05:08:22PM +0800, Wengang Wang wrote:
>>> Problem is found that some among a lot of parallel RDS 
>>> communications hang.
>>> In my test ten or so among 33 communications hang. The send requests 
>>> got
>>> -ENOBUF error meaning the peer socket (port) is congested. But 
>>> meanwhile,
>>> peer socket (port) is not congested.
>>>
>>> The congestion map updating can happen in two paths: one is in 
>>> rds_recvmsg path
>>> and the other is when it receives packets from the hardware. There 
>>> is no
>>> synchronization when updating the congestion map. So a bit operation 
>>> (clearing)
>>> in the rds_recvmsg path can be skipped by another bit operation 
>>> (setting) in
>>> hardware packet receving path.
>>>
>>> Fix is to add a spin lock per congestion map to sync the update on it.
>>> No performance drop found during the test for the fix.
>>
>> I assume that this change fixed your issue, however it looks suspicious
>> that performance wasn't change.
>>
> First of all thanks for finding the issue and posting patch
> for it. I do agree with Leon on performance comment.
> We shouldn't need locks for map updates.
>
Here is the performance data I collected yesterday.
Settings:
net.core.rmem_default = 4194304
net.core.wmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_max = 2097152

test case:  rds-stress -s 192.168.111.16 -q 1m -d 10 -T 300 -t 10
With 1M size sends, the 10 pending send request is enough to trigger the 
congestion on receiver side. And the test last 5 mins.

result is like this:
without patch:
10   2231   2355 4697759.63       0.00       0.00  473.38 19123.89 
-1.00  (average)
receiver
  10   2356   2231 4698350.06       0.00       0.00  486.28 18537.23 
-1.00  (average)

with patch applied:
sender
10   2230   2396 47x.53       0.00       0.00  475.87 31954.35 -1.00  
(average)
receiver
10   2396   2230 4738051.76       0.00       0.00  480.85 18408.13 
-1.00  (average)

So I don't see performance drops. On a previous test, the test result is 
reverted that is it's faster when patch not applied, but the numbers is 
till 47xxxxx VS 46xxxxx.  So I don't have a very stable test result. But 
in average, no obvious performance drop.

Let me try to explain from theory:
Firstly, No matter the rds_recvmsg path or the hardware receiving data 
path, we have rds_sock->rs_recv_lock (this is not enough to fix our 
issue here since there could be many different rds_socks) locked very 
near before we lock the congestion map.  So the performance drop on CPU 
cache refilling is small.
Secondly, though the problem exist,  the malformed map may be not 
happening that frequent especially for this test case, 10 parallel 
communication.

> Moreover the parallel receive path on which this patch
> is based of doesn't exist in upstream code. I have kept
> that out so far because of similar issue like one you
> encountered.
But I don't see how rds_recvmsg path is different from UEK kernels. Can 
you explain more here or offline?

>
> Anyways lets discuss offline about the fix even for the
> downstream kernel. I suspect we can address it without locks.
>
If in normal use we have no performace issue (and before we found import 
use case that would hit), I think locking is fine.
Well, what ideas do you have to prevent using locks? After all we are 
updating a 8KB bitmap, not a single uint64 or less length variable. No 
matter we use lock or not, we need to make sure the bits to update can't 
be cached on different CPUs.

thanks,
wengang

> Reagrds,
> Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2016-03-31  1:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-30  9:08 [PATCH] RDS: sync congestion map updating Wengang Wang
     [not found] ` <1459328902-31968-1-git-send-email-wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-03-30 16:19   ` Leon Romanovsky
     [not found]     ` <20160330161952.GA2670-2ukJVAZIZ/Y@public.gmane.org>
2016-03-30 17:16       ` santosh shilimkar
     [not found]         ` <56FC09D6.7090602-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-03-31  1:51           ` Wengang Wang [this message]
     [not found]             ` <56FC82B7.3070504-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-03-31  2:59               ` Wengang Wang
     [not found]                 ` <56FC927E.9090404-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-04-01 19:47                   ` santosh shilimkar
2016-04-02  1:14                     ` Leon Romanovsky
     [not found]                       ` <20160402011459.GC8565-2ukJVAZIZ/Y@public.gmane.org>
2016-04-02  4:30                         ` santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
2016-03-31  1:24       ` Wengang Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56FC82B7.3070504@oracle.com \
    --to=wen.gang.wang-qhclzuegtsvqt0dzr+alfa@public.gmane.org \
    --cc=leon-2ukJVAZIZ/Y@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.