All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Traynor <ktraynor@redhat.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>,
	Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: dev@dpdk.org, stable@dpdk.org
Subject: Re: [PATCH] vhost: fix virtio_net cache sharing of broadcast_rarp
Date: Mon, 20 Mar 2017 11:13:49 +0000	[thread overview]
Message-ID: <f04e9162-9b5e-9b03-4d65-994d957c8241@redhat.com> (raw)
In-Reply-To: <9cd39232-5b26-30cd-c51d-c6ce11068bee@redhat.com>

On 03/17/2017 10:01 AM, Maxime Coquelin wrote:
> 
> 
> On 03/17/2017 06:47 AM, Yuanhan Liu wrote:
>> On Thu, Mar 16, 2017 at 10:10:05AM +0000, Kevin Traynor wrote:
>>> On 03/16/2017 06:21 AM, Yuanhan Liu wrote:
>>>> On Wed, Mar 15, 2017 at 07:10:49PM +0000, Kevin Traynor wrote:
>>>>> The virtio_net structure is used in both enqueue and dequeue
>>>>> datapaths.
>>>>> broadcast_rarp is checked with cmpset in the dequeue datapath
>>>>> regardless
>>>>> of whether descriptors are available or not.
>>>>>
>>>>> It is observed in some cases where dequeue and enqueue are
>>>>> performed by
>>>>> different cores and no packets are available on the dequeue datapath
>>>>> (i.e. uni-directional traffic), the frequent checking of
>>>>> broadcast_rarp
>>>>> in dequeue causes performance degradation for the enqueue datapath.
>>>>>
>>>>> In OVS the issue can cause a uni-directional performance drop of up
>>>>> to 15%.
>>>>>
>>>>> Fix that by moving broadcast_rarp to a different cache line in
>>>>> virtio_net struct.
>>>>
>>>> Thanks, but I'm a bit confused. The drop looks like being caused by
>>>> cache false sharing, but I don't see anything would lead to a false
>>>> sharing. I mean, there is no write in the same cache line where the
>>>> broadcast_rarp belongs. Or, the "volatile" type is the culprit here?
>>>>
>>>
>>> Yes, the cmpset code uses cmpxchg and that performs a write regardless
>>> of the result - it either writes the new value or back the old value.
>>
>> Oh, right, I missed this part!
>>
>>>> Talking about that, I had actually considered to turn "broadcast_rarp"
>>>> to a simple "int" or "uint16_t" type, to make it more light weight.
>>>> The reason I used atomic type is to exactly send one broadcast RARP
>>>> packet once SEND_RARP request is recieved. Otherwise, we may send more
>>>> than one RARP packet when MQ is invovled. But I think we don't have
>>>> to be that accurate: it's tolerable when more RARP are sent. I saw 4
>>>> SEND_RARP requests (aka 4 RARP packets) in the last time I tried
>>>> vhost-user live migration after all. I don't quite remember why
>>>> it was 4 though.
>>>>
>>>> That said, I think it also would resolve the performance issue if you
>>>> change "rte_atomic16_t" to "uint16_t", without moving the place?
>>>>
>>>
>>> Yes, that should work fine, with the side effect you mentioned of
>>> possibly some more rarps - no big deal.
>>>
>>> I tested another solution also - as it is unlikely we would need to send
>>> the broadcast_rarp, you can first read and only do the cmpset if it is
>>> likely to succeed. This resolved the issue too.
>>>
>>> --- a/lib/librte_vhost/virtio_net.c
>>> +++ b/lib/librte_vhost/virtio_net.c
>>> @@ -1057,7 +1057,8 @@ static inline bool __attribute__((always_inline))
>>>          *
>>>          * Check user_send_rarp() for more information.
>>>          */
>>> -       if (unlikely(rte_atomic16_cmpset((volatile uint16_t *)
>>> +       if (unlikely(rte_atomic16_read(&dev->broadcast_rarp) &&
>>> +                       rte_atomic16_cmpset((volatile uint16_t *)
>>>                                          &dev->broadcast_rarp.cnt, 1,
>>> 0))) {
>>>                 rarp_mbuf = rte_pktmbuf_alloc(mbuf_pool);
>>>                 if (rarp_mbuf == NULL) {
>>
>> I'm okay with this one. It's simple and clean enough, that it could
>> be picked to a stable release. Later, I'd like to send another patch
>> to turn it to "uint16_t". Since it changes the behaviour a bit, it
>> is not a good candidate for stable release.
>>
>> BTW, would you please include the root cause (false sharing) into
>> your commit log?
> And maybe also adds the info to the comment just above?
> I will help people wondering why we read before cmpset.
> 

Sure, I will re-spin, do some testing and submit a v2.

> Maxime

  reply	other threads:[~2017-03-20 11:13 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-15 19:10 [PATCH] vhost: fix virtio_net cache sharing of broadcast_rarp Kevin Traynor
2017-03-16  6:21 ` Yuanhan Liu
2017-03-16 10:10   ` Kevin Traynor
2017-03-17  5:47     ` Yuanhan Liu
2017-03-17 10:01       ` Maxime Coquelin
2017-03-20 11:13         ` Kevin Traynor [this message]
2017-03-23 15:44 ` [PATCH v2] vhost: fix virtio_net false sharing Kevin Traynor
2017-03-27  7:34   ` Maxime Coquelin
2017-03-27  8:33     ` Yuanhan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f04e9162-9b5e-9b03-4d65-994d957c8241@redhat.com \
    --to=ktraynor@redhat.com \
    --cc=dev@dpdk.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=stable@dpdk.org \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.