From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934459AbbDXI7Y (ORCPT ); Fri, 24 Apr 2015 04:59:24 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39826 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934344AbbDXI7U (ORCPT ); Fri, 24 Apr 2015 04:59:20 -0400 From: Vitaly Kuznetsov To: Dexuan Cui Cc: KY Srinivasan , Haiyang Zhang , "devel\@linuxdriverproject.org" , "linux-kernel\@vger.kernel.org" Subject: Re: [PATCH 6/6] Drivers: hv: vmbus: do a fair round robin when selecting an outgoing channel References: <1429626460-7947-1-git-send-email-vkuznets@redhat.com> <1429626460-7947-7-git-send-email-vkuznets@redhat.com> Date: Fri, 24 Apr 2015 10:59:16 +0200 In-Reply-To: (Dexuan Cui's message of "Fri, 24 Apr 2015 08:42:11 +0000") Message-ID: <87sibpgavv.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dexuan Cui writes: >> -----Original Message----- >> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com] >> Sent: Tuesday, April 21, 2015 22:28 >> To: KY Srinivasan >> Cc: Haiyang Zhang; devel@linuxdriverproject.org; linux- >> kernel@vger.kernel.org; Dexuan Cui >> Subject: [PATCH 6/6] Drivers: hv: vmbus: do a fair round robin when >> selecting an outgoing channel >> >> vmbus_get_outgoing_channel() implements the following algorithm for >> selecting >> an outgoing channel (despite the comment before the function saying it >> distributes the load equally): > > Yeah, I also found the issue. > >> 1) If we have no subchannels return the primary channel; >> 2) If primary->next_oc is grater than primary->num_sc reset the primary- >> >next_oc >> to 0 and return the primary channel; >> 3) Aim for the primary->next_oc subchannel, increment primary->next_oc; >> 4) Loop through all opened subchannels. If we see a channel which has >> target_cpu == current_cpu return it. If we reached the primary->next_oc'th >> open subchannel return it; >> 5) Return the primary channel. >> The implementation also skips the subchannel No. 0 unless it matches the >> current >> cpu as we assign i to 1 in the initialization. >> >> This is not a fair round robin as subchannels in the beginning of the list are >> more likely to be returned and checking for current cpu aslo creates > > I suppose the current algorithm is trying to make use of cache locality? > KY may share more information. > >> additional >> complexity. Simplify the vmbus_get_outgoing_channel() function, make it >> do what the comment before it says. > > Hi Vitaly, > It looks your algorithm also has an issue: > Assuming primary->num_sc == 3 (SC1, SC2, SC3) > 1st time: we choose SC1 and primary->next_oc is set to 1. > 2nd time: we choose SC2 and primary->next_oc is set to 2. > 3rd time: we choose SC3 and primary->next_oc is set to 3. > 4th time and later: since i varies among 1~3 and can't be bigger than 3, > we always choose the primary channel. You're right, it seems 'if (primary->next_oc > primary->num_sc)' is off-by-one, it should be >= > > BTW, IMO it's not easy to achieve complete fairness because > vmbus_get_outgoing_channel() can run simultaneously on different > CPUs (so primary->next_oc can be modified at the same time by multiple > CPUs), and we believe it should be lockless. > Maybe atomic_t can help(?) This thing bothered me a bit but then I realized that we don't actually care - doing a mistake once is better than suffering from the slowness of locks/atomics. I'd suggest we keep it this way (with the fix mentioned above). Thanks! > > Thanks, > -- Dexuan > >> >> Signed-off-by: Vitaly Kuznetsov >> --- >> drivers/hv/channel_mgmt.c | 27 +++++++++------------------ >> 1 file changed, 9 insertions(+), 18 deletions(-) >> >> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c >> index daa6417..df82442 100644 >> --- a/drivers/hv/channel_mgmt.c >> +++ b/drivers/hv/channel_mgmt.c >> @@ -827,39 +827,30 @@ cleanup: >> struct vmbus_channel *vmbus_get_outgoing_channel(struct >> vmbus_channel *primary) >> { >> struct list_head *cur, *tmp; >> - int cur_cpu; >> struct vmbus_channel *cur_channel; >> - struct vmbus_channel *outgoing_channel = primary; >> - int next_channel; >> - int i = 1; >> + int i = 0; >> >> if (list_empty(&primary->sc_list)) >> - return outgoing_channel; >> + return primary; >> >> - next_channel = primary->next_oc++; >> - >> - if (next_channel > (primary->num_sc)) { >> + if (primary->next_oc > primary->num_sc) { >> primary->next_oc = 0; >> - return outgoing_channel; >> + return primary; >> } >> >> - cur_cpu = hv_context.vp_index[get_cpu()]; >> - put_cpu(); >> list_for_each_safe(cur, tmp, &primary->sc_list) { >> + i++; >> cur_channel = list_entry(cur, struct vmbus_channel, sc_list); >> if (cur_channel->state != CHANNEL_OPENED_STATE) >> continue; >> >> - if (cur_channel->target_vp == cur_cpu) >> - return cur_channel; >> - >> - if (i == next_channel) >> + if (i > primary->next_oc) { >> + primary->next_oc = i; >> return cur_channel; >> - >> - i++; >> + } >> } >> >> - return outgoing_channel; >> + return primary; >> } >> EXPORT_SYMBOL_GPL(vmbus_get_outgoing_channel); >> >> -- >> 1.9.3 -- Vitaly