From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29BF5C388F4 for ; Fri, 27 Sep 2019 16:37:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0A7592190F for ; Fri, 27 Sep 2019 16:37:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727747AbfI0Qho (ORCPT ); Fri, 27 Sep 2019 12:37:44 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:33797 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726251AbfI0Qho (ORCPT ); Fri, 27 Sep 2019 12:37:44 -0400 Received: by mail-pf1-f196.google.com with SMTP id b128so1919389pfa.1; Fri, 27 Sep 2019 09:37:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=rXUZjRv3sHDzWZRDkwglLtup/0FTFNbY7qU5xRDhJ5c=; b=ap4wwVeEwq7TqtHzAlPh3CZF+CERfSpKV3xZyOZ3wpl5a076H2K61qzKD7O26UvylM QteKCL+CXwvm+m0Ayb8M28ju3hbYYpUNPDMF3gRfnoqN6JcSbjfVKyjCB4X7wSoRtT90 cIr+76ogDPBWnR6sNcO2AqLlJS+Q4O1jjKJICb6DDAsQlaejW5XBe4KCwPFrJDHaTg3x pAt4F5vhfT+aJ+Bc/YxBYSpzVg0WKrzjTXCpIKgDNu82LBNbils+S2ww05J9/s8ofon7 wmFVDg0G15z6WN9+FzNlrdWPoe74yckXuq+mtJHyg/adtHmnWOoSzr7y3ehJuge6flfn oLOw== X-Gm-Message-State: APjAAAVzw1WWFn+GCjUdUKxA0B9o7QeufKWY5Cxr54lxRCGj6J5HI3aT Br6IY1zzYSLMMdA6qVNtzpI= X-Google-Smtp-Source: APXvYqxLnN/86RHL4Y9EPVBoV6SoixR70FwUVwgHVRklnvVwtmsE5rxAwcrrgqSyHe756UzwH9M7mA== X-Received: by 2002:aa7:96ab:: with SMTP id g11mr5480651pfk.61.1569602263309; Fri, 27 Sep 2019 09:37:43 -0700 (PDT) Received: from desktop-bart.svl.corp.google.com ([2620:15c:2cd:202:4308:52a3:24b6:2c60]) by smtp.gmail.com with ESMTPSA id f128sm4641540pfg.143.2019.09.27.09.37.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 27 Sep 2019 09:37:42 -0700 (PDT) Subject: Re: [PATCH v4 17/25] ibnbd: client: main functionality To: Roman Penyaev Cc: Danil Kipnis , Jack Wang , linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, Jens Axboe , Christoph Hellwig , Sagi Grimberg , Jason Gunthorpe , Doug Ledford , rpenyaev@suse.de, Jack Wang References: <20190620150337.7847-1-jinpuwang@gmail.com> <20190620150337.7847-18-jinpuwang@gmail.com> <5c5ff7df-2cce-ec26-7893-55911e4d8595@acm.org> <6f677d56-82b3-a321-f338-cbf8ff4e83eb@acm.org> From: Bart Van Assche Message-ID: Date: Fri, 27 Sep 2019 09:37:39 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On 9/27/19 1:52 AM, Roman Penyaev wrote: > No, it seems this thingy is a bit different. According to my > understanding patches 3 and 4 from this patchset do the > following: 1# split equally the whole queue depth on number > of hardware queues and 2# return tag number which is unique > host-wide (more or less similar to unique_tag, right?). > > 2# is not needed for ibtrs, and 1# can be easy done by dividing > queue_depth on number of hw queues on tag set allocation, e.g. > something like the following: > > ... > tags->nr_hw_queues = num_online_cpus(); > tags->queue_depth = sess->queue_deph / tags->nr_hw_queues; > > blk_mq_alloc_tag_set(tags); > > > And this trick won't work out for the performance. ibtrs client > has a single resource: set of buffer chunks received from a > server side. And these buffers should be dynamically distributed > between IO producers according to the load. Having a hard split > of the whole queue depth between hw queues we can forget about a > dynamic load distribution, here is an example: > > - say server shares 1024 buffer chunks for a session (do not > remember what is the actual number). > > - 1024 buffers are equally divided between hw queues, let's > say 64 (number of cpus), so each queue is 16 requests depth. > > - only several CPUs produce IO, and instead of occupying the > whole "bandwidth" of a session, i.e. 1024 buffer chunks, > we limit ourselves to a small queue depth of an each hw > queue. > > And performance drops significantly when number of IO producers > is smaller than number of hw queues (CPUs), and it can be easily > tested and proved. > > So for this particular ibtrs case tags should be globally shared, > and seems (unfortunately) there is no any other similar requirements > for other block devices. Hi Roman, I agree that BLK_MQ_F_HOST_TAGS partitions a tag set across hardware queues while ibnbd shares a single tag set across multiple hardware queues. Since such sharing may be useful for other block drivers, isn't that something that should be implemented in the block layer core instead of in the ibnbd driver? If that logic would be moved into the block layer core, would that allow to reuse the queue restarting logic that already exists in the block layer core? Thanks, Bart.