From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1756312AbbEUUph (ORCPT <rfc822;w@1wt.eu>);
	Thu, 21 May 2015 16:45:37 -0400
Received: from mga11.intel.com ([192.55.52.93]:29917 "EHLO mga11.intel.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1756082AbbEUUpf (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Thu, 21 May 2015 16:45:35 -0400
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.13,471,1427785200"; 
   d="scan'208";a="698540489"
Message-ID: <1432241133.2454.53.camel@jpfreyen-mobl5.amr.corp.intel.com>
Subject: Re: [PATCHv1] NVMe: nvme_queue made cache friendly.
From: J Freyensee <james_p_freyensee@linux.intel.com>
To: Parav Pandit <parav.pandit@avagotech.com>
Cc: linux-nvme@lists.infradead.org, willy@linux.intel.com, axboe@kernel.dk,
        linux-kernel@vger.kernel.org
Date: Thu, 21 May 2015 13:45:33 -0700
In-Reply-To: <1432154627-12336-1-git-send-email-parav.pandit@avagotech.com>
References: <1432154627-12336-1-git-send-email-parav.pandit@avagotech.com>
Content-Type: text/plain; charset="UTF-8"
X-Mailer: Evolution 3.10.4 (3.10.4-4.fc20) 
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, 2015-05-20 at 16:43 -0400, Parav Pandit wrote:
> nvme_queue structure made 64B cache friendly so that majority of the
> data elements of the structure during IO and completion path can be
> found in typical single 64B cache line size which was previously spanning
> beyond single 64B cache line size.
> 
> By aligning most of the fields are found at start of the structure.
> Elements which are not used in frequent IO path are moved at the
> end of structure.

I'll repeat the same question Matthew said last time:

"Have you done any performance measurements on this?"

If the answer is no, then I'm not sure why the patch is even being sent
to apply to the code base if the main reason is performance-related.
>>From the comments from the last patch attempt, it did not even sound
like there was a good understanding where the q_lock should go for best
performance.

I think it would be better to have some results to go along with the
patch request.  At least it would be known for sure where the q_lock
should go.  And that would be good knowledge to know for future
programming projects.

> 
> Signed-off-by: Parav Pandit <parav.pandit@avagotech.com>
> ---
>  drivers/block/nvme-core.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
> index b9ba36f..58041c7 100644
> --- a/drivers/block/nvme-core.c
> +++ b/drivers/block/nvme-core.c
> @@ -98,23 +98,23 @@ struct async_cmd_info {
>  struct nvme_queue {
>  	struct device *q_dmadev;
>  	struct nvme_dev *dev;
> -	char irqname[24];	/* nvme4294967295-65535\0 */
> -	spinlock_t q_lock;
>  	struct nvme_command *sq_cmds;
> +	struct blk_mq_hw_ctx *hctx;
>  	volatile struct nvme_completion *cqes;
> -	dma_addr_t sq_dma_addr;
> -	dma_addr_t cq_dma_addr;
>  	u32 __iomem *q_db;
> +	spinlock_t q_lock;
>  	u16 q_depth;
> -	s16 cq_vector;
>  	u16 sq_head;
>  	u16 sq_tail;
>  	u16 cq_head;
>  	u16 qid;
> +	s16 cq_vector;
>  	u8 cq_phase;
>  	u8 cqe_seen;
>  	struct async_cmd_info cmdinfo;
> -	struct blk_mq_hw_ctx *hctx;
> +	char irqname[24];	/* nvme4294967295-65535\0 */
> +	dma_addr_t sq_dma_addr;
> +	dma_addr_t cq_dma_addr;
>  };
>  
>  /*