Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
From: Max Gurtovoy <maxg@mellanox.com>
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: axboe@kernel.dk, sagi@grimberg.me, vladimirk@mellanox.com,
	shlomin@mellanox.com, israelr@mellanox.com,
	linux-nvme@lists.infradead.org, idanb@mellanox.com,
	oren@mellanox.com, kbusch@kernel.org, hch@lst.de
Subject: Re: [PATCH 13/15] nvmet: Add metadata/T10-PI support
Date: Mon, 27 Jan 2020 19:17:30 +0200
Message-ID: <99c92a01-be7b-fd47-3d28-b75e5372c3a6@mellanox.com> (raw)
In-Reply-To: <yq1sgkpxqf5.fsf@oracle.com>


On 1/9/2020 5:24 AM, Martin K. Petersen wrote:
> Max,


Martin,


>
>> -	/* no limit on data transfer sizes for now */
>> -	id->mdts = 0;
>> +	/* Limit MDTS for metadata capable controllers (assume mpsmin is 4k) */
>> +	id->mdts = ctrl->pi_support ? ilog2(NVMET_T10_PI_MAX_KB_SZ >> 2) : 0;
> [...]
>
>> +#define NVMET_T10_PI_MAX_KB_SZ 128
> Where does this limit come from?

This is coming from few reasons:

1. In the current driver implementation of the ib_core (RDMA) we are 
limited to 1MB IO for T10-DIF offload so we need to limit the mdts 
anyway. For that, I want to add a ctrl->ops->get_ctrl_mdts(ctrl).

2. There is some unclear (to me) behavior in the block layer regarding 
splitting integrity bios. We get guard error over the fabric transaction 
in case we need to split a bio (the error is at the target side). I'm 
debugging it and found potential bug that I fixed using:

@@ -378,6 +378,12 @@ void bio_integrity_advance(struct bio *bio, 
unsigned int bytes_done)

         bip->bip_iter.bi_sector += bytes_done >> 9;
         bvec_iter_advance(bip->bip_vec, &bip->bip_iter, bytes);
+
+       if (split && bip->bip_iter.bi_bvec_done) {
+               bip->bip_vec[bip->bip_iter.bi_idx].bv_len -= 
bip->bip_iter.bi_bvec_done;
+               bip->bip_vec[bip->bip_iter.bi_idx].bv_offset += 
bip->bip_iter.bi_bvec_done;
+               bip->bip_iter.bi_bvec_done = 0;
+       }
  }

but unfortunately it helped only if we split the bio once. In case we 
split it more than once, we get to guard check error again.

maybe you have an idea how to overcome this ?


3. The maximum_integrity_segments = 1 for NVMe devices, so we can use 
4096/8=512 integrity sectors before reaching the limit of 
max_integrity_segments. This leads to mdts=6 (256KB) if the bs=512 . I'm 
afraid that we might have issues with the fact that the target prot_sgl 
 > 1 and the NVMe driver set maximum_integrity_segments = 1.


Therefore, I suggest to limit the mdts per transport (and internally 
check if this ctrl is pi_enabled).


All the above is happening when using submit_bio --> ... --> bio_split 
function that call bio_advance as well (that is called from completion 
context as well). This path is not used in local NVMe IO.

In the local NVMe IO (e.g using FIO application), there is no call to 
bio_split/bio_advance in the submission flow, and there is only one call 
to bio_advance in the completion flow.


-Max.


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply index

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-06 13:37 [PATCH 00/15 V3] nvme-rdma/nvmet-rdma: " Max Gurtovoy
2020-01-06 13:37 ` [PATCH] nvme-cli/fabrics: Add pi_enable param to connect cmd Max Gurtovoy
2020-01-06 13:37 ` [PATCH 01/15] nvme: Introduce namespace features flag Max Gurtovoy
2020-01-07 18:07   ` Keith Busch
2020-01-08 12:00     ` Max Gurtovoy
2020-01-09  3:11   ` Martin K. Petersen
2020-01-09 10:38     ` Max Gurtovoy
2020-01-09 16:26       ` Keith Busch
2020-01-12  9:40         ` Max Gurtovoy
2020-01-13 20:31           ` Keith Busch
2020-01-14 16:04             ` Max Gurtovoy
2020-01-12  9:40         ` Max Gurtovoy
2020-01-06 13:37 ` [PATCH 02/15] nvme: Enforce extended LBA format for fabrics metadata Max Gurtovoy
2020-01-16 23:53   ` James Smart
2020-01-19 11:20     ` Max Gurtovoy
2020-01-21 17:40       ` James Smart
2020-01-06 13:37 ` [PATCH 03/15] nvme: Introduce max_integrity_segments ctrl attribute Max Gurtovoy
2020-01-09  3:12   ` Martin K. Petersen
2020-01-06 13:37 ` [PATCH 04/15] nvme-fabrics: Allow user enabling metadata/T10-PI support Max Gurtovoy
2020-01-06 13:37 ` [PATCH 05/15] nvme: Introduce NVME_INLINE_PROT_SG_CNT Max Gurtovoy
2020-01-09  3:13   ` Martin K. Petersen
2020-01-06 13:37 ` [PATCH 06/15] nvme-rdma: Introduce nvme_rdma_sgl structure Max Gurtovoy
2020-01-06 13:37 ` [PATCH 07/15] nvme-rdma: Add metadata/T10-PI support Max Gurtovoy
2020-01-06 13:37 ` [PATCH 08/15] nvmet: Prepare metadata request Max Gurtovoy
2020-01-06 13:37 ` [PATCH 09/15] nvmet: Add metadata characteristics for a namespace Max Gurtovoy
2020-01-09  3:16   ` Martin K. Petersen
2020-01-06 13:37 ` [PATCH 10/15] nvmet: Rename nvmet_rw_len to nvmet_rw_data_len Max Gurtovoy
2020-01-09  3:17   ` Martin K. Petersen
2020-01-06 13:37 ` [PATCH 11/15] nvmet: Rename nvmet_check_data_len to nvmet_check_transfer_len Max Gurtovoy
2020-01-09  3:19   ` Martin K. Petersen
2020-01-06 13:37 ` [PATCH 12/15] nvme: Add Metadata Capabilities enumerations Max Gurtovoy
2020-01-06 13:37 ` [PATCH 13/15] nvmet: Add metadata/T10-PI support Max Gurtovoy
2020-01-09  3:24   ` Martin K. Petersen
2020-01-27 17:17     ` Max Gurtovoy [this message]
2020-01-29  2:32       ` Martin K. Petersen
2020-01-17 16:46   ` James Smart
2020-01-19 13:47     ` Max Gurtovoy
2020-01-06 13:37 ` [PATCH 14/15] nvmet: Add metadata support for block devices Max Gurtovoy
2020-01-06 13:37 ` [PATCH 15/15] nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
2020-01-09  3:29   ` Martin K. Petersen
  -- strict thread matches above, loose matches on Subject: below --
2020-04-28 13:11 [PATCH 00/15 V6] nvme-rdma/nvmet-rdma: " Max Gurtovoy
2020-04-28 13:11 ` [PATCH 13/15] nvmet: add " Max Gurtovoy
2020-05-01 15:58   ` Christoph Hellwig
2020-05-01 16:19   ` Christoph Hellwig
2019-11-05 16:20 [PATCH 00/15] nvme-rdma/nvmet-rdma: Add " Max Gurtovoy
2019-11-05 16:20 ` [PATCH 13/15] nvmet: " Max Gurtovoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99c92a01-be7b-fd47-3d28-b75e5372c3a6@mellanox.com \
    --to=maxg@mellanox.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=idanb@mellanox.com \
    --cc=israelr@mellanox.com \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=martin.petersen@oracle.com \
    --cc=oren@mellanox.com \
    --cc=sagi@grimberg.me \
    --cc=shlomin@mellanox.com \
    --cc=vladimirk@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git