All of lore.kernel.org
 help / color / mirror / Atom feed
From: Niklas Schnelle <schnelle@linux.ibm.com>
To: Keith Busch <kbusch@kernel.org>
Cc: hch@lst.de, kbusch@meta.com, linux-nvme@lists.infradead.org,
	sagi@grimberg.me
Subject: Re: [PATCH 0/4] nvme patches for 6.3
Date: Fri, 10 Feb 2023 17:29:31 +0100	[thread overview]
Message-ID: <128ea5592001c2a11fbd18c6c2726f35978dfa54.camel@linux.ibm.com> (raw)
In-Reply-To: <Y+Zkn3bj80lR4bZo@kbusch-mbp.dhcp.thefacebook.com>

On Fri, 2023-02-10 at 08:37 -0700, Keith Busch wrote:
> On Fri, Feb 10, 2023 at 08:24:46AM -0700, Keith Busch wrote:
> > On Fri, Feb 10, 2023 at 03:34:09PM +0100, Niklas Schnelle wrote:
> > > Hi Christoph, Hi Keith,
> > > 
> > > It looks like this series causes crashes on s390x.
> > > With current linux-next-20230210 and a Samsung PM173X I get the
> > > below[0] crash. Reverting patches 1-3 makes the NVMe work again. I
> > > tried reverting just patch 3 and 2/3 but this results in crashes as
> > > well and as far as I can see patches 2/3 depend on each other. Not
> > > entirely sure what's going on but patch 3 mentions padding to the cache
> > > line size and our 256 byte cache lines are definitely unusual. I didn't
> > > see any obvious place where this would break things though. I did debug
> > > that in the crashing nvme_unmap_data() iod->nr_allocations is -1 and
> > > iod->use_sgl is true which is weird since it looks to me like iod-
> > > > nr_allocations should only be -1 if sg_list couldn't be allocated from
> > > the pool.
> > 
> > Thanks for the notice.
> > 
> > I think the driver must have received a request with multiple physical
> > segments, but the dma mapping collapsed it to 1. In that case, the driver won't
> > use the "simple" sgl, but we don't allocate a descriptor either, so we need to
> > account for that. I'll send a fix shortly.
> 
> This should fix it:
> 
> ---
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index a331fbfa9a667..6e8fcbf9306d2 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -553,14 +553,16 @@ static void nvme_unmap_data(struct nvme_dev *dev, struct request *req)
>  
>  	dma_unmap_sgtable(dev->dev, &iod->sgt, rq_dma_dir(req), 0);
>  
> -	if (iod->nr_allocations == 0)
> +	if (iod->nr_allocations == 0) {
>  		dma_pool_free(dev->prp_small_pool, iod->list[0].sg_list,
>  			      iod->first_dma);
> -	else if (iod->use_sgl)
> -		dma_pool_free(dev->prp_page_pool, iod->list[0].sg_list,
> -			      iod->first_dma);
> -	else
> -		nvme_free_prps(dev, req);
> +	} else if (iod->nr_allocations > 0) {
> +		if (iod->use_sgl)
> +			dma_pool_free(dev->prp_page_pool, iod->list[0].sg_list,
> +				      iod->first_dma);
> +		else
> +			nvme_free_prps(dev, req);
> +	}
>  	mempool_free(iod->sgt.sgl, dev->iod_mempool);
>  }
>  
> --

Thanks, yes that seems to have been it and with the above patch applied
on top of linux-next-20230210 the crash disappears and my standard fio
script runs fine too.

So if I understand things correctly possibly thanks to "nvme-pci: use
mapped entries for sgl decision" a collapsed DMA mapping lands us in
the entries == 1 case in nvme_pci_setup_sgls() so no extra mapping is
needed and thus the dma_unmap_sgtable() is the only unmapping we need
in nvme_unmap_data()? Does that mean that previously this case might
have used prps and thus now needs fewer mapping operations?

Either way, feel free to add my:

Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>


  reply	other threads:[~2023-02-10 16:29 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-05 20:28 [PATCH 0/4] nvme patches for 6.3 Keith Busch
2023-01-05 20:28 ` [PATCH 1/4] nvme-pci: remove SGL segment descriptors Keith Busch
2023-01-09  7:07   ` Chaitanya Kulkarni
2023-02-13 10:15   ` Sagi Grimberg
2023-01-05 20:28 ` [PATCH 2/4] nvme-pci: use mapped entries for sgl decision Keith Busch
2023-01-09  7:09   ` Chaitanya Kulkarni
2023-01-05 20:28 ` [PATCH 3/4] nvme-pci: place descriptor addresses in iod Keith Busch
2023-02-13 10:19   ` Sagi Grimberg
2023-01-05 20:28 ` [PATCH 4/4] nvme: always initialize known command effects Keith Busch
2023-01-08 18:21   ` Christoph Hellwig
2023-01-08 18:22 ` [PATCH 0/4] nvme patches for 6.3 Christoph Hellwig
2023-02-10 14:34   ` Niklas Schnelle
2023-02-10 15:24     ` Keith Busch
2023-02-10 15:37       ` Keith Busch
2023-02-10 16:29         ` Niklas Schnelle [this message]
2023-02-10 17:14           ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=128ea5592001c2a11fbd18c6c2726f35978dfa54.camel@linux.ibm.com \
    --to=schnelle@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=kbusch@meta.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.