iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Will Deacon <will@kernel.org>
To: John Garry <john.garry@huawei.com>
Cc: Vijay Kilary <vkilari@codeaurora.org>,
	Jean-Philippe Brucker <jean-philippe.brucker@arm.com>,
	Jon Masters <jcm@redhat.com>, Jan Glauber <jglauber@marvell.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	iommu@lists.linux-foundation.org,
	Jayachandran Chandrasekharan Nair <jnair@marvell.com>,
	Robin Murphy <robin.murphy@arm.com>
Subject: Re: [RFC PATCH v2 18/19] iommu/arm-smmu-v3: Reduce contention during command-queue insertion
Date: Wed, 24 Jul 2019 13:15:48 +0100	[thread overview]
Message-ID: <20190724121548.j5tekad45kwlobvs@willie-the-truck> (raw)
In-Reply-To: <8a1be404-f22a-1f96-2f0d-4cf35ca99d2d@huawei.com>

Hi John,

Thanks for reading the code!

On Fri, Jul 19, 2019 at 12:04:15PM +0100, John Garry wrote:
> > +static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > +				       u64 *cmds, int n, bool sync)
> > +{
> > +	u64 cmd_sync[CMDQ_ENT_DWORDS];
> > +	u32 prod;
> >  	unsigned long flags;
> > -	bool wfe = !!(smmu->features & ARM_SMMU_FEAT_SEV);
> > -	struct arm_smmu_cmdq_ent ent = { .opcode = CMDQ_OP_CMD_SYNC };
> > -	int ret;
> > +	bool owner;
> > +	struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> > +	struct arm_smmu_ll_queue llq = {
> > +		.max_n_shift = cmdq->q.llq.max_n_shift,
> > +	}, head = llq;
> > +	int ret = 0;
> > 
> > -	arm_smmu_cmdq_build_cmd(cmd, &ent);
> > +	/* 1. Allocate some space in the queue */
> > +	local_irq_save(flags);
> > +	llq.val = READ_ONCE(cmdq->q.llq.val);
> > +	do {
> > +		u64 old;
> > +
> > +		while (!queue_has_space(&llq, n + sync)) {
> > +			local_irq_restore(flags);
> > +			if (arm_smmu_cmdq_poll_until_not_full(smmu, &llq))
> > +				dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
> > +			local_irq_save(flags);
> > +		}
> > +
> > +		head.cons = llq.cons;
> > +		head.prod = queue_inc_prod_n(&llq, n + sync) |
> > +					     CMDQ_PROD_OWNED_FLAG;
> > +
> > +		old = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
> > +		if (old == llq.val)
> > +			break;
> > +
> > +		llq.val = old;
> > +	} while (1);
> > +	owner = !(llq.prod & CMDQ_PROD_OWNED_FLAG);
> > +
> > +	/*
> > +	 * 2. Write our commands into the queue
> > +	 * Dependency ordering from the cmpxchg() loop above.
> > +	 */
> > +	arm_smmu_cmdq_write_entries(cmdq, cmds, llq.prod, n);
> > +	if (sync) {
> > +		prod = queue_inc_prod_n(&llq, n);
> > +		arm_smmu_cmdq_build_sync_cmd(cmd_sync, smmu, prod);
> > +		queue_write(Q_ENT(&cmdq->q, prod), cmd_sync, CMDQ_ENT_DWORDS);
> > +
> > +		/*
> > +		 * In order to determine completion of our CMD_SYNC, we must
> > +		 * ensure that the queue can't wrap twice without us noticing.
> > +		 * We achieve that by taking the cmdq lock as shared before
> > +		 * marking our slot as valid.
> > +		 */
> > +		arm_smmu_cmdq_shared_lock(cmdq);
> > +	}
> > +
> > +	/* 3. Mark our slots as valid, ensuring commands are visible first */
> > +	dma_wmb();
> > +	prod = queue_inc_prod_n(&llq, n + sync);
> > +	arm_smmu_cmdq_set_valid_map(cmdq, llq.prod, prod);
> > +
> > +	/* 4. If we are the owner, take control of the SMMU hardware */
> > +	if (owner) {
> > +		/* a. Wait for previous owner to finish */
> > +		atomic_cond_read_relaxed(&cmdq->owner_prod, VAL == llq.prod);
> > +
> > +		/* b. Stop gathering work by clearing the owned flag */
> > +		prod = atomic_fetch_andnot_relaxed(CMDQ_PROD_OWNED_FLAG,
> > +						   &cmdq->q.llq.atomic.prod);
> > +		prod &= ~CMDQ_PROD_OWNED_FLAG;
> > +		head.prod &= ~CMDQ_PROD_OWNED_FLAG;
> > +
> 
> Could it be a minor optimisation to advance the HW producer pointer at this
> stage for the owner only? We know that its entries are written, and it
> should be first in the new batch of commands (right?), so we could advance
> the pointer to at least get the HW started.

I think that would be a valid thing to do, but it depends on the relative
cost of writing to prod compared to how long we're likely to wait. Given
that everybody has irqs disabled when writing out their commands, I wouldn't
expect the waiting to be a big issue, although we could probably optimise
arm_smmu_cmdq_write_entries() into a memcpy() if we needed to.

In other words, I think we need numbers to justify that change.

Thanks,

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2019-07-24 12:15 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-11 17:19 [RFC PATCH v2 00/19] Try to reduce lock contention on the SMMUv3 command queue Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 01/19] iommu: Remove empty iommu_tlb_range_add() callback from iommu_ops Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 02/19] iommu/io-pgtable-arm: Remove redundant call to io_pgtable_tlb_sync() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 03/19] iommu/io-pgtable: Rename iommu_gather_ops to iommu_flush_ops Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 04/19] iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes Will Deacon
2019-07-24  7:19   ` Joerg Roedel
2019-07-24  7:41     ` Will Deacon
2019-07-25  7:58       ` Joerg Roedel
2019-07-11 17:19 ` [RFC PATCH v2 05/19] iommu: Introduce iommu_iotlb_gather_add_page() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 06/19] iommu: Pass struct iommu_iotlb_gather to ->unmap() and ->iotlb_sync() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 07/19] iommu/io-pgtable: Introduce tlb_flush_walk() and tlb_flush_leaf() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 08/19] iommu/io-pgtable: Hook up ->tlb_flush_walk() and ->tlb_flush_leaf() in drivers Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 09/19] iommu/io-pgtable-arm: Call ->tlb_flush_walk() and ->tlb_flush_leaf() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 10/19] iommu/io-pgtable: Replace ->tlb_add_flush() with ->tlb_add_page() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 11/19] iommu/io-pgtable: Remove unused ->tlb_sync() callback Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 12/19] iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->unmap() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 13/19] iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->tlb_add_page() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 14/19] iommu/arm-smmu-v3: Separate s/w and h/w views of prod and cons indexes Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 15/19] iommu/arm-smmu-v3: Drop unused 'q' argument from Q_OVF macro Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 16/19] iommu/arm-smmu-v3: Move low-level queue fields out of arm_smmu_queue Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 17/19] iommu/arm-smmu-v3: Operate directly on low-level queue where possible Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 18/19] iommu/arm-smmu-v3: Reduce contention during command-queue insertion Will Deacon
2019-07-19 11:04   ` John Garry
2019-07-24 12:15     ` Will Deacon [this message]
2019-07-24 14:03       ` John Garry
2019-07-24 14:07         ` Will Deacon
2019-07-24  8:20   ` John Garry
2019-07-24 14:33     ` Will Deacon
2019-07-25 11:31       ` John Garry
2019-07-11 17:19 ` [RFC PATCH v2 19/19] iommu/arm-smmu-v3: Defer TLB invalidation until ->iotlb_sync() Will Deacon
2019-07-19  4:25 ` [RFC PATCH v2 00/19] Try to reduce lock contention on the SMMUv3 command queue Ganapatrao Kulkarni
2019-07-24 12:28   ` Will Deacon
2019-07-24  9:58 ` John Garry
2019-07-24 12:20   ` Will Deacon
2019-07-24 14:25     ` John Garry
2019-07-24 14:48       ` Will Deacon
2019-07-25 10:11         ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190724121548.j5tekad45kwlobvs@willie-the-truck \
    --to=will@kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jcm@redhat.com \
    --cc=jean-philippe.brucker@arm.com \
    --cc=jglauber@marvell.com \
    --cc=jnair@marvell.com \
    --cc=john.garry@huawei.com \
    --cc=robin.murphy@arm.com \
    --cc=vkilari@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).