linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: "'Andy Shevchenko'" <andy.shevchenko@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>,
	James Smart <james.smart@broadcom.com>,
	Dick Kennedy <dick.kennedy@broadcom.com>,
	"James E.J. Bottomley" <jejb@linux.vnet.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Hannes Reinecke <hare@suse.com>,
	Johannes Thumshirn <jthumshirn@suse.de>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH] scsi: lpfc: use memcpy_toio instead of writeq
Date: Fri, 23 Feb 2018 17:09:09 +0000	[thread overview]
Message-ID: <04b6c673208b4680b7d6cf41ccdfd3f0@AcuMS.aculab.com> (raw)
In-Reply-To: <CAHp75VfNKVjAmgEor2veC6sDnuLacxK1BVgKbqdZ-mxOX=9CSw@mail.gmail.com>

From: Andy Shevchenko
> Sent: 23 February 2018 16:51
> On Fri, Feb 23, 2018 at 6:41 PM, David Laight <David.Laight@aculab.com> wrote:
> > From: Arnd Bergmann
> >> Sent: 23 February 2018 15:37
> >>
> >> 32-bit architectures generally cannot use writeq(), so we now get a build
> >> failure for the lpfc driver:
> >>
> >> drivers/scsi/lpfc/lpfc_sli.c: In function 'lpfc_sli4_wq_put':
> >> drivers/scsi/lpfc/lpfc_sli.c:145:4: error: implicit declaration of function 'writeq'; did you mean
> >> 'writeb'? [-Werror=implicit-function-declaration]
> >>
> >> Another problem here is that writing out actual data (unlike accessing
> >> mmio registers) means we must write the data with the same endianess
> >> that we have read from memory, but writeq() will perform byte swaps
> >> and add barriers inbetween accesses as we do for registers.
> >>
> >> Using memcpy_toio() should do the right thing here, using register
> >> sized stores with correct endianess conversion and barriers (i.e. none),
> >> but on some architectures might fall back to byte-size access.
> > ...
> >
> > Have you looked at the performance impact of this on x86?
> > Last time I looked memcpy_toio() aliased directly to memcpy().
> > memcpy() is run-time patched between several different algorithms.
> > On recent Intel cpus memcpy() is implemented as 'rep movsb' relying
> > on the hardware to DTRT.
> > For uncached accesses (typical for io) the 'RT' has to be byte transfers.
> > So instead of the 8 byte transfers (on 64 bit) you get single bytes.
> > This won't be what is intended!
> > memcpy_toio() should probably use 'rep movsd' for the bulk of the transfer.
> 
> Maybe I'm wrong but it uses movsq on 64-bit and movsl on 32-bit.

(Let's not argue about the instruction mnemonic). 

You might expect that, but last time I looked at the bus cycles on a PCIe slave
that wasn't what I saw.

> The side-effect I referred previously is about tails, i.e. unaligned
> bytes are transferred in portions
> like
>   7 on 64-bit will be  4 + 2 + 1,
>   5 = 4 + 1

on 64bit memcpy() is allowed to do:
	(long *)(tgt+len)[-1] = (long *)(src+len)[-1];
	rep_movsq(tgt, src, len >> 3);
provided the length is at least 8.

The misaligned PCIe transfer generates a single TLP covering 12 bytes with the
relevant byte enables set for the first and last 32bit words.

	David

  parent reply	other threads:[~2018-02-23 17:08 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-23 15:36 [PATCH] scsi: lpfc: use memcpy_toio instead of writeq Arnd Bergmann
2018-02-23 15:59 ` Andy Shevchenko
2018-02-23 16:13   ` Andy Shevchenko
2018-02-23 16:41 ` David Laight
2018-02-23 16:44   ` Arnd Bergmann
2018-02-23 16:51   ` Andy Shevchenko
2018-02-23 16:53     ` Andy Shevchenko
2018-02-23 17:09     ` David Laight [this message]
2018-02-23 17:12       ` Andy Shevchenko
2018-02-23 17:45         ` David Laight
2018-02-23 21:02 ` Arnd Bergmann
2018-02-24 22:24   ` James Smart
2018-02-25 10:02 ` Johannes Thumshirn
2018-02-26  9:03   ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=04b6c673208b4680b7d6cf41ccdfd3f0@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=andy.shevchenko@gmail.com \
    --cc=arnd@arndb.de \
    --cc=dick.kennedy@broadcom.com \
    --cc=hare@suse.com \
    --cc=james.smart@broadcom.com \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=jthumshirn@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).