From: martin sperl <kernel-TqfNSX0MhmxHKSADF0wUEw@public.gmane.org>
To: linux-spi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: Mark Brown <broonie-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Subject: Re: [PATCH] added API: spi_message_compile/optimize/hint/...
Date: Wed, 20 Nov 2013 10:11:44 +0100 [thread overview]
Message-ID: <528C7CD0.3060608@martin.sperl.org> (raw)
In-Reply-To: <0C7D5B1E-E561-4F52-BEA8-572EB0CA26A6-TqfNSX0MhmxHKSADF0wUEw@public.gmane.org>
Here the missing sign-off:
Signed-Off-By: Martin Sperl <kernel-TqfNSX0MhmxHKSADF0wUEw@public.gmane.org>
Is this sufficient or do you need to send the patch again?
By the way: I am thinking of extending the spi_transfer
with another set of compiled_flags, so that individual
spi_transfers may get marked as "modifiable".
Adding also something like the following defines:
#define SPI_MESSAGE_COMPILE_MODIFIES_RXBUF (1<<0)
#define SPI_MESSAGE_COMPILE_MODIFIES_TXBUF (1<<1)
#define SPI_MESSAGE_COMPILE_MODIFIES_RXDMA ((1<<2)|(1<<0))
#define SPI_MESSAGE_COMPILE_MODIFIES_TXDMA ((1<<3)|(1<<1))
#define SPI_MESSAGE_COMPILE_MODIFIES_LEN (1<<4)
#define SPI_MESSAGE_COMPILE_MODIFIES_DELAY (1<<5)
#define SPI_MESSAGE_COMPILE_IN_ATOMIC_CONTEXT (1<<30)
#define SPI_MESSAGE_COMPILE_FLAGS_SET_IN_XFER (1<<31)
Note that SPI_MESSAGE_COMPILE_FLAGS_SET_IN XFER is there mainly to
allow copying the default flags from spi_message_compile to the
individual transfers if xfer.compile_flags=0. This copy would happen
in spi_message_verify (formerly in __spi_async).
adding the following in the list_for_each_entry:
if (!xfer->compile_flags)
xfer->compile_flags = message->compiled_flags;
(but it would also require a slight change in spi_message_compile
to set message->compiled_flags before calling spi_message_verify.
That would allow spi_read, spi_write, spi_write_then_read to define
its "variation" in the structure and then compile the message.
Which would give a slight boost - and if I think of it it may make
things faster for a dma driver, as there could be some "shortcuts"...
for the different cases that could be made quite efficient if it
can rely on the above hints... (essentially a list with
copy from X to Y and maybe the required DMA mapping computation - if
necessary)
That way we at least avoid the memory allocation and the computation
of structures.
Please tell me your preferences.
Martin
On 19.11.2013 19:33, Martin Sperl wrote:
> Hi!
>
> Here the patch for a spi_message_compile interface
> (as discussed in a separate thread).
>
> Naming of the API is subject to discussion - instead of "compile" it could
> also be "optimize" or "hint" or ...
>
> The patch creates:
> * 3 additional variables in spi_message
> * 2 additional function pointers in spi_master
> * spi_message_compile/uncompile using the above
> * __spi_async modified, so that the "validation" step for the message
> does not run every time.
>
> Here some measurements using my "standard" CAN test:
> * runs on Raspberry Pi without over-clocking
> * optimized mcp2515 driver with 3 Message chunks and callbacks running
> in listen only mode
> * CAN bus is running at 125kHz (not 500 as before to simulate "less" load
> - 25% to be exact)
> * single can device retransmitting single frame repeatedly. (every 2.1s there
> is a gap of 15ms when no traffic happens - device reboots after watchdog
> timeout)
> * spi-bcm2835 stock kernel driver is used with patch to run message pump with
> realtime priority.
> * kernel 3.10.16 (foundation based, as the vanilla RPI kernel still misses
> dmaengine, so it is easier to develop against this kernel)
>
> Main metrics reported are taken with "vmstat 10 12"
>
> spi-bcm2835 realtime=0, unpatched spi.h, mcp2515 without any "compiles":
> Messages received: 178523
> Messages in second HW buffer: 20
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 0 0 0 411516 14380 26436 0 0 588 19 1340 653 12 31 37 20
> 0 0 0 411232 14388 26436 0 0 0 4 45880 9412 0 35 61 4
> 0 0 0 411104 14388 26436 0 0 0 0 45843 9378 0 34 66 0
> 0 0 0 411040 14388 26436 0 0 0 8 45803 9528 0 41 59 0
> 0 0 0 410816 14396 26436 0 0 0 6 45703 9394 0 38 53 9
> 0 0 0 410624 14396 26436 0 0 0 0 45623 9270 0 33 67 0
> 0 0 0 410560 14396 26436 0 0 0 0 45793 9335 0 34 66 0
> 0 0 0 410464 14396 26436 0 0 0 3 45786 9367 0 35 65 0
> 0 0 0 410464 14396 26436 0 0 0 0 45741 9364 0 35 65 0
> 0 0 0 410496 14396 26436 0 0 0 0 45772 9360 0 35 65 0
> 0 0 0 410464 14396 26436 0 0 0 0 45731 9309 0 34 66 0
> 0 0 0 410528 14396 26436 0 0 0 0 45726 9287 0 34 66 0
>
> spi-bcm2835 realtime=1 unpatched spi.h, mcp2515 without any "compiles":
> Messages received: 178031
> Messages in second HW buffer: 4
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 0 0 0 411512 14408 26440 0 0 57 2 7223 1497 1 8 89 2
> 0 0 0 411384 14416 26440 0 0 0 2 45815 9827 0 34 61 5
> 0 0 0 411288 14416 26440 0 0 0 0 45778 9822 0 33 67 0
> 0 0 0 411128 14416 26440 0 0 0 2 45630 9807 0 35 65 0
> 0 0 0 411064 14424 26440 0 0 0 6 45705 10029 0 40 54 6
> 0 0 0 411032 14424 26440 0 0 0 0 45628 9807 0 33 67 0
> 0 0 0 410840 14424 26440 0 0 0 0 45611 9763 0 33 67 0
> 0 0 0 410744 14424 26440 0 0 0 0 45544 9716 0 33 67 0
> 0 0 0 410712 14424 26440 0 0 0 0 45574 9756 0 33 67 0
> 0 0 0 410712 14424 26440 0 0 0 0 45660 9807 0 33 67 0
> 0 0 0 410712 14424 26440 0 0 0 0 45494 9768 0 33 67 0
> 0 0 0 410712 14424 26440 0 0 0 0 45549 9778 0 34 66 0
>
> spi-bcm2835 realtime=0 patched spi.h, mcp2515 without any "compiles":
> Messages received: 178378
> Messages in second HW buffer: 6
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 0 0 0 355064 20460 71900 0 0 182 20 8400 1786 15 17 57 11
> 0 0 0 355180 20468 71900 0 0 0 2 45380 8788 0 33 59 8
> 0 0 0 355116 20468 71900 0 0 0 0 46032 8863 0 33 67 0
> 1 0 0 355020 20468 71900 0 0 0 2 45768 8865 0 36 64 0
> 0 0 0 354924 20476 71900 0 0 0 6 45749 8869 0 39 51 10
> 0 0 0 354828 20476 71900 0 0 0 0 45452 8747 0 33 67 0
> 0 0 0 354732 20476 71900 0 0 0 0 45644 8757 0 33 67 0
> 0 0 0 354540 20476 71900 0 0 0 0 45404 8746 0 33 67 0
> 0 0 0 354508 20476 71900 0 0 0 0 45678 8739 0 32 68 0
> 0 0 0 354548 20476 71900 0 0 0 0 45959 8804 0 33 67 0
> 0 0 0 354516 20476 71900 0 0 0 0 45984 8831 0 33 67 0
> 0 0 0 354548 20476 71900 0 0 0 0 45925 8835 0 34 66 0
>
> spi-bcm2835 realtime=1 patched spi.h, mcp2515 without any "compiles":
> Messages received: 177648
> Messages in second HW buffer: 6
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 0 0 0 355456 20488 71908 0 0 116 13 11779 2372 10 15 68 7
> 0 0 0 355296 20496 71908 0 0 0 2 45963 8916 0 35 61 3
> 0 0 0 355168 20496 71908 0 0 0 0 45879 8913 0 35 65 0
> 0 0 0 355104 20496 71908 0 0 0 2 45803 8880 0 36 64 0
> 0 0 0 355008 20504 71908 0 0 0 8 45352 8888 0 41 48 11
> 0 0 0 354880 20504 71908 0 0 0 0 45424 8846 0 36 64 0
> 0 0 0 354816 20504 71908 0 0 0 0 45247 8804 0 34 66 0
> 0 0 0 354720 20504 71908 0 0 0 0 45532 8824 0 35 65 0
> 0 0 0 354688 20504 71908 0 0 0 0 45553 8826 0 36 64 0
> 0 0 0 354720 20504 71908 0 0 0 0 45490 8834 0 34 66 0
> 0 0 0 354688 20504 71908 0 0 0 0 45673 8880 0 35 65 0
> 0 0 0 354688 20504 71908 0 0 0 0 45041 8674 0 34 66 0
>
> spi-bcm2835 realtime=0 patched spi.h, mcp2515 with "compile":
> Messages received: 178493
> Messages in second HW buffer: 38
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 0 0 0 353868 20792 73116 0 0 74 9 15549 3084 8 16 71 5
> 0 0 0 353804 20800 73116 0 0 0 2 45717 9051 0 32 63 5
> 0 0 0 353676 20800 73116 0 0 0 0 45844 9044 0 33 67 0
> 0 0 0 353620 20800 73116 0 0 0 1 45829 9120 0 36 64 0
> 0 0 0 353588 20804 73116 0 0 0 2 45885 9164 0 37 58 5
> 0 0 0 353460 20804 73116 0 0 0 0 45737 9070 0 32 68 0
> 0 0 0 353364 20804 73116 0 0 0 0 45752 9056 0 31 69 0
> 0 0 0 353204 20804 73116 0 0 0 0 45847 9078 0 33 67 0
> 0 0 0 353140 20804 73116 0 0 0 0 45629 9035 0 33 67 0
> 0 0 0 353108 20804 73116 0 0 0 0 45677 9052 0 32 68 0
> 0 0 0 353140 20804 73116 0 0 0 0 45692 9025 0 33 67 0
> 0 0 0 353172 20804 73116 0 0 0 0 45687 9023 0 33 67 0
>
> spi-bcm2835 realtime=1 patched spi.h, mcp2515 with "compile":
> Messages received: 177623
> Messages in second HW buffer: 3
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 0 0 0 353964 20760 73112 0 0 87 10 13506 2687 9 15 70 6
> 0 0 0 353836 20768 73112 0 0 0 2 45629 8960 0 34 63 4
> 0 0 0 353740 20768 73112 0 0 0 0 45360 8920 0 33 67 0
> 0 0 0 353676 20768 73112 0 0 0 2 45660 9033 0 35 65 0
> 0 0 0 353580 20776 73112 0 0 0 6 45714 9007 0 36 57 6
> 0 0 0 353452 20776 73112 0 0 0 0 45489 8897 0 32 68 0
> 0 0 0 353292 20776 73112 0 0 0 0 45615 8976 0 33 67 0
> 0 0 0 353164 20776 73112 0 0 0 0 45467 8942 0 34 66 0
> 0 0 0 353164 20776 73112 0 0 0 0 45464 8933 0 32 68 0
> 0 0 0 353196 20776 73112 0 0 0 0 45535 8937 0 33 67 0
> 0 0 0 353196 20776 73112 0 0 0 0 45329 8916 0 33 67 0
> 0 0 0 353228 20776 73112 0 0 0 0 45610 8953 0 33 67 0
>
> So in summary:
> * there is no regression in performance with the patch
> * The improved locking does not show any real difference between
> installations, but then we would nto be expecting an interrupt while running
> spi_async (at least not with this driver).
> * there is a slight indication that System CPU values are slightly lower as we
> have 31% and 32% System CPU that we have not seen in the tests prior to
> applying the patch. This is thus related to the optimization of avoiding the
> repeated message checks in case of an optimized driver.
>
> So this has a slightly positive impact on overall performance.
> we could add a means to optimize all those spi_write/read/write_then_read
> calls by compiling them. But that depends still on what we want to implement.
>
> Assumptions by default should be:
> * is_dma_mapped is set
> * no change to structure, length, speed, bits, delays are allowed.
>
> We may think of adding either flags for "global" changes like:
> * allow length of transfer to change
> * allow the dma_address of the transfer to change (so still with
> is_dma_mapped)
> * allow the address of the transfer to change (would mean overhead for
> mapping dma on a per call basis)
>
> From an optimization perspective it might be better to give "hints" to what
> may change per spi_transfer, so adding a "u32 compile_hint" to spi_transfer
> might be preferable to make it work best.
>
> Please review and comment.
>
> Thanks,
> Martin
>
> P.s: The patch itself - it is against 3.10.16, so it might not apply cleanly...
>
> diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
> index 32b7bb1..8f1aede 100644
> --- a/drivers/spi/spi.c
> +++ b/drivers/spi/spi.c
> @@ -1346,7 +1346,18 @@ int spi_setup(struct spi_device *spi)
> }
> EXPORT_SYMBOL_GPL(spi_setup);
>
> -static int __spi_async(struct spi_device *spi, struct spi_message *message)
> +/**
> + * spi_message_verify - verifies if the spi message is supported
> + * by the bus-master
> + * @spi: device with which data will be exchanged
> + * @message: describes the data transfers, including completion callback
> + * Context: any (irqs may be blocked, etc)
> + *
> + * This call may be used in_irq and other contexts which can't sleep,
> + * as well as from task contexts which can sleep.
> + */
> +extern int spi_message_verify(struct spi_device *spi,
> + struct spi_message *message)
> {
> struct spi_master *master = spi->master;
> struct spi_transfer *xfer;
> @@ -1388,6 +1399,20 @@ static int __spi_async(struct spi_device *spi, struct spi_message *message)
> return -EINVAL;
> }
> }
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(spi_message_verify);
> +
> +static int __spi_async(struct spi_device *spi, struct spi_message *message)
> +{
> + struct spi_master *master = spi->master;
> + int ret;
> +
> + if (!message->is_compiled) {
> + ret=spi_message_verify(spi,message);
> + if (ret)
> + return ret;
> + }
>
> message->spi = spi;
> message->status = -EINPROGRESS;
> @@ -1430,15 +1455,12 @@ int spi_async(struct spi_device *spi, struct spi_message *message)
> unsigned long flags;
>
> spin_lock_irqsave(&master->bus_lock_spinlock, flags);
> -
> - if (master->bus_lock_flag)
> - ret = -EBUSY;
> - else
> - ret = __spi_async(spi, message);
> -
> + ret=master->bus_lock_flag;
> spin_unlock_irqrestore(&master->bus_lock_spinlock, flags);
> + if (ret)
> + return -EBUSY;
>
> - return ret;
> + return __spi_async(spi, message);
> }
> EXPORT_SYMBOL_GPL(spi_async);
>
> diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
> index 6ff26c8..cea622f 100644
> --- a/include/linux/spi/spi.h
> +++ b/include/linux/spi/spi.h
> @@ -378,6 +378,14 @@ struct spi_master {
> int (*unprepare_transfer_hardware)(struct spi_master *master);
> /* gpio chip select */
> int *cs_gpios;
> + /* compile (static) message handlers */
> + int (*message_compile)(struct spi_master *master,
> + struct spi_message *mesg,
> + u32 flags
> + );
> + void (*message_uncompile)(struct spi_master *master,
> + struct spi_message *mesg
> + );
> };
>
> static inline void *spi_master_get_devdata(struct spi_master *master)
> @@ -532,6 +540,12 @@ struct spi_transfer {
> * @spi: SPI device to which the transaction is queued
> * @is_dma_mapped: if true, the caller provided both dma and cpu virtual
> * addresses for each transfer buffer
> + * @is_compiled: if true, then the message has been initialized
> + * via spi_compile_message
> + * @compiled_flags: flags given when the message got compiled
> + * via spi_message_compile
> + * @compiled_data: data owned by bus-master driver between
> + * spi_message_compile and spi_message_uncompile
> * @complete: called to report transaction completions
> * @context: the argument to complete() when it's called
> * @actual_length: the total number of bytes that were transferred in all
> @@ -561,6 +575,13 @@ struct spi_message {
>
> unsigned is_dma_mapped:1;
>
> + /* bus-driver owned data for initialized messages
> + * between spi_message_compile and spi_message_uncompile
> + */
> + unsigned is_compiled:1;
> + u32 compiled_flags;
> + void *compiled_data;
> +
> /* REVISIT: we might want a flag affecting the behavior of the
> * last transfer ... allowing things like "read 16 bit length L"
> * immediately followed by "read L bytes". Basically imposing
> @@ -604,6 +625,41 @@ spi_transfer_del(struct spi_transfer *t)
> list_del(&t->transfer_list);
> }
>
> +extern int spi_message_verify(struct spi_device *spi,
> + struct spi_message *m);
> +
> +static inline int spi_message_compile(struct spi_device *spi,
> + struct spi_message *m,
> + u32 flags)
> +{
> + int verify;
> + if (m->is_compiled)
> + return -EOPNOTSUPP;
> +
> + if ((verify=spi_message_verify(spi,m)))
> + return verify;
> +
> + m->is_compiled=1;
> + m->compiled_flags=flags;
> + m->compiled_data=NULL;
> +
> + if (spi->master->message_compile)
> + return spi->master->message_compile(spi->master,m,flags);
> + else
> + return 0;
> +}
> +
> +static inline void spi_message_uncompile(struct spi_device *spi,
> + struct spi_message *m)
> +{
> + if (spi->master->message_uncompile)
> + spi->master->message_uncompile(spi->master,m);
> +
> + m->is_compiled=0;
> + m->compiled_flags=0;
> + m->compiled_data=NULL;
> +}
> +
> /**
> * spi_message_init_with_transfers - Initialize spi_message and append transfers
> * @m: spi_message to be initialized
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-11-20 9:11 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-19 18:33 [PATCH] added API: spi_message_compile/optimize/hint/ Martin Sperl
[not found] ` <0C7D5B1E-E561-4F52-BEA8-572EB0CA26A6-TqfNSX0MhmxHKSADF0wUEw@public.gmane.org>
2013-11-19 19:25 ` Mark Brown
2013-11-20 9:11 ` martin sperl [this message]
2013-11-20 11:29 ` Mark Brown
[not found] ` <20131120112920.GX2674-GFdadSzt00ze9xe1eoZjHA@public.gmane.org>
2013-11-20 15:27 ` Martin Sperl
[not found] ` <05C64EF6-0F3B-4CBC-A462-45ED8F1564A7-TqfNSX0MhmxHKSADF0wUEw@public.gmane.org>
2013-11-20 16:24 ` Mark Brown
[not found] ` <20131120162449.GJ2674-GFdadSzt00ze9xe1eoZjHA@public.gmane.org>
2013-11-20 18:42 ` Martin Sperl
[not found] ` <9DE09AF6-AD13-45DE-A36F-BD65E619C342-TqfNSX0MhmxHKSADF0wUEw@public.gmane.org>
2013-11-21 19:10 ` Mark Brown
[not found] ` <20131121191004.GH8120-GFdadSzt00ze9xe1eoZjHA@public.gmane.org>
2013-11-22 6:53 ` Martin Sperl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=528C7CD0.3060608@martin.sperl.org \
--to=kernel-tqfnsx0mhmxhksadf0wuew@public.gmane.org \
--cc=broonie-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=linux-spi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).