Re: [PATCH v8 2/2] mmc: support packed write command for eMMC4.5 device

* Re: [PATCH v8 2/2] mmc: support packed write command for eMMC4.5    device
@ 2012-11-14  8:14 merez
  2012-11-15 10:05 ` Seungwon Jeon
  0 siblings, 1 reply; 11+ messages in thread
From: merez @ 2012-11-14  8:14 UTC (permalink / raw)
  To: Chris Ball
  Cc: merez, Seungwon Jeon, linux-mmc, 'Subhash Jadavani',
	'S, Venkatraman', 'Saugata Das',
	'Namjae Jeon'

Hi Chris,

The amount of improvement from the packed commands, as from any other
eMMC4.5 feature, depends on several parameters:
1. The card support of this feature. If the card supports only the feature
interface, then you'll see no improvement when using the feature.
2. The benchmark tool used. Since the packed command preparation is
stopped due to a FLUSH request, a benchmark that issues many FLUSH
requests can result in a small amount of packing and you will see no
improvement.

You can use the following patch to get the packed commands statistics:
http://marc.info/?l=linux-mmc&m=134374508625826&w=2
With this patch you will be able to see the amount of packing and what
caused the packed preparation to stop.

We tested the packed commands feature with SanDisk cards and got
improvement of 30% when using lmdd and tiotest. We don't use iozone for
sequential tests but if you'll send me the exact command that you use we
can try it as well.

It is true that packed commands can cause degradation of read in
read-write collisions. However, it is only nature that when having longer
write request a read request has to wait for a longer time and its latency
will increase. I believe that it is not our duty to decide if this is a
reason to exclude this feature. Everyone should take its own decision if
he wants to benefit from the write improvement, while risking the
read-write collisions scenarios.
eMMC4.5 introduces the HPI and stop transmission to overcome the
degradation of read latency due to write (regardless of the packed
commands).
The packing control is our own enhancement that we believe can also be
used to overcome this degradation. It is tunable and requires a specific
enabling, so it can be the developer’s decision whether to use it or not.
Since it is not a standard feature we can discuss separately if it should
be accepted or not and what is the best way to use it.

Packed commands is not the only eMMC4.5 feature that can cause degradation
in specific scenarios. If we will look at the cache feature, it causes
degradation by almost a half in random operations when FLUSH is being
used.
When using the following iozone command when cache is enabled, you will
see degradation in the iozone results:
./data/iozone -i0 -i2 -r4k -s50m -O -o -I -f /data/mmc0/file3
However, cache support was accepted regardless of this degradation and it
is the developer’s responsibility to decide if to use this feature or not.

To summarize, all eMMC4.5 features that were added are tunable and
disabled by default.
I believe that when someone would enable a certain feature he will do all
the required testing for determining if he can benefit from this feature
or not in his own environment.

Thanks,
Maya

On Tue, November 13, 2012 6:54 pm, Chris Ball wrote:
> Hi Maya,
>
> On Sun, Nov 04 2012, merez@codeaurora.org wrote:
>> Packed commands is a mandatory eMMC4.5 feature and is supported by all
the card vendors.
>
> We're still only talking about using packed writes, though, right?
>
>> It wa proven to be beneficial for eMMC4.5 cards and harmless for non
eMMC4.5 cards.
>
> My understanding is that write packing causes a regression in read
performance that can be tuned/fixed by your num_wr_reqs_to_start_packing
tunable (and read packing causes a read regression with current eMMC 4.5
cards).  Is that wrong?
>
>> I don't see a point to hold it back while it can be enabled or
>> disabled by a flag and most of the code it adds is guarded in specific
functions and is not active when packed commands is disabled.
>
> Earlier in the thread I wrote:
>
>>> * I still don't have a good set of representative benchmarks showing
>>>   what kind of performance changes come with this patchset. It seems
like we've had a small amount of testing on one controller/eMMC part
combo from Seungwon, and an entirely different test from Maya, and
the results aren't documented fully anywhere to the level of
describing what the hardware was, what the test was, and what the
results were before and after the patchset.
>
> I still feel this way.  I'm worried that we might be merging code that
works well on your controller/card but causes large regressions for
everyone else.  I don't want to handle this by making a tunable that
everyone has to tune for their system, because I don't think anyone will
tune it.  I don't think that shipping a capability that will probably
lead to performance regressions if you turn it on is a good idea.
>
> I'm in a better position to help now, though -- I have some motherboards
with Marvell SoCs and a socketed eMMC slot, and I have eMMC 4.5 parts
from Sandisk and Toshiba.  So I can try to help work out how
> generalizable your results are across other controllers and cards.
>
> So far I've only tried the Sandisk part, but it didn't show any write
improvement with write packing.  I've verified that the switch command
to turn on packed_event_en happens and succeeds, and that the caps are
set correctly, so I'm not sure what's wrong yet.  With iozone I get:
>
>                        KB  reclen   write rewrite
> Unpacked writes:    10240    8192   17250   16794
> Packed writes:      10240    8192   16930   17353
>
> I'll try the Toshiba part next, and I'll start using lmdd as well as
iozone.  Any ideas on why I might not be seeing improvements with
Sandisk?
>
> I'm not opposed to merging packed write support in principle, I just
want to be convinced that we're not causing regressions for most users
who turn it on.  (And more than that, I want to see that it leads to
improvements that make it worth adding the code complexity for.)
>
> Thanks,
>
> - Chris.
> --
> Chris Ball   <cjb@laptop.org>   <http://printf.net/>
> One Laptop Per Child
>

-- 
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 11+ messages in thread