All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: [RFC] consistent_sync and non L1 cache line aligned buffers
@ 2003-07-16  0:12 Darin.Johnson
  0 siblings, 0 replies; 18+ messages in thread
From: Darin.Johnson @ 2003-07-16  0:12 UTC (permalink / raw)
  To: linuxppc-embedded


> If you think about it, you will see that if you are doing DMA to an
> unaligned buffer, and some other unrelated part of the kernel is
> accessing another part of the cache line, you are in trouble no matter
> what sequence of cache flushes/invalidates/whatever you do.

I sort of assumed this was a given.  I'm used to things like
invalidating just 1514 bytes in a larger buffer, because I know
that it's safe.  However, I can agree with your sentiment, since
the actual reason that I made my invalidate routine safer was
because this was simpler than convincing some other programmers
how to do the right thing :-)

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread
* RE: [RFC] consistent_sync and non L1 cache line aligned buffers
@ 2003-07-15 23:04 Darin.Johnson
  2003-07-15 23:34 ` Paul Mackerras
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Darin.Johnson @ 2003-07-15 23:04 UTC (permalink / raw)
  To: linuxppc-embedded


> IMHO, the easiest solution is
> alignment of buffers.....plus it's likely to be a performance
> improvement.

True, it's the easiest solution for the kernel developer, but
requires more work from driver authors.  Which is ok, *if* it's
well documented and everyone knows buffers must be aligned,
and that's the problem.  I think some people implicitly understand
these issues, and assume that everyone else thinks the same way.
The driver authors on the other hand often come from completely
different environments and may be porting code that's been working
fine.

My headache example was the 4 byte buffer that MPC860 CPM
was using to talk to an I2C device, there just wasn't an
elegant solution...

A big snag is that problems in this area are not readily apparent,
and bugs may not surface early.  "Option 1", to provide assertions,
has some appeal when I think about it, because it'll point out
problems that may arise on other architectures (ie, the author
develops on PowerPC with a 'smart' invalidate_dcache_region, but
then the code mysteriously fails twice a week on a pentium).

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread
* RE: [RFC] consistent_sync and non L1 cache line aligned buffers
@ 2003-07-15 20:47 Darin.Johnson
  0 siblings, 0 replies; 18+ messages in thread
From: Darin.Johnson @ 2003-07-15 20:47 UTC (permalink / raw)
  To: linuxppc-embedded


> Well, I would be, of the USB code.  SCSI might have had it fixed, and
> others that we haven't found yet may or may not.  But the important
> point is that doing this is a driver bug and it's OK to beat driver
> authors over the head with patches to fix the behavior. :)

I don't necessarily agree.  It's an argument over what is
broken - the invalidate routine that makes assumptions, or the
device writer that assumes the interface functions work as
claimed, or the documentation that doesn't exist?

It's fine to beat the driver authors over the head with the
fact that the functions don't necessarily do what their
names imply.  And it's fine to discourage authors from doing
DMA on the stack (that's a separate issue from unaligned
buffers).

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread
[parent not found: <F0B628F30F48064289D8CCC1EE21B7A80C48A5@mvebe001.americas.n okia.com>]
* RE: [RFC] consistent_sync and non L1 cache line aligned buffers
@ 2003-07-15 20:18 Darin.Johnson
  0 siblings, 0 replies; 18+ messages in thread
From: Darin.Johnson @ 2003-07-15 20:18 UTC (permalink / raw)
  To: linuxppc-embedded


> invalidate_dcache_range works in L1_CACHE_LINE chunks, so if
> start and/or
> end of the buffer are not aligned we may corrupt data located
> in the same
> cache line (usually stack variable(s) declared before or after buffer
> declaration).

I had noticed this a long time ago when looking to see how PowerPC
Linux had solved this problem, but discovered that it just sidestepped
the problem.

I solved the problem (in a non-Linux system) by just flushing the first
and last lines in the requested range, and invalidating the rest.  The
very slight performance hit is probably less than testing to see if the
buffer is unaligned.

> To be safe I think it's better to modify consistent_sync to
> handle such
> "bad" buffers.

"Bad" is the wrong word to use, in my opinion.  If the function is
not called "invalidate_aligned_dcache_range", or is not otherwise
clearly and unambiguously documented (in an easy to locate place)
that there are severe restrictions on the input parameters, then
the function itself is "bad".  The "principle of least astonishment"
implies that a function should do just what its name implies without
hidden surprises.

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread
* [RFC] consistent_sync and non L1 cache line aligned buffers
@ 2003-07-15  4:32 Eugene Surovegin
  2003-07-15 15:46 ` Tom Rini
  2003-07-15 16:17 ` Matt Porter
  0 siblings, 2 replies; 18+ messages in thread
From: Eugene Surovegin @ 2003-07-15  4:32 UTC (permalink / raw)
  To: linuxppc-embedded


Hi!

I think this is a known problem.

There are drivers or even subsystems which use stack allocated DMA buffers.
To make things worse, those buffers usually non L1 cache line aligned
(start and/or end).

When they use pci_map_* with PCI_DMA_FROMDEVICE, consistent_sync calls
invalidate_dcache_range for the buffer.

invalidate_dcache_range works in L1_CACHE_LINE chunks, so if start and/or
end of the buffer are not aligned we may corrupt data located in the same
cache line (usually stack variable(s) declared before or after buffer
declaration).

According to MV kernel, there are USB devices that use such buffers.

After spending last weekend with RISCWatch :) I can say that SCSI subsystem
is also guilty of this behavior (drivers/scsi/scsi_scan.c::scan_scsis,
scsi_result0).

Unfortunately, I don't know how many similar places of code are still
waiting to be found :(.
To be safe I think it's better to modify consistent_sync to handle such
"bad" buffers.

If start and/or end of the buffer are not properly aligned I use "dcbf" to
flush corresponding cache line(s) and then call invalidate_dcache_range.

This change doesn't affect performance of consistent_sync noticeably (like
in the variant I found in MV kernel, where invalidate_dcache_range was
changed to flush_dcache_range if USB was enabled)

I don't know whether we should "ifdef" this for CONFIG_4xx and I know this
fix is ugly :)
I'm not even sure that such hacks should be included in the kernel :)))
(but I will definitely use it in my tree)

Comments/suggestions are welcome!

Thanks,

Eugene


===== arch/ppc/mm/cachemap.c 1.13 vs edited =====
--- 1.13/arch/ppc/mm/cachemap.c	Thu Feb 27 11:40:16 2003
+++ edited/arch/ppc/mm/cachemap.c	Mon Jul 14 20:49:28 2003
@@ -150,6 +150,21 @@
  	case PCI_DMA_NONE:
  		BUG();
  	case PCI_DMA_FROMDEVICE:	/* invalidate only */
+
+		/* Handle cases when the buffer start and/or end
+		   are not L1 cache line aligned.
+		   Some drivers/subsystems (e.g. USB, SCSI) do DMA
+		   from the stack allocated buffers, to prevent
+		   corruption of the other stack variables located
+		   near the buffer, we flush (instead of invalidate)
+		   these "dangerous" areas                     --ebs
+		*/
+		if (unlikely(start & (L1_CACHE_LINE_SIZE - 1)))
+			__asm__ __volatile__("dcbf 0,%0" : : "r" (start));
+
+		if (unlikely(end & (L1_CACHE_LINE_SIZE - 1)))
+			__asm__ __volatile__("dcbf 0,%0" : : "r" (end));
+
  		invalidate_dcache_range(start, end);
  		break;
  	case PCI_DMA_TODEVICE:		/* writeback only */


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2003-07-16 14:01 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-16  0:12 [RFC] consistent_sync and non L1 cache line aligned buffers Darin.Johnson
  -- strict thread matches above, loose matches on Subject: below --
2003-07-15 23:04 Darin.Johnson
2003-07-15 23:34 ` Paul Mackerras
2003-07-15 23:50   ` Eugene Surovegin
2003-07-15 23:45 ` Matt Porter
2003-07-16 14:01 ` Dan Malek
2003-07-15 20:47 Darin.Johnson
     [not found] <F0B628F30F48064289D8CCC1EE21B7A80C48A5@mvebe001.americas.n okia.com>
2003-07-15 20:39 ` Eugene Surovegin
2003-07-15 21:26   ` David Blythe
2003-07-15 22:15     ` Dan Malek
2003-07-15 20:18 Darin.Johnson
2003-07-15  4:32 Eugene Surovegin
2003-07-15 15:46 ` Tom Rini
2003-07-15 16:20   ` Eugene Surovegin
2003-07-15 16:25     ` Tom Rini
2003-07-15 16:17 ` Matt Porter
2003-07-15 16:27   ` Eugene Surovegin
2003-07-15 23:51     ` Matt Porter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.