linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Mikhail Zolotaryov <lebon@lebon.org.ua>
Cc: Prodyut Hazarika <phazarika@amcc.com>,
	tburns@datacast.com, Andrea Zypchen <azypchen@intldata.ca>,
	linuxppc-dev@lists.ozlabs.org, azilkie@datacast.com
Subject: Re: AW: PowerPC PCI DMA issues (prefetch/coherency?)
Date: Fri, 11 Sep 2009 17:31:56 +1000	[thread overview]
Message-ID: <1252654316.8566.100.camel@pasglop> (raw)
In-Reply-To: <4AA9F98E.2010800@lebon.org.ua>

On Fri, 2009-09-11 at 10:17 +0300, Mikhail Zolotaryov wrote:
> Benjamin Herrenschmidt wrote:
> > On Wed, 2009-09-09 at 17:40 +0300, Mikhail Zolotaryov wrote:
> >   
> >> Hi Tom,
> >>
> >> In my case __dma_sync() calls flush_dcache_range() (it's due to 
> >> alignment) from a tasklet - no OOPS. It uses dcbf instruction instead of 
> >> dcbi - that's the difference as dcbf is not privileged.
> >>     
> >
> > What it calls depends on the direction of the transfer.

> Would not agree with you in this point as __dma_sync() code is:

Well, it -does- depend on the direction of the transfer... and -also- on
the size & alignement :-)

Anyway, that is probably not the problem. From the log I've seen, it
just looks like a page fault due to a bad virtual address passed there.

Cheers,
Ben.

>         case DMA_FROM_DEVICE:
>                 /*
>                  * invalidate only when cache-line aligned otherwise 
> there is
>                  * the potential for discarding uncommitted data from 
> the cache
>                  */
>                 if ((start & (L1_CACHE_BYTES - 1)) || (size & 
> (L1_CACHE_BYTES - 1)))
>                         flush_dcache_range(start, end);
>                 else
>                         invalidate_dcache_range(start, end);
>                 break;
> 
> So, actual instruction used depends on address/size alignment.
> 
> >  The tasklet runs
> > in priviledged mode, dcbi should work just fine... if passed a correct
> > address :-)
> >
> > Cheers,
> > Ben.
> >
> >   
> >> Tom Burns wrote:
> >>     
> >>> Hi Mikhail,
> >>>
> >>> Sorry, this DMA code is in a tasklet.  Are you suggesting the 
> >>> processor is in supervisor mode at that time?  Calling 
> >>> pci_dma_sync_sg_for_cpu() from the tasklet context is what generates 
> >>> the OOPS.  The entire oops is as follows, if it's relevant:
> >>>
> >>> Oops: kernel access of bad area, sig: 11 [#1]
> >>> NIP: c0003ab0 LR: c0010c30 CTR: 02400001
> >>> REGS: df117bd0 TRAP: 0300   Tainted: P         (2.6.24.2)
> >>> MSR: 00029000 <EE,ME>  CR: 44224042  XER: 20000000
> >>> DEAR: 3fd39000, ESR: 00800000
> >>> TASK = de5db7d0[157] 'cat' THREAD: df116000
> >>> GPR00: e11e5854 df117c80 de5db7d0 3fd39000 02400001 0000001f 00000002
> >>> 0079a169
> >>> GPR08: 00000001 c0310000 00000000 c0010c84 24224042 101c0dac c0310000
> >>> 10177000
> >>> GPR16: deb14200 df116000 e12062d0 e11f6104 de0f16c0 e11f0000 c0310000
> >>> e11f59cc
> >>> GPR24: e11f62d0 e11f0000 e11f0000 00000000 00000002 defee014 3fd39008
> >>> 87d39009
> >>> NIP [c0003ab0] invalidate_dcache_range+0x1c/0x30
> >>> LR [c0010c30] __dma_sync+0x58/0xac
> >>> Call Trace:
> >>> [df117c80] [0000000a] 0xa (unreliable)
> >>> [df117c90] [e11e5854] DoTasklet+0x67c/0xc90 [ideDriverDuo_cyph]
> >>> [df117ce0] [c001ee24] tasklet_action+0x60/0xcc
> >>> [df117cf0] [c001ef04] __do_softirq+0x74/0xe0
> >>> [df117d10] [c00067a8] do_softirq+0x54/0x58
> >>> [df117d20] [c001edb4] irq_exit+0x48/0x58
> >>> [df117d30] [c00069d0] do_IRQ+0x6c/0xc0
> >>> [df117d40] [c00020e0] ret_from_except+0x0/0x18
> >>> [df117e00] [c00501e0] unmap_vmas+0x2c4/0x560
> >>> [df117e90] [c0053ebc] exit_mmap+0x64/0xec
> >>> [df117ec0] [c00171ac] mmput+0x50/0xd4
> >>> [df117ed0] [c001aef8] exit_mm+0x80/0xe0
> >>> [df117ef0] [c001c818] do_exit+0x134/0x6f8
> >>> [df117f30] [c001ce14] do_group_exit+0x38/0x74
> >>> [df117f40] [c0001a80] ret_from_syscall+0x0/0x3c
> >>> Instruction dump:
> >>> 7c0018ac 38630020 4200fff8 7c0004ac 4e800020 38a0001f 7c632878 7c832050
> >>> 7c842a14 5484d97f 4d820020 7c8903a6 <7c001bac> 38630020 4200fff8
> >>> 7c0004ac
> >>> Kernel panic - not syncing: Aiee, killing interrupt handler!
> >>> Rebooting in 180 seconds..
> >>>
> >>>
> >>> Cheers,
> >>> Tom
> >>>
> >>> Mikhail Zolotaryov wrote:
> >>>       
> >>>> Hi Tom,
> >>>>
> >>>> possible solution could be to use tasklet to perform DMA-related job 
> >>>> (as in most cases DMA transfer is interrupt driven - makes sense).
> >>>>
> >>>>
> >>>> Tom Burns wrote:
> >>>>         
> >>>>> Hi,
> >>>>>
> >>>>> With the default config for the Sequoia board on 2.6.24, calling 
> >>>>> pci_dma_sync_sg_for_cpu() results in executing
> >>>>> invalidate_dcache_range() in arch/ppc/kernel/misc.S from 
> >>>>> __dma_sync().  This OOPses on PPC440 since it tries to call directly 
> >>>>> the assembly instruction dcbi, which can only be executed in 
> >>>>> supervisor mode.  We tried that before resorting to manual cache 
> >>>>> line management with usermode-safe assembly calls.
> >>>>>
> >>>>> Regards,
> >>>>> Tom Burns
> >>>>> International Datacasting Corporation
> >>>>>
> >>>>> Mikhail Zolotaryov wrote:
> >>>>>           
> >>>>>> Hi,
> >>>>>>
> >>>>>> Why manage cache lines  manually, if appropriate code is a part of 
> >>>>>> __dma_sync / dma_sync_single_for_device of DMA API ? (implies 
> >>>>>> CONFIG_NOT_COHERENT_CACHE enabled, as default for Sequoia Board)
> >>>>>>
> >>>>>> Prodyut Hazarika wrote:
> >>>>>>             
> >>>>>>> Hi Adam,
> >>>>>>>
> >>>>>>>  
> >>>>>>>               
> >>>>>>>> Yes, I am using the 440EPx (same as the sequoia board). Our 
> >>>>>>>> ideDriver is DMA'ing blocks of 192-byte data over the PCI bus
> >>>>>>>>     
> >>>>>>>>                 
> >>>>>>> (using
> >>>>>>>  
> >>>>>>>               
> >>>>>>>> the Sil0680A PCI-IDE bridge). Most of the DMA's (depending on 
> >>>>>>>> timing)
> >>>>>>>> end up being partially corrupted when we try to parse the data in 
> >>>>>>>> the
> >>>>>>>> virtual page. We have confirmed the data is good before the PCI-IDE
> >>>>>>>> bridge. We are creating two 8K pages and map them to physical DMA
> >>>>>>>>     
> >>>>>>>>                 
> >>>>>>> memory
> >>>>>>>  
> >>>>>>>               
> >>>>>>>> using single-entry scatter/gather structs. When a DMA block is
> >>>>>>>> corrupted, we see a random portion of it (always a multiple of 
> >>>>>>>> 16byte
> >>>>>>>> cache lines) is overwritten with old data from the last time the
> >>>>>>>>     
> >>>>>>>>                 
> >>>>>>> buffer
> >>>>>>>  
> >>>>>>>               
> >>>>>>>> was used.     
> >>>>>>>>                 
> >>>>>>> This looks like a cache coherency problem.
> >>>>>>> Can you ensure that the TLB entries corresponding to the DMA 
> >>>>>>> region has
> >>>>>>> the CacheInhibit bit set.
> >>>>>>> You will need a BDI connected to your system.
> >>>>>>>
> >>>>>>> Also, you will need to invalidate and flush the lines appropriately,
> >>>>>>> since in 440 cores,
> >>>>>>> L1Cache coherency is managed entirely by software.
> >>>>>>> Please look at drivers/net/ibm_newemac/mal.c and core.c for 
> >>>>>>> example on
> >>>>>>> how to do it.
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>> Prodyut
> >>>>>>>
> >>>>>>> On Thu, 2009-09-03 at 13:27 -0700, Prodyut Hazarika wrote:
> >>>>>>>  
> >>>>>>>               
> >>>>>>>> Hi Adam,
> >>>>>>>>
> >>>>>>>>  
> >>>>>>>>                 
> >>>>>>>>> Are you sure there is L2 cache on the 440?
> >>>>>>>>>       
> >>>>>>>>>                   
> >>>>>>>> It depends on the SoC you are using. SoC like 460EX (Canyonlands
> >>>>>>>>     
> >>>>>>>>                 
> >>>>>>> board)
> >>>>>>>  
> >>>>>>>               
> >>>>>>>> have L2Cache.
> >>>>>>>> It seems you are using a Sequoia board, which has a 440EPx SoC. 
> >>>>>>>> 440EPx
> >>>>>>>> has a 440 cpu core, but no L2Cache.
> >>>>>>>> Could you please tell me which SoC you are using?
> >>>>>>>> You can also refer to the appropriate dts file to see if there is 
> >>>>>>>> L2C.
> >>>>>>>> For example, in canyonlands.dts (460EX based board), we have the L2C
> >>>>>>>> entry.
> >>>>>>>>         L2C0: l2c {
> >>>>>>>>               ...
> >>>>>>>>         }
> >>>>>>>>
> >>>>>>>>  
> >>>>>>>>                 
> >>>>>>>>> I am seeing this problem with our custom IDE driver which is 
> >>>>>>>>> based on
> >>>>>>>>>       
> >>>>>>>>>                   
> >>>>>>>  
> >>>>>>>               
> >>>>>>>>> pretty old code. Our driver uses pci_alloc_consistent() to allocate
> >>>>>>>>>       
> >>>>>>>>>                   
> >>>>>>> the
> >>>>>>>  
> >>>>>>>               
> >>>>>>>>> physical DMA memory and alloc_pages() to allocate a virtual 
> >>>>>>>>> page. It then uses pci_map_sg() to map to a scatter/gather 
> >>>>>>>>> buffer. Perhaps I should convert these to the DMA API calls as 
> >>>>>>>>> you suggest.
> >>>>>>>>>       
> >>>>>>>>>                   
> >>>>>>>> Could you give more details on the consistency problem? It is a good
> >>>>>>>> idea to change to the new DMA APIs, but pci_alloc_consistent() 
> >>>>>>>> should
> >>>>>>>> work too
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>> Prodyut  On Thu, 2009-09-03 at 19:57 +1000, Benjamin 
> >>>>>>>> Herrenschmidt wrote:
> >>>>>>>>  
> >>>>>>>>                 
> >>>>>>>>> On Thu, 2009-09-03 at 09:05 +0100, Chris Pringle wrote:
> >>>>>>>>>   
> >>>>>>>>>                   
> >>>>>>>>>> Hi Adam,
> >>>>>>>>>>
> >>>>>>>>>> If you have a look in include/asm-ppc/pgtable.h for the following
> >>>>>>>>>>         
> >>>>>>>>>>                     
> >>>>>>>> section:
> >>>>>>>>  
> >>>>>>>>                 
> >>>>>>>>>> #ifdef CONFIG_44x
> >>>>>>>>>> #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED |
> >>>>>>>>>>         
> >>>>>>>>>>                     
> >>>>>>>> _PAGE_GUARDED)
> >>>>>>>>  
> >>>>>>>>                 
> >>>>>>>>>> #else
> >>>>>>>>>> #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED)
> >>>>>>>>>> #endif
> >>>>>>>>>>
> >>>>>>>>>> Try adding _PAGE_COHERENT to the appropriate line above and see if
> >>>>>>>>>>         
> >>>>>>>>>>                     
> >>>>>>>> that 
> >>>>>>>>                 
> >>>>>>>>>> fixes your issue - this causes the 'M' bit to be set on the page
> >>>>>>>>>>         
> >>>>>>>>>>                     
> >>>>>>>> which 
> >>>>>>>>                 
> >>>>>>>>>> sure enforce cache coherency. If it doesn't, you'll need to check
> >>>>>>>>>>         
> >>>>>>>>>>                     
> >>>>>>>> the 
> >>>>>>>>                 
> >>>>>>>>>> 'M' bit isn't being masked out in head_44x.S (it was originally
> >>>>>>>>>>         
> >>>>>>>>>>                     
> >>>>>>>> masked 
> >>>>>>>>                 
> >>>>>>>>>> out on arch/powerpc, but was fixed in later kernels when the cache
> >>>>>>>>>>         
> >>>>>>>>>>                     
> >>>>>>>  
> >>>>>>>               
> >>>>>>>>>> coherency issues with non-SMP systems were resolved).
> >>>>>>>>>>         
> >>>>>>>>>>                     
> >>>>>>>>> I have some doubts about the usefulness of doing that for 4xx.
> >>>>>>>>>       
> >>>>>>>>>                   
> >>>>>>> AFAIK,
> >>>>>>>  
> >>>>>>>               
> >>>>>>>>> the 440 core just ignores M.
> >>>>>>>>>
> >>>>>>>>> The problem lies probably elsewhere. Maybe the L2 cache coherency
> >>>>>>>>>       
> >>>>>>>>>                   
> >>>>>>>> isn't
> >>>>>>>>  
> >>>>>>>>                 
> >>>>>>>>> enabled or not working ?
> >>>>>>>>>
> >>>>>>>>> The L1 cache on 440 is simply not coherent, so drivers have to make
> >>>>>>>>>       
> >>>>>>>>>                   
> >>>>>>>> sure
> >>>>>>>>  
> >>>>>>>>                 
> >>>>>>>>> they use the appropriate DMA APIs which will do cache flushing when
> >>>>>>>>> needed.
> >>>>>>>>>
> >>>>>>>>> Adam, what driver is causing you that sort of problems ?
> >>>>>>>>>
> >>>>>>>>> Cheers,
> >>>>>>>>> Ben.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>       
> >>>>>>>>>                   
> >>>>>>             
> >>>>>           
> >>>>         
> >>>       
> >> _______________________________________________
> >> Linuxppc-dev mailing list
> >> Linuxppc-dev@lists.ozlabs.org
> >> https://lists.ozlabs.org/listinfo/linuxppc-dev
> >>     
> >
> >   

  reply	other threads:[~2009-09-11  7:32 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-02 21:22 AW: PowerPC PCI DMA issues (prefetch/coherency?) Adam Zilkie
2009-09-03  8:05 ` Chris Pringle
2009-09-03  9:57   ` Benjamin Herrenschmidt
2009-09-03 16:04     ` Adam Zilkie
2009-09-03 16:21       ` Josh Boyer
2009-09-03 20:27       ` Prodyut Hazarika
2009-09-08 18:01         ` Adam Zilkie
2009-09-08 18:59           ` Prodyut Hazarika
2009-09-08 19:30             ` Adam Zilkie
2009-09-08 19:56               ` Prodyut Hazarika
2009-09-08 20:00                 ` Adam Zilkie
2009-09-09  1:34                   ` Benjamin Herrenschmidt
2009-09-08 21:34               ` Benjamin Herrenschmidt
2009-09-09 13:28             ` Mikhail Zolotaryov
2009-09-09 13:43               ` Tom Burns
2009-09-09 14:12                 ` Mikhail Zolotaryov
2009-09-09 14:10                   ` Tom Burns
2009-09-09 14:40                     ` Mikhail Zolotaryov
2009-09-11  1:57                       ` Benjamin Herrenschmidt
2009-09-11  7:17                         ` Mikhail Zolotaryov
2009-09-11  7:31                           ` Benjamin Herrenschmidt [this message]
2009-09-11  1:57                     ` Benjamin Herrenschmidt
2009-09-10 19:53                   ` Tom Burns
2009-09-10 20:30                     ` Pravin Bathija
2009-09-11  2:44                       ` Benjamin Herrenschmidt
2009-09-11  5:12                         ` Stefan Roese
2009-09-11  5:17                           ` Benjamin Herrenschmidt
2009-09-11  5:25                             ` Stefan Roese
2009-09-11  5:35                               ` Pravin Bathija
2009-09-11  5:40                                 ` Benjamin Herrenschmidt
2009-09-11  9:23                                   ` Pravin Bathija
2009-09-11  1:59                     ` Benjamin Herrenschmidt
2009-09-11 16:05                     ` Prodyut Hazarika
2009-09-11  1:55                 ` Benjamin Herrenschmidt
2009-09-11 13:51                   ` Tom Burns
2009-09-08 21:29           ` Benjamin Herrenschmidt
2009-09-03 12:20   ` Wrobel Heinz-R39252
2009-09-03 12:43     ` Chris Pringle
2009-09-06 21:32     ` Benjamin Herrenschmidt
2009-09-03 15:54   ` Adam Zilkie
     [not found] <4A37A503.3030209@oxtel.com>
     [not found] ` <20090616162114.GA5051@loki.buserror.net>
     [not found]   ` <4A37C97A.5050508@oxtel.com>
2009-06-16 16:46     ` Scott Wood
2009-06-16 16:57       ` Chris Pringle
2009-06-16 17:03         ` Scott Wood
2009-06-17  7:58           ` Chris Pringle
2009-06-17 13:18             ` Chris Pringle
2009-06-18 11:24               ` Chris Pringle
2009-06-22 14:31                 ` AW: " Sergej.Stepanov
2009-06-29  8:11                   ` Chris Pringle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1252654316.8566.100.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=azilkie@datacast.com \
    --cc=azypchen@intldata.ca \
    --cc=lebon@lebon.org.ua \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=phazarika@amcc.com \
    --cc=tburns@datacast.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).