From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arvind R Subject: Re: Nouveau on dom0 Date: Thu, 4 Mar 2010 14:47:58 +0530 Message-ID: References: <20100225125552.GC9040@phenom.dumpdata.com> <20100225174411.GA13270@phenom.dumpdata.com> <20100301160130.GB7881@phenom.dumpdata.com> <20100303181303.GA21078@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Return-path: In-Reply-To: <20100303181303.GA21078@phenom.dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Konrad Rzeszutek Wilk Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On Wed, Mar 3, 2010 at 11:43 PM, Konrad Rzeszutek Wilk wrote: >> > aio-write - >> >> which triggers do_page_fault, handle_mm_fault, do_linear_fault, __do_fault >> and finally ttm_bo_vm_fault. > I've attached a simple patch I wrote some time ago to get the real MFNs > and its page protection. I think you can adapt it (print_data function to be exact) > to peet at the PTE and its protection values. Have patched - did not apply clean. Will compile and get some info. > There is an extra flag that the PTE can have when running under Xen: _PAGE_IOMAP. > This signifies that the PFN is actually the MFN. In this case thought > it sholdn't be enabled b/c the memory is actually gathered from > alloc_page. But if it is, it might be the culprit. >> What can possibly cause the fault-handler to repeat endlessly? FYI: about 2000 times a second - slowed by printk >> If a wrong page is backed at the user-address, it should create bad_access or >> some other subsequent events - but the system is running fine minus all local > So you see this fault handler being called endlessly while the machine > is still running and other pieces of code work just fine, right? Right. Can ssh in - but no local console >> ttm_tt_get_page calls alloc in a loop - so it may allocate multiple pages from >> start/end depending on Highmem memory or not - implying asynchronous allocation >> and mapping. > > I thought it had some logic to figure out that it already handled this > page and would return an already allocate page? Right. I think the problem lies in the vm_insert_pfn/page/mixed family of functions. These are only used (grep'ed kernel tree) and invariably for mmaping. Scsi-tgt, mspec, some media/video, poch,android in staging and ttm - and, surprise - xen/blktap/ring.c and device.c - which both check XENFEAT_auto_translated_physmap Pls. look at xen/blktap/ring.c - it looks to be what we need