From mboxrd@z Thu Jan 1 00:00:00 1970 From: r.sricharan@ti.com (Sricharan R) Date: Thu, 25 Jul 2013 09:19:37 +0530 Subject: [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses In-Reply-To: References: <1371858502-10083-1-git-send-email-santosh.shilimkar@ti.com> <1371858502-10083-7-git-send-email-santosh.shilimkar@ti.com> <51EF356D.2050902@ti.com> <51EFBF8F.4010201@ti.com> <51EFC37B.80906@ti.com> Message-ID: <51F0A051.20203@ti.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Hi Nicolas, On Thursday 25 July 2013 01:51 AM, Nicolas Pitre wrote: > On Wed, 24 Jul 2013, Sricharan R wrote: > >> On Wednesday 24 July 2013 05:20 PM, Sricharan R wrote: >>> On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote: >>>> On Tue, 23 Jul 2013, Santosh Shilimkar wrote: >>>> >>>>> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote: >>>>>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote: >>>>>> >>>>>>> From: Sricharan R >>>>>>> >>>>>>> The current phys_to_virt patching mechanism does not work >>>>>>> for 64 bit physical addressesp. Note that constant used in add/sub >>>>>>> instructions is encoded in to the last 8 bits of the opcode. So shift >>>>>>> the _pv_offset constant by 24 to get it in to the correct place. >>>>>>> >>>>>>> The v2p patching mechanism patches the higher 32bits of physical >>>>>>> address with a constant. While this is correct, in those platforms >>>>>>> where the lowmem addressable physical memory spawns across 4GB boundary, >>>>>>> a carry bit can be produced as a result of addition of lower 32bits. >>>>>>> This has to be taken in to account and added in to the upper. The patched >>>>>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be >>>>>>> in two's complement form when PA_START < VA_START and that can result >>>>>>> in a false carry bit. >>>>>>> >>>>>>> e.g PA = 0x80000000 VA = 0xC0000000 >>>>>>> __pv_offset = PA - VA = 0xC0000000 (2's complement) >>>>>>> >>>>>>> So adding __pv_offset + VA should never result in a true overflow. So in >>>>>>> order to differentiate between a true carry, a extra flag __pv_sign_flag >>>>>>> is introduced. >>>>> First of all thanks for the review. >>>>> >>>>>> I'm still wondering if this is worth bothering about. >>>>>> >>>>>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry >>>>>> to propagate to the high word of the physical address as the VA space >>>>>> cannot be larger than 0x40000000. >>>>>> >>>>> Agreed. >>>>> >>>>>> So is there really a case where: >>>>>> >>>>>> 1) physical memory is crossing the 4GB mark, and ... >>>>>> >>>>>> 2) physical memory start address is higher than virtual memory start >>>>>> address needing a carry due to the 32-bit add overflow? >>>>>> >>>>> Consider below two cases of memory layout apart from one mentioned >>>>> above where the carry is bit irrelevant as you rightly said. >>>>> >>>>> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000 >>>> This can be patched as: >>>> >>>> mov phys_hi, #0x8 >>>> add phys_lo, virt, #0x40000000 @ carry ignored >>>> >>>>> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000 >>>> mov phys_hi, #0x2 >>>> add phys_lo, virt, #0xc0000000 @ carry ignored >>>> >>>>> In both of these cases there a true carry which needs to be >>>>> considered. >>>> Well, not really. However, if you have: >>>> >>>> 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000 >>>> >>>> ... then you need: >>>> >>>> mov phys_hi, #0x2 >>>> adds phys_lo, virt, #0x40000000 >>>> adc phys_hi, phys_hi, #0 >>>> >>>> My question is: how likely is this? >>>> >>>> What is your actual physical memory start address? >>> Agreed. In our case we do not have the Physical address crossing across >>> 4GB. So ignoring the carry would have be been OK. But we are >>> also addressing the other case where it would really crossover. >>> >>>> If we really need to cope with the carry, then the __pv_sign_flag should >>>> instead be represented in pv_offset directly: >>>> >>>> Taking example #2 above, that would be: >>>> >>>> mov phys_hi, #0x1 >>>> adds phys_lo, virt, #0xc0000000 >>>> adc phys_hi, phys_hi, #0 >>>> >>>> If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is >>>> 0xffff-ffff-c000-0000, meaning: >>>> >>>> mvn phys_hi, #0 >>>> add phys_lo, virt, #0xc0000000 >>>> adc phys_hi, phys_hi, #0 >>>> >>>> So that would require a special case in the patching code where a mvn >>>> with 0 is used if the high part of pv_offset is 0xffffffff. >>>> >>>> >>>> Nicolas >>> Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset >>> is going to be actual value and not 2's complement. Fine here. >>> When running from higher physical address space, we will always fall here. >>> >>> So for the second case where pv_offset is 0xffffffff .., (PA < VA) >>> is a problem only when we run from lower physical address. So we can safely >>> assume that the higher 32bits of PA are '0' and stub it initially. In this way we >>> can avoid the special case. >> Sorry, I missed one more point here. In the second case,we should patch it with >> 0x0 when (PA > VA) and with 0xffffffff when (PA < VA). > I don't think I follow you here. > > Let's assume: > > phys_addr_t __pv_offset = PHYS_START - VIRT_START; > > If PA = 0x0-8000-0000 and VA = 0xc000-0000 then > __pv_offset = 0xffff-ffff-c000-0000. > > If PA = 0x2-8000-0000 and VA = 0xc000-0000 then > __pv_offset = 0x1-c000-0000. > > So the __virt_to_phys() assembly stub could look like: > > static inline phys_addr_t __virt_to_phys(unsigned long x) > { > phys_addr_t t; > > if if (sizeof(phys_addr_t) == 4) { > __pv_stub(x, t, "add", __PV_BITS_31_24); > } else { > __pv_movhi_stub(t); > __pv_add_carry_stub(x, t); > } > > return t; > } > > And... > > #define __pv_movhi_stub(y) \ > __asm__("@ __pv_movhi_stub\n" \ > "1: mov %R0, %1\n" \ > " .pushsection .pv_table,\"a\"\n" \ > " .long 1b\n" \ > " .popsection\n" \ > : "=r" (y) \ > : "I" (__PV_BITS_8_0)) > > #define __pv_add_carry_stub(x, y) \ > __asm__("@ __pv_add_carry_stub\n" \ > "1: adds %Q0, %1, %2\n" \ > " adc %R0, %R0, #0\n" \ > " .pushsection .pv_table,\"a\"\n" \ > " .long 1b\n" \ > " .popsection\n" \ > : "+r" (y) \ > : "r" (x), "I" (__PV_BITS_31_24) \ > : "cc") > > The stub bits such as __PV_BITS_8_0 can be augmented with more bits in > the middle to determine the type of fixup needed. The fixup code would > determine the shift needed on the value, and whether or not the low or > high word of __pv_offset should be used according to those bits. > > Then, in the case where a mov is patched, you need to check if the high > word of __pv_offset is 0xffffffff and if so the mov should be turned > into a "mvn rn, #0". > > And there you are with all possible cases handled. > > > Nicolas Thanks and you have given the full details here. Sorry if i was not clear on my previous response. 1) When i said special case can be avoided, i meant that we need not differentiate the 0xfffffff case inside the __virt_to_phy macro, but can handle it at the time of patching. Your above code makes that clear. 2) I would have ended creating separate tables for 'mov' and 'add' case. But again thanks to your above idea of augumenting the __PV_BITS, with which we can find out run time. And 'mvn' would be needed for moving '0xffffffff' . Now I can get rid of the separate section that i created for 'mov' in my previous version. I will make the above suggestions and come back. Regards, Sricharan