From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [PATCH v4 5/8] hvmloader: Correct bug in low mmio region accounting Date: Fri, 21 Jun 2013 13:58:49 +0100 Message-ID: <51C46A2902000078000DFA9A@nat28.tlf.novell.com> References: <1371811594-31135-1-git-send-email-george.dunlap@eu.citrix.com> <1371811594-31135-6-git-send-email-george.dunlap@eu.citrix.com> <20932.14035.475814.607949@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20932.14035.475814.607949@mariner.uk.xensource.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap , Ian Jackson Cc: Keir Fraser , xen-devel@lists.xen.org, Stefano Stabellini , Ian Campbell , Hanweidong List-Id: xen-devel@lists.xenproject.org >>> On 21.06.13 at 13:19, Ian Jackson wrote: > George Dunlap writes ("[PATCH v4 5/8] hvmloader: Correct bug in low mmio > region accounting"): >> When deciding whether to map a device in low MMIO space (<4GiB), >> hvmloader compares it with "mmio_left", which is set to the size of >> the low MMIO range (pci_mem_end - pci_mem_start). However, even if it >> does map a device in high MMIO space, it still removes the size of its >> BAR from mmio_left. >> >> In reality we don't need to do a separate accounting of the low memory >> available -- this can be calculated from mem_resource. Just get rid >> of the variable and the duplicate accounting entirely. This will make >> the code more robust. > ... >> - using_64bar = bars[i].is_64bar && bar64_relocate && (mmio_left < bar_sz); >> + using_64bar = bars[i].is_64bar && bar64_relocate >> + && (bar_sz > (mem_resource.max - mem_resource.base)); > > This is not entirely straightforward I think. > > The actual calculation about whether it will actually fit, rather than > a precalculation of whether it is going to fit, is done here: > > base = (resource->base + bar_sz - 1) & ~(uint64_t)(bar_sz - 1); > bar_data |= (uint32_t)base; > bar_data_upper = (uint32_t)(base >> 32); > base += bar_sz; > > if ( (base < resource->base) || (base > resource->max) ) > [ ... doesn't fit ... ] > > The first test rounds the base up to a multiple of bar_sz. I assume > that this is a requirement of the PCI spec. > > (While I'm here I'll note that the (uint64_t) cast in that line is > unneccessary and confusing. If bar_sz weren't 64-bit this code would > be quite wrong, and putting that cast there suggests that it might not > be.) > > In infer (from "bar_sz &= ~(bar_sz - 1)") that bar_sz is supposed to > be always a power of two. And we have devices in descending order of > size. So at least after the first device, this rounding does nothing. > > But for the first device I think it may be possible for resource->base > not to be a multiple of the bar_sz, and in that case it might be that > the precalculation thinks it will fit when the actual placement > calculation doesn't. > > Do you think this is possible ? This is possible only from an abstract perspective, not in reality: PCI_MEM_START being 0x{f,e,c,8}0000000, PCI_MEM_END being 0xfc000000, and allocations starting with the biggest BARs (where you already correctly noted that BARs are always a power of 2 in size), the current base address can be misaligned only when the BAR size is too large to fit anyway. In which case it'll go into the space above 4Gb, and to that range the precalculation doesn't apply. Jan