linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joe Jin <joe.jin@oracle.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "DONGLI.ZHANG" <dongli.zhang@oracle.com>,
	konrad@kernel.org, Christoph Helwig <hch@lst.de>,
	John Sobecki <john.sobecki@oracle.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"linux-kernel@vger.kernel.org\"" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] xen-swiotlb: exchange memory with Xen only when pages are contiguous
Date: Thu, 25 Oct 2018 07:23:35 -0700	[thread overview]
Message-ID: <922914c9-22db-c5d1-33da-d07691ebd7d7@oracle.com> (raw)
In-Reply-To: <24a62522-1629-5d0b-398e-6d2c1a0b97f7@oracle.com>

On 10/25/18 4:45 AM, Boris Ostrovsky wrote:
> On 10/24/18 10:43 AM, Joe Jin wrote:
>> On 10/24/18 6:57 AM, Boris Ostrovsky wrote:
>>> On 10/24/18 9:02 AM, Konrad Rzeszutek Wilk wrote:
>>>> On Tue, Oct 23, 2018 at 08:09:04PM -0700, Joe Jin wrote:
>>>>> Commit 4855c92dbb7 "xen-swiotlb: fix the check condition for
>>>>> xen_swiotlb_free_coherent" only fixed memory address check condition
>>>>> on xen_swiotlb_free_coherent(), when memory was not physically
>>>>> contiguous and tried to exchanged with Xen via 
>>>>> xen_destroy_contiguous_region it will lead kernel panic.
>>>> s/it will lead/which lead to/?
>>>>
>>>>> The correct check condition should be memory is in DMA area and
>>>>> physically contiguous.
>>>> "The correct check condition to make Xen hypercall to revert the
>>>> memory back from its 32-bit pool is if it is:
>>>>  1) Above its DMA bit mask (for example 32-bit devices can only address
>>>> up to 4GB, and we may want 4GB+2K), and
>>> Is this "and' or 'or'?
>>>
>>>>  2) If it not physically contingous
>>>>
>>>> N.B. The logic in the code is inverted, which leads to all sorts of
>>>> confusions."
>>>
>>> I would, in fact, suggest to make the logic the same in both
>>> xen_swiotlb_alloc_coherent() and xen_swiotlb_free_coherent() to avoid
>>> this. This will involve swapping if and else in the former.
>>>
>>>
>>>> Does that sound correct?
>>>>
>>>>> Thank you Boris for pointing it out.
>>>>>
>>>> Fixes: 4855c92dbb7 ("xen-sw..") ?
>>>>
>>>>> Signed-off-by: Joe Jin <joe.jin@oracle.com>
>>>>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>>>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>> Reported-by: Boris Ostrovs... ?
>>>>> Cc: Christoph Helwig <hch@lst.de>
>>>>> Cc: Dongli Zhang <dongli.zhang@oracle.com>
>>>>> Cc: John Sobecki <john.sobecki@oracle.com>
>>>>> ---
>>>>>  drivers/xen/swiotlb-xen.c | 4 ++--
>>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
>>>>> index f5c1af4ce9ab..aed92fa019f9 100644
>>>>> --- a/drivers/xen/swiotlb-xen.c
>>>>> +++ b/drivers/xen/swiotlb-xen.c
>>>>> @@ -357,8 +357,8 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
>>>>>  	/* Convert the size to actually allocated. */
>>>>>  	size = 1UL << (order + XEN_PAGE_SHIFT);
>>>>>  
>>>>> -	if (((dev_addr + size - 1 <= dma_mask)) ||
>>>>> -	    range_straddles_page_boundary(phys, size))
>>>>> +	if ((dev_addr + size - 1 <= dma_mask) &&
>>>>> +	    !range_straddles_page_boundary(phys, size))
>>>>>  		xen_destroy_contiguous_region(phys, order);
>>>
>>> I don't think this is right.
>>>
>>> if ((dev_addr + size - 1 > dma_mask) || range_straddles_page_boundary(phys, size))
>>>
>>> No?
>> No this is not correct.
>>
>> When allocate memory, it tried to allocated from Dom0/Guest, then check if physical
>> address is DMA memory also contiguous, if no, exchange with Hypervisor, code as below:
>>
>> 326         phys = *dma_handle;                                                     
>> 327         dev_addr = xen_phys_to_bus(phys);                                       
>> 328         if (((dev_addr + size - 1 <= dma_mask)) &&                              
>> 329             !range_straddles_page_boundary(phys, size))                         
>> 330                 *dma_handle = dev_addr;                                         
>> 331         else {                                                                  
>> 332                 if (xen_create_contiguous_region(phys, order,                   
>> 333                                                  fls64(dma_mask), dma_handle) != 0) {
>> 334                         xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs);
>> 335                         return NULL;                                            
>> 336                 }                                                               
>> 337         }                                                                       
>>                                                                      
>>
>> On freeing, need to return the memory to Xen, otherwise DMA memory will be used
>> up(this is the issue the patch intend to fix), so when memory is DMAable and
>> contiguous then call xen_destroy_contiguous_region(), return DMA memory to Xen.
> 
> So if you want to allocate 1 byte at address 0 (and dev_addr=phys),
> xen_create_contiguous_region() will not be called. And yet you will call
> xen_destroy_contiguous_region() in the free path.
> 
> Is this the expected behavior?

I could not say it's expected behavior, but I think it's reasonable.

On allocating, it used __get_free_pages() to allocate memory, if lucky the memory is 
DMAable, will not exchange memory with hypervisor, obviously this is not guaranteed.

And on freeing it could not be identified if memory from Dom0/guest own memory
or hypervisor, if don't back memory to hypervisor which will lead hypervisor DMA 
memory be used up, then on Dom0/guest, DMA request maybe failed, the worse thing is
could not start any new guest.

Thanks,
Joe

> 
> -boris
> 

  reply	other threads:[~2018-10-25 14:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-24  3:09 [PATCH] xen-swiotlb: exchange memory with Xen only when pages are contiguous Joe Jin
2018-10-24 13:02 ` Konrad Rzeszutek Wilk
2018-10-24 13:57   ` Boris Ostrovsky
2018-10-24 14:43     ` Joe Jin
2018-10-25 11:45       ` Boris Ostrovsky
2018-10-25 14:23         ` Joe Jin [this message]
2018-10-25 16:10           ` Boris Ostrovsky
2018-10-25 16:28             ` Joe Jin
2018-10-25 18:56               ` Joe Jin
2018-10-26  7:48                 ` Christoph Helwig
2018-10-26  8:54                   ` Dongli Zhang
2018-10-26 14:48                     ` Joe Jin
2018-10-26 14:40                   ` Joe Jin
2018-10-30  2:51                 ` Joe Jin
     [not found]                   ` <57e5593233c64dc0a36c7d4c750a1ed4@AMSPEX02CL03.citrite.net>
2018-10-30 14:12                     ` [Xen-devel] " Joe Jin
     [not found]                       ` <097f1f6f16f7415aa3a52a7c4f5e52dc@AMSPEX02CL03.citrite.net>
2018-10-30 14:48                         ` Joe Jin
2018-10-24 16:05   ` Joe Jin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=922914c9-22db-c5d1-33da-d07691ebd7d7@oracle.com \
    --to=joe.jin@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dongli.zhang@oracle.com \
    --cc=hch@lst.de \
    --cc=john.sobecki@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=konrad@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).