* [PATCH 1/2] kexec: remove unnecessary unusable_pages
@ 2016-07-12 4:56 zhongjiang
2016-07-12 4:56 ` [PATCH 2/2] kexec: add a pmd huge entry condition during the page table zhongjiang
2016-07-12 15:19 ` [PATCH 1/2] kexec: remove unnecessary unusable_pages Eric W. Biederman
0 siblings, 2 replies; 11+ messages in thread
From: zhongjiang @ 2016-07-12 4:56 UTC (permalink / raw)
To: ebiederm, dyoung, horms, vgoyal, yinghai, akpm; +Cc: kexec, linux-mm
From: zhong jiang <zhongjiang@huawei.com>
In general, kexec alloc pages from buddy system, it cannot exceed
the physical address in the system.
The patch just remove this unnecessary code, no functional change.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
include/linux/kexec.h | 1 -
kernel/kexec_core.c | 13 -------------
2 files changed, 14 deletions(-)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index e8acb2b..26e4917 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -162,7 +162,6 @@ struct kimage {
struct list_head control_pages;
struct list_head dest_pages;
- struct list_head unusable_pages;
/* Address of next control page to allocate for crash kernels. */
unsigned long control_page;
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 56b3ed0..448127d 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -257,9 +257,6 @@ struct kimage *do_kimage_alloc_init(void)
/* Initialize the list of destination pages */
INIT_LIST_HEAD(&image->dest_pages);
- /* Initialize the list of unusable pages */
- INIT_LIST_HEAD(&image->unusable_pages);
-
return image;
}
@@ -517,10 +514,6 @@ static void kimage_free_extra_pages(struct kimage *image)
{
/* Walk through and free any extra destination pages I may have */
kimage_free_page_list(&image->dest_pages);
-
- /* Walk through and free any unusable pages I have cached */
- kimage_free_page_list(&image->unusable_pages);
-
}
void kimage_terminate(struct kimage *image)
{
@@ -647,12 +640,6 @@ static struct page *kimage_alloc_page(struct kimage *image,
page = kimage_alloc_pages(gfp_mask, 0);
if (!page)
return NULL;
- /* If the page cannot be used file it away */
- if (page_to_pfn(page) >
- (KEXEC_SOURCE_MEMORY_LIMIT >> PAGE_SHIFT)) {
- list_add(&page->lru, &image->unusable_pages);
- continue;
- }
addr = page_to_pfn(page) << PAGE_SHIFT;
/* If it is the destination page we want use it */
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/2] kexec: add a pmd huge entry condition during the page table
2016-07-12 4:56 [PATCH 1/2] kexec: remove unnecessary unusable_pages zhongjiang
@ 2016-07-12 4:56 ` zhongjiang
2016-07-12 15:46 ` Eric W. Biederman
2016-07-12 15:19 ` [PATCH 1/2] kexec: remove unnecessary unusable_pages Eric W. Biederman
1 sibling, 1 reply; 11+ messages in thread
From: zhongjiang @ 2016-07-12 4:56 UTC (permalink / raw)
To: ebiederm, dyoung, horms, vgoyal, yinghai, akpm; +Cc: kexec, linux-mm
From: zhong jiang <zhongjiang@huawei.com>
when image is loaded into kernel, we need set up page table for it. and
all valid pfn also set up new mapping. it will tend to establish a pmd
page table in the form of a large page if pud_present is true. relocate_kernel
points to code segment can locate in the pmd huge entry in init_transtion_pgtable.
therefore, we need to take the situation into account.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
arch/x86/kernel/machine_kexec_64.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 5a294e4..c33e344 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -14,6 +14,7 @@
#include <linux/gfp.h>
#include <linux/reboot.h>
#include <linux/numa.h>
+#include <linux/hugetlb.h>
#include <linux/ftrace.h>
#include <linux/io.h>
#include <linux/suspend.h>
@@ -34,6 +35,17 @@ static struct kexec_file_ops *kexec_file_loaders[] = {
};
#endif
+static void split_pmd(pmd_t *pmd, pte_t *pte)
+{
+ unsigned long pfn = pmd_pfn(*pmd);
+ int i = 0;
+
+ do {
+ set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC));
+ pfn++;
+ } while (pte++, i++, i < PTRS_PER_PTE);
+}
+
static void free_transition_pgtable(struct kimage *image)
{
free_page((unsigned long)image->arch.pud);
@@ -68,15 +80,19 @@ static int init_transition_pgtable(struct kimage *image, pgd_t *pgd)
set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
}
pmd = pmd_offset(pud, vaddr);
- if (!pmd_present(*pmd)) {
+ if (!pmd_present(*pmd) || pmd_huge(*pmd)) {
pte = (pte_t *)get_zeroed_page(GFP_KERNEL);
if (!pte)
goto err;
image->arch.pte = pte;
- set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
+ if (pmd_huge(*pmd))
+ split_pmd(pmd, pte);
+ else
+ set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
}
pte = pte_offset_kernel(pmd, vaddr);
set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL_EXEC));
+
return 0;
err:
free_transition_pgtable(image);
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] kexec: add a pmd huge entry condition during the page table
2016-07-12 4:56 ` [PATCH 2/2] kexec: add a pmd huge entry condition during the page table zhongjiang
@ 2016-07-12 15:46 ` Eric W. Biederman
2016-07-13 7:01 ` zhong jiang
0 siblings, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2016-07-12 15:46 UTC (permalink / raw)
To: zhongjiang; +Cc: dyoung, horms, vgoyal, yinghai, akpm, linux-mm, kexec
zhongjiang <zhongjiang@huawei.com> writes:
> From: zhong jiang <zhongjiang@huawei.com>
>
> when image is loaded into kernel, we need set up page table for it. and
> all valid pfn also set up new mapping. it will tend to establish a pmd
> page table in the form of a large page if pud_present is true. relocate_kernel
> points to code segment can locate in the pmd huge entry in init_transtion_pgtable.
> therefore, we need to take the situation into account.
I can see how in theory this might be necessary but when is a kernel virtual
address on x86_64 that is above 0x8000000000000000 in conflict with an
identity mapped physicall address that are all below 0x8000000000000000?
If anything the code could be simplified to always assume those mappings
are unoccupied.
Did you run into an actual failure somewhere?
Eric
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
> ---
> arch/x86/kernel/machine_kexec_64.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
> index 5a294e4..c33e344 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -14,6 +14,7 @@
> #include <linux/gfp.h>
> #include <linux/reboot.h>
> #include <linux/numa.h>
> +#include <linux/hugetlb.h>
> #include <linux/ftrace.h>
> #include <linux/io.h>
> #include <linux/suspend.h>
> @@ -34,6 +35,17 @@ static struct kexec_file_ops *kexec_file_loaders[] = {
> };
> #endif
>
> +static void split_pmd(pmd_t *pmd, pte_t *pte)
> +{
> + unsigned long pfn = pmd_pfn(*pmd);
> + int i = 0;
> +
> + do {
> + set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC));
> + pfn++;
> + } while (pte++, i++, i < PTRS_PER_PTE);
> +}
> +
> static void free_transition_pgtable(struct kimage *image)
> {
> free_page((unsigned long)image->arch.pud);
> @@ -68,15 +80,19 @@ static int init_transition_pgtable(struct kimage *image, pgd_t *pgd)
> set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
> }
> pmd = pmd_offset(pud, vaddr);
> - if (!pmd_present(*pmd)) {
> + if (!pmd_present(*pmd) || pmd_huge(*pmd)) {
> pte = (pte_t *)get_zeroed_page(GFP_KERNEL);
> if (!pte)
> goto err;
> image->arch.pte = pte;
> - set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
> + if (pmd_huge(*pmd))
> + split_pmd(pmd, pte);
> + else
> + set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
> }
> pte = pte_offset_kernel(pmd, vaddr);
> set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL_EXEC));
> +
> return 0;
> err:
> free_transition_pgtable(image);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] kexec: add a pmd huge entry condition during the page table
2016-07-12 15:46 ` Eric W. Biederman
@ 2016-07-13 7:01 ` zhong jiang
2016-07-14 13:19 ` Eric W. Biederman
0 siblings, 1 reply; 11+ messages in thread
From: zhong jiang @ 2016-07-13 7:01 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: dyoung, horms, vgoyal, yinghai, akpm, linux-mm, kexec
On 2016/7/12 23:46, Eric W. Biederman wrote:
> zhongjiang <zhongjiang@huawei.com> writes:
>
>> From: zhong jiang <zhongjiang@huawei.com>
>>
>> when image is loaded into kernel, we need set up page table for it. and
>> all valid pfn also set up new mapping. it will tend to establish a pmd
>> page table in the form of a large page if pud_present is true. relocate_kernel
>> points to code segment can locate in the pmd huge entry in init_transtion_pgtable.
>> therefore, we need to take the situation into account.
> I can see how in theory this might be necessary but when is a kernel virtual
> address on x86_64 that is above 0x8000000000000000 in conflict with an
> identity mapped physicall address that are all below 0x8000000000000000?
>
> If anything the code could be simplified to always assume those mappings
> are unoccupied.
>
> Did you run into an actual failure somewhere?
>
> Eric
>
I do not understand what you trying to say, Maybe I miss your point.
The key is how to ensure that relocate_kernel points to the pmd entry is not huge page.
Thanks
zhongjiang
>> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
>> ---
>> arch/x86/kernel/machine_kexec_64.c | 20 ++++++++++++++++++--
>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
>> index 5a294e4..c33e344 100644
>> --- a/arch/x86/kernel/machine_kexec_64.c
>> +++ b/arch/x86/kernel/machine_kexec_64.c
>> @@ -14,6 +14,7 @@
>> #include <linux/gfp.h>
>> #include <linux/reboot.h>
>> #include <linux/numa.h>
>> +#include <linux/hugetlb.h>
>> #include <linux/ftrace.h>
>> #include <linux/io.h>
>> #include <linux/suspend.h>
>> @@ -34,6 +35,17 @@ static struct kexec_file_ops *kexec_file_loaders[] = {
>> };
>> #endif
>>
>> +static void split_pmd(pmd_t *pmd, pte_t *pte)
>> +{
>> + unsigned long pfn = pmd_pfn(*pmd);
>> + int i = 0;
>> +
>> + do {
>> + set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC));
>> + pfn++;
>> + } while (pte++, i++, i < PTRS_PER_PTE);
>> +}
>> +
>> static void free_transition_pgtable(struct kimage *image)
>> {
>> free_page((unsigned long)image->arch.pud);
>> @@ -68,15 +80,19 @@ static int init_transition_pgtable(struct kimage *image, pgd_t *pgd)
>> set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
>> }
>> pmd = pmd_offset(pud, vaddr);
>> - if (!pmd_present(*pmd)) {
>> + if (!pmd_present(*pmd) || pmd_huge(*pmd)) {
>> pte = (pte_t *)get_zeroed_page(GFP_KERNEL);
>> if (!pte)
>> goto err;
>> image->arch.pte = pte;
>> - set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
>> + if (pmd_huge(*pmd))
>> + split_pmd(pmd, pte);
>> + else
>> + set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
>> }
>> pte = pte_offset_kernel(pmd, vaddr);
>> set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL_EXEC));
>> +
>> return 0;
>> err:
>> free_transition_pgtable(image);
> .
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] kexec: add a pmd huge entry condition during the page table
2016-07-13 7:01 ` zhong jiang
@ 2016-07-14 13:19 ` Eric W. Biederman
2016-07-20 7:25 ` zhong jiang
0 siblings, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2016-07-14 13:19 UTC (permalink / raw)
To: zhong jiang; +Cc: dyoung, horms, vgoyal, yinghai, akpm, linux-mm, kexec
zhong jiang <zhongjiang@huawei.com> writes:
> On 2016/7/12 23:46, Eric W. Biederman wrote:
>> zhongjiang <zhongjiang@huawei.com> writes:
>>
>>> From: zhong jiang <zhongjiang@huawei.com>
>>>
>>> when image is loaded into kernel, we need set up page table for it. and
>>> all valid pfn also set up new mapping. it will tend to establish a pmd
>>> page table in the form of a large page if pud_present is true. relocate_kernel
>>> points to code segment can locate in the pmd huge entry in init_transtion_pgtable.
>>> therefore, we need to take the situation into account.
>> I can see how in theory this might be necessary but when is a kernel virtual
>> address on x86_64 that is above 0x8000000000000000 in conflict with an
>> identity mapped physicall address that are all below 0x8000000000000000?
>>
>> If anything the code could be simplified to always assume those mappings
>> are unoccupied.
>>
>> Did you run into an actual failure somewhere?
>>
>> Eric
>>
> I do not understand what you trying to say, Maybe I miss your point.
>
> The key is how to ensure that relocate_kernel points to the pmd
> entry is not huge page.
Kernel virtual addresses are in the negative half of the address space.
Identity mapped physical addresses are in the positive half of the
address space.
As the entire negative half of the address space at the time that page
table entry is being created the are no huge pages present.
Even testing pmd_present is a redundant, and that is probably the bug.
Eric
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] kexec: add a pmd huge entry condition during the page table
2016-07-14 13:19 ` Eric W. Biederman
@ 2016-07-20 7:25 ` zhong jiang
0 siblings, 0 replies; 11+ messages in thread
From: zhong jiang @ 2016-07-20 7:25 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: dyoung, horms, vgoyal, yinghai, akpm, linux-mm, kexec
On 2016/7/14 21:19, Eric W. Biederman wrote:
> zhong jiang <zhongjiang@huawei.com> writes:
>
>> On 2016/7/12 23:46, Eric W. Biederman wrote:
>>> zhongjiang <zhongjiang@huawei.com> writes:
>>>
>>>> From: zhong jiang <zhongjiang@huawei.com>
>>>>
>>>> when image is loaded into kernel, we need set up page table for it. and
>>>> all valid pfn also set up new mapping. it will tend to establish a pmd
>>>> page table in the form of a large page if pud_present is true. relocate_kernel
>>>> points to code segment can locate in the pmd huge entry in init_transtion_pgtable.
>>>> therefore, we need to take the situation into account.
>>> I can see how in theory this might be necessary but when is a kernel virtual
>>> address on x86_64 that is above 0x8000000000000000 in conflict with an
>>> identity mapped physicall address that are all below 0x8000000000000000?
>>>
>>> If anything the code could be simplified to always assume those mappings
>>> are unoccupied.
>>>
>>> Did you run into an actual failure somewhere?
>>>
>>> Eric
>>>
>> I do not understand what you trying to say, Maybe I miss your point.
>>
>> The key is how to ensure that relocate_kernel points to the pmd
>> entry is not huge page.
> Kernel virtual addresses are in the negative half of the address space.
> Identity mapped physical addresses are in the positive half of the
> address space.
>
> As the entire negative half of the address space at the time that page
> table entry is being created the are no huge pages present.
>
> Even testing pmd_present is a redundant, and that is probably the bug.
>
> Eric
>
> .
ok , I know your mean. we allocate new pgd page, that is control_code_page,
to rebuild new mapping machanism in init_pgtable. because the relocate_kernel
is in the negative half of the address space. and The page table is not establise
for the new pgd. To my surprise, if the page table is not exist, why we need check
p(g,u,m)d_present() . if not , I still think that it can exist a pmd huge .
or Maybe I misunderstand its meaning.
Thanks
zhongjiang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] kexec: remove unnecessary unusable_pages
2016-07-12 4:56 [PATCH 1/2] kexec: remove unnecessary unusable_pages zhongjiang
2016-07-12 4:56 ` [PATCH 2/2] kexec: add a pmd huge entry condition during the page table zhongjiang
@ 2016-07-12 15:19 ` Eric W. Biederman
2016-07-13 4:08 ` zhong jiang
1 sibling, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2016-07-12 15:19 UTC (permalink / raw)
To: zhongjiang; +Cc: dyoung, horms, vgoyal, yinghai, akpm, kexec, linux-mm
zhongjiang <zhongjiang@huawei.com> writes:
> From: zhong jiang <zhongjiang@huawei.com>
>
> In general, kexec alloc pages from buddy system, it cannot exceed
> the physical address in the system.
>
> The patch just remove this unnecessary code, no functional change.
On 32bit systems with highmem support kexec can very easily receive a
page from the buddy allocator that can exceed 4GiB. This doesn't show
up on 64bit systems as typically the memory limits are less than the
address space. But this code is very necessary on some systems and
removing it is not ok.
Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
> ---
> include/linux/kexec.h | 1 -
> kernel/kexec_core.c | 13 -------------
> 2 files changed, 14 deletions(-)
>
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e8acb2b..26e4917 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -162,7 +162,6 @@ struct kimage {
>
> struct list_head control_pages;
> struct list_head dest_pages;
> - struct list_head unusable_pages;
>
> /* Address of next control page to allocate for crash kernels. */
> unsigned long control_page;
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 56b3ed0..448127d 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -257,9 +257,6 @@ struct kimage *do_kimage_alloc_init(void)
> /* Initialize the list of destination pages */
> INIT_LIST_HEAD(&image->dest_pages);
>
> - /* Initialize the list of unusable pages */
> - INIT_LIST_HEAD(&image->unusable_pages);
> -
> return image;
> }
>
> @@ -517,10 +514,6 @@ static void kimage_free_extra_pages(struct kimage *image)
> {
> /* Walk through and free any extra destination pages I may have */
> kimage_free_page_list(&image->dest_pages);
> -
> - /* Walk through and free any unusable pages I have cached */
> - kimage_free_page_list(&image->unusable_pages);
> -
> }
> void kimage_terminate(struct kimage *image)
> {
> @@ -647,12 +640,6 @@ static struct page *kimage_alloc_page(struct kimage *image,
> page = kimage_alloc_pages(gfp_mask, 0);
> if (!page)
> return NULL;
> - /* If the page cannot be used file it away */
> - if (page_to_pfn(page) >
> - (KEXEC_SOURCE_MEMORY_LIMIT >> PAGE_SHIFT)) {
> - list_add(&page->lru, &image->unusable_pages);
> - continue;
> - }
> addr = page_to_pfn(page) << PAGE_SHIFT;
>
> /* If it is the destination page we want use it */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] kexec: remove unnecessary unusable_pages
2016-07-12 15:19 ` [PATCH 1/2] kexec: remove unnecessary unusable_pages Eric W. Biederman
@ 2016-07-13 4:08 ` zhong jiang
2016-07-13 5:07 ` Eric W. Biederman
0 siblings, 1 reply; 11+ messages in thread
From: zhong jiang @ 2016-07-13 4:08 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: dyoung, horms, vgoyal, yinghai, akpm, kexec, linux-mm
On 2016/7/12 23:19, Eric W. Biederman wrote:
> zhongjiang <zhongjiang@huawei.com> writes:
>
>> From: zhong jiang <zhongjiang@huawei.com>
>>
>> In general, kexec alloc pages from buddy system, it cannot exceed
>> the physical address in the system.
>>
>> The patch just remove this unnecessary code, no functional change.
> On 32bit systems with highmem support kexec can very easily receive a
> page from the buddy allocator that can exceed 4GiB. This doesn't show
> up on 64bit systems as typically the memory limits are less than the
> address space. But this code is very necessary on some systems and
> removing it is not ok.
>
> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
This viewpoint is as opposed to me, 32bit systems architectural decide it can not
access exceed 4GiB whether the highmem or not. but there is one exception,
when PAE enable, its physical address should be extended to 36, new paging mechanism
established for it. therefore, the page from the buddy allocator can exceed 4GiB.
moreover, on 32bit systems I can not understand why KEXEC_SOURCE_MEMORY_LIMIT
is defined to -1UL. therefore, kimge_aloc_page allocate page will always add to unusable_pages.
Thanks
zhongjiang
>> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
>> ---
>> include/linux/kexec.h | 1 -
>> kernel/kexec_core.c | 13 -------------
>> 2 files changed, 14 deletions(-)
>>
>> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
>> index e8acb2b..26e4917 100644
>> --- a/include/linux/kexec.h
>> +++ b/include/linux/kexec.h
>> @@ -162,7 +162,6 @@ struct kimage {
>>
>> struct list_head control_pages;
>> struct list_head dest_pages;
>> - struct list_head unusable_pages;
>>
>> /* Address of next control page to allocate for crash kernels. */
>> unsigned long control_page;
>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>> index 56b3ed0..448127d 100644
>> --- a/kernel/kexec_core.c
>> +++ b/kernel/kexec_core.c
>> @@ -257,9 +257,6 @@ struct kimage *do_kimage_alloc_init(void)
>> /* Initialize the list of destination pages */
>> INIT_LIST_HEAD(&image->dest_pages);
>>
>> - /* Initialize the list of unusable pages */
>> - INIT_LIST_HEAD(&image->unusable_pages);
>> -
>> return image;
>> }
>>
>> @@ -517,10 +514,6 @@ static void kimage_free_extra_pages(struct kimage *image)
>> {
>> /* Walk through and free any extra destination pages I may have */
>> kimage_free_page_list(&image->dest_pages);
>> -
>> - /* Walk through and free any unusable pages I have cached */
>> - kimage_free_page_list(&image->unusable_pages);
>> -
>> }
>> void kimage_terminate(struct kimage *image)
>> {
>> @@ -647,12 +640,6 @@ static struct page *kimage_alloc_page(struct kimage *image,
>> page = kimage_alloc_pages(gfp_mask, 0);
>> if (!page)
>> return NULL;
>> - /* If the page cannot be used file it away */
>> - if (page_to_pfn(page) >
>> - (KEXEC_SOURCE_MEMORY_LIMIT >> PAGE_SHIFT)) {
>> - list_add(&page->lru, &image->unusable_pages);
>> - continue;
>> - }
>> addr = page_to_pfn(page) << PAGE_SHIFT;
>>
>> /* If it is the destination page we want use it */
> .
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] kexec: remove unnecessary unusable_pages
2016-07-13 4:08 ` zhong jiang
@ 2016-07-13 5:07 ` Eric W. Biederman
2016-07-13 7:07 ` zhong jiang
0 siblings, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2016-07-13 5:07 UTC (permalink / raw)
To: zhong jiang; +Cc: dyoung, horms, vgoyal, yinghai, akpm, kexec, linux-mm
zhong jiang <zhongjiang@huawei.com> writes:
> On 2016/7/12 23:19, Eric W. Biederman wrote:
>> zhongjiang <zhongjiang@huawei.com> writes:
>>
>>> From: zhong jiang <zhongjiang@huawei.com>
>>>
>>> In general, kexec alloc pages from buddy system, it cannot exceed
>>> the physical address in the system.
>>>
>>> The patch just remove this unnecessary code, no functional change.
>> On 32bit systems with highmem support kexec can very easily receive a
>> page from the buddy allocator that can exceed 4GiB. This doesn't show
>> up on 64bit systems as typically the memory limits are less than the
>> address space. But this code is very necessary on some systems and
>> removing it is not ok.
>>
>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>
> This viewpoint is as opposed to me, 32bit systems architectural decide it can not
> access exceed 4GiB whether the highmem or not. but there is one exception,
> when PAE enable, its physical address should be extended to 36, new paging mechanism
> established for it. therefore, the page from the buddy allocator
> can exceed 4GiB.
Exactly. And I was dealing with PAE systems in 2001 or so with > 4GiB
of RAM. Which is where the unusable_pages work comes from.
Other architectures such as ARM also followed a similar path, so
it isn't just x86 that has 32bit systems with > 32 address lines.
> moreover, on 32bit systems I can not understand why KEXEC_SOURCE_MEMORY_LIMIT
> is defined to -1UL. therefore, kimge_aloc_page allocate page will always add to unusable_pages.
-1UL is a short way of writing 0xffffffffUL Which is as close as you
can get to writing 0x100000000UL in 32bits.
kimage_alloc_page won't always add to unusable_pages as there is memory
below 4GiB but it isn't easily found so there may temporarily be a
memory shortage, as it allocates it's way there. Unfortunately whenever
I have looked there are memory zones that line up with the memory the
kexec is looking for. So it does a little bit of a weird dance to get
the memory it needs and to discard the memory it can't use.
Eric
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] kexec: remove unnecessary unusable_pages
2016-07-13 5:07 ` Eric W. Biederman
@ 2016-07-13 7:07 ` zhong jiang
0 siblings, 0 replies; 11+ messages in thread
From: zhong jiang @ 2016-07-13 7:07 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: dyoung, horms, vgoyal, yinghai, akpm, kexec, linux-mm
On 2016/7/13 13:07, Eric W. Biederman wrote:
> zhong jiang <zhongjiang@huawei.com> writes:
>
>> On 2016/7/12 23:19, Eric W. Biederman wrote:
>>> zhongjiang <zhongjiang@huawei.com> writes:
>>>
>>>> From: zhong jiang <zhongjiang@huawei.com>
>>>>
>>>> In general, kexec alloc pages from buddy system, it cannot exceed
>>>> the physical address in the system.
>>>>
>>>> The patch just remove this unnecessary code, no functional change.
>>> On 32bit systems with highmem support kexec can very easily receive a
>>> page from the buddy allocator that can exceed 4GiB. This doesn't show
>>> up on 64bit systems as typically the memory limits are less than the
>>> address space. But this code is very necessary on some systems and
>>> removing it is not ok.
>>>
>>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>>
>> This viewpoint is as opposed to me, 32bit systems architectural decide it can not
>> access exceed 4GiB whether the highmem or not. but there is one exception,
>> when PAE enable, its physical address should be extended to 36, new paging mechanism
>> established for it. therefore, the page from the buddy allocator
>> can exceed 4GiB.
> Exactly. And I was dealing with PAE systems in 2001 or so with > 4GiB
> of RAM. Which is where the unusable_pages work comes from.
>
> Other architectures such as ARM also followed a similar path, so
> it isn't just x86 that has 32bit systems with > 32 address lines.
>
>> moreover, on 32bit systems I can not understand why KEXEC_SOURCE_MEMORY_LIMIT
>> is defined to -1UL. therefore, kimge_aloc_page allocate page will always add to unusable_pages.
> -1UL is a short way of writing 0xffffffffUL Which is as close as you
> can get to writing 0x100000000UL in 32bits.
>
> kimage_alloc_page won't always add to unusable_pages as there is memory
> below 4GiB but it isn't easily found so there may temporarily be a
> memory shortage, as it allocates it's way there. Unfortunately whenever
> I have looked there are memory zones that line up with the memory the
> kexec is looking for. So it does a little bit of a weird dance to get
> the memory it needs and to discard the memory it can't use.
>
> Eric
>
Thanks , I get it.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] kexec: remove unnecessary unusable_pages
@ 2016-07-11 6:36 zhongjiang
0 siblings, 0 replies; 11+ messages in thread
From: zhongjiang @ 2016-07-11 6:36 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, linux-kernel
From: zhong jiang <zhongjiang@huawei.com>
In general, kexec alloc pages from buddy system, it cannot exceed
the physical address in the system.
The patch just remove this code, no functional change.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
include/linux/kexec.h | 1 -
kernel/kexec_core.c | 13 -------------
2 files changed, 14 deletions(-)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index e8acb2b..26e4917 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -162,7 +162,6 @@ struct kimage {
struct list_head control_pages;
struct list_head dest_pages;
- struct list_head unusable_pages;
/* Address of next control page to allocate for crash kernels. */
unsigned long control_page;
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 56b3ed0..448127d 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -257,9 +257,6 @@ struct kimage *do_kimage_alloc_init(void)
/* Initialize the list of destination pages */
INIT_LIST_HEAD(&image->dest_pages);
- /* Initialize the list of unusable pages */
- INIT_LIST_HEAD(&image->unusable_pages);
-
return image;
}
@@ -517,10 +514,6 @@ static void kimage_free_extra_pages(struct kimage *image)
{
/* Walk through and free any extra destination pages I may have */
kimage_free_page_list(&image->dest_pages);
-
- /* Walk through and free any unusable pages I have cached */
- kimage_free_page_list(&image->unusable_pages);
-
}
void kimage_terminate(struct kimage *image)
{
@@ -647,12 +640,6 @@ static struct page *kimage_alloc_page(struct kimage *image,
page = kimage_alloc_pages(gfp_mask, 0);
if (!page)
return NULL;
- /* If the page cannot be used file it away */
- if (page_to_pfn(page) >
- (KEXEC_SOURCE_MEMORY_LIMIT >> PAGE_SHIFT)) {
- list_add(&page->lru, &image->unusable_pages);
- continue;
- }
addr = page_to_pfn(page) << PAGE_SHIFT;
/* If it is the destination page we want use it */
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-07-20 7:37 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-12 4:56 [PATCH 1/2] kexec: remove unnecessary unusable_pages zhongjiang
2016-07-12 4:56 ` [PATCH 2/2] kexec: add a pmd huge entry condition during the page table zhongjiang
2016-07-12 15:46 ` Eric W. Biederman
2016-07-13 7:01 ` zhong jiang
2016-07-14 13:19 ` Eric W. Biederman
2016-07-20 7:25 ` zhong jiang
2016-07-12 15:19 ` [PATCH 1/2] kexec: remove unnecessary unusable_pages Eric W. Biederman
2016-07-13 4:08 ` zhong jiang
2016-07-13 5:07 ` Eric W. Biederman
2016-07-13 7:07 ` zhong jiang
-- strict thread matches above, loose matches on Subject: below --
2016-07-11 6:36 zhongjiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).