* question about migration @ 2015-12-24 2:29 Wen Congyang 2015-12-24 12:36 ` Andrew Cooper 0 siblings, 1 reply; 21+ messages in thread From: Wen Congyang @ 2015-12-24 2:29 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen devel Hi Andrew Cooper: I rebase the COLO codes to the newest upstream xen, and test it. I found a problem in the test, and I can reproduce this problem via the migration. How to reproduce: 1. xl cr -p hvm_nopv 2. xl migrate hvm_nopv 192.168.3.1 The migration successes, but the vm doesn't run in the target machine. You can get the reason from 'xl dmesg': (XEN) HVM2 restore: VMCE_VCPU 1 (XEN) HVM2 restore: TSC_ADJUST 0 (XEN) HVM2 restore: TSC_ADJUST 1 (d2) HVM Loader (d2) Detected Xen v4.7-unstable (d2) Get guest memory maps[128] failed. (-38) (d2) *** HVMLoader bug at e820.c:39 (d2) *** HVMLoader crashed. The reason is that: We don't call xc_domain_set_memory_map() in the target machine. When we create a hvm domain: libxl__domain_build() libxl__build_hvm() libxl__arch_domain_construct_memmap() xc_domain_set_memory_map() Should we migrate the guest memory from source machine to target machine? Thanks Wen Congyang ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2015-12-24 2:29 question about migration Wen Congyang @ 2015-12-24 12:36 ` Andrew Cooper 2015-12-25 0:55 ` Wen Congyang 2015-12-25 1:45 ` Wen Congyang 0 siblings, 2 replies; 21+ messages in thread From: Andrew Cooper @ 2015-12-24 12:36 UTC (permalink / raw) To: Wen Congyang; +Cc: xen devel On 24/12/15 02:29, Wen Congyang wrote: > Hi Andrew Cooper: > > I rebase the COLO codes to the newest upstream xen, and test it. I found > a problem in the test, and I can reproduce this problem via the migration. > > How to reproduce: > 1. xl cr -p hvm_nopv > 2. xl migrate hvm_nopv 192.168.3.1 You are the very first person to try a usecase like this. It works as much as it does because of your changes to the uncooperative HVM domain logic. I have said repeatedly during review, this is not necessarily a safe change to make without an in-depth analysis of the knock-on effects; it looks as if you have found the first knock-on effect. > > The migration successes, but the vm doesn't run in the target machine. > You can get the reason from 'xl dmesg': > (XEN) HVM2 restore: VMCE_VCPU 1 > (XEN) HVM2 restore: TSC_ADJUST 0 > (XEN) HVM2 restore: TSC_ADJUST 1 > (d2) HVM Loader > (d2) Detected Xen v4.7-unstable > (d2) Get guest memory maps[128] failed. (-38) > (d2) *** HVMLoader bug at e820.c:39 > (d2) *** HVMLoader crashed. > > The reason is that: > We don't call xc_domain_set_memory_map() in the target machine. > When we create a hvm domain: > libxl__domain_build() > libxl__build_hvm() > libxl__arch_domain_construct_memmap() > xc_domain_set_memory_map() > > Should we migrate the guest memory from source machine to target machine? This bug specifically is because HVMLoader is expected to have run and turned the hypercall information in an E820 table in the guest before a migration occurs. Unfortunately, the current codebase is riddled with such assumption and expectations (e.g. the HVM save code assumed that FPU context is valid when it is saving register state) which is a direct side effect of how it was developed. Having said all of the above, I agree that your example is a usecase which should work. It is the ultimate test of whether the migration stream contains enough information to faithfully reproduce the domain on the far side. Clearly at the moment, this is not the case. I have an upcoming project to work on the domain memory layout logic, because it is unsuitable for a number of XenServer usecases. Part of that will require moving it in the migration stream. ~Andrew ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2015-12-24 12:36 ` Andrew Cooper @ 2015-12-25 0:55 ` Wen Congyang 2015-12-29 10:57 ` Andrew Cooper 2015-12-25 1:45 ` Wen Congyang 1 sibling, 1 reply; 21+ messages in thread From: Wen Congyang @ 2015-12-25 0:55 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen devel On 12/24/2015 08:36 PM, Andrew Cooper wrote: > On 24/12/15 02:29, Wen Congyang wrote: >> Hi Andrew Cooper: >> >> I rebase the COLO codes to the newest upstream xen, and test it. I found >> a problem in the test, and I can reproduce this problem via the migration. >> >> How to reproduce: >> 1. xl cr -p hvm_nopv >> 2. xl migrate hvm_nopv 192.168.3.1 > > You are the very first person to try a usecase like this. > > It works as much as it does because of your changes to the uncooperative HVM domain logic. I have said repeatedly during review, this is not necessarily a safe change to make without an in-depth analysis of the knock-on effects; it looks as if you have found the first knock-on effect. > >> >> The migration successes, but the vm doesn't run in the target machine. >> You can get the reason from 'xl dmesg': >> (XEN) HVM2 restore: VMCE_VCPU 1 >> (XEN) HVM2 restore: TSC_ADJUST 0 >> (XEN) HVM2 restore: TSC_ADJUST 1 >> (d2) HVM Loader >> (d2) Detected Xen v4.7-unstable >> (d2) Get guest memory maps[128] failed. (-38) >> (d2) *** HVMLoader bug at e820.c:39 >> (d2) *** HVMLoader crashed. >> >> The reason is that: >> We don't call xc_domain_set_memory_map() in the target machine. >> When we create a hvm domain: >> libxl__domain_build() >> libxl__build_hvm() >> libxl__arch_domain_construct_memmap() >> xc_domain_set_memory_map() >> >> Should we migrate the guest memory from source machine to target machine? > > This bug specifically is because HVMLoader is expected to have run and turned the hypercall information in an E820 table in the guest before a migration occurs. > > Unfortunately, the current codebase is riddled with such assumption and expectations (e.g. the HVM save code assumed that FPU context is valid when it is saving register state) which is a direct side effect of how it was developed. Does FPU context have the similar problem? IIRC, I have tested colo befroe 4.6 is released. It works. In my test, I always use the option '-p' to start the HVM guest. > > > Having said all of the above, I agree that your example is a usecase which should work. It is the ultimate test of whether the migration stream contains enough information to faithfully reproduce the domain on the far side. Clearly at the moment, this is not the case. I think it should work. But the user doesn't use the migration like this. So it is not a serious problem. Thanks Wen Congyang > > I have an upcoming project to work on the domain memory layout logic, because it is unsuitable for a number of XenServer usecases. Part of that will require moving it in the migration stream. > > ~Andrew > > > . > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2015-12-25 0:55 ` Wen Congyang @ 2015-12-29 10:57 ` Andrew Cooper 0 siblings, 0 replies; 21+ messages in thread From: Andrew Cooper @ 2015-12-29 10:57 UTC (permalink / raw) To: Wen Congyang; +Cc: xen devel On 25/12/2015 00:55, Wen Congyang wrote: > On 12/24/2015 08:36 PM, Andrew Cooper wrote: >> On 24/12/15 02:29, Wen Congyang wrote: >>> Hi Andrew Cooper: >>> >>> I rebase the COLO codes to the newest upstream xen, and test it. I found >>> a problem in the test, and I can reproduce this problem via the migration. >>> >>> How to reproduce: >>> 1. xl cr -p hvm_nopv >>> 2. xl migrate hvm_nopv 192.168.3.1 >> You are the very first person to try a usecase like this. >> >> It works as much as it does because of your changes to the uncooperative HVM domain logic. I have said repeatedly during review, this is not necessarily a safe change to make without an in-depth analysis of the knock-on effects; it looks as if you have found the first knock-on effect. >> >>> The migration successes, but the vm doesn't run in the target machine. >>> You can get the reason from 'xl dmesg': >>> (XEN) HVM2 restore: VMCE_VCPU 1 >>> (XEN) HVM2 restore: TSC_ADJUST 0 >>> (XEN) HVM2 restore: TSC_ADJUST 1 >>> (d2) HVM Loader >>> (d2) Detected Xen v4.7-unstable >>> (d2) Get guest memory maps[128] failed. (-38) >>> (d2) *** HVMLoader bug at e820.c:39 >>> (d2) *** HVMLoader crashed. >>> >>> The reason is that: >>> We don't call xc_domain_set_memory_map() in the target machine. >>> When we create a hvm domain: >>> libxl__domain_build() >>> libxl__build_hvm() >>> libxl__arch_domain_construct_memmap() >>> xc_domain_set_memory_map() >>> >>> Should we migrate the guest memory from source machine to target machine? >> This bug specifically is because HVMLoader is expected to have run and turned the hypercall information in an E820 table in the guest before a migration occurs. >> >> Unfortunately, the current codebase is riddled with such assumption and expectations (e.g. the HVM save code assumed that FPU context is valid when it is saving register state) which is a direct side effect of how it was developed. > Does FPU context have the similar problem? Yes, although it is far harder to spot, and no software will likely crash as a result. > IIRC, I have tested colo befroe 4.6 is released. It works. In my test, I always > use the option '-p' to start the HVM guest. If the FPU wasn't initialised, the save code memset()'s the x87 register block to 0. On the restore side, this is taken an loaded back. The problem is that a block of zeroes is valid for the x87, and not the default which the vcpu would expect to observe, given no resetting itself. However, the first thing any real software will do is reset the values properly. > >> >> Having said all of the above, I agree that your example is a usecase which should work. It is the ultimate test of whether the migration stream contains enough information to faithfully reproduce the domain on the far side. Clearly at the moment, this is not the case. > I think it should work. But the user doesn't use the migration like this. > So it is not a serious problem. It is still worth identifying as an issue. ~Andrew ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2015-12-24 12:36 ` Andrew Cooper 2015-12-25 0:55 ` Wen Congyang @ 2015-12-25 1:45 ` Wen Congyang 2015-12-25 3:06 ` Wen Congyang 2015-12-29 11:24 ` Andrew Cooper 1 sibling, 2 replies; 21+ messages in thread From: Wen Congyang @ 2015-12-25 1:45 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen devel On 12/24/2015 08:36 PM, Andrew Cooper wrote: > On 24/12/15 02:29, Wen Congyang wrote: >> Hi Andrew Cooper: >> >> I rebase the COLO codes to the newest upstream xen, and test it. I found >> a problem in the test, and I can reproduce this problem via the migration. >> >> How to reproduce: >> 1. xl cr -p hvm_nopv >> 2. xl migrate hvm_nopv 192.168.3.1 > > You are the very first person to try a usecase like this. > > It works as much as it does because of your changes to the uncooperative HVM domain logic. I have said repeatedly during review, this is not necessarily a safe change to make without an in-depth analysis of the knock-on effects; it looks as if you have found the first knock-on effect. > >> >> The migration successes, but the vm doesn't run in the target machine. >> You can get the reason from 'xl dmesg': >> (XEN) HVM2 restore: VMCE_VCPU 1 >> (XEN) HVM2 restore: TSC_ADJUST 0 >> (XEN) HVM2 restore: TSC_ADJUST 1 >> (d2) HVM Loader >> (d2) Detected Xen v4.7-unstable >> (d2) Get guest memory maps[128] failed. (-38) >> (d2) *** HVMLoader bug at e820.c:39 >> (d2) *** HVMLoader crashed. >> >> The reason is that: >> We don't call xc_domain_set_memory_map() in the target machine. >> When we create a hvm domain: >> libxl__domain_build() >> libxl__build_hvm() >> libxl__arch_domain_construct_memmap() >> xc_domain_set_memory_map() >> >> Should we migrate the guest memory from source machine to target machine? > > This bug specifically is because HVMLoader is expected to have run and turned the hypercall information in an E820 table in the guest before a migration occurs. > > Unfortunately, the current codebase is riddled with such assumption and expectations (e.g. the HVM save code assumed that FPU context is valid when it is saving register state) which is a direct side effect of how it was developed. > > > Having said all of the above, I agree that your example is a usecase which should work. It is the ultimate test of whether the migration stream contains enough information to faithfully reproduce the domain on the far side. Clearly at the moment, this is not the case. > > I have an upcoming project to work on the domain memory layout logic, because it is unsuitable for a number of XenServer usecases. Part of that will require moving it in the migration stream. I found another migration problem in the test: If the migration fails, we will resume it in the source side. But the hvm guest doesn't response any more. In my test envirionment, the migration always successses, so I use a hack way to reproduce it: 1. modify the target xen tools: diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c index 258dec4..da95606 100644 --- a/tools/libxl/libxl_stream_read.c +++ b/tools/libxl/libxl_stream_read.c @@ -767,6 +767,8 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void, goto err; } + rc = ERROR_FAIL; + err: check_all_finished(egc, stream, rc); 2. xl cr hvm_nopv, and wait some time(You can login to the guest) 3. xl migrate hvm_nopv 192.168.3.1 The reason it that: We create a default ioreq server when we get the hvm param HVM_PARAM_IOREQ_PFN. It means that: the problem occurs only when the migration fails after we get the hvm param HVM_PARAM_IOREQ_PFN. In the function hvm_select_ioreq_server() If the I/O will be handed by non-default ioreq server, we will return the non-default ioreq server. In this case, it is handed by qemu. If the I/O will not be handed by non-default ioreq server, we will return the default ioreq server. Before migration, we return NULL, and after migration it is not NULL. See the caller is hvmemul_do_io(): case X86EMUL_UNHANDLEABLE: { struct hvm_ioreq_server *s = hvm_select_ioreq_server(curr->domain, &p); /* If there is no suitable backing DM, just ignore accesses */ if ( !s ) { rc = hvm_process_io_intercept(&null_handler, &p); vio->io_req.state = STATE_IOREQ_NONE; } else { rc = hvm_send_ioreq(s, &p, 0); if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down ) vio->io_req.state = STATE_IOREQ_NONE; else if ( data_is_addr ) rc = X86EMUL_OKAY; } break; We send the I/O request to the default I/O request server, but no backing DM hands it. We wil wait the I/O forever...... Thanks Wen Congyang > > ~Andrew > > > . > ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: question about migration 2015-12-25 1:45 ` Wen Congyang @ 2015-12-25 3:06 ` Wen Congyang 2015-12-29 12:46 ` Andrew Cooper 2015-12-29 11:24 ` Andrew Cooper 1 sibling, 1 reply; 21+ messages in thread From: Wen Congyang @ 2015-12-25 3:06 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen devel On 12/25/2015 09:45 AM, Wen Congyang wrote: > On 12/24/2015 08:36 PM, Andrew Cooper wrote: >> On 24/12/15 02:29, Wen Congyang wrote: >>> Hi Andrew Cooper: >>> >>> I rebase the COLO codes to the newest upstream xen, and test it. I found >>> a problem in the test, and I can reproduce this problem via the migration. >>> >>> How to reproduce: >>> 1. xl cr -p hvm_nopv >>> 2. xl migrate hvm_nopv 192.168.3.1 >> >> You are the very first person to try a usecase like this. >> >> It works as much as it does because of your changes to the uncooperative HVM domain logic. I have said repeatedly during review, this is not necessarily a safe change to make without an in-depth analysis of the knock-on effects; it looks as if you have found the first knock-on effect. >> >>> >>> The migration successes, but the vm doesn't run in the target machine. >>> You can get the reason from 'xl dmesg': >>> (XEN) HVM2 restore: VMCE_VCPU 1 >>> (XEN) HVM2 restore: TSC_ADJUST 0 >>> (XEN) HVM2 restore: TSC_ADJUST 1 >>> (d2) HVM Loader >>> (d2) Detected Xen v4.7-unstable >>> (d2) Get guest memory maps[128] failed. (-38) >>> (d2) *** HVMLoader bug at e820.c:39 >>> (d2) *** HVMLoader crashed. >>> >>> The reason is that: >>> We don't call xc_domain_set_memory_map() in the target machine. >>> When we create a hvm domain: >>> libxl__domain_build() >>> libxl__build_hvm() >>> libxl__arch_domain_construct_memmap() >>> xc_domain_set_memory_map() >>> >>> Should we migrate the guest memory from source machine to target machine? >> >> This bug specifically is because HVMLoader is expected to have run and turned the hypercall information in an E820 table in the guest before a migration occurs. >> >> Unfortunately, the current codebase is riddled with such assumption and expectations (e.g. the HVM save code assumed that FPU context is valid when it is saving register state) which is a direct side effect of how it was developed. >> >> >> Having said all of the above, I agree that your example is a usecase which should work. It is the ultimate test of whether the migration stream contains enough information to faithfully reproduce the domain on the far side. Clearly at the moment, this is not the case. >> >> I have an upcoming project to work on the domain memory layout logic, because it is unsuitable for a number of XenServer usecases. Part of that will require moving it in the migration stream. > > I found another migration problem in the test: > If the migration fails, we will resume it in the source side. > But the hvm guest doesn't response any more. > > In my test envirionment, the migration always successses, so I > use a hack way to reproduce it: > 1. modify the target xen tools: > > diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c > index 258dec4..da95606 100644 > --- a/tools/libxl/libxl_stream_read.c > +++ b/tools/libxl/libxl_stream_read.c > @@ -767,6 +767,8 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void, > goto err; > } > > + rc = ERROR_FAIL; > + > err: > check_all_finished(egc, stream, rc); > > 2. xl cr hvm_nopv, and wait some time(You can login to the guest) > 3. xl migrate hvm_nopv 192.168.3.1 Another problem: If migration fails after the guest is suspended, we will resume it in the source. In this case, we cannot shutdown it. because no process hanlds the shutdown event. The log in /var/log/xen/xl-hvm_nopv.log: Waiting for domain hvm_nopv (domid 1) to die [pid 5508] Domain 1 has shut down, reason code 2 0x2 Domain has suspended. Done. Exiting now The xl has exited... Thanks Wen Congyang > > The reason it that: > We create a default ioreq server when we get the hvm param HVM_PARAM_IOREQ_PFN. > It means that: the problem occurs only when the migration fails after we get > the hvm param HVM_PARAM_IOREQ_PFN. > > In the function hvm_select_ioreq_server() > If the I/O will be handed by non-default ioreq server, we will return the > non-default ioreq server. In this case, it is handed by qemu. > If the I/O will not be handed by non-default ioreq server, we will return > the default ioreq server. Before migration, we return NULL, and after migration > it is not NULL. > See the caller is hvmemul_do_io(): > case X86EMUL_UNHANDLEABLE: > { > struct hvm_ioreq_server *s = > hvm_select_ioreq_server(curr->domain, &p); > > /* If there is no suitable backing DM, just ignore accesses */ > if ( !s ) > { > rc = hvm_process_io_intercept(&null_handler, &p); > vio->io_req.state = STATE_IOREQ_NONE; > } > else > { > rc = hvm_send_ioreq(s, &p, 0); > if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down ) > vio->io_req.state = STATE_IOREQ_NONE; > else if ( data_is_addr ) > rc = X86EMUL_OKAY; > } > break; > > We send the I/O request to the default I/O request server, but no backing > DM hands it. We wil wait the I/O forever...... > > Thanks > Wen Congyang > >> >> ~Andrew >> >> >> . >> > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > . > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2015-12-25 3:06 ` Wen Congyang @ 2015-12-29 12:46 ` Andrew Cooper 2016-01-04 15:31 ` Ian Jackson 0 siblings, 1 reply; 21+ messages in thread From: Andrew Cooper @ 2015-12-29 12:46 UTC (permalink / raw) To: Wen Congyang; +Cc: Wei Liu, Ian Jackson, Ian.Campbell, xen devel On 25/12/2015 03:06, Wen Congyang wrote: > On 12/25/2015 09:45 AM, Wen Congyang wrote: >> On 12/24/2015 08:36 PM, Andrew Cooper wrote: >>> On 24/12/15 02:29, Wen Congyang wrote: >>>> Hi Andrew Cooper: >>>> >>>> I rebase the COLO codes to the newest upstream xen, and test it. I found >>>> a problem in the test, and I can reproduce this problem via the migration. >>>> >>>> How to reproduce: >>>> 1. xl cr -p hvm_nopv >>>> 2. xl migrate hvm_nopv 192.168.3.1 >>> You are the very first person to try a usecase like this. >>> >>> It works as much as it does because of your changes to the uncooperative HVM domain logic. I have said repeatedly during review, this is not necessarily a safe change to make without an in-depth analysis of the knock-on effects; it looks as if you have found the first knock-on effect. >>> >>>> The migration successes, but the vm doesn't run in the target machine. >>>> You can get the reason from 'xl dmesg': >>>> (XEN) HVM2 restore: VMCE_VCPU 1 >>>> (XEN) HVM2 restore: TSC_ADJUST 0 >>>> (XEN) HVM2 restore: TSC_ADJUST 1 >>>> (d2) HVM Loader >>>> (d2) Detected Xen v4.7-unstable >>>> (d2) Get guest memory maps[128] failed. (-38) >>>> (d2) *** HVMLoader bug at e820.c:39 >>>> (d2) *** HVMLoader crashed. >>>> >>>> The reason is that: >>>> We don't call xc_domain_set_memory_map() in the target machine. >>>> When we create a hvm domain: >>>> libxl__domain_build() >>>> libxl__build_hvm() >>>> libxl__arch_domain_construct_memmap() >>>> xc_domain_set_memory_map() >>>> >>>> Should we migrate the guest memory from source machine to target machine? >>> This bug specifically is because HVMLoader is expected to have run and turned the hypercall information in an E820 table in the guest before a migration occurs. >>> >>> Unfortunately, the current codebase is riddled with such assumption and expectations (e.g. the HVM save code assumed that FPU context is valid when it is saving register state) which is a direct side effect of how it was developed. >>> >>> >>> Having said all of the above, I agree that your example is a usecase which should work. It is the ultimate test of whether the migration stream contains enough information to faithfully reproduce the domain on the far side. Clearly at the moment, this is not the case. >>> >>> I have an upcoming project to work on the domain memory layout logic, because it is unsuitable for a number of XenServer usecases. Part of that will require moving it in the migration stream. >> I found another migration problem in the test: >> If the migration fails, we will resume it in the source side. >> But the hvm guest doesn't response any more. >> >> In my test envirionment, the migration always successses, so I >> use a hack way to reproduce it: >> 1. modify the target xen tools: >> >> diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c >> index 258dec4..da95606 100644 >> --- a/tools/libxl/libxl_stream_read.c >> +++ b/tools/libxl/libxl_stream_read.c >> @@ -767,6 +767,8 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void, >> goto err; >> } >> >> + rc = ERROR_FAIL; >> + >> err: >> check_all_finished(egc, stream, rc); >> >> 2. xl cr hvm_nopv, and wait some time(You can login to the guest) >> 3. xl migrate hvm_nopv 192.168.3.1 > Another problem: > If migration fails after the guest is suspended, we will resume it in the source. > In this case, we cannot shutdown it. because no process hanlds the shutdown event. > The log in /var/log/xen/xl-hvm_nopv.log: > Waiting for domain hvm_nopv (domid 1) to die [pid 5508] > Domain 1 has shut down, reason code 2 0x2 > Domain has suspended. > Done. Exiting now > > The xl has exited... > > Thanks > Wen Congyang Hmm yes. This is a libxl bug in libxl_evenable_domain_death(). CC'ing the toolstack maintainers. It waits for the @releasedomain watch, but doesn't interpret the results correctly. In particular, if it can still make successful hypercalls with the provided domid, that domain was not the subject of @releasedomain. (I also observe that domain_death_xswatch_callback() is very inefficient. It only needs to make a single hypercall, not query the entire state of all domains.) ~Andrew ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2015-12-29 12:46 ` Andrew Cooper @ 2016-01-04 15:31 ` Ian Jackson 2016-01-04 15:44 ` Ian Campbell ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Ian Jackson @ 2016-01-04 15:31 UTC (permalink / raw) To: Andrew Cooper; +Cc: Wei Liu, Ian.Campbell, Wen Congyang, xen devel Andrew Cooper writes ("Re: [Xen-devel] question about migration"): > On 25/12/2015 03:06, Wen Congyang wrote: > > Another problem: > > If migration fails after the guest is suspended, we will resume it in the source. > > In this case, we cannot shutdown it. because no process hanlds the shutdown event. > > The log in /var/log/xen/xl-hvm_nopv.log: > > Waiting for domain hvm_nopv (domid 1) to die [pid 5508] > > Domain 1 has shut down, reason code 2 0x2 > > Domain has suspended. > > Done. Exiting now > > > > The xl has exited... ... > Hmm yes. This is a libxl bug in libxl_evenable_domain_death(). CC'ing > the toolstack maintainers. AIUI this is a response to Wen's comments above. > It waits for the @releasedomain watch, but doesn't interpret the results > correctly. In particular, if it can still make successful hypercalls > with the provided domid, that domain was not the subject of > @releasedomain. (I also observe that domain_death_xswatch_callback() is > very inefficient. It only needs to make a single hypercall, not query > the entire state of all domains.) I don't understand precisely what you allege this bug to be, but: * libxl_evenable_domain_death may generate two events, a DOMAIN_SHUTDOWN and a DOMAIN_DEATH, or only one, a DOMAIN_DEATH. This is documented in libxl.h (although it refers to DESTROY rather than DEATH - see patch below to fix the doc). * @releaseDomain usually triggers twice for each domain: once when it goes to SHUTDOWN and once when it is actually destroyed. (This is obviously necessary to implement the above.) * @releaseDomain does not have a specific domain which is the "subject of @releaseDomain". Arguably this is unhelpful, but it is not libxl's fault. It arises from the VIRQ generated by Xen. Note that xenstored needs to search its own list of active domains to see what has happened; it generates the @releaseDomain event and throws away the domid. * domain_death_xswatch_callback handles the reporting for all domains that this instance of libxl is interested in. * That libxl searches its own smaller list of domains it is interested in is I think not a huge performance problem in a context where xenstored has to search all the domains on the system. In the worst case where each libxl process is interested in one domain it makes O(number of domains) hypercalls requesting info for 200 domains each. If this `200' is a problem a better heuristic would be quite easy. * Wen is using xl. The domain death events in xl are handled in xl_cmdimpl.c in the big while loop near the end of create_domain. The message in the log comes from this (in `create_domain'): case LIBXL_EVENT_TYPE_DOMAIN_SHUTDOWN: LOG("Domain %d has shut down, reason code %d 0x%x", domid, event->u.domain_shutdown.shutdown_reason, event->u.domain_shutdown.shutdown_reason); and then this (in `handle_domain_death') case LIBXL_SHUTDOWN_REASON_SUSPEND: LOG("Domain has suspended."); return 0; This is anomalous coding style because handle_domain_death should return one of the DOMAIN_RESTART values. This anomaly was introduced in 1a0e1789 "xl: support on_{poweroff,reboot,crash} domain configuration options", which introduced ACTION_* but left the suspend case to `return 0'. 0 now means in this context DOMAIN_RESTART_NONE which means that the daemonic xl should simply exit. But this stylistic anomaly is of no functional import: even before that (eg in 0e89a4ce~), shutdown with the reason code suspend causes xl to exit. The daemonic xl exits in this situation because it expects the migration machinery to tidy up the domain. * It is not possible to resume the domain in the source after it has suspended. The domain is, in the hypervisor, in the lifecycle state SHUTDOWN. For this reason in general, it is a bad idea to suspend the guest until the migration is known to be likely to succeed. After a late migration failure the domain should probably be treated as crashed. This is slightly awkward to implement because the daemonic xl would have to hang around to see what happened next and perhaps be told what to do. Or maybe the migrating xl would have to know what the on_crash configuration was. In the meantime the failure is rather less graceful: debris is simply left for the user. This is not ideal but tolerable I think. Ian. >From 00c4fc6495078026d68f3fdd88bfd29a8ad483d7 Mon Sep 17 00:00:00 2001 From: Ian Jackson <ian.jackson@eu.citrix.com> Date: Mon, 4 Jan 2016 15:13:14 +0000 Subject: [PATCH] libxl: Fix doc comment ref to DOMAIN_DEATH The doc comment for libxl_evdisable_domain_death mistakenly referred to DOMAIN_DESTROY but the event type name is actually DOMAIN_DEATH. Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Andrew Cooper <andrew.cooper3@citrix.com> --- tools/libxl/libxl_event.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h index fad4c14..1ea789e 100644 --- a/tools/libxl/libxl_event.h +++ b/tools/libxl/libxl_event.h @@ -179,9 +179,9 @@ typedef struct libxl__evgen_domain_death libxl_evgen_domain_death; int libxl_evenable_domain_death(libxl_ctx *ctx, uint32_t domid, libxl_ev_user, libxl_evgen_domain_death **evgen_out); void libxl_evdisable_domain_death(libxl_ctx *ctx, libxl_evgen_domain_death*); - /* Arranges for the generation of DOMAIN_SHUTDOWN and DOMAIN_DESTROY + /* Arranges for the generation of DOMAIN_SHUTDOWN and DOMAIN_DEATH * events. A domain which is destroyed before it shuts down - * may generate only a DESTROY event. + * may generate only a DEATH event. */ typedef struct libxl__evgen_disk_eject libxl_evgen_disk_eject; -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: question about migration 2016-01-04 15:31 ` Ian Jackson @ 2016-01-04 15:44 ` Ian Campbell 2016-01-04 15:48 ` Ian Campbell 2016-01-04 16:38 ` Andrew Cooper 2 siblings, 0 replies; 21+ messages in thread From: Ian Campbell @ 2016-01-04 15:44 UTC (permalink / raw) To: Ian Jackson, Andrew Cooper; +Cc: Wei Liu, Wen Congyang, xen devel On Mon, 2016-01-04 at 15:31 +0000, Ian Jackson wrote: > > From 00c4fc6495078026d68f3fdd88bfd29a8ad483d7 Mon Sep 17 00:00:00 2001 > From: Ian Jackson <ian.jackson@eu.citrix.com> > Date: Mon, 4 Jan 2016 15:13:14 +0000 > Subject: [PATCH] libxl: Fix doc comment ref to DOMAIN_DEATH > > The doc comment for libxl_evdisable_domain_death mistakenly referred > to DOMAIN_DESTROY but the event type name is actually DOMAIN_DEATH. > > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> > CC: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> > --- > tools/libxl/libxl_event.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h > index fad4c14..1ea789e 100644 > --- a/tools/libxl/libxl_event.h > +++ b/tools/libxl/libxl_event.h > @@ -179,9 +179,9 @@ typedef struct libxl__evgen_domain_death > libxl_evgen_domain_death; > int libxl_evenable_domain_death(libxl_ctx *ctx, uint32_t domid, > libxl_ev_user, libxl_evgen_domain_death > **evgen_out); > void libxl_evdisable_domain_death(libxl_ctx *ctx, > libxl_evgen_domain_death*); > - /* Arranges for the generation of DOMAIN_SHUTDOWN and DOMAIN_DESTROY > + /* Arranges for the generation of DOMAIN_SHUTDOWN and DOMAIN_DEATH > * events. A domain which is destroyed before it shuts down > - * may generate only a DESTROY event. > + * may generate only a DEATH event. > */ > > typedef struct libxl__evgen_disk_eject libxl_evgen_disk_eject; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2016-01-04 15:31 ` Ian Jackson 2016-01-04 15:44 ` Ian Campbell @ 2016-01-04 15:48 ` Ian Campbell 2016-01-04 16:38 ` Andrew Cooper 2 siblings, 0 replies; 21+ messages in thread From: Ian Campbell @ 2016-01-04 15:48 UTC (permalink / raw) To: Ian Jackson, Andrew Cooper; +Cc: Wei Liu, Wen Congyang, xen devel On Mon, 2016-01-04 at 15:31 +0000, Ian Jackson wrote: > [...] > The daemonic xl exits in this situation because it expects the > migration machinery to tidy up the domain. > > * It is not possible to resume the domain in the source after it has > suspended. Isn't supposed to be, isn't that why SCHEDOP_suspend returns a flag to indicate if it was cancelled or not, so the guest can be resumed on the source if "something" went wrong? (And isn't such a resume the same as what happens after a checkpoint?) Unless you are referring not to the general case but to some more specific scenario/window where this is no longer possible? > The domain is, in the hypervisor, in the lifecycle state > SHUTDOWN. For this reason in general, it is a bad idea to suspend > the guest until the migration is known to be likely to succeed. > > After a late migration failure the domain should probably be treated > as crashed. This is slightly awkward to implement because the > daemonic xl would have to hang around to see what happened next and > perhaps be told what to do. Or maybe the migrating xl would have to > know what the on_crash configuration was. > > In the meantime the failure is rather less graceful: debris is > simply left for the user. This is not ideal but tolerable I think. > > Ian. > > > From 00c4fc6495078026d68f3fdd88bfd29a8ad483d7 Mon Sep 17 00:00:00 2001 > From: Ian Jackson <ian.jackson@eu.citrix.com> > Date: Mon, 4 Jan 2016 15:13:14 +0000 > Subject: [PATCH] libxl: Fix doc comment ref to DOMAIN_DEATH > > The doc comment for libxl_evdisable_domain_death mistakenly referred > to DOMAIN_DESTROY but the event type name is actually DOMAIN_DEATH. > > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> > CC: Andrew Cooper <andrew.cooper3@citrix.com> > --- > tools/libxl/libxl_event.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h > index fad4c14..1ea789e 100644 > --- a/tools/libxl/libxl_event.h > +++ b/tools/libxl/libxl_event.h > @@ -179,9 +179,9 @@ typedef struct libxl__evgen_domain_death > libxl_evgen_domain_death; > int libxl_evenable_domain_death(libxl_ctx *ctx, uint32_t domid, > libxl_ev_user, libxl_evgen_domain_death > **evgen_out); > void libxl_evdisable_domain_death(libxl_ctx *ctx, > libxl_evgen_domain_death*); > - /* Arranges for the generation of DOMAIN_SHUTDOWN and DOMAIN_DESTROY > + /* Arranges for the generation of DOMAIN_SHUTDOWN and DOMAIN_DEATH > * events. A domain which is destroyed before it shuts down > - * may generate only a DESTROY event. > + * may generate only a DEATH event. > */ > > typedef struct libxl__evgen_disk_eject libxl_evgen_disk_eject; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2016-01-04 15:31 ` Ian Jackson 2016-01-04 15:44 ` Ian Campbell 2016-01-04 15:48 ` Ian Campbell @ 2016-01-04 16:38 ` Andrew Cooper 2016-01-04 17:46 ` Ian Jackson 2 siblings, 1 reply; 21+ messages in thread From: Andrew Cooper @ 2016-01-04 16:38 UTC (permalink / raw) To: Ian Jackson; +Cc: Wei Liu, Ian.Campbell, Wen Congyang, xen devel On 04/01/16 15:31, Ian Jackson wrote: > Andrew Cooper writes ("Re: [Xen-devel] question about migration"): >> On 25/12/2015 03:06, Wen Congyang wrote: >>> Another problem: >>> If migration fails after the guest is suspended, we will resume it in the source. >>> In this case, we cannot shutdown it. because no process hanlds the shutdown event. >>> The log in /var/log/xen/xl-hvm_nopv.log: >>> Waiting for domain hvm_nopv (domid 1) to die [pid 5508] >>> Domain 1 has shut down, reason code 2 0x2 >>> Domain has suspended. >>> Done. Exiting now >>> >>> The xl has exited... > ... >> Hmm yes. This is a libxl bug in libxl_evenable_domain_death(). CC'ing >> the toolstack maintainers. > AIUI this is a response to Wen's comments above. > >> It waits for the @releasedomain watch, but doesn't interpret the results >> correctly. In particular, if it can still make successful hypercalls >> with the provided domid, that domain was not the subject of >> @releasedomain. (I also observe that domain_death_xswatch_callback() is >> very inefficient. It only needs to make a single hypercall, not query >> the entire state of all domains.) > I don't understand precisely what you allege this bug to be, but: > > * libxl_evenable_domain_death may generate two events, a > DOMAIN_SHUTDOWN and a DOMAIN_DEATH, or only one, a DOMAIN_DEATH. > This is documented in libxl.h (although it refers to DESTROY rather > than DEATH - see patch below to fix the doc). > > * @releaseDomain usually triggers twice for each domain: once when it > goes to SHUTDOWN and once when it is actually destroyed. (This is > obviously necessary to implement the above.) So it does. I clearly had an accident with `git grep` when I came the opposite conclusion. Apologies for the noise generated from this. > > * @releaseDomain does not have a specific domain which is the "subject > of @releaseDomain". Arguably this is unhelpful, but it is not > libxl's fault. It arises from the VIRQ generated by Xen. Note that > xenstored needs to search its own list of active domains to see what > has happened; it generates the @releaseDomain event and throws away > the domid. The semantics of @releaseDomain are quite mad, but this is have it has always been. The current semantics are a scalability limitation which someone in XenServer will likely get around to in due course (we support 1000 VMs per host). > * It is not possible to resume the domain in the source after it has > suspended. This functionality exists and is already used in several circumstances, both by libxl, and other toolstacks. xl has an added split-brain problem here that plain demonic toolstacks don't have; specifically that there are two completely independent processes playing with the domain state at the same time. The daemonic xl needs to ignore DOMAIN_SHUTDOWN and tidy up only after DOMAIN_DEATH. Under these circumstances, a failed migrate which resumes the domain won't result in qemu being cleaned up. ~Andrew ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2016-01-04 16:38 ` Andrew Cooper @ 2016-01-04 17:46 ` Ian Jackson 2016-01-04 18:05 ` Andrew Cooper 0 siblings, 1 reply; 21+ messages in thread From: Ian Jackson @ 2016-01-04 17:46 UTC (permalink / raw) To: Andrew Cooper; +Cc: Wei Liu, Ian.Campbell, Wen Congyang, xen devel Andrew Cooper writes ("Re: [Xen-devel] question about migration"): > On 04/01/16 15:31, Ian Jackson wrote: > > * It is not possible to resume the domain in the source after it has > > suspended. > > This functionality exists and is already used in several circumstances, > both by libxl, and other toolstacks. Oh! > xl has an added split-brain problem here that plain demonic toolstacks > don't have; specifically that there are two completely independent > processes playing with the domain state at the same time. > > The daemonic xl needs to ignore DOMAIN_SHUTDOWN and tidy up only after > DOMAIN_DEATH. Under these circumstances, a failed migrate which resumes > the domain won't result in qemu being cleaned up. I think there is some kind of further underlying problem here. Suppose a domain goes into SHUTDOWN with reason code SUSPEND. Then later it is resumed again (perhaps the migration failed). And later it shuts down cleanly. Will this generate a second @releaseDomain ? How is a toolstack supposed to discover this situation ? Should libxl simply treat domains in state SHUTDOWN/SUSPEND as if they were running, and not issue a notification to anyone ? > > * @releaseDomain does not have a specific domain which is the "subject > > of @releaseDomain". Arguably this is unhelpful, but it is not > > libxl's fault. It arises from the VIRQ generated by Xen. Note that > > xenstored needs to search its own list of active domains to see what > > has happened; it generates the @releaseDomain event and throws away > > the domid. > > The semantics of @releaseDomain are quite mad, but this is have it has > always been. Yes. > The current semantics are a scalability limitation which someone in > XenServer will likely get around to in due course (we support 1000 VMs > per host). Jolly good :-). Ian. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2016-01-04 17:46 ` Ian Jackson @ 2016-01-04 18:05 ` Andrew Cooper 2016-01-05 15:40 ` Ian Jackson 0 siblings, 1 reply; 21+ messages in thread From: Andrew Cooper @ 2016-01-04 18:05 UTC (permalink / raw) To: Ian Jackson; +Cc: Wei Liu, Ian.Campbell, Wen Congyang, xen devel On 04/01/16 17:46, Ian Jackson wrote: > Andrew Cooper writes ("Re: [Xen-devel] question about migration"): >> On 04/01/16 15:31, Ian Jackson wrote: >>> * It is not possible to resume the domain in the source after it has >>> suspended. >> This functionality exists and is already used in several circumstances, >> both by libxl, and other toolstacks. > Oh! > >> xl has an added split-brain problem here that plain demonic toolstacks >> don't have; specifically that there are two completely independent >> processes playing with the domain state at the same time. >> >> The daemonic xl needs to ignore DOMAIN_SHUTDOWN and tidy up only after >> DOMAIN_DEATH. Under these circumstances, a failed migrate which resumes >> the domain won't result in qemu being cleaned up. > I think there is some kind of further underlying problem here. > > Suppose a domain goes into SHUTDOWN with reason code SUSPEND. Then > later it is resumed again (perhaps the migration failed). And later > it shuts down cleanly. > > Will this generate a second @releaseDomain ? How is a toolstack > supposed to discover this situation ? By the looks of it in Xen, domain_resume() sets d->is_shut_down = 0, which means a subsequent domain_shutdown() will generate a new VIRQ. Therefore, a new @releaseDomain will arrive at a later point. (Actually, there are 3 VIRQs on a standard shutdown path, but the final two are very close together. It is exceedingly likely that they are amortised in the time it takes dom0 to respond to the evtchn for the penultimate notification.) > > Should libxl simply treat domains in state SHUTDOWN/SUSPEND as if they > were running, and not issue a notification to anyone ? As far as libxl_evenable_domain_death() is concerned, yes. In general however, no. Consider a libxl user (such as a snapshotting utility) which asks a domain to suspend to quiesce the disks and then intends to resume the domain. ~Andrew ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2016-01-04 18:05 ` Andrew Cooper @ 2016-01-05 15:40 ` Ian Jackson 2016-01-05 17:39 ` Andrew Cooper 0 siblings, 1 reply; 21+ messages in thread From: Ian Jackson @ 2016-01-05 15:40 UTC (permalink / raw) To: Andrew Cooper; +Cc: Wei Liu, Ian.Campbell, Wen Congyang, xen devel Andrew Cooper writes ("Re: [Xen-devel] question about migration"): > On 04/01/16 17:46, Ian Jackson wrote: > > Suppose a domain goes into SHUTDOWN with reason code SUSPEND. Then > > later it is resumed again (perhaps the migration failed). And later > > it shuts down cleanly. > > > > Will this generate a second @releaseDomain ? How is a toolstack > > supposed to discover this situation ? > > By the looks of it in Xen, domain_resume() sets d->is_shut_down = 0, > which means a subsequent domain_shutdown() will generate a new VIRQ. > > Therefore, a new @releaseDomain will arrive at a later point. Well, jolly good. > > Should libxl simply treat domains in state SHUTDOWN/SUSPEND as if they > > were running, and not issue a notification to anyone ? > > As far as libxl_evenable_domain_death() is concerned, yes. Would you like to prepare a patch ? > In general however, no. Consider a libxl user (such as a snapshotting > utility) which asks a domain to suspend to quiesce the disks and then > intends to resume the domain. Currently such a utility would have to use libxl_evenable_domain_death to discover when the domain suspends, but that would cause breakage as discussed in this thread. So I think such a utility would need a new libxl evenable interface. Do you agree ? Ian. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2016-01-05 15:40 ` Ian Jackson @ 2016-01-05 17:39 ` Andrew Cooper 2016-01-05 18:17 ` Ian Jackson 0 siblings, 1 reply; 21+ messages in thread From: Andrew Cooper @ 2016-01-05 17:39 UTC (permalink / raw) To: Ian Jackson; +Cc: Wei Liu, Ian.Campbell, Wen Congyang, xen devel On 05/01/16 15:40, Ian Jackson wrote: > Andrew Cooper writes ("Re: [Xen-devel] question about migration"): >> On 04/01/16 17:46, Ian Jackson wrote: >>> Suppose a domain goes into SHUTDOWN with reason code SUSPEND. Then >>> later it is resumed again (perhaps the migration failed). And later >>> it shuts down cleanly. >>> >>> Will this generate a second @releaseDomain ? How is a toolstack >>> supposed to discover this situation ? >> By the looks of it in Xen, domain_resume() sets d->is_shut_down = 0, >> which means a subsequent domain_shutdown() will generate a new VIRQ. >> >> Therefore, a new @releaseDomain will arrive at a later point. > Well, jolly good. > >>> Should libxl simply treat domains in state SHUTDOWN/SUSPEND as if they >>> were running, and not issue a notification to anyone ? >> As far as libxl_evenable_domain_death() is concerned, yes. > Would you like to prepare a patch ? I don't have a repro of the issue. This thread was merely me triaging an issue reported by Wen, given some `git grep`. (There are actually 3 different bugs on different subthreads of this thread.) > >> In general however, no. Consider a libxl user (such as a snapshotting >> utility) which asks a domain to suspend to quiesce the disks and then >> intends to resume the domain. > Currently such a utility would have to use > libxl_evenable_domain_death to discover when the domain suspends, but > that would cause breakage as discussed in this thread. So I think > such a utility would need a new libxl evenable interface. Do you > agree ? This looks like a yakk. Changing the behaviour of libxl_evenable_domain_death() will break the API and also break in-guest users. It also doesn't return the shutdown code, so the caller can't loop while shutdown && suspend. I can't think of a solution which doesn't involve making a brand new function written from scratch. ~Andrew ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2016-01-05 17:39 ` Andrew Cooper @ 2016-01-05 18:17 ` Ian Jackson 2016-01-06 10:21 ` Ian Campbell 0 siblings, 1 reply; 21+ messages in thread From: Ian Jackson @ 2016-01-05 18:17 UTC (permalink / raw) To: Andrew Cooper; +Cc: Wei Liu, Ian.Campbell, Wen Congyang, xen devel Andrew Cooper writes ("Re: [Xen-devel] question about migration"): > On 05/01/16 15:40, Ian Jackson wrote: > > Would you like to prepare a patch ? > > I don't have a repro of the issue. This thread was merely me triaging > an issue reported by Wen, given some `git grep`. (There are actually 3 > different bugs on different subthreads of this thread.) I will do so then. > >> In general however, no. Consider a libxl user (such as a snapshotting > >> utility) which asks a domain to suspend to quiesce the disks and then > >> intends to resume the domain. > > > > Currently such a utility would have to use > > libxl_evenable_domain_death to discover when the domain suspends, but > > that would cause breakage as discussed in this thread. So I think > > such a utility would need a new libxl evenable interface. Do you > > agree ? > > This looks like a yakk. It's a yak that someone would have to shave when they wanted to write such a utility. > Changing the behaviour of libxl_evenable_domain_death() will break the > API and also break in-guest users. It also doesn't return the shutdown > code, so the caller can't loop while shutdown && suspend. I don't understand what you mean. Currently libxl_evenable_domain_death reports DOMAIN_SHUTDOWN and provides the reason code. But it only does so once, so when a domain is migrated it triggers during the suspend-for-migrate and then generates no further SHUTDOWN event. So any code trying to use this for your snapshotting case is already broken and cannot be fixed without introducing a new API (probably one which generates separate suspend/unsuspend events). (We can't have libxl_evenable_domain_death generate new unsuspend events because existing callers won't understand.) I therefore propose that: libxl_evenable_domain_death should never generate DOMAIN_SHUTDOWN with reason code suspended. Instead, it should hope that the domain will go back to running. If the domain ends up shut down for some other reason, or is destroyed, it should report those things. In the future we introduce libxl_evenable_domain_state which generates the existing events from libxl_evenable_domain_death but also two new events DOMAIN_SUSPENDED or DOMAIN_RUNNING. Ian. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2016-01-05 18:17 ` Ian Jackson @ 2016-01-06 10:21 ` Ian Campbell 0 siblings, 0 replies; 21+ messages in thread From: Ian Campbell @ 2016-01-06 10:21 UTC (permalink / raw) To: Ian Jackson, Andrew Cooper; +Cc: Wei Liu, Wen Congyang, xen devel On Tue, 2016-01-05 at 18:17 +0000, Ian Jackson wrote: > So any code trying to use this for your snapshotting case is already > broken and cannot be fixed without introducing a new API (probably one > which generates separate suspend/unsuspend events). Remus does seem to work at least to the extent that it seems not to be hitting this case though? Or at least I assume so since people have reported various cases as working. Or maybe no one ever did "xl shutdown" after checkpointing so this went unnoticed. I'm trying to decide if this is "it's completely knackered" for "fails in some corner cases". > (We can't have libxl_evenable_domain_death generate new unsuspend > events because existing callers won't understand.) Can we make it so that the new API can be extended in this way, even if we just document "ignore unknown events"? > I therefore propose that: > > libxl_evenable_domain_death should never generate DOMAIN_SHUTDOWN with > reason code suspended. FWIW libvirt ignores these (and as we know xl incorrectly exits). I guess we think it unlikely that anything could be relying on the current behaviour? OOI would calling libxl_evdisable_domain_death+libxl_evenable_domain_death reset the one-shot nature of this event? > Instead, it should hope that the domain will > go back to running. If the domain ends up shut down for some other > reason, or is destroyed, it should report those things. > > In the future we introduce libxl_evenable_domain_state which generates > the existing events from libxl_evenable_domain_death but also two new > events DOMAIN_SUSPENDED or DOMAIN_RUNNING. I infer that this new function will take a mask of domain states which the caller is interested in (and hence is extensible)? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2015-12-25 1:45 ` Wen Congyang 2015-12-25 3:06 ` Wen Congyang @ 2015-12-29 11:24 ` Andrew Cooper 2016-01-04 10:28 ` Paul Durrant 1 sibling, 1 reply; 21+ messages in thread From: Andrew Cooper @ 2015-12-29 11:24 UTC (permalink / raw) To: Wen Congyang; +Cc: Paul Durrant, xen devel On 25/12/2015 01:45, Wen Congyang wrote: > On 12/24/2015 08:36 PM, Andrew Cooper wrote: >> On 24/12/15 02:29, Wen Congyang wrote: >>> Hi Andrew Cooper: >>> >>> I rebase the COLO codes to the newest upstream xen, and test it. I found >>> a problem in the test, and I can reproduce this problem via the migration. >>> >>> How to reproduce: >>> 1. xl cr -p hvm_nopv >>> 2. xl migrate hvm_nopv 192.168.3.1 >> You are the very first person to try a usecase like this. >> >> It works as much as it does because of your changes to the uncooperative HVM domain logic. I have said repeatedly during review, this is not necessarily a safe change to make without an in-depth analysis of the knock-on effects; it looks as if you have found the first knock-on effect. >> >>> The migration successes, but the vm doesn't run in the target machine. >>> You can get the reason from 'xl dmesg': >>> (XEN) HVM2 restore: VMCE_VCPU 1 >>> (XEN) HVM2 restore: TSC_ADJUST 0 >>> (XEN) HVM2 restore: TSC_ADJUST 1 >>> (d2) HVM Loader >>> (d2) Detected Xen v4.7-unstable >>> (d2) Get guest memory maps[128] failed. (-38) >>> (d2) *** HVMLoader bug at e820.c:39 >>> (d2) *** HVMLoader crashed. >>> >>> The reason is that: >>> We don't call xc_domain_set_memory_map() in the target machine. >>> When we create a hvm domain: >>> libxl__domain_build() >>> libxl__build_hvm() >>> libxl__arch_domain_construct_memmap() >>> xc_domain_set_memory_map() >>> >>> Should we migrate the guest memory from source machine to target machine? >> This bug specifically is because HVMLoader is expected to have run and turned the hypercall information in an E820 table in the guest before a migration occurs. >> >> Unfortunately, the current codebase is riddled with such assumption and expectations (e.g. the HVM save code assumed that FPU context is valid when it is saving register state) which is a direct side effect of how it was developed. >> >> >> Having said all of the above, I agree that your example is a usecase which should work. It is the ultimate test of whether the migration stream contains enough information to faithfully reproduce the domain on the far side. Clearly at the moment, this is not the case. >> >> I have an upcoming project to work on the domain memory layout logic, because it is unsuitable for a number of XenServer usecases. Part of that will require moving it in the migration stream. > I found another migration problem in the test: > If the migration fails, we will resume it in the source side. > But the hvm guest doesn't response any more. > > In my test envirionment, the migration always successses, so I "succeeds" > use a hack way to reproduce it: > 1. modify the target xen tools: > > diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c > index 258dec4..da95606 100644 > --- a/tools/libxl/libxl_stream_read.c > +++ b/tools/libxl/libxl_stream_read.c > @@ -767,6 +767,8 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void, > goto err; > } > > + rc = ERROR_FAIL; > + > err: > check_all_finished(egc, stream, rc); > > 2. xl cr hvm_nopv, and wait some time(You can login to the guest) > 3. xl migrate hvm_nopv 192.168.3.1 > > The reason it that: > We create a default ioreq server when we get the hvm param HVM_PARAM_IOREQ_PFN. > It means that: the problem occurs only when the migration fails after we get > the hvm param HVM_PARAM_IOREQ_PFN. > > In the function hvm_select_ioreq_server() > If the I/O will be handed by non-default ioreq server, we will return the > non-default ioreq server. In this case, it is handed by qemu. > If the I/O will not be handed by non-default ioreq server, we will return > the default ioreq server. Before migration, we return NULL, and after migration > it is not NULL. > See the caller is hvmemul_do_io(): > case X86EMUL_UNHANDLEABLE: > { > struct hvm_ioreq_server *s = > hvm_select_ioreq_server(curr->domain, &p); > > /* If there is no suitable backing DM, just ignore accesses */ > if ( !s ) > { > rc = hvm_process_io_intercept(&null_handler, &p); > vio->io_req.state = STATE_IOREQ_NONE; > } > else > { > rc = hvm_send_ioreq(s, &p, 0); > if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down ) > vio->io_req.state = STATE_IOREQ_NONE; > else if ( data_is_addr ) > rc = X86EMUL_OKAY; > } > break; > > We send the I/O request to the default I/O request server, but no backing > DM hands it. We wil wait the I/O forever...... Hmm yes. This needs fixing. CC'ing Paul who did the ioreq server work. This bug is caused by the read side effects of HVM_PARAM_IOREQ_PFN. The migration code needs a way of being able to query whether a default ioreq server exists, without creating one. Can you remember what the justification for the read side effects were? ISTR that it was only for qemu compatibility until the ioreq server work got in upstream. If that was the case, can we drop the read side effects now and mandate that all qemus explicitly create their ioreq servers (even if this involves creating a default ioreq server for qemu-trad)? ~Andrew ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2015-12-29 11:24 ` Andrew Cooper @ 2016-01-04 10:28 ` Paul Durrant 2016-01-04 10:36 ` Andrew Cooper 0 siblings, 1 reply; 21+ messages in thread From: Paul Durrant @ 2016-01-04 10:28 UTC (permalink / raw) To: Andrew Cooper, Wen Congyang; +Cc: xen devel > -----Original Message----- [snip] > > We send the I/O request to the default I/O request server, but no backing > > DM hands it. We wil wait the I/O forever...... > > Hmm yes. This needs fixing. > > CC'ing Paul who did the ioreq server work. > > This bug is caused by the read side effects of HVM_PARAM_IOREQ_PFN. The > migration code needs a way of being able to query whether a default > ioreq server exists, without creating one. > > Can you remember what the justification for the read side effects were? > ISTR that it was only for qemu compatibility until the ioreq server work > got in upstream. If that was the case, can we drop the read side > effects now and mandate that all qemus explicitly create their ioreq > servers (even if this involves creating a default ioreq server for > qemu-trad)? > Yes, you're correct that the read side-effect was the only way of keeping compatibility with existing QEMU at the time. It would be nice to remove that hackery but it would indeed require a patch to qemu-trad and would clearly break compatibility with distro qemu packages built prior to my modification. An alternative solution to avoid breaking compatibility (albeit a bit icky) would be to turn off the side effects when HVMOP_create_ioreq_server is handled. There should be no case where a non-default server needs to be created in advance of a default server. Paul > ~Andrew ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2016-01-04 10:28 ` Paul Durrant @ 2016-01-04 10:36 ` Andrew Cooper 2016-01-04 11:08 ` Paul Durrant 0 siblings, 1 reply; 21+ messages in thread From: Andrew Cooper @ 2016-01-04 10:36 UTC (permalink / raw) To: Paul Durrant, Wen Congyang; +Cc: Wei Liu, Ian Jackson, Ian Campbell, xen devel On 04/01/16 10:28, Paul Durrant wrote: >> -----Original Message----- > [snip] >>> We send the I/O request to the default I/O request server, but no backing >>> DM hands it. We wil wait the I/O forever...... >> Hmm yes. This needs fixing. >> >> CC'ing Paul who did the ioreq server work. >> >> This bug is caused by the read side effects of HVM_PARAM_IOREQ_PFN. The >> migration code needs a way of being able to query whether a default >> ioreq server exists, without creating one. >> >> Can you remember what the justification for the read side effects were? >> ISTR that it was only for qemu compatibility until the ioreq server work >> got in upstream. If that was the case, can we drop the read side >> effects now and mandate that all qemus explicitly create their ioreq >> servers (even if this involves creating a default ioreq server for >> qemu-trad)? >> > Yes, you're correct that the read side-effect was the only way of keeping compatibility with existing QEMU at the time. It would be nice to remove that hackery but it would indeed require a patch to qemu-trad and would clearly break compatibility with distro qemu packages built prior to my modification. > An alternative solution to avoid breaking compatibility (albeit a bit icky) would be to turn off the side effects when HVMOP_create_ioreq_server is handled. There should be no case where a non-default server needs to be created in advance of a default server. I was under the impression that qemu-trad (like libxc) was expected to exactly match. There is deliberately no facility to use a distro packaged qemu-trad in the Xen build system. CC'ing toolstack maintainers for their input. If this is indeed the case, then the former option is the better course of action. ~Andrew ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: question about migration 2016-01-04 10:36 ` Andrew Cooper @ 2016-01-04 11:08 ` Paul Durrant 0 siblings, 0 replies; 21+ messages in thread From: Paul Durrant @ 2016-01-04 11:08 UTC (permalink / raw) To: Andrew Cooper, Wen Congyang; +Cc: Ian Jackson, Wei Liu, Ian Campbell, xen devel > -----Original Message----- > From: Andrew Cooper [mailto:andrew.cooper3@citrix.com] > Sent: 04 January 2016 10:36 > To: Paul Durrant; Wen Congyang > Cc: xen devel; Ian Campbell; Ian Jackson; Wei Liu > Subject: Re: [Xen-devel] question about migration > > On 04/01/16 10:28, Paul Durrant wrote: > >> -----Original Message----- > > [snip] > >>> We send the I/O request to the default I/O request server, but no > backing > >>> DM hands it. We wil wait the I/O forever...... > >> Hmm yes. This needs fixing. > >> > >> CC'ing Paul who did the ioreq server work. > >> > >> This bug is caused by the read side effects of HVM_PARAM_IOREQ_PFN. > The > >> migration code needs a way of being able to query whether a default > >> ioreq server exists, without creating one. > >> > >> Can you remember what the justification for the read side effects were? > >> ISTR that it was only for qemu compatibility until the ioreq server work > >> got in upstream. If that was the case, can we drop the read side > >> effects now and mandate that all qemus explicitly create their ioreq > >> servers (even if this involves creating a default ioreq server for > >> qemu-trad)? > >> > > Yes, you're correct that the read side-effect was the only way of keeping > compatibility with existing QEMU at the time. It would be nice to remove that > hackery but it would indeed require a patch to qemu-trad and would clearly > break compatibility with distro qemu packages built prior to my modification. > > An alternative solution to avoid breaking compatibility (albeit a bit icky) > would be to turn off the side effects when HVMOP_create_ioreq_server is > handled. There should be no case where a non-default server needs to be > created in advance of a default server. > > I was under the impression that qemu-trad (like libxc) was expected to > exactly match. There is deliberately no facility to use a distro > packaged qemu-trad in the Xen build system. CC'ing toolstack > maintainers for their input. > It wasn't clear but I meant that compatibility with *upstream* QEMU builds prior to my patch would be broken. It depends whether we care about this or not. Paul > If this is indeed the case, then the former option is the better course > of action. > > ~Andrew ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2016-01-06 10:21 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-12-24 2:29 question about migration Wen Congyang 2015-12-24 12:36 ` Andrew Cooper 2015-12-25 0:55 ` Wen Congyang 2015-12-29 10:57 ` Andrew Cooper 2015-12-25 1:45 ` Wen Congyang 2015-12-25 3:06 ` Wen Congyang 2015-12-29 12:46 ` Andrew Cooper 2016-01-04 15:31 ` Ian Jackson 2016-01-04 15:44 ` Ian Campbell 2016-01-04 15:48 ` Ian Campbell 2016-01-04 16:38 ` Andrew Cooper 2016-01-04 17:46 ` Ian Jackson 2016-01-04 18:05 ` Andrew Cooper 2016-01-05 15:40 ` Ian Jackson 2016-01-05 17:39 ` Andrew Cooper 2016-01-05 18:17 ` Ian Jackson 2016-01-06 10:21 ` Ian Campbell 2015-12-29 11:24 ` Andrew Cooper 2016-01-04 10:28 ` Paul Durrant 2016-01-04 10:36 ` Andrew Cooper 2016-01-04 11:08 ` Paul Durrant
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.