* [PATCH v4 0/4] Assorted fixes and improvements @ 2016-02-16 17:37 Roger Pau Monne 2016-02-16 17:37 ` [PATCH v4 1/4] x86/HVM: update the start info structure layout Roger Pau Monne ` (3 more replies) 0 siblings, 4 replies; 39+ messages in thread From: Roger Pau Monne @ 2016-02-16 17:37 UTC (permalink / raw) To: xen-devel Hello, This series contains some assorted patches and improvements. They are completely ortogonal one fro meach other, so they can be applied independently without issues. I've just decided to group them together because I prefer it rather than spamming the list with 4 different threads. #1 updates the HVMlite boot ABI according to the discussion we had arround the design document. #2 introduces a new UNKNOWN VGA type that can be used to let libxl decide which emulated VGA it's going to be used. #3 rewrites and simplifies the strtab/symtab loading, and get rid of a bunch of duplicated code. #4 is a fix for cd-eject issues, which consists in fixing libxl__device_disk_set_backend in order to avoid trying to guess the backend type for LIBXL_DISK_FORMAT_EMPTY disks. Thanks, Roger. NB: this is labeled as v4 because #2 is at v4, I'm not sure about the version of the others, but it's certainly < 4. ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH v4 1/4] x86/HVM: update the start info structure layout 2016-02-16 17:37 [PATCH v4 0/4] Assorted fixes and improvements Roger Pau Monne @ 2016-02-16 17:37 ` Roger Pau Monne 2016-02-16 19:13 ` Andrew Cooper ` (4 more replies) 2016-02-16 17:37 ` [PATCH v4 2/4] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNKNOWN Roger Pau Monne ` (2 subsequent siblings) 3 siblings, 5 replies; 39+ messages in thread From: Roger Pau Monne @ 2016-02-16 17:37 UTC (permalink / raw) To: xen-devel Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, Jan Beulich, Samuel Thibault, Roger Pau Monne After some discussion around the new boot ABI consensus has been reached about the layout and contents of the start info. The following patch updates the layout to what has been agreed. Also, the new layout is described in binary terms in order to avoid issues with alignments when using C structs. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> --- tools/libxc/include/xc_dom.h | 31 +++++++++++++++++++++++++ xen/include/public/xen.h | 55 ++++++++++++++++++++++++++++++-------------- 2 files changed, 69 insertions(+), 17 deletions(-) diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index cac4698..6ebe946 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -216,6 +216,37 @@ struct xc_dom_image { struct xc_hvm_firmware_module smbios_module; }; +#if defined(__i386__) || defined(__x86_64__) +/* C representation of the x86/HVM start info layout. + * + * The canonical definition of this layout resides in public/xen.h, this + * is just a way to represent the layout described there using C types. + * + * NB: the packed attribute is not really needed, but it helps us enforce + * the fact this this is just a representation, and it might indeed + * be required in the future if there are alignment changes. + */ +struct hvm_start_info { + uint32_t magic; /* Contains the magic value 0x336ec578 */ + /* ("xEn3" with the 0x80 bit of the "E" set).*/ + uint32_t version; /* Version of this structure. */ + uint32_t flags; /* SIF_xxx flags. */ + uint32_t cmdline_paddr; /* Physical address of the command line. */ + uint32_t nr_modules; /* Number of modules passed to the kernel. */ + uint32_t modlist_paddr; /* Physical address of an array of */ + /* hvm_modlist_entry. */ + uint32_t rsdp_paddr; /* Physical address of the RSDP ACPI data */ + /* structure. */ +} __attribute__((packed)); + +struct hvm_modlist_entry { + uint64_t paddr; /* Physical address of the module. */ + uint64_t size; /* Size of the module in bytes. */ + uint64_t cmdline_paddr; /* Physical address of the command line. */ + uint64_t reserved; +} __attribute__((packed)); +#endif /* x86 */ + /* --- pluggable kernel loader ------------------------------------- */ struct xc_dom_loader { diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h index 7b629b1..6ba060f 100644 --- a/xen/include/public/xen.h +++ b/xen/include/public/xen.h @@ -787,25 +787,46 @@ typedef struct start_info start_info_t; /* * Start of day structure passed to PVH guests in %ebx. * - * NOTE: nothing will be loaded at physical address 0, so - * a 0 value in any of the address fields should be treated - * as not present. + * NOTE: nothing will be loaded at physical address 0, so a 0 value in any + * of the address fields should be treated as not present. + * + * 0 +----------------+ + * | magic | Contains the magic value HVM_START_MAGIC_VALUE + * | | ("xEn3" with the 0x80 bit of the "E" set). + * 4 +----------------+ + * | version | Version of this structure. Current version is 0. New + * | | versions are guaranteed to be backwards-compatible. + * 8 +----------------+ + * | flags | SIF_xxx flags. + * 12 +----------------+ + * | cmdline_paddr | Physical address of the command line, + * | | a zero-terminated ASCII string. + * 16 +----------------+ + * | nr_modules | Number of modules passed to the kernel. + * 20 +----------------+ + * | modlist_paddr | Physical address of an array of modules + * | | (layout of the structure below). + * 24 +----------------+ + * | rsdp_paddr | Physical address of the RSDP ACPI data structure. + * 28 +----------------+ + * + * The layout of each entry in the module structure is the following: + * + * 0 +----------------+ + * | paddr | Physical address of the module. + * 8 +----------------+ + * | size | Size of the module in bytes. + * 16 +----------------+ + * | cmdline_paddr | Physical address of the command line, + * | | a zero-terminated ASCII string. + * 24 +----------------+ + * | reserved | + * 32 +----------------+ + * + * The address and size of the modules is a 64bit unsigned integer. However + * Xen will always try to place all modules below the 4GiB boundary. */ -struct hvm_start_info { #define HVM_START_MAGIC_VALUE 0x336ec578 - uint32_t magic; /* Contains the magic value 0x336ec578 */ - /* ("xEn3" with the 0x80 bit of the "E" set).*/ - uint32_t flags; /* SIF_xxx flags. */ - uint32_t cmdline_paddr; /* Physical address of the command line. */ - uint32_t nr_modules; /* Number of modules passed to the kernel. */ - uint32_t modlist_paddr; /* Physical address of an array of */ - /* hvm_modlist_entry. */ -}; - -struct hvm_modlist_entry { - uint32_t paddr; /* Physical address of the module. */ - uint32_t size; /* Size of the module in bytes. */ -}; /* New console union for dom0 introduced in 0x00030203. */ #if __XEN_INTERFACE_VERSION__ < 0x00030203 -- 2.5.4 (Apple Git-61) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/4] x86/HVM: update the start info structure layout 2016-02-16 17:37 ` [PATCH v4 1/4] x86/HVM: update the start info structure layout Roger Pau Monne @ 2016-02-16 19:13 ` Andrew Cooper 2016-02-16 20:06 ` Konrad Rzeszutek Wilk ` (3 subsequent siblings) 4 siblings, 0 replies; 39+ messages in thread From: Andrew Cooper @ 2016-02-16 19:13 UTC (permalink / raw) To: Roger Pau Monne, xen-devel Cc: Samuel Thibault, Ian Jackson, Ian Campbell, Jan Beulich, Wei Liu On 16/02/16 17:37, Roger Pau Monne wrote: > After some discussion around the new boot ABI consensus has been reached > about the layout and contents of the start info. The following patch updates > the layout to what has been agreed. > > Also, the new layout is described in binary terms in order to avoid issues > with alignments when using C structs. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/4] x86/HVM: update the start info structure layout 2016-02-16 17:37 ` [PATCH v4 1/4] x86/HVM: update the start info structure layout Roger Pau Monne 2016-02-16 19:13 ` Andrew Cooper @ 2016-02-16 20:06 ` Konrad Rzeszutek Wilk 2016-02-17 10:01 ` Roger Pau Monné 2016-02-16 21:26 ` Boris Ostrovsky ` (2 subsequent siblings) 4 siblings, 1 reply; 39+ messages in thread From: Konrad Rzeszutek Wilk @ 2016-02-16 20:06 UTC (permalink / raw) To: Roger Pau Monne, boris.ostrovsky Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, Jan Beulich, Samuel Thibault, xen-devel On Tue, Feb 16, 2016 at 06:37:46PM +0100, Roger Pau Monne wrote: > After some discussion around the new boot ABI consensus has been reached > about the layout and contents of the start info. The following patch updates > the layout to what has been agreed. .. It would be good to have an URL to the discussion? > > Also, the new layout is described in binary terms in order to avoid issues > with alignments when using C structs. Should the title say more about HVMLite/PVH instead of just 'HVM'? > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> > Cc: Ian Jackson <ian.jackson@eu.citrix.com> > Cc: Ian Campbell <ian.campbell@citrix.com> > Cc: Wei Liu <wei.liu2@citrix.com> > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Andrew Cooper <andrew.cooper3@citrix.com> You forgot Boris who is working on the Linux side. CC-ing him. Actually please CC him on _ALL_ PVH related work. > --- > tools/libxc/include/xc_dom.h | 31 +++++++++++++++++++++++++ > xen/include/public/xen.h | 55 ++++++++++++++++++++++++++++++-------------- > 2 files changed, 69 insertions(+), 17 deletions(-) > > diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h > index cac4698..6ebe946 100644 > --- a/tools/libxc/include/xc_dom.h > +++ b/tools/libxc/include/xc_dom.h > @@ -216,6 +216,37 @@ struct xc_dom_image { > struct xc_hvm_firmware_module smbios_module; > }; > > +#if defined(__i386__) || defined(__x86_64__) > +/* C representation of the x86/HVM start info layout. > + * > + * The canonical definition of this layout resides in public/xen.h, this > + * is just a way to represent the layout described there using C types. > + * > + * NB: the packed attribute is not really needed, but it helps us enforce > + * the fact this this is just a representation, and it might indeed > + * be required in the future if there are alignment changes. > + */ > +struct hvm_start_info { > + uint32_t magic; /* Contains the magic value 0x336ec578 */ > + /* ("xEn3" with the 0x80 bit of the "E" set).*/ > + uint32_t version; /* Version of this structure. */ > + uint32_t flags; /* SIF_xxx flags. */ > + uint32_t cmdline_paddr; /* Physical address of the command line. */ > + uint32_t nr_modules; /* Number of modules passed to the kernel. */ > + uint32_t modlist_paddr; /* Physical address of an array of */ > + /* hvm_modlist_entry. */ > + uint32_t rsdp_paddr; /* Physical address of the RSDP ACPI data */ > + /* structure. */ > +} __attribute__((packed)); > + > +struct hvm_modlist_entry { > + uint64_t paddr; /* Physical address of the module. */ > + uint64_t size; /* Size of the module in bytes. */ > + uint64_t cmdline_paddr; /* Physical address of the command line. */ > + uint64_t reserved; > +} __attribute__((packed)); > +#endif /* x86 */ > + > /* --- pluggable kernel loader ------------------------------------- */ > > struct xc_dom_loader { > diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h > index 7b629b1..6ba060f 100644 > --- a/xen/include/public/xen.h > +++ b/xen/include/public/xen.h > @@ -787,25 +787,46 @@ typedef struct start_info start_info_t; > /* > * Start of day structure passed to PVH guests in %ebx. > * > - * NOTE: nothing will be loaded at physical address 0, so > - * a 0 value in any of the address fields should be treated > - * as not present. > + * NOTE: nothing will be loaded at physical address 0, so a 0 value in any > + * of the address fields should be treated as not present. > + * > + * 0 +----------------+ > + * | magic | Contains the magic value HVM_START_MAGIC_VALUE > + * | | ("xEn3" with the 0x80 bit of the "E" set). > + * 4 +----------------+ > + * | version | Version of this structure. Current version is 0. New > + * | | versions are guaranteed to be backwards-compatible. > + * 8 +----------------+ > + * | flags | SIF_xxx flags. > + * 12 +----------------+ > + * | cmdline_paddr | Physical address of the command line, > + * | | a zero-terminated ASCII string. > + * 16 +----------------+ > + * | nr_modules | Number of modules passed to the kernel. > + * 20 +----------------+ > + * | modlist_paddr | Physical address of an array of modules > + * | | (layout of the structure below). > + * 24 +----------------+ > + * | rsdp_paddr | Physical address of the RSDP ACPI data structure. > + * 28 +----------------+ > + * > + * The layout of each entry in the module structure is the following: > + * > + * 0 +----------------+ > + * | paddr | Physical address of the module. > + * 8 +----------------+ > + * | size | Size of the module in bytes. > + * 16 +----------------+ > + * | cmdline_paddr | Physical address of the command line, > + * | | a zero-terminated ASCII string. > + * 24 +----------------+ > + * | reserved | > + * 32 +----------------+ > + * > + * The address and size of the modules is a 64bit unsigned integer. However > + * Xen will always try to place all modules below the 4GiB boundary. > */ > -struct hvm_start_info { > #define HVM_START_MAGIC_VALUE 0x336ec578 > - uint32_t magic; /* Contains the magic value 0x336ec578 */ > - /* ("xEn3" with the 0x80 bit of the "E" set).*/ > - uint32_t flags; /* SIF_xxx flags. */ > - uint32_t cmdline_paddr; /* Physical address of the command line. */ > - uint32_t nr_modules; /* Number of modules passed to the kernel. */ > - uint32_t modlist_paddr; /* Physical address of an array of */ > - /* hvm_modlist_entry. */ > -}; > - > -struct hvm_modlist_entry { > - uint32_t paddr; /* Physical address of the module. */ > - uint32_t size; /* Size of the module in bytes. */ > -}; > > /* New console union for dom0 introduced in 0x00030203. */ > #if __XEN_INTERFACE_VERSION__ < 0x00030203 > -- > 2.5.4 (Apple Git-61) > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/4] x86/HVM: update the start info structure layout 2016-02-16 20:06 ` Konrad Rzeszutek Wilk @ 2016-02-17 10:01 ` Roger Pau Monné 0 siblings, 0 replies; 39+ messages in thread From: Roger Pau Monné @ 2016-02-17 10:01 UTC (permalink / raw) To: Konrad Rzeszutek Wilk, boris.ostrovsky Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, Jan Beulich, Samuel Thibault, xen-devel El 16/2/16 a les 21:06, Konrad Rzeszutek Wilk ha escrit: > On Tue, Feb 16, 2016 at 06:37:46PM +0100, Roger Pau Monne wrote: >> After some discussion around the new boot ABI consensus has been reached >> about the layout and contents of the start info. The following patch updates >> the layout to what has been agreed. > > .. It would be good to have an URL to the discussion? >> >> Also, the new layout is described in binary terms in order to avoid issues >> with alignments when using C structs. > > Should the title say more about HVMLite/PVH instead of just 'HVM'? I guess x86/PVHv2 would be a better tag. >> >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >> --- >> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> >> Cc: Ian Jackson <ian.jackson@eu.citrix.com> >> Cc: Ian Campbell <ian.campbell@citrix.com> >> Cc: Wei Liu <wei.liu2@citrix.com> >> Cc: Jan Beulich <jbeulich@suse.com> >> Cc: Andrew Cooper <andrew.cooper3@citrix.com> > > You forgot Boris who is working on the Linux side. CC-ing him. > > Actually please CC him on _ALL_ PVH related work. Sure, I should have CC'ed him, David and yourself on this one, sorry. Roger. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/4] x86/HVM: update the start info structure layout 2016-02-16 17:37 ` [PATCH v4 1/4] x86/HVM: update the start info structure layout Roger Pau Monne 2016-02-16 19:13 ` Andrew Cooper 2016-02-16 20:06 ` Konrad Rzeszutek Wilk @ 2016-02-16 21:26 ` Boris Ostrovsky 2016-02-17 9:58 ` Jan Beulich 2016-02-17 10:45 ` Samuel Thibault 2016-02-17 13:00 ` Jan Beulich 4 siblings, 1 reply; 39+ messages in thread From: Boris Ostrovsky @ 2016-02-16 21:26 UTC (permalink / raw) To: Roger Pau Monne, xen-devel Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, Jan Beulich, Samuel Thibault On 02/16/2016 12:37 PM, Roger Pau Monne wrote: > diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h > index 7b629b1..6ba060f 100644 > --- a/xen/include/public/xen.h > +++ b/xen/include/public/xen.h > @@ -787,25 +787,46 @@ typedef struct start_info start_info_t; > /* > * Start of day structure passed to PVH guests in %ebx. > * > - * NOTE: nothing will be loaded at physical address 0, so > - * a 0 value in any of the address fields should be treated > - * as not present. > + * NOTE: nothing will be loaded at physical address 0, so a 0 value in any > + * of the address fields should be treated as not present. > + * > + * 0 +----------------+ > + * | magic | Contains the magic value HVM_START_MAGIC_VALUE > + * | | ("xEn3" with the 0x80 bit of the "E" set). > + * 4 +----------------+ > + * | version | Version of this structure. Current version is 0. New > + * | | versions are guaranteed to be backwards-compatible. #define XEN_HVM_START_INFO_VERSION 0 is needed then? ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/4] x86/HVM: update the start info structure layout 2016-02-16 21:26 ` Boris Ostrovsky @ 2016-02-17 9:58 ` Jan Beulich 2016-02-17 10:05 ` Roger Pau Monné 0 siblings, 1 reply; 39+ messages in thread From: Jan Beulich @ 2016-02-17 9:58 UTC (permalink / raw) To: Boris Ostrovsky Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, xen-devel, Samuel Thibault, Roger Pau Monne >>> On 16.02.16 at 22:26, <boris.ostrovsky@oracle.com> wrote: > On 02/16/2016 12:37 PM, Roger Pau Monne wrote: >> diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h >> index 7b629b1..6ba060f 100644 >> --- a/xen/include/public/xen.h >> +++ b/xen/include/public/xen.h >> @@ -787,25 +787,46 @@ typedef struct start_info start_info_t; >> /* >> * Start of day structure passed to PVH guests in %ebx. >> * >> - * NOTE: nothing will be loaded at physical address 0, so >> - * a 0 value in any of the address fields should be treated >> - * as not present. >> + * NOTE: nothing will be loaded at physical address 0, so a 0 value in any >> + * of the address fields should be treated as not present. >> + * >> + * 0 +----------------+ >> + * | magic | Contains the magic value HVM_START_MAGIC_VALUE >> + * | | ("xEn3" with the 0x80 bit of the "E" set). >> + * 4 +----------------+ >> + * | version | Version of this structure. Current version is 0. New >> + * | | versions are guaranteed to be backwards-compatible. > > #define XEN_HVM_START_INFO_VERSION 0 What would that buy us? Once it gets bumped to 1, consumers would need to check against literal zero anyway. Jan ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/4] x86/HVM: update the start info structure layout 2016-02-17 9:58 ` Jan Beulich @ 2016-02-17 10:05 ` Roger Pau Monné 2016-02-17 14:39 ` Boris Ostrovsky 0 siblings, 1 reply; 39+ messages in thread From: Roger Pau Monné @ 2016-02-17 10:05 UTC (permalink / raw) To: Jan Beulich, Boris Ostrovsky Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, Samuel Thibault, xen-devel El 17/2/16 a les 10:58, Jan Beulich ha escrit: >>>> On 16.02.16 at 22:26, <boris.ostrovsky@oracle.com> wrote: >> On 02/16/2016 12:37 PM, Roger Pau Monne wrote: >>> diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h >>> index 7b629b1..6ba060f 100644 >>> --- a/xen/include/public/xen.h >>> +++ b/xen/include/public/xen.h >>> @@ -787,25 +787,46 @@ typedef struct start_info start_info_t; >>> /* >>> * Start of day structure passed to PVH guests in %ebx. >>> * >>> - * NOTE: nothing will be loaded at physical address 0, so >>> - * a 0 value in any of the address fields should be treated >>> - * as not present. >>> + * NOTE: nothing will be loaded at physical address 0, so a 0 value in any >>> + * of the address fields should be treated as not present. >>> + * >>> + * 0 +----------------+ >>> + * | magic | Contains the magic value HVM_START_MAGIC_VALUE >>> + * | | ("xEn3" with the 0x80 bit of the "E" set). >>> + * 4 +----------------+ >>> + * | version | Version of this structure. Current version is 0. New >>> + * | | versions are guaranteed to be backwards-compatible. >> >> #define XEN_HVM_START_INFO_VERSION 0 > > What would that buy us? Once it gets bumped to 1, consumers > would need to check against literal zero anyway. I agree with Jan that this doesn't seem very useful, the headers don't have to be in sync with the underlying hypervisor, so it's better to just use literals instead of defines. Roger. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/4] x86/HVM: update the start info structure layout 2016-02-17 10:05 ` Roger Pau Monné @ 2016-02-17 14:39 ` Boris Ostrovsky 2016-02-17 14:54 ` Jan Beulich 0 siblings, 1 reply; 39+ messages in thread From: Boris Ostrovsky @ 2016-02-17 14:39 UTC (permalink / raw) To: Roger Pau Monné, Jan Beulich Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, Samuel Thibault, xen-devel On 02/17/2016 05:05 AM, Roger Pau Monné wrote: > El 17/2/16 a les 10:58, Jan Beulich ha escrit: >>>>> On 16.02.16 at 22:26, <boris.ostrovsky@oracle.com> wrote: >>> On 02/16/2016 12:37 PM, Roger Pau Monne wrote: >>>> diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h >>>> index 7b629b1..6ba060f 100644 >>>> --- a/xen/include/public/xen.h >>>> +++ b/xen/include/public/xen.h >>>> @@ -787,25 +787,46 @@ typedef struct start_info start_info_t; >>>> /* >>>> * Start of day structure passed to PVH guests in %ebx. >>>> * >>>> - * NOTE: nothing will be loaded at physical address 0, so >>>> - * a 0 value in any of the address fields should be treated >>>> - * as not present. >>>> + * NOTE: nothing will be loaded at physical address 0, so a 0 value in any >>>> + * of the address fields should be treated as not present. >>>> + * >>>> + * 0 +----------------+ >>>> + * | magic | Contains the magic value HVM_START_MAGIC_VALUE >>>> + * | | ("xEn3" with the 0x80 bit of the "E" set). >>>> + * 4 +----------------+ >>>> + * | version | Version of this structure. Current version is 0. New >>>> + * | | versions are guaranteed to be backwards-compatible. >>> #define XEN_HVM_START_INFO_VERSION 0 >> What would that buy us? Once it gets bumped to 1, consumers >> would need to check against literal zero anyway. Consumers would need to check against what their header file's version is, not necessarily zero. And they, for example, may decide not to run if the version provided by the structure is smaller than what they support. -boris > I agree with Jan that this doesn't seem very useful, the headers don't > have to be in sync with the underlying hypervisor, so it's better to > just use literals instead of defines. > > Roger. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/4] x86/HVM: update the start info structure layout 2016-02-17 14:39 ` Boris Ostrovsky @ 2016-02-17 14:54 ` Jan Beulich 0 siblings, 0 replies; 39+ messages in thread From: Jan Beulich @ 2016-02-17 14:54 UTC (permalink / raw) To: Boris Ostrovsky Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, xen-devel, Samuel Thibault, roger.pau >>> On 17.02.16 at 15:39, <boris.ostrovsky@oracle.com> wrote: > On 02/17/2016 05:05 AM, Roger Pau Monné wrote: >> El 17/2/16 a les 10:58, Jan Beulich ha escrit: >>>>>> On 16.02.16 at 22:26, <boris.ostrovsky@oracle.com> wrote: >>>> On 02/16/2016 12:37 PM, Roger Pau Monne wrote: >>>>> diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h >>>>> index 7b629b1..6ba060f 100644 >>>>> --- a/xen/include/public/xen.h >>>>> +++ b/xen/include/public/xen.h >>>>> @@ -787,25 +787,46 @@ typedef struct start_info start_info_t; >>>>> /* >>>>> * Start of day structure passed to PVH guests in %ebx. >>>>> * >>>>> - * NOTE: nothing will be loaded at physical address 0, so >>>>> - * a 0 value in any of the address fields should be treated >>>>> - * as not present. >>>>> + * NOTE: nothing will be loaded at physical address 0, so a 0 value in any >>>>> + * of the address fields should be treated as not present. >>>>> + * >>>>> + * 0 +----------------+ >>>>> + * | magic | Contains the magic value HVM_START_MAGIC_VALUE >>>>> + * | | ("xEn3" with the 0x80 bit of the "E" set). >>>>> + * 4 +----------------+ >>>>> + * | version | Version of this structure. Current version is 0. > New >>>>> + * | | versions are guaranteed to be backwards-compatible. >>>> #define XEN_HVM_START_INFO_VERSION 0 >>> What would that buy us? Once it gets bumped to 1, consumers >>> would need to check against literal zero anyway. > > Consumers would need to check against what their header file's version > is, not necessarily zero. Only if they aren't capable to deal with more than one version. Plus - an update to the header would then go unnoticed, breaking the code. > And they, for example, may decide not to run > if the version provided by the structure is smaller than what they support. Achievable by doing checks against literal numbers. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/4] x86/HVM: update the start info structure layout 2016-02-16 17:37 ` [PATCH v4 1/4] x86/HVM: update the start info structure layout Roger Pau Monne ` (2 preceding siblings ...) 2016-02-16 21:26 ` Boris Ostrovsky @ 2016-02-17 10:45 ` Samuel Thibault 2016-02-17 13:00 ` Jan Beulich 4 siblings, 0 replies; 39+ messages in thread From: Samuel Thibault @ 2016-02-17 10:45 UTC (permalink / raw) To: Roger Pau Monne Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, Jan Beulich, xen-devel Roger Pau Monne, on Tue 16 Feb 2016 18:37:46 +0100, wrote: > After some discussion around the new boot ABI consensus has been reached > about the layout and contents of the start info. The following patch updates > the layout to what has been agreed. > > Also, the new layout is described in binary terms in order to avoid issues > with alignments when using C structs. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org> > --- > Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> > Cc: Ian Jackson <ian.jackson@eu.citrix.com> > Cc: Ian Campbell <ian.campbell@citrix.com> > Cc: Wei Liu <wei.liu2@citrix.com> > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > --- > tools/libxc/include/xc_dom.h | 31 +++++++++++++++++++++++++ > xen/include/public/xen.h | 55 ++++++++++++++++++++++++++++++-------------- > 2 files changed, 69 insertions(+), 17 deletions(-) > > diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h > index cac4698..6ebe946 100644 > --- a/tools/libxc/include/xc_dom.h > +++ b/tools/libxc/include/xc_dom.h > @@ -216,6 +216,37 @@ struct xc_dom_image { > struct xc_hvm_firmware_module smbios_module; > }; > > +#if defined(__i386__) || defined(__x86_64__) > +/* C representation of the x86/HVM start info layout. > + * > + * The canonical definition of this layout resides in public/xen.h, this > + * is just a way to represent the layout described there using C types. > + * > + * NB: the packed attribute is not really needed, but it helps us enforce > + * the fact this this is just a representation, and it might indeed > + * be required in the future if there are alignment changes. > + */ > +struct hvm_start_info { > + uint32_t magic; /* Contains the magic value 0x336ec578 */ > + /* ("xEn3" with the 0x80 bit of the "E" set).*/ > + uint32_t version; /* Version of this structure. */ > + uint32_t flags; /* SIF_xxx flags. */ > + uint32_t cmdline_paddr; /* Physical address of the command line. */ > + uint32_t nr_modules; /* Number of modules passed to the kernel. */ > + uint32_t modlist_paddr; /* Physical address of an array of */ > + /* hvm_modlist_entry. */ > + uint32_t rsdp_paddr; /* Physical address of the RSDP ACPI data */ > + /* structure. */ > +} __attribute__((packed)); > + > +struct hvm_modlist_entry { > + uint64_t paddr; /* Physical address of the module. */ > + uint64_t size; /* Size of the module in bytes. */ > + uint64_t cmdline_paddr; /* Physical address of the command line. */ > + uint64_t reserved; > +} __attribute__((packed)); > +#endif /* x86 */ > + > /* --- pluggable kernel loader ------------------------------------- */ > > struct xc_dom_loader { > diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h > index 7b629b1..6ba060f 100644 > --- a/xen/include/public/xen.h > +++ b/xen/include/public/xen.h > @@ -787,25 +787,46 @@ typedef struct start_info start_info_t; > /* > * Start of day structure passed to PVH guests in %ebx. > * > - * NOTE: nothing will be loaded at physical address 0, so > - * a 0 value in any of the address fields should be treated > - * as not present. > + * NOTE: nothing will be loaded at physical address 0, so a 0 value in any > + * of the address fields should be treated as not present. > + * > + * 0 +----------------+ > + * | magic | Contains the magic value HVM_START_MAGIC_VALUE > + * | | ("xEn3" with the 0x80 bit of the "E" set). > + * 4 +----------------+ > + * | version | Version of this structure. Current version is 0. New > + * | | versions are guaranteed to be backwards-compatible. > + * 8 +----------------+ > + * | flags | SIF_xxx flags. > + * 12 +----------------+ > + * | cmdline_paddr | Physical address of the command line, > + * | | a zero-terminated ASCII string. > + * 16 +----------------+ > + * | nr_modules | Number of modules passed to the kernel. > + * 20 +----------------+ > + * | modlist_paddr | Physical address of an array of modules > + * | | (layout of the structure below). > + * 24 +----------------+ > + * | rsdp_paddr | Physical address of the RSDP ACPI data structure. > + * 28 +----------------+ > + * > + * The layout of each entry in the module structure is the following: > + * > + * 0 +----------------+ > + * | paddr | Physical address of the module. > + * 8 +----------------+ > + * | size | Size of the module in bytes. > + * 16 +----------------+ > + * | cmdline_paddr | Physical address of the command line, > + * | | a zero-terminated ASCII string. > + * 24 +----------------+ > + * | reserved | > + * 32 +----------------+ > + * > + * The address and size of the modules is a 64bit unsigned integer. However > + * Xen will always try to place all modules below the 4GiB boundary. > */ > -struct hvm_start_info { > #define HVM_START_MAGIC_VALUE 0x336ec578 > - uint32_t magic; /* Contains the magic value 0x336ec578 */ > - /* ("xEn3" with the 0x80 bit of the "E" set).*/ > - uint32_t flags; /* SIF_xxx flags. */ > - uint32_t cmdline_paddr; /* Physical address of the command line. */ > - uint32_t nr_modules; /* Number of modules passed to the kernel. */ > - uint32_t modlist_paddr; /* Physical address of an array of */ > - /* hvm_modlist_entry. */ > -}; > - > -struct hvm_modlist_entry { > - uint32_t paddr; /* Physical address of the module. */ > - uint32_t size; /* Size of the module in bytes. */ > -}; > > /* New console union for dom0 introduced in 0x00030203. */ > #if __XEN_INTERFACE_VERSION__ < 0x00030203 > -- > 2.5.4 (Apple Git-61) > -- Samuel <N> un driver qui fait quoi, alors ? <y> ben pour les bips <s> pour passer les oops en morse -+- #ens-mim - vive les rapports de bug -+- ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/4] x86/HVM: update the start info structure layout 2016-02-16 17:37 ` [PATCH v4 1/4] x86/HVM: update the start info structure layout Roger Pau Monne ` (3 preceding siblings ...) 2016-02-17 10:45 ` Samuel Thibault @ 2016-02-17 13:00 ` Jan Beulich 4 siblings, 0 replies; 39+ messages in thread From: Jan Beulich @ 2016-02-17 13:00 UTC (permalink / raw) To: Roger Pau Monne Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, Samuel Thibault, xen-devel >>> On 16.02.16 at 18:37, <roger.pau@citrix.com> wrote: > --- a/xen/include/public/xen.h > +++ b/xen/include/public/xen.h > @@ -787,25 +787,46 @@ typedef struct start_info start_info_t; > /* > * Start of day structure passed to PVH guests in %ebx. > * > - * NOTE: nothing will be loaded at physical address 0, so > - * a 0 value in any of the address fields should be treated > - * as not present. > + * NOTE: nothing will be loaded at physical address 0, so a 0 value in any > + * of the address fields should be treated as not present. > + * > + * 0 +----------------+ > + * | magic | Contains the magic value HVM_START_MAGIC_VALUE > + * | | ("xEn3" with the 0x80 bit of the "E" set). > + * 4 +----------------+ > + * | version | Version of this structure. Current version is 0. New > + * | | versions are guaranteed to be backwards-compatible. > + * 8 +----------------+ > + * | flags | SIF_xxx flags. > + * 12 +----------------+ > + * | cmdline_paddr | Physical address of the command line, > + * | | a zero-terminated ASCII string. > + * 16 +----------------+ > + * | nr_modules | Number of modules passed to the kernel. > + * 20 +----------------+ > + * | modlist_paddr | Physical address of an array of modules > + * | | (layout of the structure below). > + * 24 +----------------+ > + * | rsdp_paddr | Physical address of the RSDP ACPI data structure. > + * 28 +----------------+ > + * > + * The layout of each entry in the module structure is the following: > + * > + * 0 +----------------+ > + * | paddr | Physical address of the module. > + * 8 +----------------+ > + * | size | Size of the module in bytes. > + * 16 +----------------+ > + * | cmdline_paddr | Physical address of the command line, > + * | | a zero-terminated ASCII string. > + * 24 +----------------+ > + * | reserved | > + * 32 +----------------+ > + * > + * The address and size of the modules is a 64bit unsigned integer. However > + * Xen will always try to place all modules below the 4GiB boundary. > */ > -struct hvm_start_info { > #define HVM_START_MAGIC_VALUE 0x336ec578 > - uint32_t magic; /* Contains the magic value 0x336ec578 */ I would have wanted to take the opportunity and prefix this constant with XEN_, but I see that the tools already use it. May I please ask for a follow-up patch to correct this name space issue? Jan ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH v4 2/4] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNKNOWN 2016-02-16 17:37 [PATCH v4 0/4] Assorted fixes and improvements Roger Pau Monne 2016-02-16 17:37 ` [PATCH v4 1/4] x86/HVM: update the start info structure layout Roger Pau Monne @ 2016-02-16 17:37 ` Roger Pau Monne 2016-02-24 12:08 ` Wei Liu 2016-02-16 17:37 ` [PATCH v4 3/4] libelf: rewrite symtab/strtab loading Roger Pau Monne 2016-02-16 17:37 ` [PATCH v4 4/4] libxl: fix cd-eject Roger Pau Monne 3 siblings, 1 reply; 39+ messages in thread From: Roger Pau Monne @ 2016-02-16 17:37 UTC (permalink / raw) To: xen-devel; +Cc: Wei Liu, Ian Jackson, Ian Campbell, Roger Pau Monne And use it as the default value for the VGA kind. This allows libxl to set it to the default value later on when the domain type is known. For HVM guests the default value is LIBXL_VGA_INTERFACE_TYPE_CIRRUS while for HVMlite the default value is LIBXL_VGA_INTERFACE_TYPE_NONE. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> --- Changes since v3: - s/UNDEF/UNKNOWN/. - Add a LIBXL_HAVE_VGA_INTERFACE_TYPE_UNKNOWN. --- tools/libxl/libxl.h | 10 ++++++++++ tools/libxl/libxl_create.c | 8 ++++++-- tools/libxl/libxl_dm.c | 6 ++++++ tools/libxl/libxl_types.idl | 3 ++- 4 files changed, 24 insertions(+), 3 deletions(-) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index fa87f53..1d11e02 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -876,6 +876,16 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src); */ #define LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE 1 +/* + * LIBXL_HAVE_VGA_INTERFACE_TYPE_UNKNOWN + * + * In the case that LIBXL_HAVE_VGA_INTERFACE_TYPE_UNKNOWN is set the + * libxl_vga_interface_type enumeration type contains a + * LIBXL_VGA_INTERFACE_TYPE_UNKNOWN identifier. This is used to signal + * that a libxl_vga_interface_type type has not been initialized yet. + */ +#define LIBXL_HAVE_VGA_INTERFACE_TYPE_UNKNOWN 1 + typedef char **libxl_string_list; void libxl_string_list_dispose(libxl_string_list *sl); int libxl_string_list_length(const libxl_string_list *sl); diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index de5d27f..e1a20a9 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -222,8 +222,12 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, if (b_info->u.hvm.mmio_hole_memkb == LIBXL_MEMKB_DEFAULT) b_info->u.hvm.mmio_hole_memkb = 0; - if (!b_info->u.hvm.vga.kind) - b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS; + if (b_info->u.hvm.vga.kind == LIBXL_VGA_INTERFACE_TYPE_UNKNOWN) { + if (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE) + b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE; + else + b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS; + } if (!b_info->u.hvm.hdtype) b_info->u.hvm.hdtype = LIBXL_HDTYPE_IDE; diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index a088d71..9aa0cc8 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -531,6 +531,9 @@ static int libxl__build_device_model_args_old(libxl__gc *gc, break; case LIBXL_VGA_INTERFACE_TYPE_QXL: break; + default: + LOG(ERROR, "Invalid emulated video card specified"); + return ERROR_INVAL; } if (b_info->u.hvm.boot) { @@ -970,6 +973,9 @@ static int libxl__build_device_model_args_new(libxl__gc *gc, GCSPRINTF("qxl-vga,vram_size_mb=%"PRIu64",ram_size_mb=%"PRIu64, (b_info->video_memkb/2/1024), (b_info->video_memkb/2/1024) ) ); break; + default: + LOG(ERROR, "Invalid emulated video card specified"); + return ERROR_INVAL; } if (b_info->u.hvm.boot) { diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 9ad7eba..e9e0da0 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -204,11 +204,12 @@ libxl_shutdown_reason = Enumeration("shutdown_reason", [ ], init_val = "LIBXL_SHUTDOWN_REASON_UNKNOWN") libxl_vga_interface_type = Enumeration("vga_interface_type", [ + (0, "UNKNOWN"), (1, "CIRRUS"), (2, "STD"), (3, "NONE"), (4, "QXL"), - ], init_val = "LIBXL_VGA_INTERFACE_TYPE_CIRRUS") + ], init_val = "LIBXL_VGA_INTERFACE_TYPE_UNKNOWN") libxl_vendor_device = Enumeration("vendor_device", [ (0, "NONE"), -- 2.5.4 (Apple Git-61) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 2/4] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNKNOWN 2016-02-16 17:37 ` [PATCH v4 2/4] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNKNOWN Roger Pau Monne @ 2016-02-24 12:08 ` Wei Liu 2016-03-01 16:06 ` Ian Jackson 0 siblings, 1 reply; 39+ messages in thread From: Wei Liu @ 2016-02-24 12:08 UTC (permalink / raw) To: Roger Pau Monne; +Cc: xen-devel, Ian Jackson, Ian Campbell, Wei Liu On Tue, Feb 16, 2016 at 06:37:47PM +0100, Roger Pau Monne wrote: > And use it as the default value for the VGA kind. This allows libxl to set > it to the default value later on when the domain type is known. For HVM > guests the default value is LIBXL_VGA_INTERFACE_TYPE_CIRRUS while for > HVMlite the default value is LIBXL_VGA_INTERFACE_TYPE_NONE. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 2/4] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNKNOWN 2016-02-24 12:08 ` Wei Liu @ 2016-03-01 16:06 ` Ian Jackson 0 siblings, 0 replies; 39+ messages in thread From: Ian Jackson @ 2016-03-01 16:06 UTC (permalink / raw) To: Wei Liu; +Cc: xen-devel, Roger Pau Monne Wei Liu writes ("Re: [PATCH v4 2/4] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNKNOWN"): > On Tue, Feb 16, 2016 at 06:37:47PM +0100, Roger Pau Monne wrote: > > And use it as the default value for the VGA kind. This allows libxl to set > > it to the default value later on when the domain type is known. For HVM > > guests the default value is LIBXL_VGA_INTERFACE_TYPE_CIRRUS while for > > HVMlite the default value is LIBXL_VGA_INTERFACE_TYPE_NONE. > > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > Acked-by: Wei Liu <wei.liu2@citrix.com> I tried to apply this but it didn't apply to staging. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH v4 3/4] libelf: rewrite symtab/strtab loading 2016-02-16 17:37 [PATCH v4 0/4] Assorted fixes and improvements Roger Pau Monne 2016-02-16 17:37 ` [PATCH v4 1/4] x86/HVM: update the start info structure layout Roger Pau Monne 2016-02-16 17:37 ` [PATCH v4 2/4] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNKNOWN Roger Pau Monne @ 2016-02-16 17:37 ` Roger Pau Monne 2016-02-26 13:15 ` Jan Beulich 2016-02-16 17:37 ` [PATCH v4 4/4] libxl: fix cd-eject Roger Pau Monne 3 siblings, 1 reply; 39+ messages in thread From: Roger Pau Monne @ 2016-02-16 17:37 UTC (permalink / raw) To: xen-devel; +Cc: Roger Pau Monne Current implementation of elf_load_bsdsyms is broken when loading inside of a HVM guest, because it assumes elf_memcpy_safe is able to write into guest memory space, which it is not. Take the oportunity to do some cleanup and properly document how elf_{parse/load}_bsdsyms works. The new implementation uses elf_load_image when dealing with data that needs to be copied to the guest memory space. Also reduce the number of section headers copied to the minimum necessary. This patch also removes the duplication of code found in the libxc ELF loader, since the libelf symtab/strtab loading code will also handle this case without having to duplicate it. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- tools/libxc/xc_dom_elfloader.c | 203 ---------------------------------- xen/common/libelf/libelf-loader.c | 223 ++++++++++++++++++++++++++++---------- 2 files changed, 163 insertions(+), 263 deletions(-) diff --git a/tools/libxc/xc_dom_elfloader.c b/tools/libxc/xc_dom_elfloader.c index 5039f3f..62d421a 100644 --- a/tools/libxc/xc_dom_elfloader.c +++ b/tools/libxc/xc_dom_elfloader.c @@ -133,204 +133,6 @@ static elf_negerrnoval xc_dom_probe_elf_kernel(struct xc_dom_image *dom) return 0; } -static elf_errorstatus xc_dom_load_elf_symtab(struct xc_dom_image *dom, - struct elf_binary *elf, bool load) -{ - struct elf_binary syms; - ELF_HANDLE_DECL(elf_shdr) shdr; ELF_HANDLE_DECL(elf_shdr) shdr2; - xen_vaddr_t symtab, maxaddr; - elf_ptrval hdr; - size_t size; - unsigned h, count, type, i, tables = 0; - unsigned long *strtab_referenced = NULL; - - if ( elf_swap(elf) ) - { - DOMPRINTF("%s: non-native byte order, bsd symtab not supported", - __FUNCTION__); - return 0; - } - - size = elf->bsd_symtab_pend - elf->bsd_symtab_pstart; - - if ( load ) - { - char *hdr_ptr; - size_t allow_size; - - if ( !dom->bsd_symtab_start ) - return 0; - hdr_ptr = xc_dom_vaddr_to_ptr(dom, dom->bsd_symtab_start, &allow_size); - if ( hdr_ptr == NULL ) - { - DOMPRINTF("%s: xc_dom_vaddr_to_ptr(dom,dom->bsd_symtab_start" - " => NULL", __FUNCTION__); - return -1; - } - elf->caller_xdest_base = hdr_ptr; - elf->caller_xdest_size = allow_size; - hdr = ELF_REALPTR2PTRVAL(hdr_ptr); - elf_store_val(elf, unsigned, hdr, size - sizeof(unsigned)); - } - else - { - char *hdr_ptr; - - hdr_ptr = xc_dom_malloc(dom, size); - if ( hdr_ptr == NULL ) - return 0; - elf->caller_xdest_base = hdr_ptr; - elf->caller_xdest_size = size; - hdr = ELF_REALPTR2PTRVAL(hdr_ptr); - dom->bsd_symtab_start = elf_round_up(elf, dom->kernel_seg.vend); - dom->kernel_seg.vend = elf_round_up(elf, dom->bsd_symtab_start + size); - return 0; - } - - elf_memcpy_safe(elf, hdr + sizeof(unsigned), - ELF_IMAGE_BASE(elf), - elf_size(elf, elf->ehdr)); - elf_memcpy_safe(elf, hdr + sizeof(unsigned) + elf_size(elf, elf->ehdr), - ELF_IMAGE_BASE(elf) + elf_uval(elf, elf->ehdr, e_shoff), - elf_shdr_count(elf) * elf_size(elf, shdr)); - if ( elf_64bit(elf) ) - { - Elf64_Ehdr *ehdr = (Elf64_Ehdr *)(hdr + sizeof(unsigned)); - ehdr->e_phoff = 0; - ehdr->e_phentsize = 0; - ehdr->e_phnum = 0; - ehdr->e_shoff = elf_size(elf, elf->ehdr); - ehdr->e_shstrndx = SHN_UNDEF; - } - else - { - Elf32_Ehdr *ehdr = (Elf32_Ehdr *)(hdr + sizeof(unsigned)); - ehdr->e_phoff = 0; - ehdr->e_phentsize = 0; - ehdr->e_phnum = 0; - ehdr->e_shoff = elf_size(elf, elf->ehdr); - ehdr->e_shstrndx = SHN_UNDEF; - } - if ( elf->caller_xdest_size < sizeof(unsigned) ) - { - DOMPRINTF("%s: header size %"PRIx64" too small", - __FUNCTION__, (uint64_t)elf->caller_xdest_size); - return -1; - } - if ( elf_init(&syms, elf->caller_xdest_base + sizeof(unsigned), - elf->caller_xdest_size - sizeof(unsigned)) ) - return -1; - - /* - * The caller_xdest_{base,size} and dest_{base,size} need to - * remain valid so long as each struct elf_image does. The - * principle we adopt is that these values are set when the - * memory is allocated or mapped, and cleared when (and if) - * they are unmapped. - * - * Mappings of the guest are normally undone by xc_dom_unmap_all - * (directly or via xc_dom_release). We do not explicitly clear - * these because in fact that happens only at the end of - * xc_dom_boot_image, at which time all of these ELF loading - * functions have returned. No relevant struct elf_binary* - * escapes this file. - */ - - xc_elf_set_logfile(dom->xch, &syms, 1); - - symtab = dom->bsd_symtab_start + sizeof(unsigned); - maxaddr = elf_round_up(&syms, symtab + elf_size(&syms, syms.ehdr) + - elf_shdr_count(&syms) * elf_size(&syms, shdr)); - - DOMPRINTF("%s: bsd_symtab_start=%" PRIx64 ", kernel.end=0x%" PRIx64 - " -- symtab=0x%" PRIx64 ", maxaddr=0x%" PRIx64 "", - __FUNCTION__, dom->bsd_symtab_start, dom->kernel_seg.vend, - symtab, maxaddr); - - count = elf_shdr_count(&syms); - /* elf_shdr_count guarantees that count is reasonable */ - - strtab_referenced = xc_dom_malloc(dom, bitmap_size(count)); - if ( strtab_referenced == NULL ) - return -1; - bitmap_clear(strtab_referenced, count); - /* Note the symtabs @h linked to by any strtab @i. */ - for ( i = 0; i < count; i++ ) - { - shdr2 = elf_shdr_by_index(&syms, i); - if ( elf_uval(&syms, shdr2, sh_type) == SHT_SYMTAB ) - { - h = elf_uval(&syms, shdr2, sh_link); - if (h < count) - set_bit(h, strtab_referenced); - } - } - - for ( h = 0; h < count; h++ ) - { - shdr = elf_shdr_by_index(&syms, h); - if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) - /* input has an insane section header count field */ - break; - type = elf_uval(&syms, shdr, sh_type); - if ( type == SHT_STRTAB ) - { - /* Skip symtab @h if we found no corresponding strtab @i. */ - if ( !test_bit(h, strtab_referenced) ) - { - if ( elf_64bit(&syms) ) - elf_store_field(elf, shdr, e64.sh_offset, 0); - else - elf_store_field(elf, shdr, e32.sh_offset, 0); - continue; - } - } - - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) - { - /* Mangled to be based on ELF header location. */ - if ( elf_64bit(&syms) ) - elf_store_field(elf, shdr, e64.sh_offset, maxaddr - symtab); - else - elf_store_field(elf, shdr, e32.sh_offset, maxaddr - symtab); - size = elf_uval(&syms, shdr, sh_size); - maxaddr = elf_round_up(&syms, maxaddr + size); - tables++; - DOMPRINTF("%s: h=%u %s, size=0x%zx, maxaddr=0x%" PRIx64 "", - __FUNCTION__, h, - type == SHT_SYMTAB ? "symtab" : "strtab", - size, maxaddr); - - shdr2 = elf_shdr_by_index(elf, h); - elf_memcpy_safe(elf, elf_section_start(&syms, shdr), - elf_section_start(elf, shdr2), - size); - } - - /* Name is NULL. */ - if ( elf_64bit(&syms) ) - elf_store_field(elf, shdr, e64.sh_name, 0); - else - elf_store_field(elf, shdr, e32.sh_name, 0); - } - - if ( elf_check_broken(&syms) ) - DOMPRINTF("%s: symbols ELF broken: %s", __FUNCTION__, - elf_check_broken(&syms)); - if ( elf_check_broken(elf) ) - DOMPRINTF("%s: ELF broken: %s", __FUNCTION__, - elf_check_broken(elf)); - - if ( tables == 0 ) - { - DOMPRINTF("%s: no symbol table present", __FUNCTION__); - dom->bsd_symtab_start = 0; - return 0; - } - - return 0; -} - static elf_errorstatus xc_dom_parse_elf_kernel(struct xc_dom_image *dom) /* * This function sometimes returns -1 for error and sometimes @@ -385,9 +187,6 @@ static elf_errorstatus xc_dom_parse_elf_kernel(struct xc_dom_image *dom) dom->kernel_seg.vstart = dom->parms.virt_kstart; dom->kernel_seg.vend = dom->parms.virt_kend; - if ( dom->parms.bsd_symtab ) - xc_dom_load_elf_symtab(dom, elf, 0); - dom->guest_type = xc_dom_guest_type(dom, elf); DOMPRINTF("%s: %s: 0x%" PRIx64 " -> 0x%" PRIx64 "", __FUNCTION__, dom->guest_type, @@ -422,8 +221,6 @@ static elf_errorstatus xc_dom_load_elf_kernel(struct xc_dom_image *dom) DOMPRINTF("%s: failed to load elf binary", __FUNCTION__); return rc; } - if ( dom->parms.bsd_symtab ) - xc_dom_load_elf_symtab(dom, elf, 1); return 0; } diff --git a/xen/common/libelf/libelf-loader.c b/xen/common/libelf/libelf-loader.c index 6f42bea..5875795 100644 --- a/xen/common/libelf/libelf-loader.c +++ b/xen/common/libelf/libelf-loader.c @@ -153,7 +153,7 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t pstart) { uint64_t sz; ELF_HANDLE_DECL(elf_shdr) shdr; - unsigned i, type; + unsigned int i; if ( !ELF_HANDLE_VALID(elf->sym_tab) ) return; @@ -164,20 +164,33 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t pstart) sz = sizeof(uint32_t); /* Space for the elf and elf section headers */ - sz += (elf_uval(elf, elf->ehdr, e_ehsize) + - elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize)); + sz += elf_uval(elf, elf->ehdr, e_ehsize) + + 3 * elf_uval(elf, elf->ehdr, e_shentsize); sz = elf_round_up(elf, sz); - /* Space for the symbol and string tables. */ + /* Space for the symbol and string table. */ for ( i = 0; i < elf_shdr_count(elf); i++ ) { shdr = elf_shdr_by_index(elf, i); if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) /* input has an insane section header count field */ break; - type = elf_uval(elf, shdr, sh_type); - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) - sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); + + if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB ) + continue; + + sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); + shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link)); + + if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) + /* input has an insane section header count field */ + break; + + if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB ) + /* Invalid symtab -> strtab link */ + break; + + sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); } elf->bsd_symtab_pstart = pstart; @@ -186,79 +199,169 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t pstart) static void elf_load_bsdsyms(struct elf_binary *elf) { - ELF_HANDLE_DECL(elf_ehdr) sym_ehdr; - unsigned long sz; - elf_ptrval maxva; - elf_ptrval symbase; - elf_ptrval symtab_addr; - ELF_HANDLE_DECL(elf_shdr) shdr; - unsigned i, type; + /* + * Header that is placed at the end of the kernel and allows + * the OS to find where the symtab and strtab have been loaded. + * It mimics a valid ELF file header, although it only contains + * a symtab and a strtab section. + * + * NB: according to the ELF spec there's only ONE symtab per ELF + * file, and accordingly we will only load the corresponding + * strtab, so we only need three section headers in our fake ELF + * header (first section header is always a dummy). + */ + struct { + uint32_t size; + struct { + elf_ehdr header; + elf_shdr section[3]; + } __attribute__((packed)) elf_header; + } __attribute__((packed)) header; + + ELF_HANDLE_DECL(elf_ehdr) header_handle; + unsigned long shdr_size; + ELF_HANDLE_DECL(elf_shdr) section_handle; + ELF_HANDLE_DECL(elf_shdr) image_handle; + unsigned int i, link; + elf_ptrval header_base; + elf_ptrval elf_header_base; + elf_ptrval symtab_base; + elf_ptrval strtab_base; if ( !elf->bsd_symtab_pstart ) return; -#define elf_hdr_elm(_elf, _hdr, _elm, _val) \ -do { \ - if ( elf_64bit(_elf) ) \ - elf_store_field(_elf, _hdr, e64._elm, _val); \ - else \ - elf_store_field(_elf, _hdr, e32._elm, _val); \ +#define elf_store_field_bitness(_elf, _hdr, _elm, _val) \ +do { \ + if ( elf_64bit(_elf) ) \ + elf_store_field(_elf, _hdr, e64._elm, _val); \ + else \ + elf_store_field(_elf, _hdr, e32._elm, _val); \ } while ( 0 ) - symbase = elf_get_ptr(elf, elf->bsd_symtab_pstart); - symtab_addr = maxva = symbase + sizeof(uint32_t); - - /* Set up Elf header. */ - sym_ehdr = ELF_MAKE_HANDLE(elf_ehdr, symtab_addr); - sz = elf_uval(elf, elf->ehdr, e_ehsize); - elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(sym_ehdr), ELF_HANDLE_PTRVAL(elf->ehdr), sz); - maxva += sz; /* no round up */ +#define SYMTAB_INDEX 1 +#define STRTAB_INDEX 2 - elf_hdr_elm(elf, sym_ehdr, e_phoff, 0); - elf_hdr_elm(elf, sym_ehdr, e_shoff, elf_uval(elf, elf->ehdr, e_ehsize)); - elf_hdr_elm(elf, sym_ehdr, e_phentsize, 0); - elf_hdr_elm(elf, sym_ehdr, e_phnum, 0); + /* Allow elf_memcpy_safe to write to symbol_header. */ + elf->caller_xdest_base = &header; + elf->caller_xdest_size = sizeof(header); - /* Copy Elf section headers. */ - shdr = ELF_MAKE_HANDLE(elf_shdr, maxva); - sz = elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize); - elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(shdr), - ELF_IMAGE_BASE(elf) + elf_uval(elf, elf->ehdr, e_shoff), - sz); - maxva = elf_round_up(elf, (unsigned long)maxva + sz); + /* + * Calculate the position of the various elements in GUEST MEMORY SPACE. + * This addresses MUST only be used with elf_load_image. + * + * NB: strtab_base cannot be calculated at this point because we don't + * know the size of the symtab yet, and the strtab will be placed after it. + */ + header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart); + elf_header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart) + + sizeof(uint32_t); + symtab_base = elf_round_up(elf, header_base + sizeof(header)); + + /* Fill the ELF header, copied from the original ELF header. */ + header_handle = ELF_MAKE_HANDLE(elf_ehdr, + ELF_REALPTR2PTRVAL(&header.elf_header.header)); + elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(header_handle), + ELF_HANDLE_PTRVAL(elf->ehdr), + elf_uval(elf, elf->ehdr, e_ehsize)); + + /* Set the offset to the shdr array. */ + elf_store_field_bitness(elf, header_handle, e_shoff, + offsetof(typeof(header.elf_header), section)); + + /* Set the right number of section headers. */ + elf_store_field_bitness(elf, header_handle, e_shnum, 3); + + /* Clear a couple of fields we don't use. */ + elf_store_field_bitness(elf, header_handle, e_phoff, 0); + elf_store_field_bitness(elf, header_handle, e_phentsize, 0); + elf_store_field_bitness(elf, header_handle, e_phnum, 0); + + /* Zero the dummy section. */ + section_handle = ELF_MAKE_HANDLE(elf_shdr, + ELF_REALPTR2PTRVAL(&header.elf_header.section[SHN_UNDEF])); + shdr_size = elf_uval(elf, elf->ehdr, e_shentsize); + elf_memset_safe(elf, ELF_HANDLE_PTRVAL(section_handle), 0, shdr_size); + /* + * Find the actual symtab and strtab in the ELF. + * + * The symtab section header is going to reside in section[SYMTAB_INDEX], + * while the corresponding strtab is going to be placed in + * section[STRTAB_INDEX]. sh_offset is mangled so it points to the offset + * where the sections are actually loaded (relative to the ELF header + * location). + */ + section_handle = ELF_MAKE_HANDLE(elf_shdr, + ELF_REALPTR2PTRVAL(&header.elf_header.section[SYMTAB_INDEX])); for ( i = 0; i < elf_shdr_count(elf); i++ ) { - elf_ptrval old_shdr_p; - elf_ptrval new_shdr_p; - type = elf_uval(elf, shdr, sh_type); - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) + image_handle = elf_shdr_by_index(elf, i); + if ( elf_uval(elf, image_handle, sh_type) != SHT_SYMTAB ) + continue; + + elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(section_handle), + ELF_HANDLE_PTRVAL(image_handle), + shdr_size); + + link = elf_uval(elf, section_handle, sh_link); + if ( link == SHN_UNDEF ) { - elf_msg(elf, "%s: shdr %i at 0x%"ELF_PRPTRVAL" -> 0x%"ELF_PRPTRVAL"\n", __func__, i, - elf_section_start(elf, shdr), maxva); - sz = elf_uval(elf, shdr, sh_size); - elf_memcpy_safe(elf, maxva, elf_section_start(elf, shdr), sz); - /* Mangled to be based on ELF header location. */ - elf_hdr_elm(elf, shdr, sh_offset, maxva - symtab_addr); - maxva = elf_round_up(elf, (unsigned long)maxva + sz); + elf_mark_broken(elf, "bad link in symtab"); + break; } - old_shdr_p = ELF_HANDLE_PTRVAL(shdr); - new_shdr_p = old_shdr_p + elf_uval(elf, elf->ehdr, e_shentsize); - if ( new_shdr_p <= old_shdr_p ) /* wrapped or stuck */ + + /* Load symtab into guest memory. */ + elf_load_image(elf, symtab_base, elf_section_start(elf, section_handle), + elf_uval(elf, section_handle, sh_size), + elf_uval(elf, section_handle, sh_size)); + elf_store_field_bitness(elf, section_handle, sh_offset, + symtab_base - elf_header_base); + elf_store_field_bitness(elf, section_handle, sh_link, + STRTAB_INDEX); + + /* Calculate the guest address where strtab is loaded. */ + strtab_base = elf_round_up(elf, symtab_base + + elf_uval(elf, section_handle, sh_size)); + + /* Load strtab section header. */ + section_handle = ELF_MAKE_HANDLE(elf_shdr, + ELF_REALPTR2PTRVAL(&header.elf_header.section[STRTAB_INDEX])); + elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(section_handle), + ELF_HANDLE_PTRVAL(elf_shdr_by_index(elf, link)), + shdr_size); + + if ( elf_uval(elf, section_handle, sh_type) != SHT_STRTAB ) { - elf_mark_broken(elf, "bad section header length"); + elf_mark_broken(elf, "strtab not found"); break; } - if ( !elf_access_ok(elf, new_shdr_p, 1) ) /* outside image */ - break; - shdr = ELF_MAKE_HANDLE(elf_shdr, new_shdr_p); + + /* Load strtab into guest memory. */ + elf_load_image(elf, strtab_base, elf_section_start(elf, section_handle), + elf_uval(elf, section_handle, sh_size), + elf_uval(elf, section_handle, sh_size)); + elf_store_field_bitness(elf, section_handle, sh_offset, + strtab_base - elf_header_base); + + /* Store the whole size (including headers and loaded sections). */ + header.size = strtab_base + elf_uval(elf, section_handle, sh_size) - + elf_header_base; + break; } - /* Write down the actual sym size. */ - elf_store_val(elf, uint32_t, symbase, maxva - symtab_addr); + /* Load the headers. */ + elf_load_image(elf, header_base, ELF_REALPTR2PTRVAL(&header), + sizeof(header), sizeof(header)); + + /* Remove permissions from elf_memcpy_safe. */ + elf->caller_xdest_base = NULL; + elf->caller_xdest_size = 0; -#undef elf_ehdr_elm +#undef SYMTAB_INDEX +#undef STRTAB_INDEX +#undef elf_store_field_bitness } void elf_parse_binary(struct elf_binary *elf) -- 2.5.4 (Apple Git-61) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 3/4] libelf: rewrite symtab/strtab loading 2016-02-16 17:37 ` [PATCH v4 3/4] libelf: rewrite symtab/strtab loading Roger Pau Monne @ 2016-02-26 13:15 ` Jan Beulich 2016-02-26 17:02 ` Roger Pau Monné 0 siblings, 1 reply; 39+ messages in thread From: Jan Beulich @ 2016-02-26 13:15 UTC (permalink / raw) To: Roger Pau Monne; +Cc: xen-devel >>> On 16.02.16 at 18:37, <roger.pau@citrix.com> wrote: > --- a/xen/common/libelf/libelf-loader.c > +++ b/xen/common/libelf/libelf-loader.c > @@ -164,20 +164,33 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t pstart) > sz = sizeof(uint32_t); > > /* Space for the elf and elf section headers */ > - sz += (elf_uval(elf, elf->ehdr, e_ehsize) + > - elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize)); > + sz += elf_uval(elf, elf->ehdr, e_ehsize) + > + 3 * elf_uval(elf, elf->ehdr, e_shentsize); I think a literal 3 like this either needs a #define (at once serving as documentation) or a comment. Perhaps rather the former, considering that the (same?) 3 appears again further down. Plus - what guarantees there are 3 section headers in the image? > sz = elf_round_up(elf, sz); > > - /* Space for the symbol and string tables. */ > + /* Space for the symbol and string table. */ > for ( i = 0; i < elf_shdr_count(elf); i++ ) > { > shdr = elf_shdr_by_index(elf, i); > if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) > /* input has an insane section header count field */ > break; > - type = elf_uval(elf, shdr, sh_type); > - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) > - sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); > + > + if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB ) > + continue; > + > + sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); > + shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link)); > + > + if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) > + /* input has an insane section header count field */ > + break; > + > + if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB ) > + /* Invalid symtab -> strtab link */ > + break; This is not sufficient - what if sh_link is out of bounds, but the bogusly accessed field happens to hold SHT_STRTAB? (Oddly enough you have at least an SHN_UNDEF check in the second loop below.) Also even if (I assume) elf_load_image() would refuse to read outside the bounds of the image, wouldn't you better check here that both symtab and strtab are within image bounds? The more that you ignore errors from elf_load_image() (and elf_mem{cpy,set}_safe() don't even return any)? > @@ -186,79 +199,169 @@ void elf_parse_bsdsyms(struct elf_binary *elf, > uint64_t pstart) > > static void elf_load_bsdsyms(struct elf_binary *elf) > { > - ELF_HANDLE_DECL(elf_ehdr) sym_ehdr; > - unsigned long sz; > - elf_ptrval maxva; > - elf_ptrval symbase; > - elf_ptrval symtab_addr; > - ELF_HANDLE_DECL(elf_shdr) shdr; > - unsigned i, type; > + /* > + * Header that is placed at the end of the kernel and allows > + * the OS to find where the symtab and strtab have been loaded. > + * It mimics a valid ELF file header, although it only contains > + * a symtab and a strtab section. > + * > + * NB: according to the ELF spec there's only ONE symtab per ELF > + * file, and accordingly we will only load the corresponding > + * strtab, so we only need three section headers in our fake ELF > + * header (first section header is always a dummy). > + */ > + struct { > + uint32_t size; > + struct { > + elf_ehdr header; > + elf_shdr section[3]; > + } __attribute__((packed)) elf_header; > + } __attribute__((packed)) header; If this is placed at the end, how can the OS reasonably use it without knowing that there are exactly 3 section header entries? I.e. how would it find the base of this structure? > + ELF_HANDLE_DECL(elf_ehdr) header_handle; > + unsigned long shdr_size; > + ELF_HANDLE_DECL(elf_shdr) section_handle; > + ELF_HANDLE_DECL(elf_shdr) image_handle; > + unsigned int i, link; > + elf_ptrval header_base; > + elf_ptrval elf_header_base; > + elf_ptrval symtab_base; > + elf_ptrval strtab_base; > > if ( !elf->bsd_symtab_pstart ) > return; > > -#define elf_hdr_elm(_elf, _hdr, _elm, _val) \ > -do { \ > - if ( elf_64bit(_elf) ) \ > - elf_store_field(_elf, _hdr, e64._elm, _val); \ > - else \ > - elf_store_field(_elf, _hdr, e32._elm, _val); \ > +#define elf_store_field_bitness(_elf, _hdr, _elm, _val) \ > +do { \ > + if ( elf_64bit(_elf) ) \ > + elf_store_field(_elf, _hdr, e64._elm, _val); \ > + else \ > + elf_store_field(_elf, _hdr, e32._elm, _val); \ > } while ( 0 ) > > - symbase = elf_get_ptr(elf, elf->bsd_symtab_pstart); > - symtab_addr = maxva = symbase + sizeof(uint32_t); > - > - /* Set up Elf header. */ > - sym_ehdr = ELF_MAKE_HANDLE(elf_ehdr, symtab_addr); > - sz = elf_uval(elf, elf->ehdr, e_ehsize); > - elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(sym_ehdr), ELF_HANDLE_PTRVAL(elf->ehdr), sz); > - maxva += sz; /* no round up */ > +#define SYMTAB_INDEX 1 > +#define STRTAB_INDEX 2 > > - elf_hdr_elm(elf, sym_ehdr, e_phoff, 0); > - elf_hdr_elm(elf, sym_ehdr, e_shoff, elf_uval(elf, elf->ehdr, e_ehsize)); > - elf_hdr_elm(elf, sym_ehdr, e_phentsize, 0); > - elf_hdr_elm(elf, sym_ehdr, e_phnum, 0); > + /* Allow elf_memcpy_safe to write to symbol_header. */ > + elf->caller_xdest_base = &header; > + elf->caller_xdest_size = sizeof(header); > > - /* Copy Elf section headers. */ > - shdr = ELF_MAKE_HANDLE(elf_shdr, maxva); > - sz = elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize); > - elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(shdr), > - ELF_IMAGE_BASE(elf) + elf_uval(elf, elf->ehdr, e_shoff), > - sz); > - maxva = elf_round_up(elf, (unsigned long)maxva + sz); > + /* > + * Calculate the position of the various elements in GUEST MEMORY SPACE. > + * This addresses MUST only be used with elf_load_image. > + * > + * NB: strtab_base cannot be calculated at this point because we don't > + * know the size of the symtab yet, and the strtab will be placed after it. > + */ > + header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart); > + elf_header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart) + > + sizeof(uint32_t); > + symtab_base = elf_round_up(elf, header_base + sizeof(header)); > + > + /* Fill the ELF header, copied from the original ELF header. */ > + header_handle = ELF_MAKE_HANDLE(elf_ehdr, > + ELF_REALPTR2PTRVAL(&header.elf_header.header)); > + elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(header_handle), > + ELF_HANDLE_PTRVAL(elf->ehdr), > + elf_uval(elf, elf->ehdr, e_ehsize)); > + > + /* Set the offset to the shdr array. */ > + elf_store_field_bitness(elf, header_handle, e_shoff, > + offsetof(typeof(header.elf_header), section)); > + > + /* Set the right number of section headers. */ > + elf_store_field_bitness(elf, header_handle, e_shnum, 3); > + > + /* Clear a couple of fields we don't use. */ > + elf_store_field_bitness(elf, header_handle, e_phoff, 0); > + elf_store_field_bitness(elf, header_handle, e_phentsize, 0); > + elf_store_field_bitness(elf, header_handle, e_phnum, 0); Perhaps better just give the structure above an initializer? > + /* Zero the dummy section. */ Dummy? And anyway I think you mean "section header" here. > + section_handle = ELF_MAKE_HANDLE(elf_shdr, > + ELF_REALPTR2PTRVAL(&header.elf_header.section[SHN_UNDEF])); > + shdr_size = elf_uval(elf, elf->ehdr, e_shentsize); > + elf_memset_safe(elf, ELF_HANDLE_PTRVAL(section_handle), 0, shdr_size); > > + /* > + * Find the actual symtab and strtab in the ELF. > + * > + * The symtab section header is going to reside in section[SYMTAB_INDEX], > + * while the corresponding strtab is going to be placed in > + * section[STRTAB_INDEX]. sh_offset is mangled so it points to the offset > + * where the sections are actually loaded (relative to the ELF header > + * location). > + */ > + section_handle = ELF_MAKE_HANDLE(elf_shdr, > + ELF_REALPTR2PTRVAL(&header.elf_header.section[SYMTAB_INDEX])); > for ( i = 0; i < elf_shdr_count(elf); i++ ) > { > - elf_ptrval old_shdr_p; > - elf_ptrval new_shdr_p; > > - type = elf_uval(elf, shdr, sh_type); > - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) > + image_handle = elf_shdr_by_index(elf, i); > + if ( elf_uval(elf, image_handle, sh_type) != SHT_SYMTAB ) > + continue; > + > + elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(section_handle), > + ELF_HANDLE_PTRVAL(image_handle), > + shdr_size); > + > + link = elf_uval(elf, section_handle, sh_link); > + if ( link == SHN_UNDEF ) > { > - elf_msg(elf, "%s: shdr %i at 0x%"ELF_PRPTRVAL" -> 0x%"ELF_PRPTRVAL"\n", __func__, i, > - elf_section_start(elf, shdr), maxva); > - sz = elf_uval(elf, shdr, sh_size); > - elf_memcpy_safe(elf, maxva, elf_section_start(elf, shdr), sz); > - /* Mangled to be based on ELF header location. */ > - elf_hdr_elm(elf, shdr, sh_offset, maxva - symtab_addr); > - maxva = elf_round_up(elf, (unsigned long)maxva + sz); > + elf_mark_broken(elf, "bad link in symtab"); > + break; > } > - old_shdr_p = ELF_HANDLE_PTRVAL(shdr); > - new_shdr_p = old_shdr_p + elf_uval(elf, elf->ehdr, e_shentsize); > - if ( new_shdr_p <= old_shdr_p ) /* wrapped or stuck */ > + > + /* Load symtab into guest memory. */ > + elf_load_image(elf, symtab_base, elf_section_start(elf, section_handle), > + elf_uval(elf, section_handle, sh_size), > + elf_uval(elf, section_handle, sh_size)); > + elf_store_field_bitness(elf, section_handle, sh_offset, > + symtab_base - elf_header_base); > + elf_store_field_bitness(elf, section_handle, sh_link, > + STRTAB_INDEX); > + > + /* Calculate the guest address where strtab is loaded. */ > + strtab_base = elf_round_up(elf, symtab_base + > + elf_uval(elf, section_handle, sh_size)); > + > + /* Load strtab section header. */ > + section_handle = ELF_MAKE_HANDLE(elf_shdr, > + ELF_REALPTR2PTRVAL(&header.elf_header.section[STRTAB_INDEX])); > + elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(section_handle), > + ELF_HANDLE_PTRVAL(elf_shdr_by_index(elf, link)), > + shdr_size); > + > + if ( elf_uval(elf, section_handle, sh_type) != SHT_STRTAB ) > { > - elf_mark_broken(elf, "bad section header length"); > + elf_mark_broken(elf, "strtab not found"); > break; > } > - if ( !elf_access_ok(elf, new_shdr_p, 1) ) /* outside image */ > - break; > - shdr = ELF_MAKE_HANDLE(elf_shdr, new_shdr_p); > + > + /* Load strtab into guest memory. */ > + elf_load_image(elf, strtab_base, elf_section_start(elf, section_handle), > + elf_uval(elf, section_handle, sh_size), > + elf_uval(elf, section_handle, sh_size)); > + elf_store_field_bitness(elf, section_handle, sh_offset, > + strtab_base - elf_header_base); > + > + /* Store the whole size (including headers and loaded sections). */ > + header.size = strtab_base + elf_uval(elf, section_handle, sh_size) > - > + elf_header_base; > + break; > } So you're properly breaking out of the loop here - why don't you also do this in the parsing one? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 3/4] libelf: rewrite symtab/strtab loading 2016-02-26 13:15 ` Jan Beulich @ 2016-02-26 17:02 ` Roger Pau Monné 2016-02-29 9:31 ` Jan Beulich 0 siblings, 1 reply; 39+ messages in thread From: Roger Pau Monné @ 2016-02-26 17:02 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel El 26/2/16 a les 14:15, Jan Beulich ha escrit: >>>> On 16.02.16 at 18:37, <roger.pau@citrix.com> wrote: >> --- a/xen/common/libelf/libelf-loader.c >> +++ b/xen/common/libelf/libelf-loader.c >> @@ -164,20 +164,33 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t pstart) >> sz = sizeof(uint32_t); >> >> /* Space for the elf and elf section headers */ >> - sz += (elf_uval(elf, elf->ehdr, e_ehsize) + >> - elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize)); >> + sz += elf_uval(elf, elf->ehdr, e_ehsize) + >> + 3 * elf_uval(elf, elf->ehdr, e_shentsize); > > I think a literal 3 like this either needs a #define (at once serving > as documentation) or a comment. Perhaps rather the former, > considering that the (same?) 3 appears again further down. Plus - > what guarantees there are 3 section headers in the image? This is not related to the image itself, but to the metadata that libelf places at the end of the loaded kernel in order to stash the symtab and strtab for the guest to use. The layout is as follows (I should add this to the patch itself as a comment, since I guess this is still quite confusing): +------------------------+ | | | KERNEL | | | +------------------------+ pend | ALIGNMENT | +------------------------+ bsd_symtab_pstart | | | size | bsd_symtab_pend - bsd_symtab_pstart | | +------------------------+ | | | ELF header | | | +------------------------+ | | | ELF section header 0 | Dummy section header | | +------------------------+ | | | ELF section header 1 | SYMTAB section header | | +------------------------+ | | | ELF section header 2 | STRTAB section header | | +------------------------+ | | | SYMTAB | | | +------------------------+ | | | STRTAB | | | +------------------------+ bsd_symtab_pend There are always only tree sections because that's all we need, section 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is used to describe the STRTAB. According to the ELF spec there can only be one SYMTAB, so that's all we need. > >> sz = elf_round_up(elf, sz); >> >> - /* Space for the symbol and string tables. */ >> + /* Space for the symbol and string table. */ >> for ( i = 0; i < elf_shdr_count(elf); i++ ) >> { >> shdr = elf_shdr_by_index(elf, i); >> if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >> /* input has an insane section header count field */ >> break; >> - type = elf_uval(elf, shdr, sh_type); >> - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) >> - sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >> + >> + if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB ) >> + continue; >> + >> + sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >> + shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link)); >> + >> + if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >> + /* input has an insane section header count field */ >> + break; >> + >> + if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB ) >> + /* Invalid symtab -> strtab link */ >> + break; > > This is not sufficient - what if sh_link is out of bounds, but the > bogusly accessed field happens to hold SHT_STRTAB? (Oddly > enough you have at least an SHN_UNDEF check in the second > loop below.) The out-of-bounds access would be detected by elf_access_ok (if it's out of the scope of the ELF file, which is all we care about). In fact the elf_access_ok above could be removed because elf_uval already performs out-of-bounds checks. The result is definitely no worse that what we are doing ATM. > > Also even if (I assume) elf_load_image() would refuse to read > outside the bounds of the image, wouldn't you better check here > that both symtab and strtab are within image bounds? The more > that you ignore errors from elf_load_image() (and > elf_mem{cpy,set}_safe() don't even return any)? Yes, I will add error checks to elf_load_image calls below. >> @@ -186,79 +199,169 @@ void elf_parse_bsdsyms(struct elf_binary *elf, >> uint64_t pstart) >> >> static void elf_load_bsdsyms(struct elf_binary *elf) >> { >> - ELF_HANDLE_DECL(elf_ehdr) sym_ehdr; >> - unsigned long sz; >> - elf_ptrval maxva; >> - elf_ptrval symbase; >> - elf_ptrval symtab_addr; >> - ELF_HANDLE_DECL(elf_shdr) shdr; >> - unsigned i, type; >> + /* >> + * Header that is placed at the end of the kernel and allows >> + * the OS to find where the symtab and strtab have been loaded. >> + * It mimics a valid ELF file header, although it only contains >> + * a symtab and a strtab section. >> + * >> + * NB: according to the ELF spec there's only ONE symtab per ELF >> + * file, and accordingly we will only load the corresponding >> + * strtab, so we only need three section headers in our fake ELF >> + * header (first section header is always a dummy). >> + */ >> + struct { >> + uint32_t size; >> + struct { >> + elf_ehdr header; >> + elf_shdr section[3]; >> + } __attribute__((packed)) elf_header; >> + } __attribute__((packed)) header; > > If this is placed at the end, how can the OS reasonably use it > without knowing that there are exactly 3 section header > entries? I.e. how would it find the base of this structure? See the commend below, the base is found at the end of the kernel, and then the ELF header contains the right pointers to the sections headers (by appropriately setting e_shoff). IMHO, this is an overly complex way to pass a SYMTAB and STRTAB, but it's baked in the ABI, so I don't see us changing it. >> + ELF_HANDLE_DECL(elf_ehdr) header_handle; >> + unsigned long shdr_size; >> + ELF_HANDLE_DECL(elf_shdr) section_handle; >> + ELF_HANDLE_DECL(elf_shdr) image_handle; >> + unsigned int i, link; >> + elf_ptrval header_base; >> + elf_ptrval elf_header_base; >> + elf_ptrval symtab_base; >> + elf_ptrval strtab_base; >> >> if ( !elf->bsd_symtab_pstart ) >> return; >> >> -#define elf_hdr_elm(_elf, _hdr, _elm, _val) \ >> -do { \ >> - if ( elf_64bit(_elf) ) \ >> - elf_store_field(_elf, _hdr, e64._elm, _val); \ >> - else \ >> - elf_store_field(_elf, _hdr, e32._elm, _val); \ >> +#define elf_store_field_bitness(_elf, _hdr, _elm, _val) \ >> +do { \ >> + if ( elf_64bit(_elf) ) \ >> + elf_store_field(_elf, _hdr, e64._elm, _val); \ >> + else \ >> + elf_store_field(_elf, _hdr, e32._elm, _val); \ >> } while ( 0 ) >> >> - symbase = elf_get_ptr(elf, elf->bsd_symtab_pstart); >> - symtab_addr = maxva = symbase + sizeof(uint32_t); >> - >> - /* Set up Elf header. */ >> - sym_ehdr = ELF_MAKE_HANDLE(elf_ehdr, symtab_addr); >> - sz = elf_uval(elf, elf->ehdr, e_ehsize); >> - elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(sym_ehdr), ELF_HANDLE_PTRVAL(elf->ehdr), sz); >> - maxva += sz; /* no round up */ >> +#define SYMTAB_INDEX 1 >> +#define STRTAB_INDEX 2 >> >> - elf_hdr_elm(elf, sym_ehdr, e_phoff, 0); >> - elf_hdr_elm(elf, sym_ehdr, e_shoff, elf_uval(elf, elf->ehdr, e_ehsize)); >> - elf_hdr_elm(elf, sym_ehdr, e_phentsize, 0); >> - elf_hdr_elm(elf, sym_ehdr, e_phnum, 0); >> + /* Allow elf_memcpy_safe to write to symbol_header. */ >> + elf->caller_xdest_base = &header; >> + elf->caller_xdest_size = sizeof(header); >> >> - /* Copy Elf section headers. */ >> - shdr = ELF_MAKE_HANDLE(elf_shdr, maxva); >> - sz = elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize); >> - elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(shdr), >> - ELF_IMAGE_BASE(elf) + elf_uval(elf, elf->ehdr, e_shoff), >> - sz); >> - maxva = elf_round_up(elf, (unsigned long)maxva + sz); >> + /* >> + * Calculate the position of the various elements in GUEST MEMORY SPACE. >> + * This addresses MUST only be used with elf_load_image. >> + * >> + * NB: strtab_base cannot be calculated at this point because we don't >> + * know the size of the symtab yet, and the strtab will be placed after it. >> + */ >> + header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart); >> + elf_header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart) + >> + sizeof(uint32_t); >> + symtab_base = elf_round_up(elf, header_base + sizeof(header)); >> + >> + /* Fill the ELF header, copied from the original ELF header. */ >> + header_handle = ELF_MAKE_HANDLE(elf_ehdr, >> + ELF_REALPTR2PTRVAL(&header.elf_header.header)); >> + elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(header_handle), >> + ELF_HANDLE_PTRVAL(elf->ehdr), >> + elf_uval(elf, elf->ehdr, e_ehsize)); >> + >> + /* Set the offset to the shdr array. */ >> + elf_store_field_bitness(elf, header_handle, e_shoff, >> + offsetof(typeof(header.elf_header), section)); >> + >> + /* Set the right number of section headers. */ >> + elf_store_field_bitness(elf, header_handle, e_shnum, 3); >> + >> + /* Clear a couple of fields we don't use. */ >> + elf_store_field_bitness(elf, header_handle, e_phoff, 0); >> + elf_store_field_bitness(elf, header_handle, e_phentsize, 0); >> + elf_store_field_bitness(elf, header_handle, e_phnum, 0); > > Perhaps better just give the structure above an initializer? Not really, the problem is that we copy the original header from the ELF binary, and then we mangle it. This is just a fixup to make it clear that no program headers are present (we just use section headers in order to describe the SYMTAB and STRTAB positions). >> + /* Zero the dummy section. */ > > Dummy? And anyway I think you mean "section header" here. Yes, the right comment would be: zero the dummy section header. > >> + section_handle = ELF_MAKE_HANDLE(elf_shdr, >> + ELF_REALPTR2PTRVAL(&header.elf_header.section[SHN_UNDEF])); >> + shdr_size = elf_uval(elf, elf->ehdr, e_shentsize); >> + elf_memset_safe(elf, ELF_HANDLE_PTRVAL(section_handle), 0, shdr_size); >> >> + /* >> + * Find the actual symtab and strtab in the ELF. >> + * >> + * The symtab section header is going to reside in section[SYMTAB_INDEX], >> + * while the corresponding strtab is going to be placed in >> + * section[STRTAB_INDEX]. sh_offset is mangled so it points to the offset >> + * where the sections are actually loaded (relative to the ELF header >> + * location). >> + */ >> + section_handle = ELF_MAKE_HANDLE(elf_shdr, >> + ELF_REALPTR2PTRVAL(&header.elf_header.section[SYMTAB_INDEX])); >> for ( i = 0; i < elf_shdr_count(elf); i++ ) >> { >> - elf_ptrval old_shdr_p; >> - elf_ptrval new_shdr_p; >> >> - type = elf_uval(elf, shdr, sh_type); >> - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) >> + image_handle = elf_shdr_by_index(elf, i); >> + if ( elf_uval(elf, image_handle, sh_type) != SHT_SYMTAB ) >> + continue; >> + >> + elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(section_handle), >> + ELF_HANDLE_PTRVAL(image_handle), >> + shdr_size); >> + >> + link = elf_uval(elf, section_handle, sh_link); >> + if ( link == SHN_UNDEF ) >> { >> - elf_msg(elf, "%s: shdr %i at 0x%"ELF_PRPTRVAL" -> 0x%"ELF_PRPTRVAL"\n", __func__, i, >> - elf_section_start(elf, shdr), maxva); >> - sz = elf_uval(elf, shdr, sh_size); >> - elf_memcpy_safe(elf, maxva, elf_section_start(elf, shdr), sz); >> - /* Mangled to be based on ELF header location. */ >> - elf_hdr_elm(elf, shdr, sh_offset, maxva - symtab_addr); >> - maxva = elf_round_up(elf, (unsigned long)maxva + sz); >> + elf_mark_broken(elf, "bad link in symtab"); >> + break; >> } >> - old_shdr_p = ELF_HANDLE_PTRVAL(shdr); >> - new_shdr_p = old_shdr_p + elf_uval(elf, elf->ehdr, e_shentsize); >> - if ( new_shdr_p <= old_shdr_p ) /* wrapped or stuck */ >> + >> + /* Load symtab into guest memory. */ >> + elf_load_image(elf, symtab_base, elf_section_start(elf, section_handle), >> + elf_uval(elf, section_handle, sh_size), >> + elf_uval(elf, section_handle, sh_size)); >> + elf_store_field_bitness(elf, section_handle, sh_offset, >> + symtab_base - elf_header_base); >> + elf_store_field_bitness(elf, section_handle, sh_link, >> + STRTAB_INDEX); >> + >> + /* Calculate the guest address where strtab is loaded. */ >> + strtab_base = elf_round_up(elf, symtab_base + >> + elf_uval(elf, section_handle, sh_size)); >> + >> + /* Load strtab section header. */ >> + section_handle = ELF_MAKE_HANDLE(elf_shdr, >> + ELF_REALPTR2PTRVAL(&header.elf_header.section[STRTAB_INDEX])); >> + elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(section_handle), >> + ELF_HANDLE_PTRVAL(elf_shdr_by_index(elf, link)), >> + shdr_size); >> + >> + if ( elf_uval(elf, section_handle, sh_type) != SHT_STRTAB ) >> { >> - elf_mark_broken(elf, "bad section header length"); >> + elf_mark_broken(elf, "strtab not found"); >> break; >> } >> - if ( !elf_access_ok(elf, new_shdr_p, 1) ) /* outside image */ >> - break; >> - shdr = ELF_MAKE_HANDLE(elf_shdr, new_shdr_p); >> + >> + /* Load strtab into guest memory. */ >> + elf_load_image(elf, strtab_base, elf_section_start(elf, section_handle), >> + elf_uval(elf, section_handle, sh_size), >> + elf_uval(elf, section_handle, sh_size)); >> + elf_store_field_bitness(elf, section_handle, sh_offset, >> + strtab_base - elf_header_base); >> + >> + /* Store the whole size (including headers and loaded sections). */ >> + header.size = strtab_base + elf_uval(elf, section_handle, sh_size) >> - >> + elf_header_base; >> + break; >> } > > So you're properly breaking out of the loop here - why don't you > also do this in the parsing one? Oh, my bad, this is a bug in the parsing code, will fix it, thanks. FWIW, it works because proper ELF binaries only have one SYMTAB, but we shouldn't rely on this. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 3/4] libelf: rewrite symtab/strtab loading 2016-02-26 17:02 ` Roger Pau Monné @ 2016-02-29 9:31 ` Jan Beulich 2016-02-29 10:57 ` Roger Pau Monné 0 siblings, 1 reply; 39+ messages in thread From: Jan Beulich @ 2016-02-29 9:31 UTC (permalink / raw) To: Roger Pau Monné; +Cc: xen-devel >>> On 26.02.16 at 18:02, <roger.pau@citrix.com> wrote: > El 26/2/16 a les 14:15, Jan Beulich ha escrit: >>>>> On 16.02.16 at 18:37, <roger.pau@citrix.com> wrote: >>> --- a/xen/common/libelf/libelf-loader.c >>> +++ b/xen/common/libelf/libelf-loader.c >>> @@ -164,20 +164,33 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t pstart) >>> sz = sizeof(uint32_t); >>> >>> /* Space for the elf and elf section headers */ >>> - sz += (elf_uval(elf, elf->ehdr, e_ehsize) + >>> - elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize)); >>> + sz += elf_uval(elf, elf->ehdr, e_ehsize) + >>> + 3 * elf_uval(elf, elf->ehdr, e_shentsize); >> >> I think a literal 3 like this either needs a #define (at once serving >> as documentation) or a comment. Perhaps rather the former, >> considering that the (same?) 3 appears again further down. Plus - >> what guarantees there are 3 section headers in the image? > > This is not related to the image itself, but to the metadata that libelf > places at the end of the loaded kernel in order to stash the symtab and > strtab for the guest to use. I understand this, yet imo the literal 3 still should be replaced. > The layout is as follows (I should add this to the patch itself as a > comment, since I guess this is still quite confusing): > > +------------------------+ > | | > | KERNEL | > | | > +------------------------+ pend > | ALIGNMENT | > +------------------------+ bsd_symtab_pstart > | | > | size | bsd_symtab_pend - bsd_symtab_pstart > | | > +------------------------+ > | | > | ELF header | > | | > +------------------------+ > | | > | ELF section header 0 | Dummy section header > | | > +------------------------+ > | | > | ELF section header 1 | SYMTAB section header > | | > +------------------------+ > | | > | ELF section header 2 | STRTAB section header > | | > +------------------------+ > | | > | SYMTAB | > | | > +------------------------+ > | | > | STRTAB | > | | > +------------------------+ bsd_symtab_pend > > There are always only tree sections because that's all we need, section > 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is > used to describe the STRTAB. All fine, but this still doesn't clarify how the kernel learns where bsd_symtab_pstart is. > According to the ELF spec there can only be > one SYMTAB, so that's all we need. Did you perhaps overlook the spec saying "... but this restriction may be relaxed in the future", plus the fact that we're talking about an executable file here (which commonly have two symbol tables - .dynsym and .symtab), not an object one? (This isn't to say that you need to make the code handle multiple ones, if you know that *BSD will only ever have one.) >>> sz = elf_round_up(elf, sz); >>> >>> - /* Space for the symbol and string tables. */ >>> + /* Space for the symbol and string table. */ >>> for ( i = 0; i < elf_shdr_count(elf); i++ ) >>> { >>> shdr = elf_shdr_by_index(elf, i); >>> if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>> /* input has an insane section header count field */ >>> break; >>> - type = elf_uval(elf, shdr, sh_type); >>> - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) >>> - sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>> + >>> + if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB ) >>> + continue; >>> + >>> + sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>> + shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link)); >>> + >>> + if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>> + /* input has an insane section header count field */ >>> + break; >>> + >>> + if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB ) >>> + /* Invalid symtab -> strtab link */ >>> + break; >> >> This is not sufficient - what if sh_link is out of bounds, but the >> bogusly accessed field happens to hold SHT_STRTAB? (Oddly >> enough you have at least an SHN_UNDEF check in the second >> loop below.) > > The out-of-bounds access would be detected by elf_access_ok (if it's out > of the scope of the ELF file, which is all we care about). In fact the > elf_access_ok above could be removed because elf_uval already performs > out-of-bounds checks. The result is definitely no worse that what we are > doing ATM. No, the out of bounds check should be more strict than just considering the whole image: The image is broken if the link points to a non-existing section. >>> @@ -186,79 +199,169 @@ void elf_parse_bsdsyms(struct elf_binary *elf, >>> uint64_t pstart) >>> >>> static void elf_load_bsdsyms(struct elf_binary *elf) >>> { >>> - ELF_HANDLE_DECL(elf_ehdr) sym_ehdr; >>> - unsigned long sz; >>> - elf_ptrval maxva; >>> - elf_ptrval symbase; >>> - elf_ptrval symtab_addr; >>> - ELF_HANDLE_DECL(elf_shdr) shdr; >>> - unsigned i, type; >>> + /* >>> + * Header that is placed at the end of the kernel and allows >>> + * the OS to find where the symtab and strtab have been loaded. >>> + * It mimics a valid ELF file header, although it only contains >>> + * a symtab and a strtab section. >>> + * >>> + * NB: according to the ELF spec there's only ONE symtab per ELF >>> + * file, and accordingly we will only load the corresponding >>> + * strtab, so we only need three section headers in our fake ELF >>> + * header (first section header is always a dummy). >>> + */ >>> + struct { >>> + uint32_t size; >>> + struct { >>> + elf_ehdr header; >>> + elf_shdr section[3]; >>> + } __attribute__((packed)) elf_header; >>> + } __attribute__((packed)) header; >> >> If this is placed at the end, how can the OS reasonably use it >> without knowing that there are exactly 3 section header >> entries? I.e. how would it find the base of this structure? > > See the commend below, the base is found at the end of the kernel, and > then the ELF header contains the right pointers to the sections headers > (by appropriately setting e_shoff). Thing is - I can't see which "comment below" you try to refer me to. >>> + ELF_HANDLE_DECL(elf_ehdr) header_handle; >>> + unsigned long shdr_size; >>> + ELF_HANDLE_DECL(elf_shdr) section_handle; >>> + ELF_HANDLE_DECL(elf_shdr) image_handle; >>> + unsigned int i, link; >>> + elf_ptrval header_base; >>> + elf_ptrval elf_header_base; >>> + elf_ptrval symtab_base; >>> + elf_ptrval strtab_base; >>> >>> if ( !elf->bsd_symtab_pstart ) >>> return; >>> >>> -#define elf_hdr_elm(_elf, _hdr, _elm, _val) \ >>> -do { \ >>> - if ( elf_64bit(_elf) ) \ >>> - elf_store_field(_elf, _hdr, e64._elm, _val); \ >>> - else \ >>> - elf_store_field(_elf, _hdr, e32._elm, _val); \ >>> +#define elf_store_field_bitness(_elf, _hdr, _elm, _val) \ >>> +do { \ >>> + if ( elf_64bit(_elf) ) \ >>> + elf_store_field(_elf, _hdr, e64._elm, _val); \ >>> + else \ >>> + elf_store_field(_elf, _hdr, e32._elm, _val); \ >>> } while ( 0 ) >>> >>> - symbase = elf_get_ptr(elf, elf->bsd_symtab_pstart); >>> - symtab_addr = maxva = symbase + sizeof(uint32_t); >>> - >>> - /* Set up Elf header. */ >>> - sym_ehdr = ELF_MAKE_HANDLE(elf_ehdr, symtab_addr); >>> - sz = elf_uval(elf, elf->ehdr, e_ehsize); >>> - elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(sym_ehdr), ELF_HANDLE_PTRVAL(elf->ehdr), sz); >>> - maxva += sz; /* no round up */ >>> +#define SYMTAB_INDEX 1 >>> +#define STRTAB_INDEX 2 >>> >>> - elf_hdr_elm(elf, sym_ehdr, e_phoff, 0); >>> - elf_hdr_elm(elf, sym_ehdr, e_shoff, elf_uval(elf, elf->ehdr, e_ehsize)); >>> - elf_hdr_elm(elf, sym_ehdr, e_phentsize, 0); >>> - elf_hdr_elm(elf, sym_ehdr, e_phnum, 0); >>> + /* Allow elf_memcpy_safe to write to symbol_header. */ >>> + elf->caller_xdest_base = &header; >>> + elf->caller_xdest_size = sizeof(header); >>> >>> - /* Copy Elf section headers. */ >>> - shdr = ELF_MAKE_HANDLE(elf_shdr, maxva); >>> - sz = elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize); >>> - elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(shdr), >>> - ELF_IMAGE_BASE(elf) + elf_uval(elf, elf->ehdr, e_shoff), >>> - sz); >>> - maxva = elf_round_up(elf, (unsigned long)maxva + sz); >>> + /* >>> + * Calculate the position of the various elements in GUEST MEMORY SPACE. >>> + * This addresses MUST only be used with elf_load_image. >>> + * >>> + * NB: strtab_base cannot be calculated at this point because we don't >>> + * know the size of the symtab yet, and the strtab will be placed after it. >>> + */ >>> + header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart); >>> + elf_header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart) + >>> + sizeof(uint32_t); >>> + symtab_base = elf_round_up(elf, header_base + sizeof(header)); >>> + >>> + /* Fill the ELF header, copied from the original ELF header. */ >>> + header_handle = ELF_MAKE_HANDLE(elf_ehdr, >>> + ELF_REALPTR2PTRVAL(&header.elf_header.header)); >>> + elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(header_handle), >>> + ELF_HANDLE_PTRVAL(elf->ehdr), >>> + elf_uval(elf, elf->ehdr, e_ehsize)); >>> + >>> + /* Set the offset to the shdr array. */ >>> + elf_store_field_bitness(elf, header_handle, e_shoff, >>> + offsetof(typeof(header.elf_header), section)); >>> + >>> + /* Set the right number of section headers. */ >>> + elf_store_field_bitness(elf, header_handle, e_shnum, 3); >>> + >>> + /* Clear a couple of fields we don't use. */ >>> + elf_store_field_bitness(elf, header_handle, e_phoff, 0); >>> + elf_store_field_bitness(elf, header_handle, e_phentsize, 0); >>> + elf_store_field_bitness(elf, header_handle, e_phnum, 0); >> >> Perhaps better just give the structure above an initializer? > > Not really, the problem is that we copy the original header from the ELF > binary, and then we mangle it. This is just a fixup to make it clear > that no program headers are present (we just use section headers in > order to describe the SYMTAB and STRTAB positions). Ah, I see now. >>> + /* Zero the dummy section. */ >> >> Dummy? And anyway I think you mean "section header" here. > > Yes, the right comment would be: zero the dummy section header. But then still - why "dummy"? The 0-th section header has a purpose beyond being a filler, even if that purpose doesn't matter for the intentions you have here. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 3/4] libelf: rewrite symtab/strtab loading 2016-02-29 9:31 ` Jan Beulich @ 2016-02-29 10:57 ` Roger Pau Monné 2016-02-29 12:14 ` Jan Beulich 0 siblings, 1 reply; 39+ messages in thread From: Roger Pau Monné @ 2016-02-29 10:57 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel El 29/2/16 a les 10:31, Jan Beulich ha escrit: >>>> On 26.02.16 at 18:02, <roger.pau@citrix.com> wrote: >> El 26/2/16 a les 14:15, Jan Beulich ha escrit: >>>>>> On 16.02.16 at 18:37, <roger.pau@citrix.com> wrote: >>>> --- a/xen/common/libelf/libelf-loader.c >>>> +++ b/xen/common/libelf/libelf-loader.c >>>> @@ -164,20 +164,33 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t pstart) >>>> sz = sizeof(uint32_t); >>>> >>>> /* Space for the elf and elf section headers */ >>>> - sz += (elf_uval(elf, elf->ehdr, e_ehsize) + >>>> - elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize)); >>>> + sz += elf_uval(elf, elf->ehdr, e_ehsize) + >>>> + 3 * elf_uval(elf, elf->ehdr, e_shentsize); >>> >>> I think a literal 3 like this either needs a #define (at once serving >>> as documentation) or a comment. Perhaps rather the former, >>> considering that the (same?) 3 appears again further down. Plus - >>> what guarantees there are 3 section headers in the image? >> >> This is not related to the image itself, but to the metadata that libelf >> places at the end of the loaded kernel in order to stash the symtab and >> strtab for the guest to use. > > I understand this, yet imo the literal 3 still should be replaced. Yes, I agree. >> The layout is as follows (I should add this to the patch itself as a >> comment, since I guess this is still quite confusing): >> >> +------------------------+ >> | | >> | KERNEL | >> | | >> +------------------------+ pend >> | ALIGNMENT | >> +------------------------+ bsd_symtab_pstart >> | | >> | size | bsd_symtab_pend - bsd_symtab_pstart >> | | >> +------------------------+ >> | | >> | ELF header | >> | | >> +------------------------+ >> | | >> | ELF section header 0 | Dummy section header >> | | >> +------------------------+ >> | | >> | ELF section header 1 | SYMTAB section header >> | | >> +------------------------+ >> | | >> | ELF section header 2 | STRTAB section header >> | | >> +------------------------+ >> | | >> | SYMTAB | >> | | >> +------------------------+ >> | | >> | STRTAB | >> | | >> +------------------------+ bsd_symtab_pend >> >> There are always only tree sections because that's all we need, section >> 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is >> used to describe the STRTAB. > > All fine, but this still doesn't clarify how the kernel learns where > bsd_symtab_pstart is. The BSDs linker scripts places an "end" symbol after all the loaded sections: https://svnweb.freebsd.org/base/head/sys/conf/ldscript.amd64?revision=284870&view=co If you execute: $ nm -n /boot/kernel/kernel Against a FreeBSD kernel the output is as follows: [...] ffffffff81dc4760 B cpu_info ffffffff81dc4b60 B dpcpu ffffffff81dc4b68 B smp_tlb_pmap ffffffff81dc4b70 B drq_rman ffffffff81dc4bb8 B mem_rman ffffffff81dc4c00 B irq_rman ffffffff81dc4c48 B port_rman ffffffff81dc4c90 B tsc_is_invariant ffffffff81dc4c94 B tsc_perf_stat ffffffff81dc4c98 B tsc_freq ffffffff81dc4ca0 B smp_tsc ffffffff81dc4ca8 B HYPERVISOR_shared_info ffffffff81dc4cb0 B xen_vector_callback_enabled ffffffff81dc4cb8 B HYPERVISOR_start_info ffffffff81dc4cc0 A _end ffffffff81dc4cc0 A end >> According to the ELF spec there can only be >> one SYMTAB, so that's all we need. > > Did you perhaps overlook the spec saying "... but this restriction > may be relaxed in the future", plus the fact that we're talking > about an executable file here (which commonly have two symbol > tables - .dynsym and .symtab), not an object one? (This isn't to > say that you need to make the code handle multiple ones, if you > know that *BSD will only ever have one.) DYNSYM is just a subset of SYMTAB, BSDs prefer (it's not a strict requirement) the SYMTAB because of the in-kernel debugger. Also DYNSYM is already loaded by default because it's covered by the program headers. I can add support for loading multiple SYMTABs/STRTABs, but shouldn't we wait until the spec is updated? As I read the spec ATM, an ELF file with multiple SYMTABs is invalid. I assume that if the ELF format ever allows more than one SYMTAB, the version is going to be bumped at least (or some other field is going to be used to signal this change). >>>> sz = elf_round_up(elf, sz); >>>> >>>> - /* Space for the symbol and string tables. */ >>>> + /* Space for the symbol and string table. */ >>>> for ( i = 0; i < elf_shdr_count(elf); i++ ) >>>> { >>>> shdr = elf_shdr_by_index(elf, i); >>>> if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>>> /* input has an insane section header count field */ >>>> break; >>>> - type = elf_uval(elf, shdr, sh_type); >>>> - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) >>>> - sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>>> + >>>> + if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB ) >>>> + continue; >>>> + >>>> + sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>>> + shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link)); >>>> + >>>> + if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>>> + /* input has an insane section header count field */ >>>> + break; >>>> + >>>> + if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB ) >>>> + /* Invalid symtab -> strtab link */ >>>> + break; >>> >>> This is not sufficient - what if sh_link is out of bounds, but the >>> bogusly accessed field happens to hold SHT_STRTAB? (Oddly >>> enough you have at least an SHN_UNDEF check in the second >>> loop below.) >> >> The out-of-bounds access would be detected by elf_access_ok (if it's out >> of the scope of the ELF file, which is all we care about). In fact the >> elf_access_ok above could be removed because elf_uval already performs >> out-of-bounds checks. The result is definitely no worse that what we are >> doing ATM. > > No, the out of bounds check should be more strict than just > considering the whole image: The image is broken if the link > points to a non-existing section. Ah, do you mean I should mark the image as broken if either of the checks fail? In this case the image is broken if the sh_link field points to anything different than a STRTAB section, so I should add a elf_mark_broken before the break. elf_access_ok already marks the image as broken if an out-of-bounds access is attempted. >>>> @@ -186,79 +199,169 @@ void elf_parse_bsdsyms(struct elf_binary *elf, >>>> uint64_t pstart) >>>> >>>> static void elf_load_bsdsyms(struct elf_binary *elf) >>>> { >>>> - ELF_HANDLE_DECL(elf_ehdr) sym_ehdr; >>>> - unsigned long sz; >>>> - elf_ptrval maxva; >>>> - elf_ptrval symbase; >>>> - elf_ptrval symtab_addr; >>>> - ELF_HANDLE_DECL(elf_shdr) shdr; >>>> - unsigned i, type; >>>> + /* >>>> + * Header that is placed at the end of the kernel and allows >>>> + * the OS to find where the symtab and strtab have been loaded. >>>> + * It mimics a valid ELF file header, although it only contains >>>> + * a symtab and a strtab section. >>>> + * >>>> + * NB: according to the ELF spec there's only ONE symtab per ELF >>>> + * file, and accordingly we will only load the corresponding >>>> + * strtab, so we only need three section headers in our fake ELF >>>> + * header (first section header is always a dummy). >>>> + */ >>>> + struct { >>>> + uint32_t size; >>>> + struct { >>>> + elf_ehdr header; >>>> + elf_shdr section[3]; >>>> + } __attribute__((packed)) elf_header; >>>> + } __attribute__((packed)) header; >>> >>> If this is placed at the end, how can the OS reasonably use it >>> without knowing that there are exactly 3 section header >>> entries? I.e. how would it find the base of this structure? >> >> See the commend below, the base is found at the end of the kernel, and >> then the ELF header contains the right pointers to the sections headers >> (by appropriately setting e_shoff). > > Thing is - I can't see which "comment below" you try to refer me to. I was trying to point you to the big diagram/layout that I've posted at the start of the email. It doesn't need to know there are exactly 3 sections, this is fetched form the ELF header e_shnum field. And the ELF header is fetched using the "end" symbol as a reference to pend. >>>> + ELF_HANDLE_DECL(elf_ehdr) header_handle; >>>> + unsigned long shdr_size; >>>> + ELF_HANDLE_DECL(elf_shdr) section_handle; >>>> + ELF_HANDLE_DECL(elf_shdr) image_handle; >>>> + unsigned int i, link; >>>> + elf_ptrval header_base; >>>> + elf_ptrval elf_header_base; >>>> + elf_ptrval symtab_base; >>>> + elf_ptrval strtab_base; >>>> >>>> if ( !elf->bsd_symtab_pstart ) >>>> return; >>>> >>>> -#define elf_hdr_elm(_elf, _hdr, _elm, _val) \ >>>> -do { \ >>>> - if ( elf_64bit(_elf) ) \ >>>> - elf_store_field(_elf, _hdr, e64._elm, _val); \ >>>> - else \ >>>> - elf_store_field(_elf, _hdr, e32._elm, _val); \ >>>> +#define elf_store_field_bitness(_elf, _hdr, _elm, _val) \ >>>> +do { \ >>>> + if ( elf_64bit(_elf) ) \ >>>> + elf_store_field(_elf, _hdr, e64._elm, _val); \ >>>> + else \ >>>> + elf_store_field(_elf, _hdr, e32._elm, _val); \ >>>> } while ( 0 ) >>>> >>>> - symbase = elf_get_ptr(elf, elf->bsd_symtab_pstart); >>>> - symtab_addr = maxva = symbase + sizeof(uint32_t); >>>> - >>>> - /* Set up Elf header. */ >>>> - sym_ehdr = ELF_MAKE_HANDLE(elf_ehdr, symtab_addr); >>>> - sz = elf_uval(elf, elf->ehdr, e_ehsize); >>>> - elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(sym_ehdr), ELF_HANDLE_PTRVAL(elf->ehdr), sz); >>>> - maxva += sz; /* no round up */ >>>> +#define SYMTAB_INDEX 1 >>>> +#define STRTAB_INDEX 2 >>>> >>>> - elf_hdr_elm(elf, sym_ehdr, e_phoff, 0); >>>> - elf_hdr_elm(elf, sym_ehdr, e_shoff, elf_uval(elf, elf->ehdr, e_ehsize)); >>>> - elf_hdr_elm(elf, sym_ehdr, e_phentsize, 0); >>>> - elf_hdr_elm(elf, sym_ehdr, e_phnum, 0); >>>> + /* Allow elf_memcpy_safe to write to symbol_header. */ >>>> + elf->caller_xdest_base = &header; >>>> + elf->caller_xdest_size = sizeof(header); >>>> >>>> - /* Copy Elf section headers. */ >>>> - shdr = ELF_MAKE_HANDLE(elf_shdr, maxva); >>>> - sz = elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize); >>>> - elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(shdr), >>>> - ELF_IMAGE_BASE(elf) + elf_uval(elf, elf->ehdr, e_shoff), >>>> - sz); >>>> - maxva = elf_round_up(elf, (unsigned long)maxva + sz); >>>> + /* >>>> + * Calculate the position of the various elements in GUEST MEMORY SPACE. >>>> + * This addresses MUST only be used with elf_load_image. >>>> + * >>>> + * NB: strtab_base cannot be calculated at this point because we don't >>>> + * know the size of the symtab yet, and the strtab will be placed after it. >>>> + */ >>>> + header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart); >>>> + elf_header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart) + >>>> + sizeof(uint32_t); >>>> + symtab_base = elf_round_up(elf, header_base + sizeof(header)); >>>> + >>>> + /* Fill the ELF header, copied from the original ELF header. */ >>>> + header_handle = ELF_MAKE_HANDLE(elf_ehdr, >>>> + ELF_REALPTR2PTRVAL(&header.elf_header.header)); >>>> + elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(header_handle), >>>> + ELF_HANDLE_PTRVAL(elf->ehdr), >>>> + elf_uval(elf, elf->ehdr, e_ehsize)); >>>> + >>>> + /* Set the offset to the shdr array. */ >>>> + elf_store_field_bitness(elf, header_handle, e_shoff, >>>> + offsetof(typeof(header.elf_header), section)); >>>> + >>>> + /* Set the right number of section headers. */ >>>> + elf_store_field_bitness(elf, header_handle, e_shnum, 3); >>>> + >>>> + /* Clear a couple of fields we don't use. */ >>>> + elf_store_field_bitness(elf, header_handle, e_phoff, 0); >>>> + elf_store_field_bitness(elf, header_handle, e_phentsize, 0); >>>> + elf_store_field_bitness(elf, header_handle, e_phnum, 0); >>> >>> Perhaps better just give the structure above an initializer? >> >> Not really, the problem is that we copy the original header from the ELF >> binary, and then we mangle it. This is just a fixup to make it clear >> that no program headers are present (we just use section headers in >> order to describe the SYMTAB and STRTAB positions). > > Ah, I see now. > >>>> + /* Zero the dummy section. */ >>> >>> Dummy? And anyway I think you mean "section header" here. >> >> Yes, the right comment would be: zero the dummy section header. > > But then still - why "dummy"? The 0-th section header has a purpose > beyond being a filler, even if that purpose doesn't matter for the > intentions you have here. OK, what about if I replace the comment with: /* Zero the undefined section header. */ I think that's more inline with the specification. I could also fill it with specific values if you think it's clearer: elf_store_field_bitness(..., sh_name, 0); elf_store_field_bitness(..., sh_type, SHT_NULL); elf_store_field_bitness(..., sh_flags, 0); [...] Which is equivalent to zeroing it. Thanks for the review, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 3/4] libelf: rewrite symtab/strtab loading 2016-02-29 10:57 ` Roger Pau Monné @ 2016-02-29 12:14 ` Jan Beulich 2016-02-29 16:20 ` Roger Pau Monné 0 siblings, 1 reply; 39+ messages in thread From: Jan Beulich @ 2016-02-29 12:14 UTC (permalink / raw) To: Roger Pau Monné; +Cc: xen-devel >>> On 29.02.16 at 11:57, <roger.pau@citrix.com> wrote: > El 29/2/16 a les 10:31, Jan Beulich ha escrit: >>>>> On 26.02.16 at 18:02, <roger.pau@citrix.com> wrote: >>> The layout is as follows (I should add this to the patch itself as a >>> comment, since I guess this is still quite confusing): >>> >>> +------------------------+ >>> | | >>> | KERNEL | >>> | | >>> +------------------------+ pend >>> | ALIGNMENT | >>> +------------------------+ bsd_symtab_pstart >>> | | >>> | size | bsd_symtab_pend - bsd_symtab_pstart >>> | | >>> +------------------------+ >>> | | >>> | ELF header | >>> | | >>> +------------------------+ >>> | | >>> | ELF section header 0 | Dummy section header >>> | | >>> +------------------------+ >>> | | >>> | ELF section header 1 | SYMTAB section header >>> | | >>> +------------------------+ >>> | | >>> | ELF section header 2 | STRTAB section header >>> | | >>> +------------------------+ >>> | | >>> | SYMTAB | >>> | | >>> +------------------------+ >>> | | >>> | STRTAB | >>> | | >>> +------------------------+ bsd_symtab_pend >>> >>> There are always only tree sections because that's all we need, section >>> 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is >>> used to describe the STRTAB. >> >> All fine, but this still doesn't clarify how the kernel learns where >> bsd_symtab_pstart is. > > The BSDs linker scripts places an "end" symbol after all the loaded > sections: > > https://svnweb.freebsd.org/base/head/sys/conf/ldscript.amd64?revision=284870& > view=co That's fine. But how do they know it is legitimate to even touch what follows, not to speak of assign meaning to the value found there? And are there no alignment/padding considerations necessary? >>> According to the ELF spec there can only be >>> one SYMTAB, so that's all we need. >> >> Did you perhaps overlook the spec saying "... but this restriction >> may be relaxed in the future", plus the fact that we're talking >> about an executable file here (which commonly have two symbol >> tables - .dynsym and .symtab), not an object one? (This isn't to >> say that you need to make the code handle multiple ones, if you >> know that *BSD will only ever have one.) > > DYNSYM is just a subset of SYMTAB, BSDs prefer (it's not a strict > requirement) the SYMTAB because of the in-kernel debugger. Also DYNSYM > is already loaded by default because it's covered by the program headers. > > I can add support for loading multiple SYMTABs/STRTABs, but shouldn't we > wait until the spec is updated? As I read the spec ATM, an ELF file with > multiple SYMTABs is invalid. I assume that if the ELF format ever allows > more than one SYMTAB, the version is going to be bumped at least (or > some other field is going to be used to signal this change). As said I don't see the need for you to add multiple symbol table support. I only meant to point out that the potential exists. >>>>> sz = elf_round_up(elf, sz); >>>>> >>>>> - /* Space for the symbol and string tables. */ >>>>> + /* Space for the symbol and string table. */ >>>>> for ( i = 0; i < elf_shdr_count(elf); i++ ) >>>>> { >>>>> shdr = elf_shdr_by_index(elf, i); >>>>> if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>>>> /* input has an insane section header count field */ >>>>> break; >>>>> - type = elf_uval(elf, shdr, sh_type); >>>>> - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) >>>>> - sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>>>> + >>>>> + if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB ) >>>>> + continue; >>>>> + >>>>> + sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>>>> + shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link)); >>>>> + >>>>> + if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>>>> + /* input has an insane section header count field */ >>>>> + break; >>>>> + >>>>> + if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB ) >>>>> + /* Invalid symtab -> strtab link */ >>>>> + break; >>>> >>>> This is not sufficient - what if sh_link is out of bounds, but the >>>> bogusly accessed field happens to hold SHT_STRTAB? (Oddly >>>> enough you have at least an SHN_UNDEF check in the second >>>> loop below.) >>> >>> The out-of-bounds access would be detected by elf_access_ok (if it's out >>> of the scope of the ELF file, which is all we care about). In fact the >>> elf_access_ok above could be removed because elf_uval already performs >>> out-of-bounds checks. The result is definitely no worse that what we are >>> doing ATM. >> >> No, the out of bounds check should be more strict than just >> considering the whole image: The image is broken if the link >> points to a non-existing section. > > Ah, do you mean I should mark the image as broken if either of the > checks fail? Perhaps, but my main point continue to be that there is a check missing here. >>>>> @@ -186,79 +199,169 @@ void elf_parse_bsdsyms(struct elf_binary *elf, >>>>> uint64_t pstart) >>>>> >>>>> static void elf_load_bsdsyms(struct elf_binary *elf) >>>>> { >>>>> - ELF_HANDLE_DECL(elf_ehdr) sym_ehdr; >>>>> - unsigned long sz; >>>>> - elf_ptrval maxva; >>>>> - elf_ptrval symbase; >>>>> - elf_ptrval symtab_addr; >>>>> - ELF_HANDLE_DECL(elf_shdr) shdr; >>>>> - unsigned i, type; >>>>> + /* >>>>> + * Header that is placed at the end of the kernel and allows >>>>> + * the OS to find where the symtab and strtab have been loaded. >>>>> + * It mimics a valid ELF file header, although it only contains >>>>> + * a symtab and a strtab section. >>>>> + * >>>>> + * NB: according to the ELF spec there's only ONE symtab per ELF >>>>> + * file, and accordingly we will only load the corresponding >>>>> + * strtab, so we only need three section headers in our fake ELF >>>>> + * header (first section header is always a dummy). >>>>> + */ >>>>> + struct { >>>>> + uint32_t size; >>>>> + struct { >>>>> + elf_ehdr header; >>>>> + elf_shdr section[3]; >>>>> + } __attribute__((packed)) elf_header; >>>>> + } __attribute__((packed)) header; >>>> >>>> If this is placed at the end, how can the OS reasonably use it >>>> without knowing that there are exactly 3 section header >>>> entries? I.e. how would it find the base of this structure? >>> >>> See the commend below, the base is found at the end of the kernel, and >>> then the ELF header contains the right pointers to the sections headers >>> (by appropriately setting e_shoff). >> >> Thing is - I can't see which "comment below" you try to refer me to. > > I was trying to point you to the big diagram/layout that I've posted at > the start of the email. > > It doesn't need to know there are exactly 3 sections, this is fetched > form the ELF header e_shnum field. And the ELF header is fetched using > the "end" symbol as a reference to pend. Ah, okay, so the ABI is _not_ for the kernel to start looking from the end, but to start looking from its own image end. >>>>> + /* Zero the dummy section. */ >>>> >>>> Dummy? And anyway I think you mean "section header" here. >>> >>> Yes, the right comment would be: zero the dummy section header. >> >> But then still - why "dummy"? The 0-th section header has a purpose >> beyond being a filler, even if that purpose doesn't matter for the >> intentions you have here. > > OK, what about if I replace the comment with: > > /* Zero the undefined section header. */ > > I think that's more inline with the specification. I could also fill it > with specific values if you think it's clearer: > > elf_store_field_bitness(..., sh_name, 0); > elf_store_field_bitness(..., sh_type, SHT_NULL); > elf_store_field_bitness(..., sh_flags, 0); > [...] > > Which is equivalent to zeroing it. Just zeroing is fine afaic. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 3/4] libelf: rewrite symtab/strtab loading 2016-02-29 12:14 ` Jan Beulich @ 2016-02-29 16:20 ` Roger Pau Monné 2016-02-29 16:41 ` Jan Beulich 0 siblings, 1 reply; 39+ messages in thread From: Roger Pau Monné @ 2016-02-29 16:20 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel El 29/2/16 a les 13:14, Jan Beulich ha escrit: >>>> On 29.02.16 at 11:57, <roger.pau@citrix.com> wrote: >> El 29/2/16 a les 10:31, Jan Beulich ha escrit: >>>>>> On 26.02.16 at 18:02, <roger.pau@citrix.com> wrote: >>>> The layout is as follows (I should add this to the patch itself as a >>>> comment, since I guess this is still quite confusing): >>>> >>>> +------------------------+ >>>> | | >>>> | KERNEL | >>>> | | >>>> +------------------------+ pend >>>> | ALIGNMENT | >>>> +------------------------+ bsd_symtab_pstart >>>> | | >>>> | size | bsd_symtab_pend - bsd_symtab_pstart >>>> | | >>>> +------------------------+ >>>> | | >>>> | ELF header | >>>> | | >>>> +------------------------+ >>>> | | >>>> | ELF section header 0 | Dummy section header >>>> | | >>>> +------------------------+ >>>> | | >>>> | ELF section header 1 | SYMTAB section header >>>> | | >>>> +------------------------+ >>>> | | >>>> | ELF section header 2 | STRTAB section header >>>> | | >>>> +------------------------+ >>>> | | >>>> | SYMTAB | >>>> | | >>>> +------------------------+ >>>> | | >>>> | STRTAB | >>>> | | >>>> +------------------------+ bsd_symtab_pend >>>> >>>> There are always only tree sections because that's all we need, section >>>> 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is >>>> used to describe the STRTAB. >>> >>> All fine, but this still doesn't clarify how the kernel learns where >>> bsd_symtab_pstart is. >> >> The BSDs linker scripts places an "end" symbol after all the loaded >> sections: >> >> https://svnweb.freebsd.org/base/head/sys/conf/ldscript.amd64?revision=284870& >> view=co > > That's fine. But how do they know it is legitimate to even touch what > follows, not to speak of assign meaning to the value found there? The kernel signals that it wants it's SYMTAB/STRTAB loaded using the XEN_ELFNOTE_BSD_SYMTAB ELFNOTE. Then AFAICT it's just a matter of 'faith', there's no signal from Xen to the guest kernel in order to confirm that the SYMTAB has indeed been loaded... There's the ELF magic in the header, that one can check in order to make sure, but apart from that I don't think there's anything else. > And are there no alignment/padding considerations necessary? bsd_symtab_pstart is aligned to 4 or 8 bytes (depending on the kernel bitness). IMHO, the best way to solve this would be to pass the SYMTAB and STRTAB as modules for PVH (using the new module list that was introduced with the new boot ABI), and use the module command line to signal which one is the strtab and which one is the symtab. I think this can be done in a backwards compatible way, but this is out of the scope of this patch. >>>>>> sz = elf_round_up(elf, sz); >>>>>> >>>>>> - /* Space for the symbol and string tables. */ >>>>>> + /* Space for the symbol and string table. */ >>>>>> for ( i = 0; i < elf_shdr_count(elf); i++ ) >>>>>> { >>>>>> shdr = elf_shdr_by_index(elf, i); >>>>>> if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>>>>> /* input has an insane section header count field */ >>>>>> break; >>>>>> - type = elf_uval(elf, shdr, sh_type); >>>>>> - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) >>>>>> - sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>>>>> + >>>>>> + if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB ) >>>>>> + continue; >>>>>> + >>>>>> + sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>>>>> + shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link)); >>>>>> + >>>>>> + if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>>>>> + /* input has an insane section header count field */ >>>>>> + break; >>>>>> + >>>>>> + if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB ) >>>>>> + /* Invalid symtab -> strtab link */ >>>>>> + break; >>>>> >>>>> This is not sufficient - what if sh_link is out of bounds, but the >>>>> bogusly accessed field happens to hold SHT_STRTAB? (Oddly >>>>> enough you have at least an SHN_UNDEF check in the second >>>>> loop below.) >>>> >>>> The out-of-bounds access would be detected by elf_access_ok (if it's out >>>> of the scope of the ELF file, which is all we care about). In fact the >>>> elf_access_ok above could be removed because elf_uval already performs >>>> out-of-bounds checks. The result is definitely no worse that what we are >>>> doing ATM. >>> >>> No, the out of bounds check should be more strict than just >>> considering the whole image: The image is broken if the link >>> points to a non-existing section. >> >> Ah, do you mean I should mark the image as broken if either of the >> checks fail? > > Perhaps, but my main point continue to be that there is a check > missing here. I'm quite sure I'm missing something, but what kind of extra checks do you envision? AFAICT, we already check that the section we are trying to load is not out-of-bounds (both elf_access_ok and elf_uval make sure of that), and that it has the right type (STRTAB). Apart from that, I don't know what else to check. There's no signature in the section headers in order to check it's integrity. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 3/4] libelf: rewrite symtab/strtab loading 2016-02-29 16:20 ` Roger Pau Monné @ 2016-02-29 16:41 ` Jan Beulich 0 siblings, 0 replies; 39+ messages in thread From: Jan Beulich @ 2016-02-29 16:41 UTC (permalink / raw) To: Roger Pau Monné; +Cc: xen-devel >>> On 29.02.16 at 17:20, <roger.pau@citrix.com> wrote: > El 29/2/16 a les 13:14, Jan Beulich ha escrit: >>>>> On 29.02.16 at 11:57, <roger.pau@citrix.com> wrote: >>> El 29/2/16 a les 10:31, Jan Beulich ha escrit: >>>>>>> On 26.02.16 at 18:02, <roger.pau@citrix.com> wrote: >>>>>>> - /* Space for the symbol and string tables. */ >>>>>>> + /* Space for the symbol and string table. */ >>>>>>> for ( i = 0; i < elf_shdr_count(elf); i++ ) >>>>>>> { >>>>>>> shdr = elf_shdr_by_index(elf, i); >>>>>>> if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>>>>>> /* input has an insane section header count field */ >>>>>>> break; >>>>>>> - type = elf_uval(elf, shdr, sh_type); >>>>>>> - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) >>>>>>> - sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>>>>>> + >>>>>>> + if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB ) >>>>>>> + continue; >>>>>>> + >>>>>>> + sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>>>>>> + shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link)); >>>>>>> + >>>>>>> + if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>>>>>> + /* input has an insane section header count field */ >>>>>>> + break; >>>>>>> + >>>>>>> + if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB ) >>>>>>> + /* Invalid symtab -> strtab link */ >>>>>>> + break; >>>>>> >>>>>> This is not sufficient - what if sh_link is out of bounds, but the >>>>>> bogusly accessed field happens to hold SHT_STRTAB? (Oddly >>>>>> enough you have at least an SHN_UNDEF check in the second >>>>>> loop below.) >>>>> >>>>> The out-of-bounds access would be detected by elf_access_ok (if it's out >>>>> of the scope of the ELF file, which is all we care about). In fact the >>>>> elf_access_ok above could be removed because elf_uval already performs >>>>> out-of-bounds checks. The result is definitely no worse that what we are >>>>> doing ATM. >>>> >>>> No, the out of bounds check should be more strict than just >>>> considering the whole image: The image is broken if the link >>>> points to a non-existing section. >>> >>> Ah, do you mean I should mark the image as broken if either of the >>> checks fail? >> >> Perhaps, but my main point continue to be that there is a check >> missing here. > > I'm quite sure I'm missing something, but what kind of extra checks do > you envision? 0 < sh_link < elf_shdr_count(elf) Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH v4 4/4] libxl: fix cd-eject 2016-02-16 17:37 [PATCH v4 0/4] Assorted fixes and improvements Roger Pau Monne ` (2 preceding siblings ...) 2016-02-16 17:37 ` [PATCH v4 3/4] libelf: rewrite symtab/strtab loading Roger Pau Monne @ 2016-02-16 17:37 ` Roger Pau Monne 2016-02-16 17:58 ` Ian Jackson 3 siblings, 1 reply; 39+ messages in thread From: Roger Pau Monne @ 2016-02-16 17:37 UTC (permalink / raw) To: xen-devel Cc: Wei Liu, Alex Braunegg, Ian Jackson, Ian Campbell, Roger Pau Monne Current libxl__device_disk_set_backend implementation tried to guess the backend of devices with format LIBXL_DISK_FORMAT_EMPTY, which is of course doomed to fail since the disk is empty. Instead just return early from the function if an empty disk is detected. This fixes cd ejection. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-by: Alex Braunegg <alex.braunegg@gmail.com> --- Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Alex Braunegg <alex.braunegg@gmail.com> --- tools/libxl/libxl_device.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c index 8bb5e93..b93cbbf 100644 --- a/tools/libxl/libxl_device.c +++ b/tools/libxl/libxl_device.c @@ -273,7 +273,8 @@ int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) { LOG(ERROR, "Disk vdev=%s is empty but not cdrom", disk->vdev); return ERROR_INVAL; } - memset(&a.stab, 0, sizeof(a.stab)); + /* Disk is empty, so it's useless to try to guess the backend type. */ + return 0; } else if ((disk->backend == LIBXL_DISK_BACKEND_UNKNOWN || disk->backend == LIBXL_DISK_BACKEND_PHY) && disk->backend_domid == LIBXL_TOOLSTACK_DOMID && -- 2.5.4 (Apple Git-61) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 4/4] libxl: fix cd-eject 2016-02-16 17:37 ` [PATCH v4 4/4] libxl: fix cd-eject Roger Pau Monne @ 2016-02-16 17:58 ` Ian Jackson 2016-02-17 11:20 ` Roger Pau Monné 0 siblings, 1 reply; 39+ messages in thread From: Ian Jackson @ 2016-02-16 17:58 UTC (permalink / raw) To: Roger Pau Monne; +Cc: xen-devel, Wei Liu, Ian Campbell, Alex Braunegg Roger Pau Monne writes ("[PATCH v4 4/4] libxl: fix cd-eject"): > Current libxl__device_disk_set_backend implementation tried to guess the > backend of devices with format LIBXL_DISK_FORMAT_EMPTY, which is of course > doomed to fail since the disk is empty. Instead just return early from the > function if an empty disk is detected. > > This fixes cd ejection. DYK when this was broken ? Or, to put it another way, how did this ever work ? ...looking at the code... AFAICT disk_try_backend should succeed for both LIBXL_DISK_BACKEND_PHY and LIBXL_DISK_BACKEND_QDISK. So even before your patch: > } > - memset(&a.stab, 0, sizeof(a.stab)); > + /* Disk is empty, so it's useless to try to guess the backend type. */ > + return 0; > } else if ((disk->backend == LIBXL_DISK_BACKEND_UNKNOWN || libxl__device_disk_set_backend should work. Worse, this change seems to leave disk->backend unset on return from libxl__device_disk_set_backend, which seems quite wrong to me. Ian. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 4/4] libxl: fix cd-eject 2016-02-16 17:58 ` Ian Jackson @ 2016-02-17 11:20 ` Roger Pau Monné 2016-02-17 11:42 ` Ian Campbell 2016-02-17 12:15 ` Ian Jackson 0 siblings, 2 replies; 39+ messages in thread From: Roger Pau Monné @ 2016-02-17 11:20 UTC (permalink / raw) To: Ian Jackson; +Cc: xen-devel, Wei Liu, Ian Campbell, Alex Braunegg El 16/2/16 a les 18:58, Ian Jackson ha escrit: > Roger Pau Monne writes ("[PATCH v4 4/4] libxl: fix cd-eject"): >> Current libxl__device_disk_set_backend implementation tried to guess the >> backend of devices with format LIBXL_DISK_FORMAT_EMPTY, which is of course >> doomed to fail since the disk is empty. Instead just return early from the >> function if an empty disk is detected. >> >> This fixes cd ejection. > > DYK when this was broken ? Or, to put it another way, how did this > ever work ? > > ...looking at the code... > > AFAICT disk_try_backend should succeed for both LIBXL_DISK_BACKEND_PHY > and LIBXL_DISK_BACKEND_QDISK. So even before your patch: > >> } >> - memset(&a.stab, 0, sizeof(a.stab)); >> + /* Disk is empty, so it's useless to try to guess the backend type. */ >> + return 0; >> } else if ((disk->backend == LIBXL_DISK_BACKEND_UNKNOWN || > > libxl__device_disk_set_backend should work. I've been looking at the code and I cannot really understand how this is supposed to work before, none of the recent changes seem to have broken it AFAICT, and the issue has been there for a long time (~1year), which IMHO makes no sense, so I'm quite sure I'm missing something. The problem is that in libxl_cdrom_insert the backend of the input "disk" struct is set based on the backend that the disk with the same vdev is using (see libxl.c:~2910). So if you have a CD plugged in with a PHY backend, the backend of the input disk will also be set to PHY. Then when you try to unplug it, the backend of the empty disk will also be set to PHY, and disk_try_backend will fail miserably. In this case since the backend is set _before_ calling libxl__device_disk_set_backend no other backend is tested, and an error is returned. I guess that all this steams from when we switched from handling RAW files from QDISK to blkback (PHY). That was quite some time ago IIRC, and I found it weird that nobody noticed this before. Prior to this change the backend used for CDROM would be QDISK, which should make everything work as expected (I've locally reverted a0a2dc and it solves the issue). Should I just force all devices of type CDROM to use QDISK as the backend? Should we allow the PHY backend to handle empty files? (pdev_path == NULL || pdev_path == ""). Roger. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 4/4] libxl: fix cd-eject 2016-02-17 11:20 ` Roger Pau Monné @ 2016-02-17 11:42 ` Ian Campbell 2016-02-17 12:15 ` Ian Jackson 1 sibling, 0 replies; 39+ messages in thread From: Ian Campbell @ 2016-02-17 11:42 UTC (permalink / raw) To: Roger Pau Monné, Ian Jackson; +Cc: xen-devel, Wei Liu, Alex Braunegg On Wed, 2016-02-17 at 12:20 +0100, Roger Pau Monné wrote: > El 16/2/16 a les 18:58, Ian Jackson ha escrit: > > Roger Pau Monne writes ("[PATCH v4 4/4] libxl: fix cd-eject"): > > > Current libxl__device_disk_set_backend implementation tried to guess > > > the > > > backend of devices with format LIBXL_DISK_FORMAT_EMPTY, which is of > > > course > > > doomed to fail since the disk is empty. Instead just return early > > > from the > > > function if an empty disk is detected. > > > > > > This fixes cd ejection. > > > > DYK when this was broken ? Or, to put it another way, how did this > > ever work ? > > > > ...looking at the code... > > > > AFAICT disk_try_backend should succeed for both LIBXL_DISK_BACKEND_PHY > > and LIBXL_DISK_BACKEND_QDISK. So even before your patch: > > > > > } > > > - memset(&a.stab, 0, sizeof(a.stab)); > > > + /* Disk is empty, so it's useless to try to guess the > > > backend type. */ > > > + return 0; > > > } else if ((disk->backend == LIBXL_DISK_BACKEND_UNKNOWN || > > > > libxl__device_disk_set_backend should work. > > I've been looking at the code and I cannot really understand how this is > supposed to work before, none of the recent changes seem to have broken > it AFAICT, and the issue has been there for a long time (~1year), which > IMHO makes no sense, so I'm quite sure I'm missing something. cd-insert/eject seems to me to either be perpetually broken or is incredibly prone to regressing (i..e this keeps coming up). Getting a test step into osstest which used cd-insert/eject would be a good idea IMHO, either a deliberate test step which checks it works or trying to use it as a matter of course during e.g. guest install. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 4/4] libxl: fix cd-eject 2016-02-17 11:20 ` Roger Pau Monné 2016-02-17 11:42 ` Ian Campbell @ 2016-02-17 12:15 ` Ian Jackson 2016-02-17 17:20 ` [PATCH v6] libxl: allow 'phy' backend to use empty files Roger Pau Monne 1 sibling, 1 reply; 39+ messages in thread From: Ian Jackson @ 2016-02-17 12:15 UTC (permalink / raw) To: Roger Pau Monné; +Cc: xen-devel, Wei Liu, Ian Campbell, Alex Braunegg Roger Pau Monné writes ("Re: [PATCH v4 4/4] libxl: fix cd-eject"): > Should we allow the PHY backend to handle empty files? > (pdev_path == NULL || pdev_path == ""). Yes. That is how it is supposed to work. I think this was broken in in 97ee1f5d "libxl: add support for image files for NetBSD" in 2011 (!) where this happened: - if (a->disk->format != LIBXL_DISK_FORMAT_EMPTY && - !S_ISBLK(a->stab.st_mode)) { - LIBXL__LOG(ctx, LIBXL__LOG_DEBUG, "Disk vdev=%s, backend phy" - " unsuitable as phys path not a block device", - a->disk->vdev); - return 0; - } - return backend; + if (libxl__try_phy_backend(a->stab.st_mode)) + return backend; ... [libxl_linux.c:] ... +int libxl__try_phy_backend(mode_t st_mode) +{ + if (!S_ISBLK(st_mode)) { + return 0; + } + + return 1; +} Note that the "a->disk->format != LIBXL_DISK_FORMAT_EMPTY" clause has been lost. Ian. ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH v6] libxl: allow 'phy' backend to use empty files 2016-02-17 12:15 ` Ian Jackson @ 2016-02-17 17:20 ` Roger Pau Monne 2016-02-18 10:27 ` Alex Braunegg ` (2 more replies) 0 siblings, 3 replies; 39+ messages in thread From: Roger Pau Monne @ 2016-02-17 17:20 UTC (permalink / raw) To: xen-devel Cc: Wei Liu, Alex Braunegg, Ian Jackson, Ian Campbell, Roger Pau Monne This was introduced by 97ee1f (~5 years ago), but was probably never surfaced because most people used regular files as CDROM images, so the PHY backend was actually never selected. A year ago this was changed, and now regular RAW files are also handled by the PHY backend, which has made this bug surface. Fix it by allowing empty disks to use the PHY backend, skipping the stat tests. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-by: Alex Braunegg <alex.braunegg@gmail.com> --- Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Alex Braunegg <alex.braunegg@gmail.com> --- Changes since v4: - Split form the rest of the series. - Fix disk_try_backend. --- tools/libxl/libxl_device.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c index 8bb5e93..2e08108 100644 --- a/tools/libxl/libxl_device.c +++ b/tools/libxl/libxl_device.c @@ -196,6 +196,14 @@ static int disk_try_backend(disk_try_backend_args *a, goto bad_format; } + if (a->disk->format == LIBXL_DISK_FORMAT_EMPTY) { + assert(a->disk->pdev_path == NULL || + !strcmp(a->disk->pdev_path, "")); + LOG(DEBUG, "Disk vdev=%s is empty, skipping physical device check", + a->disk->vdev); + return backend; + } + if (a->disk->backend_domid != LIBXL_TOOLSTACK_DOMID) { LOG(DEBUG, "Disk vdev=%s, is using a storage driver domain, " "skipping physical device check", a->disk->vdev); -- 2.5.4 (Apple Git-61) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v6] libxl: allow 'phy' backend to use empty files 2016-02-17 17:20 ` [PATCH v6] libxl: allow 'phy' backend to use empty files Roger Pau Monne @ 2016-02-18 10:27 ` Alex Braunegg 2016-02-19 17:30 ` Ian Jackson 2016-04-05 16:48 ` [PATCH v6] " George Dunlap 2 siblings, 0 replies; 39+ messages in thread From: Alex Braunegg @ 2016-02-18 10:27 UTC (permalink / raw) To: 'Roger Pau Monne', xen-devel Cc: 'Wei Liu', 'Ian Jackson', 'Ian Campbell' Hi Roger, I have tested and applied the patch locally, and can confirm that the issue reported is now resolved. Best regards, Alex -----Original Message----- From: Roger Pau Monne [mailto:roger.pau@citrix.com] Sent: Thursday, 18 February 2016 4:21 AM To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne; Ian Jackson; Ian Campbell; Wei Liu; Alex Braunegg Subject: [PATCH v6] libxl: allow 'phy' backend to use empty files This was introduced by 97ee1f (~5 years ago), but was probably never surfaced because most people used regular files as CDROM images, so the PHY backend was actually never selected. A year ago this was changed, and now regular RAW files are also handled by the PHY backend, which has made this bug surface. Fix it by allowing empty disks to use the PHY backend, skipping the stat tests. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-by: Alex Braunegg <alex.braunegg@gmail.com> --- Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Alex Braunegg <alex.braunegg@gmail.com> --- Changes since v4: - Split form the rest of the series. - Fix disk_try_backend. --- tools/libxl/libxl_device.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c index 8bb5e93..2e08108 100644 --- a/tools/libxl/libxl_device.c +++ b/tools/libxl/libxl_device.c @@ -196,6 +196,14 @@ static int disk_try_backend(disk_try_backend_args *a, goto bad_format; } + if (a->disk->format == LIBXL_DISK_FORMAT_EMPTY) { + assert(a->disk->pdev_path == NULL || + !strcmp(a->disk->pdev_path, "")); + LOG(DEBUG, "Disk vdev=%s is empty, skipping physical device check", + a->disk->vdev); + return backend; + } + if (a->disk->backend_domid != LIBXL_TOOLSTACK_DOMID) { LOG(DEBUG, "Disk vdev=%s, is using a storage driver domain, " "skipping physical device check", a->disk->vdev); -- 2.5.4 (Apple Git-61) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v6] libxl: allow 'phy' backend to use empty files 2016-02-17 17:20 ` [PATCH v6] libxl: allow 'phy' backend to use empty files Roger Pau Monne 2016-02-18 10:27 ` Alex Braunegg @ 2016-02-19 17:30 ` Ian Jackson 2016-02-19 17:41 ` Roger Pau Monné 2016-04-05 16:48 ` [PATCH v6] " George Dunlap 2 siblings, 1 reply; 39+ messages in thread From: Ian Jackson @ 2016-02-19 17:30 UTC (permalink / raw) To: Roger Pau Monne; +Cc: xen-devel, Wei Liu, Ian Campbell, Alex Braunegg Roger Pau Monne writes ("[PATCH v6] libxl: allow 'phy' backend to use empty files"): > This was introduced by 97ee1f (~5 years ago), but was probably never > surfaced because most people used regular files as CDROM images, so the PHY > backend was actually never selected. A year ago this was changed, and now > regular RAW files are also handled by the PHY backend, which has made this > bug surface. > > Fix it by allowing empty disks to use the PHY backend, skipping the stat > tests. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > Reported-by: Alex Braunegg <alex.braunegg@gmail.com> Thanks, and thanks to Alex for the testing, but: > diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c > index 8bb5e93..2e08108 100644 > --- a/tools/libxl/libxl_device.c > +++ b/tools/libxl/libxl_device.c > @@ -196,6 +196,14 @@ static int disk_try_backend(disk_try_backend_args *a, > goto bad_format; > } > > + if (a->disk->format == LIBXL_DISK_FORMAT_EMPTY) { > + assert(a->disk->pdev_path == NULL || > + !strcmp(a->disk->pdev_path, "")); I agree that these things ought to be true but I don't see what code in libxl ensures that they definitely are. I think this check ought to be moved to libxl__device_disk_set_backend or perhaps even earlier, and should generate an ERROR_INVAL rather than an assertion failure. The rest of the logic LGTM. Thanks, Ian. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v6] libxl: allow 'phy' backend to use empty files 2016-02-19 17:30 ` Ian Jackson @ 2016-02-19 17:41 ` Roger Pau Monné 2016-02-19 18:01 ` [PATCH v7] " Roger Pau Monne 0 siblings, 1 reply; 39+ messages in thread From: Roger Pau Monné @ 2016-02-19 17:41 UTC (permalink / raw) To: Ian Jackson; +Cc: xen-devel, Wei Liu, Ian Campbell, Alex Braunegg El 19/2/16 a les 18:30, Ian Jackson ha escrit: > Roger Pau Monne writes ("[PATCH v6] libxl: allow 'phy' backend to use empty files"): >> This was introduced by 97ee1f (~5 years ago), but was probably never >> surfaced because most people used regular files as CDROM images, so the PHY >> backend was actually never selected. A year ago this was changed, and now >> regular RAW files are also handled by the PHY backend, which has made this >> bug surface. >> >> Fix it by allowing empty disks to use the PHY backend, skipping the stat >> tests. >> >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >> Reported-by: Alex Braunegg <alex.braunegg@gmail.com> > > Thanks, and thanks to Alex for the testing, but: > >> diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c >> index 8bb5e93..2e08108 100644 >> --- a/tools/libxl/libxl_device.c >> +++ b/tools/libxl/libxl_device.c >> @@ -196,6 +196,14 @@ static int disk_try_backend(disk_try_backend_args *a, >> goto bad_format; >> } >> >> + if (a->disk->format == LIBXL_DISK_FORMAT_EMPTY) { >> + assert(a->disk->pdev_path == NULL || >> + !strcmp(a->disk->pdev_path, "")); > > I agree that these things ought to be true but I don't see what code > in libxl ensures that they definitely are. > > I think this check ought to be moved to libxl__device_disk_set_backend, > or perhaps even earlier, and should generate an ERROR_INVAL rather > than an assertion failure. Thanks, libxl can return EINVAL without problems from libxl__device_disk_set_backend, so I think it's a fine place to put this check. libxl__device_disk_setdefault should also be a good place, but I feel _backend is where it makes more sense. v7 is on the way. Roger. ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH v7] libxl: allow 'phy' backend to use empty files 2016-02-19 17:41 ` Roger Pau Monné @ 2016-02-19 18:01 ` Roger Pau Monne 2016-03-01 9:51 ` Roger Pau Monné 2016-03-03 15:41 ` Ian Jackson 0 siblings, 2 replies; 39+ messages in thread From: Roger Pau Monne @ 2016-02-19 18:01 UTC (permalink / raw) To: xen-devel Cc: Wei Liu, Alex Braunegg, Ian Jackson, Ian Campbell, Roger Pau Monne This was introduced by 97ee1f (~5 years ago), but was probably never surfaced because most people used regular files as CDROM images, so the PHY backend was actually never selected. A year ago this was changed, and now regular RAW files are also handled by the PHY backend, which has made this bug suface. Fix it by allowing empty disks to use the PHY backend, skipping the stat tests. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-by: Alex Braunegg <alex.braunegg@gmail.com> --- Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Alex Braunegg <alex.braunegg@gmail.com> --- Changes since v6: - Turn the assert into a check at libxl__device_disk_set_backend. Changes since v4: - Split form the rest of the series. - Fix disk_try_backend. --- tools/libxl/libxl_device.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c index 8bb5e93..e0a81e3 100644 --- a/tools/libxl/libxl_device.c +++ b/tools/libxl/libxl_device.c @@ -196,6 +196,12 @@ static int disk_try_backend(disk_try_backend_args *a, goto bad_format; } + if (a->disk->format == LIBXL_DISK_FORMAT_EMPTY) { + LOG(DEBUG, "Disk vdev=%s is empty, skipping physical device check", + a->disk->vdev); + return backend; + } + if (a->disk->backend_domid != LIBXL_TOOLSTACK_DOMID) { LOG(DEBUG, "Disk vdev=%s, is using a storage driver domain, " "skipping physical device check", a->disk->vdev); @@ -273,6 +279,12 @@ int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) { LOG(ERROR, "Disk vdev=%s is empty but not cdrom", disk->vdev); return ERROR_INVAL; } + if (disk->pdev_path != NULL && strcmp(disk->pdev_path, "")) { + LOG(ERROR, + "Disk vdev=%s is empty but an image has been provided: %s", + disk->vdev, disk->pdev_path); + return ERROR_INVAL; + } memset(&a.stab, 0, sizeof(a.stab)); } else if ((disk->backend == LIBXL_DISK_BACKEND_UNKNOWN || disk->backend == LIBXL_DISK_BACKEND_PHY) && -- 2.5.4 (Apple Git-61) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v7] libxl: allow 'phy' backend to use empty files 2016-02-19 18:01 ` [PATCH v7] " Roger Pau Monne @ 2016-03-01 9:51 ` Roger Pau Monné 2016-03-03 15:41 ` Ian Jackson 1 sibling, 0 replies; 39+ messages in thread From: Roger Pau Monné @ 2016-03-01 9:51 UTC (permalink / raw) To: xen-devel; +Cc: Wei Liu, Ian Jackson, Ian Campbell, Alex Braunegg El 19/2/16 a les 19:01, Roger Pau Monne ha escrit: > This was introduced by 97ee1f (~5 years ago), but was probably never > surfaced because most people used regular files as CDROM images, so the PHY > backend was actually never selected. A year ago this was changed, and now > regular RAW files are also handled by the PHY backend, which has made this > bug suface. > > Fix it by allowing empty disks to use the PHY backend, skipping the stat > tests. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > Reported-by: Alex Braunegg <alex.braunegg@gmail.com> > --- > Cc: Ian Jackson <ian.jackson@eu.citrix.com> > Cc: Ian Campbell <ian.campbell@citrix.com> > Cc: Wei Liu <wei.liu2@citrix.com> > Cc: Alex Braunegg <alex.braunegg@gmail.com> > --- > Changes since v6: > - Turn the assert into a check at libxl__device_disk_set_backend. > > Changes since v4: > - Split form the rest of the series. > - Fix disk_try_backend. Ping? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v7] libxl: allow 'phy' backend to use empty files 2016-02-19 18:01 ` [PATCH v7] " Roger Pau Monne 2016-03-01 9:51 ` Roger Pau Monné @ 2016-03-03 15:41 ` Ian Jackson 2016-03-31 16:20 ` Roger Pau Monné 1 sibling, 1 reply; 39+ messages in thread From: Ian Jackson @ 2016-03-03 15:41 UTC (permalink / raw) To: Roger Pau Monne; +Cc: xen-devel, Wei Liu, Ian Campbell, Alex Braunegg Roger Pau Monne writes ("[PATCH v7] libxl: allow 'phy' backend to use empty files"): > This was introduced by 97ee1f (~5 years ago), but was probably never > surfaced because most people used regular files as CDROM images, so the PHY > backend was actually never selected. A year ago this was changed, and now > regular RAW files are also handled by the PHY backend, which has made this > bug suface. > > Fix it by allowing empty disks to use the PHY backend, skipping the stat > tests. > diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c > index 8bb5e93..e0a81e3 100644 > --- a/tools/libxl/libxl_device.c > +++ b/tools/libxl/libxl_device.c > @@ -196,6 +196,12 @@ static int disk_try_backend(disk_try_backend_args *a, > goto bad_format; > } > > + if (a->disk->format == LIBXL_DISK_FORMAT_EMPTY) { > + LOG(DEBUG, "Disk vdev=%s is empty, skipping physical device check", > + a->disk->vdev); > + return backend; This implicitly assumes that every backend can cope with absent devices. I don't think that is necessarily true. I think this check should be in what is now libxl__try_phy_backend. And skipping the other tests is probably not right either. For example, checking the backend_domid is probably still necessary (and maybe the script too). For LIBXL_DISK_BACKEND_TAP, we still need to check libxl__blktap_enabled. etc. > @@ -273,6 +279,12 @@ int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) { > LOG(ERROR, "Disk vdev=%s is empty but not cdrom", disk->vdev); > return ERROR_INVAL; > } > + if (disk->pdev_path != NULL && strcmp(disk->pdev_path, "")) { > + LOG(ERROR, > + "Disk vdev=%s is empty but an image has been provided: %s", > + disk->vdev, disk->pdev_path); > + return ERROR_INVAL; > + } This hunk is fine. Thanks, Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v7] libxl: allow 'phy' backend to use empty files 2016-03-03 15:41 ` Ian Jackson @ 2016-03-31 16:20 ` Roger Pau Monné 2016-04-01 14:06 ` Ian Jackson 0 siblings, 1 reply; 39+ messages in thread From: Roger Pau Monné @ 2016-03-31 16:20 UTC (permalink / raw) To: Ian Jackson Cc: xen-devel, Alex Braunegg, Wei Liu, Ian Campbell, Roger Pau Monne On Thu, 3 Mar 2016, Ian Jackson wrote: > Roger Pau Monne writes ("[PATCH v7] libxl: allow 'phy' backend to use empty files"): > > This was introduced by 97ee1f (~5 years ago), but was probably never > > surfaced because most people used regular files as CDROM images, so the PHY > > backend was actually never selected. A year ago this was changed, and now > > regular RAW files are also handled by the PHY backend, which has made this > > bug suface. > > > > Fix it by allowing empty disks to use the PHY backend, skipping the stat > > tests. > > > diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c > > index 8bb5e93..e0a81e3 100644 > > --- a/tools/libxl/libxl_device.c > > +++ b/tools/libxl/libxl_device.c > > @@ -196,6 +196,12 @@ static int disk_try_backend(disk_try_backend_args *a, > > goto bad_format; > > } > > > > + if (a->disk->format == LIBXL_DISK_FORMAT_EMPTY) { > > + LOG(DEBUG, "Disk vdev=%s is empty, skipping physical device check", > > + a->disk->vdev); > > + return backend; > > This implicitly assumes that every backend can cope with absent > devices. I don't think that is necessarily true. I think this check > should be in what is now libxl__try_phy_backend. This check is done inside of a switch branch that's only used by LIBXL_DISK_BACKEND_PHY, so it doesn't assume that every backend can handle empty files. > And skipping the other tests is probably not right either. For > example, checking the backend_domid is probably still necessary (and > maybe the script too). For LIBXL_DISK_BACKEND_TAP, we still need to > check libxl__blktap_enabled. etc. I can move the check to a lower place (after the other checks), but those are not error checking, they are just checks to make sure the right backend is used, just like this one. I don't see how altering their order is going to make any difference. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v7] libxl: allow 'phy' backend to use empty files 2016-03-31 16:20 ` Roger Pau Monné @ 2016-04-01 14:06 ` Ian Jackson 0 siblings, 0 replies; 39+ messages in thread From: Ian Jackson @ 2016-04-01 14:06 UTC (permalink / raw) To: Roger Pau Monné; +Cc: xen-devel, Wei Liu, Ian Campbell, Alex Braunegg Roger Pau Monné writes ("Re: [PATCH v7] libxl: allow 'phy' backend to use empty files"): > This check is done inside of a switch branch that's only used by > LIBXL_DISK_BACKEND_PHY, so it doesn't assume that every backend can handle > empty files. ... > I can move the check to a lower place (after the other checks), but those > are not error checking, they are just checks to make sure the right > backend is used, just like this one. I don't see how altering their order > is going to make any difference. Thanks for your replies, which convinced me. I have rebased your patch onto staging, fixing the trivial conflict with the colo series, and queued it for my next push. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v6] libxl: allow 'phy' backend to use empty files 2016-02-17 17:20 ` [PATCH v6] libxl: allow 'phy' backend to use empty files Roger Pau Monne 2016-02-18 10:27 ` Alex Braunegg 2016-02-19 17:30 ` Ian Jackson @ 2016-04-05 16:48 ` George Dunlap 2016-04-05 21:45 ` Alex Braunegg 2 siblings, 1 reply; 39+ messages in thread From: George Dunlap @ 2016-04-05 16:48 UTC (permalink / raw) To: Roger Pau Monne Cc: Ian Jackson, xen-devel, Wei Liu, Alex Braunegg, Ian Campbell On Wed, Feb 17, 2016 at 5:20 PM, Roger Pau Monne <roger.pau@citrix.com> wrote: > This was introduced by 97ee1f (~5 years ago), but was probably never > surfaced because most people used regular files as CDROM images, so the PHY > backend was actually never selected. A year ago this was changed, and now > regular RAW files are also handled by the PHY backend, which has made this > bug surface. > > Fix it by allowing empty disks to use the PHY backend, skipping the stat > tests. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > Reported-by: Alex Braunegg <alex.braunegg@gmail.com> So first of all, it's not 100% clear from this commit what the problem was that Alex reported. I take it that if he used a phy backend for a cdrom, that "xl cd-eject" failed? Unfortunately, after this changeset, creating a VM with an empty cdrom fails, because it tries to run the block script and the block script fails: # cat c6-01.cfg builder="hvm" name = "c6-01" memory = "2048" disk = [ 'format=raw,vdev=sda,target=/images/c6-01.raw','vdev=hdc,access=ro,devtype=cdrom' ] vif = [ 'mac=5a:39:bb:b6:84:b4' ] vcpus=1 on_crash = 'destroy' serial='pty' # xl create c6-01.cfg libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: /etc/xen/scripts/block add [7236] exited with error status 1 libxl: error: libxl_device.c:1114:device_hotplug_child_death_cb: script: /etc/xen/scripts/block failed; error detected. libxl: error: libxl_create.c:1247:domcreate_launch_dm: unable to add disk devices libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: /etc/xen/scripts/block remove [7319] exited with error status 1 libxl: error: libxl_device.c:1114:device_hotplug_child_death_cb: script: /etc/xen/scripts/block failed; error detected. libxl: error: libxl.c:1565:libxl__destroy_domid: non-existant domain 5 libxl: error: libxl.c:1523:domain_destroy_callback: unable to destroy guest with domid 5 libxl: error: libxl.c:1452:domain_destroy_cb: destruction of domain 5 failed -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v6] libxl: allow 'phy' backend to use empty files 2016-04-05 16:48 ` [PATCH v6] " George Dunlap @ 2016-04-05 21:45 ` Alex Braunegg 0 siblings, 0 replies; 39+ messages in thread From: Alex Braunegg @ 2016-04-05 21:45 UTC (permalink / raw) To: 'George Dunlap', 'Roger Pau Monne' Cc: 'xen-devel', 'Wei Liu', 'Ian Jackson', 'Ian Campbell' Hi George, > I take it that if he used a phy backend for a cdrom, that "xl cd-eject" failed? No - I was always using ISO images for cd-rom devices. In the 4.4 configuration I was specifying it as a file. In Xen 4.4 and 4.6 (before patching) whenever I attempted to perform an 'xl cd-eject' it always failed. I didn’t perform the triage of what commit broke the functionality - so I cant advise on if this is what broke it. A sample of the configurations of what I used is below: ================ # Xen 4.4 builder='hvm' memory = 512 name = 'CentOS_Test' disk = [ 'phy:/dev/zvol/storage/xen/CentOS_Test/disk_sda,hda,w','file:/storage/samba/xeniso/CentOS-6.5-x86_64-minimal.iso,hdc:cdrom,r' ] # boot on floppy (a), hard disk (c) or CD-ROM (d) # default: hard disk, cd-rom, floppy boot='cd' # Xen 4.6 builder='hvm' memory = 512 name = 'CentOS_Test' disk = [ '/dev/zvol/storage/xen/CentOS_Test/disk_sda,,hda','/storage/samba/xeniso/CentOS-6.5-x86_64-minimal.iso,,hdc,cdrom' ] # boot on floppy (a), hard disk (c) or CD-ROM (d) # default: hard disk, cd-rom, floppy boot='cd' ================ After patching Xen 4.6, I can now perform the xl cd-eject and load an alternate ISO into the VM without issue. However if I just want to eject the ISO as per what you would normally do on a physical system after install, I have to 'eject' but then insert a 'dummy' ISO file to keep the cd-rom device available to the VM through reboots / shutdowns: disk = [ '/dev/zvol/storage/xen/CentOS_Test/disk_sda,,hda','/storage/samba/xeniso/dummy.iso,,hdc,cdrom' ] Without having some sort of valid 'dummy' ISO file that contains nothing, Xen has issues about creating the cd-rom device on creation and reboots, and certainly in the VM no cd-rom device is then available - meaning down the track if I want to load an ISO I cannot easily 'insert' another ISO file without powering off the VM, making the configuration change and powering the VM back on again. It would be nice at some point to have the configuration where the cd-rom drive can presented to the VM without having a valid ISO / file loaded - which would then operate as per physical devices - cd-rom devices show up, but drive contents are empty: disk = [ '/dev/zvol/storage/xen/CentOS_Test/disk_sda,,hda',',,hdc,cdrom' ] Best regards, Alex _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 39+ messages in thread
end of thread, other threads:[~2016-04-05 21:45 UTC | newest] Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-02-16 17:37 [PATCH v4 0/4] Assorted fixes and improvements Roger Pau Monne 2016-02-16 17:37 ` [PATCH v4 1/4] x86/HVM: update the start info structure layout Roger Pau Monne 2016-02-16 19:13 ` Andrew Cooper 2016-02-16 20:06 ` Konrad Rzeszutek Wilk 2016-02-17 10:01 ` Roger Pau Monné 2016-02-16 21:26 ` Boris Ostrovsky 2016-02-17 9:58 ` Jan Beulich 2016-02-17 10:05 ` Roger Pau Monné 2016-02-17 14:39 ` Boris Ostrovsky 2016-02-17 14:54 ` Jan Beulich 2016-02-17 10:45 ` Samuel Thibault 2016-02-17 13:00 ` Jan Beulich 2016-02-16 17:37 ` [PATCH v4 2/4] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNKNOWN Roger Pau Monne 2016-02-24 12:08 ` Wei Liu 2016-03-01 16:06 ` Ian Jackson 2016-02-16 17:37 ` [PATCH v4 3/4] libelf: rewrite symtab/strtab loading Roger Pau Monne 2016-02-26 13:15 ` Jan Beulich 2016-02-26 17:02 ` Roger Pau Monné 2016-02-29 9:31 ` Jan Beulich 2016-02-29 10:57 ` Roger Pau Monné 2016-02-29 12:14 ` Jan Beulich 2016-02-29 16:20 ` Roger Pau Monné 2016-02-29 16:41 ` Jan Beulich 2016-02-16 17:37 ` [PATCH v4 4/4] libxl: fix cd-eject Roger Pau Monne 2016-02-16 17:58 ` Ian Jackson 2016-02-17 11:20 ` Roger Pau Monné 2016-02-17 11:42 ` Ian Campbell 2016-02-17 12:15 ` Ian Jackson 2016-02-17 17:20 ` [PATCH v6] libxl: allow 'phy' backend to use empty files Roger Pau Monne 2016-02-18 10:27 ` Alex Braunegg 2016-02-19 17:30 ` Ian Jackson 2016-02-19 17:41 ` Roger Pau Monné 2016-02-19 18:01 ` [PATCH v7] " Roger Pau Monne 2016-03-01 9:51 ` Roger Pau Monné 2016-03-03 15:41 ` Ian Jackson 2016-03-31 16:20 ` Roger Pau Monné 2016-04-01 14:06 ` Ian Jackson 2016-04-05 16:48 ` [PATCH v6] " George Dunlap 2016-04-05 21:45 ` Alex Braunegg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).