On 18.11.21 22:00, Stefano Stabellini wrote: > On Thu, 18 Nov 2021, Juergen Gross wrote: >> On 18.11.21 09:47, Jan Beulich wrote: >>> On 18.11.2021 06:32, Juergen Gross wrote: >>>> On 18.11.21 03:37, Stefano Stabellini wrote: >>>>> --- a/drivers/xen/xenbus/xenbus_probe.c >>>>> +++ b/drivers/xen/xenbus/xenbus_probe.c >>>>> @@ -951,6 +951,28 @@ static int __init xenbus_init(void) >>>>> err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v); >>>>> if (err) >>>>> goto out_error; >>>>> + /* >>>>> + * Return error on an invalid value. >>>>> + * >>>>> + * Uninitialized hvm_params are zero and return no error. >>>>> + * Although it is theoretically possible to have >>>>> + * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it >>>>> is >>>>> + * not zero when valid. If zero, it means that Xenstore hasn't >>>>> + * been properly initialized. Instead of attempting to map a >>>>> + * wrong guest physical address return error. >>>>> + */ >>>>> + if (v == 0) { >>>> >>>> Make this "if (v == ULONG_MAX || v== 0)" instead? >>>> This would result in the same err on a new and an old hypervisor >>>> (assuming we switch the hypervisor to init params with ~0UL). > > Sure, I can do that > > >>>>> + err = -ENOENT; >>>>> + goto out_error; >>>>> + } >>>>> + /* >>>>> + * ULONG_MAX is invalid on 64-bit because is INVALID_PFN. >>>>> + * On 32-bit return error to avoid truncation. >>>>> + */ >>>>> + if (v >= ULONG_MAX) { >>>>> + err = -EINVAL; >>>>> + goto out_error; >>>>> + } >>>> >>>> Does it make sense to continue the system running in case of >>>> truncation? This would be a 32-bit guest with more than 16TB of RAM >>>> and the Xen tools decided to place the Xenstore ring page above the >>>> 16TB boundary. This is a completely insane scenario IMO. >>>> >>>> A proper panic() in this case would make diagnosis of that much >>>> easier (me having doubts that this will ever be hit, though). >>> >>> While I agree panic() may be an option here (albeit I'm not sure why >>> that would be better than trying to cope with 0 and hence without >> >> I could imagine someone wanting to run a guest without Xenstore access, >> which BTW will happen in case of a guest created by the hypervisor at >> boot time. >> >>> xenbus), I'd like to point out that the amount of RAM assigned to a >>> guest is unrelated to the choice of GFNs for the various "magic" >>> items. >> >> Yes, but this would still be a major tools problem which probably >> would render the whole guest rather unusable. > > First let's distinguish between an error due to "hvm_param not > initialized" and an error due to more serious conditions, such as "pfn > above max". > > "hvm_param not initialized" could mean v == 0 (as it would be today) or > v == ~0UL (if we change Xen to initialize all hvm_param to ~0UL). I > don't think we want to panic in these cases as they are not actually > true erroneous configurations. We should just stop trying to initialize > xenstore and continue with the rest. > > > The "pfn above max" case could happen if v is greater than the max pfn. > This is a true error in the configuration because the toolstack should > know that the guest is 32-bit so it should give it a pfn that the guest I don't think so. All x86 PVH/HVM guests start booting in 32-bit mode. > is able to use. As Jan wrote in another email, for 32-bit the actual > limit depends on the physical address bits but actually Linux has never > been able to cope with a pfn > ULONG_MAX on 32-bit because xen_store_gfn > is defined as unsigned long. So Linux 32-bit has been truncating > HVM_PARAM_STORE_PFN all along. The question is whether the number of physical address bits as presented to the guest is always >= 44. If not the actual limit is less than ULONG_MAX. Other than that you are right: a PFN larger than a 32-bit ULONG_MAX will be truncated by a 32-bit guest. > There is also an argument that depending on kconfig Linux 32-bit might > only be able to handle addresses < 4G, so I don't think the toolstack > can assume that a 32-bit guest is able to cope with HVM_PARAM_STORE_PFN >> ULONG_MAX. If Linux is 32-bit and HVM_PARAM_STORE_PFN > ULONG_MAX, > even if HVM_PARAM_STORE_PFN < address_bits_max I think it would be fair > to still consider it an error, but I can see it could be argued either > way. Certainly if HVM_PARAM_STORE_PFN > address_bits_max is an error. Right. The tools should NEVER put the frame above 4G for a non-PV guest. > In any case, I think it is still better for Linux to stop trying to > initialize Xenstore but continue with the rest because there is a bunch > of other useful things Linux can do without it. Panic should only be the > last resort if there is nothing else to do. In this case we haven't even > initialized the service and the service is not essential, at least it is > not essential in certain ARM setups. > > > So in conclusion, I think this patch should: > - if v == 0 return error (uninitialized) > - if v == ~0ULL (INVALID_PFN) return error (uinitialized) > - if v >= ~0UL (32-bit) return error (even if this case could be made to > work for v < max_address_bits depending on kconfig) > > Which leads to something like: > > /* uninitialized */ > if (v == 0 || v == ~0ULL) { > err = -ENOENT; > goto out_error; > } > /* > * Avoid truncation on 32-bit. > * TODO: handle addresses >= 4G > */ > if ( v >= ~0UL ) { > err = -EINVAL; > goto out_error; I think at least in this case a pr_err("...") should be added. Silent failure is not nice. Juergen