From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E718FC3A589 for ; Thu, 15 Aug 2019 19:22:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C16092089E for ; Thu, 15 Aug 2019 19:22:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732672AbfHOTWq (ORCPT ); Thu, 15 Aug 2019 15:22:46 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:40680 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729820AbfHOTWq (ORCPT ); Thu, 15 Aug 2019 15:22:46 -0400 Received: from pd9ef1cb8.dip0.t-ipconnect.de ([217.239.28.184] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1hyLKN-0004XU-Lb; Thu, 15 Aug 2019 21:22:27 +0200 Date: Thu, 15 Aug 2019 21:22:26 +0200 (CEST) From: Thomas Gleixner To: Dmitry Safonov cc: linux-kernel@vger.kernel.org, Dmitry Safonov <0x7f454c46@gmail.com>, Adrian Reber , Andrei Vagin , Andy Lutomirski , Arnd Bergmann , Christian Brauner , Cyrill Gorcunov , "Eric W. Biederman" , "H. Peter Anvin" , Ingo Molnar , Jann Horn , Jeff Dike , Oleg Nesterov , Pavel Emelyanov , Shuah Khan , Vincenzo Frascino , containers@lists.linux-foundation.org, criu@openvz.org, linux-api@vger.kernel.org, x86@kernel.org Subject: Re: [PATCHv6 28/36] posix-clocks: Add align for timens_offsets In-Reply-To: <20190815163836.2927-29-dima@arista.com> Message-ID: References: <20190815163836.2927-1-dima@arista.com> <20190815163836.2927-29-dima@arista.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 15 Aug 2019, Dmitry Safonov wrote: > Align offsets so that time namespace will work for ia32 applications on > x86_64 host. That's true for any 64 bit arch which supports 32bit user space and should be folded into the patch which introduces the offset store. > +/* > + * Time offsets need align as they're placed on VVAR page, > + * which is used by x86_64 and ia32 VDSO code. > + * On ia32 offset::tv_sec (u64) has align(4), so re-align offsets > + * to the same positions as 64-bit offsets. This is generic code. Please do not add x86'isms here. The alignement problem is more or less the same for any 64bit arch which supports 32bit user space. And it's even worse on BE. > + * On 64-bit big-endian systems VDSO should convert to timespec64 > + * to timespec ... What? > ... because of a padding occurring between the fields. There is no padding between the fields. 32bit BE (powerpc) struct timespec64 { time64_t tv_sec; /* 0 8 */ long int tv_nsec; /* 8 4 */ tv_nsec is directly after tv_sec }; 64bit LE and BE (x86, powerpc64) struct timespec64 { time64_t tv_sec; /* 0 8 */ long int tv_nsec; /* 8 8 */ }; The problem for BE is that the 64bit host uses long int to store tv_nsec. So the 32bit userspace will always read 0 because it reads byte 2/3 as seen from the 64 host side. So using struct timespec64 for the offset is wrong. You really need to open code that offset storage if you don't want to end up with weird workarounds for BE. Something like this: struct timens_offs { time64_t tv_sec; s64 tv_nsec; }; Then your offset store becomes: struct timens_offsets { struct timens_offs monotonic; struct timens_offs boottime; }; which needs tweaks to your conversion functions: static inline void timens_add_monotonic(struct timespec64 *ts) { struct timens_offsets *ns_offsets = current->nsproxy->time_ns->offsets; struct timens_offs *mo = &ns_offsets->monotonic; if (ns_offsets) { set_normalized_timespec64(ts, ts->tv_sec + mo->tv_sec, ts->tv_nsec + mo->tv_nsec); } } And for your to host conversion you need: case CLOCK_MONOTONIC: mo = &ns_offsets->monotonic; offset = ktime_set(mo->tv_sec, mo->tv_nsec); break; Similar changes are needed in the VDSO and the proc interface obviously. Then this works for any arch without magic BE fixups. You get the idea. And ideally you change that storage to: struct timens_offs { time64_t tv_sec; s64 tv_nsec; ktime_t nsecs; }; and do the conversion once in the proc write. Then your to host conversion can use 'nsecs' and spare the multiplication on every invocation. case CLOCK_MONOTONIC: offset = ns_offsets.monotonic.nsecs; Thanks, tglx