* [3.1 patch] x86: default to vsyscall=native @ 2011-10-03 9:08 Adrian Bunk 2011-10-03 13:04 ` Andrew Lutomirski 2011-10-03 13:19 ` richard -rw- weinberger 0 siblings, 2 replies; 50+ messages in thread From: Adrian Bunk @ 2011-10-03 9:08 UTC (permalink / raw) To: Andy Lutomirski, H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel After upgrading a kernel the existing userspace should just work (assuming it did work before ;-) ), but when I upgraded my kernel from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. dmesg said: linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 Looking throught the changelog I ended up at commit 3ae36655 ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to vsyscall=native. That sounds reasonable to me, and fixes the problem for me. Signed-off-by: Adrian Bunk <bunk@kernel.org> --- Documentation/kernel-parameters.txt | 7 ++++--- arch/x86/kernel/vsyscall_64.c | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 854ed5ca..d6e6724 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2706,10 +2706,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. functions are at fixed addresses, they make nice targets for exploits that can control RIP. - emulate [default] Vsyscalls turn into traps and are - emulated reasonably safely. + emulate Vsyscalls turn into traps and are emulated + reasonably safely. - native Vsyscalls are native syscall instructions. + native [default] Vsyscalls are native syscall + instructions. This is a little bit faster than trapping and makes a few dynamic recompilers work better than they would in emulation mode. diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c index 18ae83d..b56c65de 100644 --- a/arch/x86/kernel/vsyscall_64.c +++ b/arch/x86/kernel/vsyscall_64.c @@ -56,7 +56,7 @@ DEFINE_VVAR(struct vsyscall_gtod_data, vsyscall_gtod_data) = .lock = __SEQLOCK_UNLOCKED(__vsyscall_gtod_data.lock), }; -static enum { EMULATE, NATIVE, NONE } vsyscall_mode = EMULATE; +static enum { EMULATE, NATIVE, NONE } vsyscall_mode = NATIVE; static int __init vsyscall_setup(char *str) { -- 1.7.6.3 ^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-03 9:08 [3.1 patch] x86: default to vsyscall=native Adrian Bunk @ 2011-10-03 13:04 ` Andrew Lutomirski 2011-10-03 17:33 ` Adrian Bunk 2011-10-03 13:19 ` richard -rw- weinberger 1 sibling, 1 reply; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-03 13:04 UTC (permalink / raw) To: Adrian Bunk Cc: H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: > After upgrading a kernel the existing userspace should just work > (assuming it did work before ;-) ), but when I upgraded my kernel > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. > > dmesg said: > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 > > Looking throught the changelog I ended up at commit 3ae36655 > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). > > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to > vsyscall=native. > > That sounds reasonable to me, and fixes the problem for me. At this point in the -rc cycle, this sounds fine. That being said, I'd like to fix it for real for 3.2. This particular failure is suspicious -- the "vsyscall fault" message means that sys_gettimeofday returned EFAULT, which means that the old (3.0 and before) vgettimeofday should *also* have segfaulted. We do have a bit of a bug in that the new code doesn't report si_addr properly, but that sounds unlikely as a culprit. Did you try with the offending commit reverted (i.e. fce8dc0)? I bet that it also fails there. What's the .config for your UML binary? I'd like to see if I can reproduce this. --Andy > > Signed-off-by: Adrian Bunk <bunk@kernel.org> > --- > Documentation/kernel-parameters.txt | 7 ++++--- > arch/x86/kernel/vsyscall_64.c | 2 +- > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt > index 854ed5ca..d6e6724 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -2706,10 +2706,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. > functions are at fixed addresses, they make nice > targets for exploits that can control RIP. > > - emulate [default] Vsyscalls turn into traps and are > - emulated reasonably safely. > + emulate Vsyscalls turn into traps and are emulated > + reasonably safely. > > - native Vsyscalls are native syscall instructions. > + native [default] Vsyscalls are native syscall > + instructions. > This is a little bit faster than trapping > and makes a few dynamic recompilers work > better than they would in emulation mode. > diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c > index 18ae83d..b56c65de 100644 > --- a/arch/x86/kernel/vsyscall_64.c > +++ b/arch/x86/kernel/vsyscall_64.c > @@ -56,7 +56,7 @@ DEFINE_VVAR(struct vsyscall_gtod_data, vsyscall_gtod_data) = > .lock = __SEQLOCK_UNLOCKED(__vsyscall_gtod_data.lock), > }; > > -static enum { EMULATE, NATIVE, NONE } vsyscall_mode = EMULATE; > +static enum { EMULATE, NATIVE, NONE } vsyscall_mode = NATIVE; > > static int __init vsyscall_setup(char *str) > { > -- > 1.7.6.3 > > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-03 13:04 ` Andrew Lutomirski @ 2011-10-03 17:33 ` Adrian Bunk 2011-10-03 18:06 ` Andrew Lutomirski 2011-10-05 22:13 ` Andrew Lutomirski 0 siblings, 2 replies; 50+ messages in thread From: Adrian Bunk @ 2011-10-03 17:33 UTC (permalink / raw) To: Andrew Lutomirski Cc: H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel [-- Attachment #1: Type: text/plain, Size: 2934 bytes --] On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: > On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: > > After upgrading a kernel the existing userspace should just work > > (assuming it did work before ;-) ), but when I upgraded my kernel > > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. > > > > dmesg said: > > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 > > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 > > > > Looking throught the changelog I ended up at commit 3ae36655 > > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). > > > > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to > > vsyscall=native. > > > > That sounds reasonable to me, and fixes the problem for me. > > At this point in the -rc cycle, this sounds fine. > > That being said, I'd like to fix it for real for 3.2. This particular > failure is suspicious -- the "vsyscall fault" message means that > sys_gettimeofday returned EFAULT, which means that the old (3.0 and > before) vgettimeofday should *also* have segfaulted. This 2.6.30.1 UML kernel binary from 2009 worked for me for all host kernels from 2.6.30 to 3.0, and with 3.1.0-rc8 and vsyscall=native it also seems to run nicely. Looking deeper into "a UML instance didn't come up properly", the problem is that it comes up in a strange (readonly) state. There are "Using makefile-style concurrent boot in runlevel S." and "Using makefile-style concurrent boot in runlevel 2." in the logs with a Debian userspace, but no output from the init scripts in these broken bootups (normal messages are in non-broken bootups). Perhaps the two the messages I see in dmesg on the host are from the processes running rcS and rc2 failing early? In a working startup with a Debian userspace, I'm getting during rcS Setting the system clock. Cannot access the Hardware Clock via any known method. Use the --debug option to see the details of our search for an access method. Unable to set System Clock to: Mon Oct 3 17:01:35 UTC 2011 ... (warning). > We do have a bit > of a bug in that the new code doesn't report si_addr properly, but > that sounds unlikely as a culprit. Did you try with the offending > commit reverted (i.e. fce8dc0)? I bet that it also fails there. fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you want me to revert? > What's the .config for your UML binary? I'd like to see if I can > reproduce this. It's attached. > --Andy cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed [-- Attachment #2: config-uml --] [-- Type: text/plain, Size: 11035 bytes --] # # Automatically generated make config: don't edit # Linux kernel version: 2.6.30-rc4 # Thu Apr 30 22:55:45 2009 # CONFIG_DEFCONFIG_LIST="arch/$ARCH/defconfig" CONFIG_GENERIC_HARDIRQS=y CONFIG_UML=y CONFIG_MMU=y CONFIG_NO_IOMEM=y # CONFIG_TRACE_IRQFLAGS_SUPPORT is not set CONFIG_LOCKDEP_SUPPORT=y # CONFIG_STACKTRACE_SUPPORT is not set CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_IRQ_RELEASE_METHOD=y CONFIG_HZ=100 # # UML-specific options # # # Host processor type and features # # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set CONFIG_MK8=y # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_MVIAC7 is not set # CONFIG_MPSC is not set # CONFIG_MCORE2 is not set # CONFIG_GENERIC_CPU is not set CONFIG_X86_CPU=y CONFIG_X86_L1_CACHE_BYTES=64 CONFIG_X86_INTERNODE_CACHE_BYTES=64 # CONFIG_X86_CMPXCHG is not set CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_X86_CMOV=y CONFIG_X86_MINIMUM_CPU_FAMILY=3 CONFIG_CPU_SUP_INTEL=y CONFIG_CPU_SUP_AMD=y CONFIG_CPU_SUP_CENTAUR=y CONFIG_UML_X86=y CONFIG_64BIT=y # CONFIG_X86_32 is not set # CONFIG_RWSEM_XCHGADD_ALGORITHM is not set CONFIG_RWSEM_GENERIC_SPINLOCK=y CONFIG_3_LEVEL_PGTABLES=y # CONFIG_ARCH_HAS_SC_SIGNALS is not set # CONFIG_ARCH_REUSE_HOST_VSYSCALL_AREA is not set CONFIG_SMP_BROKEN=y CONFIG_GENERIC_HWEIGHT=y # CONFIG_STATIC_LINK is not set CONFIG_SELECT_MEMORY_MODEL=y CONFIG_FLATMEM_MANUAL=y # CONFIG_DISCONTIGMEM_MANUAL is not set # CONFIG_SPARSEMEM_MANUAL is not set CONFIG_FLATMEM=y CONFIG_FLAT_NODE_MEM_MAP=y CONFIG_PAGEFLAGS_EXTENDED=y CONFIG_SPLIT_PTLOCK_CPUS=4 CONFIG_PHYS_ADDR_T_64BIT=y CONFIG_ZONE_DMA_FLAG=0 CONFIG_VIRT_TO_BUS=y CONFIG_UNEVICTABLE_LRU=y CONFIG_HAVE_MLOCK=y CONFIG_HAVE_MLOCKED_PAGE_BIT=y CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_GENERIC_CLOCKEVENTS_BUILD=y CONFIG_LD_SCRIPT_DYN=y CONFIG_BINFMT_ELF=y # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set # CONFIG_HAVE_AOUT is not set # CONFIG_BINFMT_MISC is not set CONFIG_HOSTFS=y # CONFIG_HPPFS is not set CONFIG_MCONSOLE=y CONFIG_MAGIC_SYSRQ=y CONFIG_KERNEL_STACK_ORDER=1 # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_INIT_ENV_ARG_LIMIT=128 CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set # CONFIG_SWAP is not set CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_POSIX_MQUEUE_SYSCTL=y # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_AUDIT is not set # # RCU Subsystem # CONFIG_CLASSIC_RCU=y # CONFIG_TREE_RCU is not set # CONFIG_PREEMPT_RCU is not set # CONFIG_TREE_RCU_TRACE is not set # CONFIG_PREEMPT_RCU_TRACE is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=14 # CONFIG_GROUP_SCHED is not set # CONFIG_CGROUPS is not set CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y # CONFIG_RELAY is not set CONFIG_NAMESPACES=y # CONFIG_UTS_NS is not set # CONFIG_IPC_NS is not set # CONFIG_USER_NS is not set # CONFIG_PID_NS is not set # CONFIG_NET_NS is not set # CONFIG_BLK_DEV_INITRD is not set CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y CONFIG_ANON_INODES=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_EXTRA_PASS=y # CONFIG_STRIP_ASM_SYMS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_AIO=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_COMPAT_BRK=y CONFIG_SLAB=y # CONFIG_SLUB is not set # CONFIG_SLOB is not set # CONFIG_PROFILING is not set # CONFIG_MARKERS is not set # CONFIG_SLOW_WORK is not set # CONFIG_HAVE_GENERIC_DMA_COHERENT is not set CONFIG_SLABINFO=y CONFIG_RT_MUTEXES=y CONFIG_BASE_SMALL=0 # CONFIG_MODULES is not set CONFIG_BLOCK=y # CONFIG_BLK_DEV_BSG is not set # CONFIG_BLK_DEV_INTEGRITY is not set # # IO Schedulers # CONFIG_IOSCHED_NOOP=y # CONFIG_IOSCHED_AS is not set # CONFIG_IOSCHED_DEADLINE is not set # CONFIG_IOSCHED_CFQ is not set # CONFIG_DEFAULT_AS is not set # CONFIG_DEFAULT_DEADLINE is not set # CONFIG_DEFAULT_CFQ is not set CONFIG_DEFAULT_NOOP=y CONFIG_DEFAULT_IOSCHED="noop" # CONFIG_FREEZER is not set CONFIG_BLK_DEV=y CONFIG_BLK_DEV_UBD=y # CONFIG_BLK_DEV_UBD_SYNC is not set CONFIG_BLK_DEV_COW_COMMON=y CONFIG_BLK_DEV_LOOP=y # CONFIG_BLK_DEV_CRYPTOLOOP is not set # CONFIG_BLK_DEV_NBD is not set # CONFIG_BLK_DEV_RAM is not set # CONFIG_ATA_OVER_ETH is not set # # Character Devices # CONFIG_STDERR_CONSOLE=y CONFIG_STDIO_CONSOLE=y CONFIG_SSL=y CONFIG_NULL_CHAN=y CONFIG_PORT_CHAN=y CONFIG_PTY_CHAN=y CONFIG_TTY_CHAN=y CONFIG_XTERM_CHAN=y # CONFIG_NOCONFIG_CHAN is not set CONFIG_CON_ZERO_CHAN="fd:0,fd:1" CONFIG_CON_CHAN="xterm" CONFIG_SSL_CHAN="pts" CONFIG_UNIX98_PTYS=y CONFIG_LEGACY_PTYS=y # CONFIG_RAW_DRIVER is not set CONFIG_LEGACY_PTY_COUNT=32 # CONFIG_WATCHDOG is not set # CONFIG_UML_SOUND is not set # CONFIG_SOUND is not set # CONFIG_SOUND_OSS_CORE is not set # CONFIG_HOSTAUDIO is not set # CONFIG_HW_RANDOM is not set CONFIG_UML_RANDOM=y # CONFIG_MMAPPER is not set # # Generic Driver Options # CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" CONFIG_STANDALONE=y CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE="" # CONFIG_SYS_HYPERVISOR is not set CONFIG_NET=y # # Networking options # CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_UNIX=y # CONFIG_NET_KEY is not set CONFIG_INET=y # CONFIG_IP_MULTICAST is not set # CONFIG_IP_ADVANCED_ROUTER is not set CONFIG_IP_FIB_HASH=y # CONFIG_IP_PNP is not set # CONFIG_NET_IPIP is not set # CONFIG_NET_IPGRE is not set # CONFIG_ARPD is not set # CONFIG_SYN_COOKIES is not set # CONFIG_INET_AH is not set # CONFIG_INET_ESP is not set # CONFIG_INET_IPCOMP is not set # CONFIG_INET_XFRM_TUNNEL is not set # CONFIG_INET_TUNNEL is not set # CONFIG_INET_XFRM_MODE_TRANSPORT is not set # CONFIG_INET_XFRM_MODE_TUNNEL is not set # CONFIG_INET_XFRM_MODE_BEET is not set # CONFIG_INET_LRO is not set CONFIG_INET_DIAG=y CONFIG_INET_TCP_DIAG=y # CONFIG_TCP_CONG_ADVANCED is not set CONFIG_TCP_CONG_CUBIC=y CONFIG_DEFAULT_TCP_CONG="cubic" # CONFIG_TCP_MD5SIG is not set # CONFIG_IPV6 is not set # CONFIG_NETWORK_SECMARK is not set # CONFIG_NETFILTER is not set # CONFIG_IP_DCCP is not set # CONFIG_IP_SCTP is not set # CONFIG_TIPC is not set # CONFIG_ATM is not set # CONFIG_BRIDGE is not set # CONFIG_NET_DSA is not set # CONFIG_VLAN_8021Q is not set # CONFIG_DECNET is not set # CONFIG_LLC2 is not set # CONFIG_IPX is not set # CONFIG_ATALK is not set # CONFIG_X25 is not set # CONFIG_LAPB is not set # CONFIG_ECONET is not set # CONFIG_WAN_ROUTER is not set # CONFIG_PHONET is not set # CONFIG_NET_SCHED is not set # CONFIG_DCB is not set # # Network testing # # CONFIG_NET_PKTGEN is not set # CONFIG_HAMRADIO is not set # CONFIG_CAN is not set # CONFIG_IRDA is not set # CONFIG_BT is not set # CONFIG_AF_RXRPC is not set # CONFIG_WIRELESS is not set # CONFIG_WIMAX is not set # CONFIG_RFKILL is not set # CONFIG_NET_9P is not set # # UML Network Devices # CONFIG_UML_NET=y CONFIG_UML_NET_ETHERTAP=y CONFIG_UML_NET_TUNTAP=y # CONFIG_UML_NET_SLIP is not set # CONFIG_UML_NET_DAEMON is not set # CONFIG_UML_NET_VDE is not set # CONFIG_UML_NET_MCAST is not set # CONFIG_UML_NET_PCAP is not set # CONFIG_UML_NET_SLIRP is not set CONFIG_NETDEVICES=y CONFIG_COMPAT_NET_DEV_OPS=y CONFIG_DUMMY=y # CONFIG_BONDING is not set # CONFIG_MACVLAN is not set # CONFIG_EQUALIZER is not set CONFIG_TUN=y # CONFIG_VETH is not set # # Wireless LAN # # CONFIG_WLAN_PRE80211 is not set # CONFIG_WLAN_80211 is not set # # Enable WiMAX (Networking options) to see the WiMAX drivers # # CONFIG_WAN is not set # CONFIG_PPP is not set # CONFIG_SLIP is not set # CONFIG_NETCONSOLE is not set # CONFIG_NETPOLL is not set # CONFIG_NET_POLL_CONTROLLER is not set # CONFIG_CONNECTOR is not set # # File systems # CONFIG_EXT2_FS=y # CONFIG_EXT2_FS_XATTR is not set # CONFIG_EXT2_FS_XIP is not set CONFIG_EXT3_FS=y # CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set # CONFIG_EXT3_FS_XATTR is not set # CONFIG_EXT4_FS is not set CONFIG_JBD=y # CONFIG_REISERFS_FS is not set # CONFIG_JFS_FS is not set # CONFIG_FS_POSIX_ACL is not set CONFIG_FILE_LOCKING=y # CONFIG_XFS_FS is not set # CONFIG_GFS2_FS is not set # CONFIG_OCFS2_FS is not set # CONFIG_BTRFS_FS is not set # CONFIG_DNOTIFY is not set CONFIG_INOTIFY=y CONFIG_INOTIFY_USER=y # CONFIG_QUOTA is not set # CONFIG_AUTOFS_FS is not set # CONFIG_AUTOFS4_FS is not set # CONFIG_FUSE_FS is not set # # Caches # # CONFIG_FSCACHE is not set # # CD-ROM/DVD Filesystems # # CONFIG_ISO9660_FS is not set # CONFIG_UDF_FS is not set # # DOS/FAT/NT Filesystems # # CONFIG_MSDOS_FS is not set # CONFIG_VFAT_FS is not set # CONFIG_NTFS_FS is not set # # Pseudo filesystems # CONFIG_PROC_FS=y # CONFIG_PROC_KCORE is not set CONFIG_PROC_SYSCTL=y CONFIG_PROC_PAGE_MONITOR=y CONFIG_SYSFS=y CONFIG_TMPFS=y # CONFIG_TMPFS_POSIX_ACL is not set # CONFIG_HUGETLB_PAGE is not set # CONFIG_CONFIGFS_FS is not set # CONFIG_MISC_FILESYSTEMS is not set # CONFIG_NETWORK_FILESYSTEMS is not set # # Partition Types # # CONFIG_PARTITION_ADVANCED is not set CONFIG_MSDOS_PARTITION=y # CONFIG_NLS is not set # CONFIG_DLM is not set # # Security options # # CONFIG_KEYS is not set # CONFIG_SECURITY is not set # CONFIG_SECURITYFS is not set # CONFIG_SECURITY_FILE_CAPABILITIES is not set # CONFIG_CRYPTO is not set # CONFIG_BINARY_PRINTF is not set # # Library routines # CONFIG_BITREVERSE=y CONFIG_GENERIC_FIND_FIRST_BIT=y CONFIG_GENERIC_FIND_NEXT_BIT=y CONFIG_GENERIC_FIND_LAST_BIT=y # CONFIG_CRC_CCITT is not set # CONFIG_CRC16 is not set # CONFIG_CRC_T10DIF is not set # CONFIG_CRC_ITU_T is not set CONFIG_CRC32=y # CONFIG_CRC7 is not set # CONFIG_LIBCRC32C is not set CONFIG_HAS_DMA=y CONFIG_NLATTR=y # # SCSI device support # # CONFIG_RAID_ATTRS is not set # CONFIG_SCSI is not set # CONFIG_SCSI_DMA is not set # CONFIG_SCSI_NETLINK is not set # CONFIG_MD is not set # CONFIG_NEW_LEDS is not set # CONFIG_INPUT is not set # # Kernel hacking # # CONFIG_PRINTK_TIME is not set # CONFIG_ENABLE_WARN_DEPRECATED is not set # CONFIG_ENABLE_MUST_CHECK is not set CONFIG_FRAME_WARN=1024 # CONFIG_UNUSED_SYMBOLS is not set # CONFIG_DEBUG_FS is not set # CONFIG_DEBUG_KERNEL is not set CONFIG_DEBUG_BUGVERBOSE=y CONFIG_DEBUG_MEMORY_INIT=y # CONFIG_RCU_CPU_STALL_DETECTOR is not set # CONFIG_SYSCTL_SYSCALL_CHECK is not set # CONFIG_SAMPLES is not set # CONFIG_DEBUG_STACK_USAGE is not set ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-03 17:33 ` Adrian Bunk @ 2011-10-03 18:06 ` Andrew Lutomirski 2011-10-03 18:41 ` Adrian Bunk 2011-10-05 22:13 ` Andrew Lutomirski 1 sibling, 1 reply; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-03 18:06 UTC (permalink / raw) To: Adrian Bunk Cc: H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Mon, Oct 3, 2011 at 10:33 AM, Adrian Bunk <bunk@stusta.de> wrote: > On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: >> On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: >> > After upgrading a kernel the existing userspace should just work >> > (assuming it did work before ;-) ), but when I upgraded my kernel >> > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. >> > >> > dmesg said: >> > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 >> > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 >> > >> > Looking throught the changelog I ended up at commit 3ae36655 >> > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). >> > >> > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to >> > vsyscall=native. >> > >> > That sounds reasonable to me, and fixes the problem for me. >> >> At this point in the -rc cycle, this sounds fine. >> >> That being said, I'd like to fix it for real for 3.2. This particular >> failure is suspicious -- the "vsyscall fault" message means that >> sys_gettimeofday returned EFAULT, which means that the old (3.0 and >> before) vgettimeofday should *also* have segfaulted. > > This 2.6.30.1 UML kernel binary from 2009 worked for me for all host > kernels from 2.6.30 to 3.0, and with 3.1.0-rc8 and vsyscall=native > it also seems to run nicely. > > Looking deeper into "a UML instance didn't come up properly", > the problem is that it comes up in a strange (readonly) state. > > There are "Using makefile-style concurrent boot in runlevel S." > and "Using makefile-style concurrent boot in runlevel 2." in the > logs with a Debian userspace, but no output from the init scripts > in these broken bootups (normal messages are in non-broken bootups). > > Perhaps the two the messages I see in dmesg on the host are from the > processes running rcS and rc2 failing early? > > In a working startup with a Debian userspace, I'm getting during rcS > Setting the system clock. > Cannot access the Hardware Clock via any known method. > Use the --debug option to see the details of our search for an access method. > Unable to set System Clock to: Mon Oct 3 17:01:35 UTC 2011 ... (warning). > >> We do have a bit >> of a bug in that the new code doesn't report si_addr properly, but >> that sounds unlikely as a culprit. Did you try with the offending >> commit reverted (i.e. fce8dc0)? I bet that it also fails there. > > fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you > want me to revert? No -- I actually meant to try running that revision or to try with the vsyscall= patch reverted. > >> What's the .config for your UML binary? I'd like to see if I can >> reproduce this. > > It's attached. I'll play around with it. --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-03 18:06 ` Andrew Lutomirski @ 2011-10-03 18:41 ` Adrian Bunk 0 siblings, 0 replies; 50+ messages in thread From: Adrian Bunk @ 2011-10-03 18:41 UTC (permalink / raw) To: Andrew Lutomirski Cc: H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Mon, Oct 03, 2011 at 11:06:13AM -0700, Andrew Lutomirski wrote: > On Mon, Oct 3, 2011 at 10:33 AM, Adrian Bunk <bunk@stusta.de> wrote: > > On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: >... > >> of a bug in that the new code doesn't report si_addr properly, but > >> that sounds unlikely as a culprit. Did you try with the offending > >> commit reverted (i.e. fce8dc0)? I bet that it also fails there. > > > > fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you > > want me to revert? > > No -- I actually meant to try running that revision or to try with the > vsyscall= patch reverted. I now tried both, and your bet was right. >... > --Andy cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-03 17:33 ` Adrian Bunk 2011-10-03 18:06 ` Andrew Lutomirski @ 2011-10-05 22:13 ` Andrew Lutomirski 2011-10-05 22:22 ` richard -rw- weinberger 2011-10-05 22:24 ` [3.1 patch] x86: default to vsyscall=native Adrian Bunk 1 sibling, 2 replies; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-05 22:13 UTC (permalink / raw) To: Adrian Bunk Cc: H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Mon, Oct 3, 2011 at 10:33 AM, Adrian Bunk <bunk@stusta.de> wrote: > On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: >> On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: >> > After upgrading a kernel the existing userspace should just work >> > (assuming it did work before ;-) ), but when I upgraded my kernel >> > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. >> > >> > dmesg said: >> > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 >> > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 >> > >> > Looking throught the changelog I ended up at commit 3ae36655 >> > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). >> > >> > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to >> > vsyscall=native. >> > >> > That sounds reasonable to me, and fixes the problem for me. >> >> At this point in the -rc cycle, this sounds fine. >> >> That being said, I'd like to fix it for real for 3.2. This particular >> failure is suspicious -- the "vsyscall fault" message means that >> sys_gettimeofday returned EFAULT, which means that the old (3.0 and >> before) vgettimeofday should *also* have segfaulted. > > This 2.6.30.1 UML kernel binary from 2009 worked for me for all host > kernels from 2.6.30 to 3.0, and with 3.1.0-rc8 and vsyscall=native > it also seems to run nicely. > > Looking deeper into "a UML instance didn't come up properly", > the problem is that it comes up in a strange (readonly) state. > > There are "Using makefile-style concurrent boot in runlevel S." > and "Using makefile-style concurrent boot in runlevel 2." in the > logs with a Debian userspace, but no output from the init scripts > in these broken bootups (normal messages are in non-broken bootups). > > Perhaps the two the messages I see in dmesg on the host are from the > processes running rcS and rc2 failing early? > > In a working startup with a Debian userspace, I'm getting during rcS > Setting the system clock. > Cannot access the Hardware Clock via any known method. > Use the --debug option to see the details of our search for an access method. > Unable to set System Clock to: Mon Oct 3 17:01:35 UTC 2011 ... (warning). > >> We do have a bit >> of a bug in that the new code doesn't report si_addr properly, but >> that sounds unlikely as a culprit. Did you try with the offending >> commit reverted (i.e. fce8dc0)? I bet that it also fails there. > > fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you > want me to revert? > >> What's the .config for your UML binary? I'd like to see if I can >> reproduce this. > > It's attached. > I can't reproduce it. What distro is running inside the UML instance? --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-05 22:13 ` Andrew Lutomirski @ 2011-10-05 22:22 ` richard -rw- weinberger 2011-10-05 22:30 ` Adrian Bunk 2011-10-05 22:24 ` [3.1 patch] x86: default to vsyscall=native Adrian Bunk 1 sibling, 1 reply; 50+ messages in thread From: richard -rw- weinberger @ 2011-10-05 22:22 UTC (permalink / raw) To: Andrew Lutomirski Cc: Adrian Bunk, H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Thu, Oct 6, 2011 at 12:13 AM, Andrew Lutomirski <luto@mit.edu> wrote: > On Mon, Oct 3, 2011 at 10:33 AM, Adrian Bunk <bunk@stusta.de> wrote: >> On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: >>> On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: >>> > After upgrading a kernel the existing userspace should just work >>> > (assuming it did work before ;-) ), but when I upgraded my kernel >>> > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. >>> > >>> > dmesg said: >>> > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 >>> > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 >>> > >>> > Looking throught the changelog I ended up at commit 3ae36655 >>> > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). >>> > >>> > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to >>> > vsyscall=native. >>> > >>> > That sounds reasonable to me, and fixes the problem for me. >>> >>> At this point in the -rc cycle, this sounds fine. >>> >>> That being said, I'd like to fix it for real for 3.2. This particular >>> failure is suspicious -- the "vsyscall fault" message means that >>> sys_gettimeofday returned EFAULT, which means that the old (3.0 and >>> before) vgettimeofday should *also* have segfaulted. >> >> This 2.6.30.1 UML kernel binary from 2009 worked for me for all host >> kernels from 2.6.30 to 3.0, and with 3.1.0-rc8 and vsyscall=native >> it also seems to run nicely. >> >> Looking deeper into "a UML instance didn't come up properly", >> the problem is that it comes up in a strange (readonly) state. >> >> There are "Using makefile-style concurrent boot in runlevel S." >> and "Using makefile-style concurrent boot in runlevel 2." in the >> logs with a Debian userspace, but no output from the init scripts >> in these broken bootups (normal messages are in non-broken bootups). >> >> Perhaps the two the messages I see in dmesg on the host are from the >> processes running rcS and rc2 failing early? >> >> In a working startup with a Debian userspace, I'm getting during rcS >> Setting the system clock. >> Cannot access the Hardware Clock via any known method. >> Use the --debug option to see the details of our search for an access method. >> Unable to set System Clock to: Mon Oct 3 17:01:35 UTC 2011 ... (warning). >> >>> We do have a bit >>> of a bug in that the new code doesn't report si_addr properly, but >>> that sounds unlikely as a culprit. Did you try with the offending >>> commit reverted (i.e. fce8dc0)? I bet that it also fails there. >> >> fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you >> want me to revert? >> >>> What's the .config for your UML binary? I'd like to see if I can >>> reproduce this. >> >> It's attached. >> > > I can't reproduce it. What distro is running inside the UML instance? Same here. Adrian, is the UML kernel crashing before executing init? We definitely need more information... -- Thanks, //richard ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-05 22:22 ` richard -rw- weinberger @ 2011-10-05 22:30 ` Adrian Bunk 2011-10-05 22:41 ` richard -rw- weinberger 2011-10-05 22:46 ` Andrew Lutomirski 0 siblings, 2 replies; 50+ messages in thread From: Adrian Bunk @ 2011-10-05 22:30 UTC (permalink / raw) To: richard -rw- weinberger Cc: Andrew Lutomirski, H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Thu, Oct 06, 2011 at 12:22:34AM +0200, richard -rw- weinberger wrote: > On Thu, Oct 6, 2011 at 12:13 AM, Andrew Lutomirski <luto@mit.edu> wrote: > > On Mon, Oct 3, 2011 at 10:33 AM, Adrian Bunk <bunk@stusta.de> wrote: > >> On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: > >>> On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: > >>> > After upgrading a kernel the existing userspace should just work > >>> > (assuming it did work before ;-) ), but when I upgraded my kernel > >>> > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. > >>> > > >>> > dmesg said: > >>> > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 > >>> > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 > >>> > > >>> > Looking throught the changelog I ended up at commit 3ae36655 > >>> > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). > >>> > > >>> > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to > >>> > vsyscall=native. > >>> > > >>> > That sounds reasonable to me, and fixes the problem for me. > >>> > >>> At this point in the -rc cycle, this sounds fine. > >>> > >>> That being said, I'd like to fix it for real for 3.2. This particular > >>> failure is suspicious -- the "vsyscall fault" message means that > >>> sys_gettimeofday returned EFAULT, which means that the old (3.0 and > >>> before) vgettimeofday should *also* have segfaulted. > >> > >> This 2.6.30.1 UML kernel binary from 2009 worked for me for all host > >> kernels from 2.6.30 to 3.0, and with 3.1.0-rc8 and vsyscall=native > >> it also seems to run nicely. > >> > >> Looking deeper into "a UML instance didn't come up properly", > >> the problem is that it comes up in a strange (readonly) state. > >> > >> There are "Using makefile-style concurrent boot in runlevel S." > >> and "Using makefile-style concurrent boot in runlevel 2." in the > >> logs with a Debian userspace, but no output from the init scripts > >> in these broken bootups (normal messages are in non-broken bootups). > >> > >> Perhaps the two the messages I see in dmesg on the host are from the > >> processes running rcS and rc2 failing early? > >> > >> In a working startup with a Debian userspace, I'm getting during rcS > >> Setting the system clock. > >> Cannot access the Hardware Clock via any known method. > >> Use the --debug option to see the details of our search for an access method. > >> Unable to set System Clock to: Mon Oct 3 17:01:35 UTC 2011 ... (warning). > >> > >>> We do have a bit > >>> of a bug in that the new code doesn't report si_addr properly, but > >>> that sounds unlikely as a culprit. Did you try with the offending > >>> commit reverted (i.e. fce8dc0)? I bet that it also fails there. > >> > >> fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you > >> want me to revert? > >> > >>> What's the .config for your UML binary? I'd like to see if I can > >>> reproduce this. > >> > >> It's attached. > >> > > > > I can't reproduce it. What distro is running inside the UML instance? > > Same here. > Adrian, is the UML kernel crashing before executing init? As I wrote: Looking deeper into "a UML instance didn't come up properly", the problem is that it comes up in a strange (readonly) state. The UML kernel is running happily without crashing, and as I wrote my guess about my problems is: Perhaps the two the messages I see in dmesg on the host are from the processes running rcS and rc2 failing early? > We definitely need more information... I gave the information that was requested. plus my observations. What more information exactly do you need from me? > Thanks, > //richard cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-05 22:30 ` Adrian Bunk @ 2011-10-05 22:41 ` richard -rw- weinberger 2011-10-05 22:46 ` Andrew Lutomirski 1 sibling, 0 replies; 50+ messages in thread From: richard -rw- weinberger @ 2011-10-05 22:41 UTC (permalink / raw) To: Adrian Bunk Cc: Andrew Lutomirski, H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Thu, Oct 6, 2011 at 12:30 AM, Adrian Bunk <bunk@stusta.de> wrote: > On Thu, Oct 06, 2011 at 12:22:34AM +0200, richard -rw- weinberger wrote: >> On Thu, Oct 6, 2011 at 12:13 AM, Andrew Lutomirski <luto@mit.edu> wrote: >> > On Mon, Oct 3, 2011 at 10:33 AM, Adrian Bunk <bunk@stusta.de> wrote: >> >> On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: >> >>> On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: >> >>> > After upgrading a kernel the existing userspace should just work >> >>> > (assuming it did work before ;-) ), but when I upgraded my kernel >> >>> > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. >> >>> > >> >>> > dmesg said: >> >>> > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 >> >>> > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 >> >>> > >> >>> > Looking throught the changelog I ended up at commit 3ae36655 >> >>> > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). >> >>> > >> >>> > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to >> >>> > vsyscall=native. >> >>> > >> >>> > That sounds reasonable to me, and fixes the problem for me. >> >>> >> >>> At this point in the -rc cycle, this sounds fine. >> >>> >> >>> That being said, I'd like to fix it for real for 3.2. This particular >> >>> failure is suspicious -- the "vsyscall fault" message means that >> >>> sys_gettimeofday returned EFAULT, which means that the old (3.0 and >> >>> before) vgettimeofday should *also* have segfaulted. >> >> >> >> This 2.6.30.1 UML kernel binary from 2009 worked for me for all host >> >> kernels from 2.6.30 to 3.0, and with 3.1.0-rc8 and vsyscall=native >> >> it also seems to run nicely. >> >> >> >> Looking deeper into "a UML instance didn't come up properly", >> >> the problem is that it comes up in a strange (readonly) state. >> >> >> >> There are "Using makefile-style concurrent boot in runlevel S." >> >> and "Using makefile-style concurrent boot in runlevel 2." in the >> >> logs with a Debian userspace, but no output from the init scripts >> >> in these broken bootups (normal messages are in non-broken bootups). >> >> >> >> Perhaps the two the messages I see in dmesg on the host are from the >> >> processes running rcS and rc2 failing early? >> >> >> >> In a working startup with a Debian userspace, I'm getting during rcS >> >> Setting the system clock. >> >> Cannot access the Hardware Clock via any known method. >> >> Use the --debug option to see the details of our search for an access method. >> >> Unable to set System Clock to: Mon Oct 3 17:01:35 UTC 2011 ... (warning). >> >> >> >>> We do have a bit >> >>> of a bug in that the new code doesn't report si_addr properly, but >> >>> that sounds unlikely as a culprit. Did you try with the offending >> >>> commit reverted (i.e. fce8dc0)? I bet that it also fails there. >> >> >> >> fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you >> >> want me to revert? >> >> >> >>> What's the .config for your UML binary? I'd like to see if I can >> >>> reproduce this. >> >> >> >> It's attached. >> >> >> > >> > I can't reproduce it. What distro is running inside the UML instance? >> >> Same here. >> Adrian, is the UML kernel crashing before executing init? > > As I wrote: > Looking deeper into "a UML instance didn't come up properly", > the problem is that it comes up in a strange (readonly) state. > > The UML kernel is running happily without crashing, and as I wrote my > guess about my problems is: > Perhaps the two the messages I see in dmesg on the host are from the > processes running rcS and rc2 failing early? > >> We definitely need more information... > > I gave the information that was requested. plus my observations. > Whoops, the mail containing that information did not make it into my head, sorry. Now I know where to look for... BTW: Can you please test 3.1-rcX as UML kernel? It contains vDSO/vsyscall fixes... -- Thanks, //richard ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-05 22:30 ` Adrian Bunk 2011-10-05 22:41 ` richard -rw- weinberger @ 2011-10-05 22:46 ` Andrew Lutomirski 2011-10-05 23:36 ` Andrew Lutomirski 1 sibling, 1 reply; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-05 22:46 UTC (permalink / raw) To: Adrian Bunk Cc: richard -rw- weinberger, H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Wed, Oct 5, 2011 at 3:30 PM, Adrian Bunk <bunk@stusta.de> wrote: > On Thu, Oct 06, 2011 at 12:22:34AM +0200, richard -rw- weinberger wrote: >> On Thu, Oct 6, 2011 at 12:13 AM, Andrew Lutomirski <luto@mit.edu> wrote: >> > On Mon, Oct 3, 2011 at 10:33 AM, Adrian Bunk <bunk@stusta.de> wrote: >> >> On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: >> >>> On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: >> >>> > After upgrading a kernel the existing userspace should just work >> >>> > (assuming it did work before ;-) ), but when I upgraded my kernel >> >>> > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. >> >>> > >> >>> > dmesg said: >> >>> > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 >> >>> > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 >> >>> > >> >>> > Looking throught the changelog I ended up at commit 3ae36655 >> >>> > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). >> >>> > >> >>> > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to >> >>> > vsyscall=native. >> >>> > >> >>> > That sounds reasonable to me, and fixes the problem for me. >> >>> >> >>> At this point in the -rc cycle, this sounds fine. >> >>> >> >>> That being said, I'd like to fix it for real for 3.2. This particular >> >>> failure is suspicious -- the "vsyscall fault" message means that >> >>> sys_gettimeofday returned EFAULT, which means that the old (3.0 and >> >>> before) vgettimeofday should *also* have segfaulted. >> >> >> >> This 2.6.30.1 UML kernel binary from 2009 worked for me for all host >> >> kernels from 2.6.30 to 3.0, and with 3.1.0-rc8 and vsyscall=native >> >> it also seems to run nicely. >> >> >> >> Looking deeper into "a UML instance didn't come up properly", >> >> the problem is that it comes up in a strange (readonly) state. >> >> >> >> There are "Using makefile-style concurrent boot in runlevel S." >> >> and "Using makefile-style concurrent boot in runlevel 2." in the >> >> logs with a Debian userspace, but no output from the init scripts >> >> in these broken bootups (normal messages are in non-broken bootups). >> >> >> >> Perhaps the two the messages I see in dmesg on the host are from the >> >> processes running rcS and rc2 failing early? >> >> >> >> In a working startup with a Debian userspace, I'm getting during rcS >> >> Setting the system clock. >> >> Cannot access the Hardware Clock via any known method. >> >> Use the --debug option to see the details of our search for an access method. >> >> Unable to set System Clock to: Mon Oct 3 17:01:35 UTC 2011 ... (warning). >> >> >> >>> We do have a bit >> >>> of a bug in that the new code doesn't report si_addr properly, but >> >>> that sounds unlikely as a culprit. Did you try with the offending >> >>> commit reverted (i.e. fce8dc0)? I bet that it also fails there. >> >> >> >> fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you >> >> want me to revert? >> >> >> >>> What's the .config for your UML binary? I'd like to see if I can >> >>> reproduce this. >> >> >> >> It's attached. >> >> >> > >> > I can't reproduce it. What distro is running inside the UML instance? >> >> Same here. >> Adrian, is the UML kernel crashing before executing init? > > As I wrote: > Looking deeper into "a UML instance didn't come up properly", > the problem is that it comes up in a strange (readonly) state. > > The UML kernel is running happily without crashing, and as I wrote my > guess about my problems is: > Perhaps the two the messages I see in dmesg on the host are from the > processes running rcS and rc2 failing early? > >> We definitely need more information... > > I gave the information that was requested. plus my observations. > > What more information exactly do you need from me? None :) I just reproduced the problem with Debian Squeeze. Lenny works fine. --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-05 22:46 ` Andrew Lutomirski @ 2011-10-05 23:36 ` Andrew Lutomirski 2011-10-06 3:06 ` Andrew Lutomirski 0 siblings, 1 reply; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-05 23:36 UTC (permalink / raw) To: Adrian Bunk, richard -rw- weinberger Cc: H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Wed, Oct 5, 2011 at 3:46 PM, Andrew Lutomirski <luto@mit.edu> wrote: > On Wed, Oct 5, 2011 at 3:30 PM, Adrian Bunk <bunk@stusta.de> wrote: >> On Thu, Oct 06, 2011 at 12:22:34AM +0200, richard -rw- weinberger wrote: >>> On Thu, Oct 6, 2011 at 12:13 AM, Andrew Lutomirski <luto@mit.edu> wrote: >>> > On Mon, Oct 3, 2011 at 10:33 AM, Adrian Bunk <bunk@stusta.de> wrote: >>> >> On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: >>> >>> On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: >>> >>> > After upgrading a kernel the existing userspace should just work >>> >>> > (assuming it did work before ;-) ), but when I upgraded my kernel >>> >>> > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. >>> >>> > >>> >>> > dmesg said: >>> >>> > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 >>> >>> > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 >>> >>> > >>> >>> > Looking throught the changelog I ended up at commit 3ae36655 >>> >>> > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). >>> >>> > >>> >>> > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to >>> >>> > vsyscall=native. >>> >>> > >>> >>> > That sounds reasonable to me, and fixes the problem for me. >>> >>> >>> >>> At this point in the -rc cycle, this sounds fine. >>> >>> >>> >>> That being said, I'd like to fix it for real for 3.2. This particular >>> >>> failure is suspicious -- the "vsyscall fault" message means that >>> >>> sys_gettimeofday returned EFAULT, which means that the old (3.0 and >>> >>> before) vgettimeofday should *also* have segfaulted. >>> >> >>> >> This 2.6.30.1 UML kernel binary from 2009 worked for me for all host >>> >> kernels from 2.6.30 to 3.0, and with 3.1.0-rc8 and vsyscall=native >>> >> it also seems to run nicely. >>> >> >>> >> Looking deeper into "a UML instance didn't come up properly", >>> >> the problem is that it comes up in a strange (readonly) state. >>> >> >>> >> There are "Using makefile-style concurrent boot in runlevel S." >>> >> and "Using makefile-style concurrent boot in runlevel 2." in the >>> >> logs with a Debian userspace, but no output from the init scripts >>> >> in these broken bootups (normal messages are in non-broken bootups). >>> >> >>> >> Perhaps the two the messages I see in dmesg on the host are from the >>> >> processes running rcS and rc2 failing early? >>> >> >>> >> In a working startup with a Debian userspace, I'm getting during rcS >>> >> Setting the system clock. >>> >> Cannot access the Hardware Clock via any known method. >>> >> Use the --debug option to see the details of our search for an access method. >>> >> Unable to set System Clock to: Mon Oct 3 17:01:35 UTC 2011 ... (warning). >>> >> >>> >>> We do have a bit >>> >>> of a bug in that the new code doesn't report si_addr properly, but >>> >>> that sounds unlikely as a culprit. Did you try with the offending >>> >>> commit reverted (i.e. fce8dc0)? I bet that it also fails there. >>> >> >>> >> fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you >>> >> want me to revert? >>> >> >>> >>> What's the .config for your UML binary? I'd like to see if I can >>> >>> reproduce this. >>> >> >>> >> It's attached. >>> >> >>> > >>> > I can't reproduce it. What distro is running inside the UML instance? >>> >>> Same here. >>> Adrian, is the UML kernel crashing before executing init? >> >> As I wrote: >> Looking deeper into "a UML instance didn't come up properly", >> the problem is that it comes up in a strange (readonly) state. >> >> The UML kernel is running happily without crashing, and as I wrote my >> guess about my problems is: >> Perhaps the two the messages I see in dmesg on the host are from the >> processes running rcS and rc2 failing early? >> >>> We definitely need more information... >> >> I gave the information that was requested. plus my observations. >> >> What more information exactly do you need from me? > > None :) I just reproduced the problem with Debian Squeeze. Lenny works fine. This is strange. The problem appears to be in startpar. That same exact Debian image works fine on KVM running 3.1-rc8 (with vsyscall=emulate) and on 2.6.40 (i.e. Fedora 15's kernel). If I set print-fatal-signals=1 I don't see a fatal signal in startpar. Richard, is it possible that UML 2.6.30.1 generates a bogus vgettimeofday and recovers successfully on older kernels because the resulting SIGSEGV had a valid sigcontext? I can try hacking the "vsyscall fault" path to generate full sigcontext and info. This seems rather unlikely, though. --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-05 23:36 ` Andrew Lutomirski @ 2011-10-06 3:06 ` Andrew Lutomirski 2011-10-06 12:12 ` richard -rw- weinberger 2011-10-06 15:37 ` richard -rw- weinberger 0 siblings, 2 replies; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-06 3:06 UTC (permalink / raw) To: Adrian Bunk, richard -rw- weinberger Cc: H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Wed, Oct 5, 2011 at 4:36 PM, Andrew Lutomirski <luto@mit.edu> wrote: > On Wed, Oct 5, 2011 at 3:46 PM, Andrew Lutomirski <luto@mit.edu> wrote: >> On Wed, Oct 5, 2011 at 3:30 PM, Adrian Bunk <bunk@stusta.de> wrote: >>> On Thu, Oct 06, 2011 at 12:22:34AM +0200, richard -rw- weinberger wrote: >>>> On Thu, Oct 6, 2011 at 12:13 AM, Andrew Lutomirski <luto@mit.edu> wrote: >>>> > On Mon, Oct 3, 2011 at 10:33 AM, Adrian Bunk <bunk@stusta.de> wrote: >>>> >> On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: >>>> >>> On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: >>>> >>> > After upgrading a kernel the existing userspace should just work >>>> >>> > (assuming it did work before ;-) ), but when I upgraded my kernel >>>> >>> > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. >>>> >>> > >>>> >>> > dmesg said: >>>> >>> > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 >>>> >>> > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 >>>> >>> > >>>> >>> > Looking throught the changelog I ended up at commit 3ae36655 >>>> >>> > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). >>>> >>> > >>>> >>> > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to >>>> >>> > vsyscall=native. >>>> >>> > >>>> >>> > That sounds reasonable to me, and fixes the problem for me. >>>> >>> >>>> >>> At this point in the -rc cycle, this sounds fine. >>>> >>> >>>> >>> That being said, I'd like to fix it for real for 3.2. This particular >>>> >>> failure is suspicious -- the "vsyscall fault" message means that >>>> >>> sys_gettimeofday returned EFAULT, which means that the old (3.0 and >>>> >>> before) vgettimeofday should *also* have segfaulted. >>>> >> >>>> >> This 2.6.30.1 UML kernel binary from 2009 worked for me for all host >>>> >> kernels from 2.6.30 to 3.0, and with 3.1.0-rc8 and vsyscall=native >>>> >> it also seems to run nicely. >>>> >> >>>> >> Looking deeper into "a UML instance didn't come up properly", >>>> >> the problem is that it comes up in a strange (readonly) state. >>>> >> >>>> >> There are "Using makefile-style concurrent boot in runlevel S." >>>> >> and "Using makefile-style concurrent boot in runlevel 2." in the >>>> >> logs with a Debian userspace, but no output from the init scripts >>>> >> in these broken bootups (normal messages are in non-broken bootups). >>>> >> >>>> >> Perhaps the two the messages I see in dmesg on the host are from the >>>> >> processes running rcS and rc2 failing early? >>>> >> >>>> >> In a working startup with a Debian userspace, I'm getting during rcS >>>> >> Setting the system clock. >>>> >> Cannot access the Hardware Clock via any known method. >>>> >> Use the --debug option to see the details of our search for an access method. >>>> >> Unable to set System Clock to: Mon Oct 3 17:01:35 UTC 2011 ... (warning). >>>> >> >>>> >>> We do have a bit >>>> >>> of a bug in that the new code doesn't report si_addr properly, but >>>> >>> that sounds unlikely as a culprit. Did you try with the offending >>>> >>> commit reverted (i.e. fce8dc0)? I bet that it also fails there. >>>> >> >>>> >> fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you >>>> >> want me to revert? >>>> >> >>>> >>> What's the .config for your UML binary? I'd like to see if I can >>>> >>> reproduce this. >>>> >> >>>> >> It's attached. >>>> >> >>>> > >>>> > I can't reproduce it. What distro is running inside the UML instance? >>>> >>>> Same here. >>>> Adrian, is the UML kernel crashing before executing init? >>> >>> As I wrote: >>> Looking deeper into "a UML instance didn't come up properly", >>> the problem is that it comes up in a strange (readonly) state. >>> >>> The UML kernel is running happily without crashing, and as I wrote my >>> guess about my problems is: >>> Perhaps the two the messages I see in dmesg on the host are from the >>> processes running rcS and rc2 failing early? >>> >>>> We definitely need more information... >>> >>> I gave the information that was requested. plus my observations. >>> >>> What more information exactly do you need from me? >> >> None :) I just reproduced the problem with Debian Squeeze. Lenny works fine. > > This is strange. The problem appears to be in startpar. That same > exact Debian image works fine on KVM running 3.1-rc8 (with > vsyscall=emulate) and on 2.6.40 (i.e. Fedora 15's kernel). If I set > print-fatal-signals=1 I don't see a fatal signal in startpar. > > Richard, is it possible that UML 2.6.30.1 generates a bogus > vgettimeofday and recovers successfully on older kernels because the > resulting SIGSEGV had a valid sigcontext? I can try hacking the > "vsyscall fault" path to generate full sigcontext and info. This > seems rather unlikely, though. I think that is the problem. UML appears to lazily set up "page tables" just like a real machine; it does this by handling SIGSEGV and calling handle_mm_fault. If cr2 isn't set right, though, it doesn't know where the fault was and it can't handle it, so it just sends SIGSEGV to userspace. In 3.0 and earlier, we don't crash but we malfunction differently: UML doesn't intercept the vsyscall at all and the guest sees the hosts's time. This should be fixed in a newer version of UML. In vsyscall=native mode, we DTRT because UML handles the syscall itself. I'll see how ugly the patch to get this all correct is. It may not be all that pretty because we won't be able to use sys_gettimeofday anymore. --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-06 3:06 ` Andrew Lutomirski @ 2011-10-06 12:12 ` richard -rw- weinberger 2011-10-06 15:37 ` richard -rw- weinberger 1 sibling, 0 replies; 50+ messages in thread From: richard -rw- weinberger @ 2011-10-06 12:12 UTC (permalink / raw) To: Andrew Lutomirski Cc: Adrian Bunk, H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Thu, Oct 6, 2011 at 5:06 AM, Andrew Lutomirski <luto@mit.edu> wrote: > On Wed, Oct 5, 2011 at 4:36 PM, Andrew Lutomirski <luto@mit.edu> wrote: >> On Wed, Oct 5, 2011 at 3:46 PM, Andrew Lutomirski <luto@mit.edu> wrote: >>> On Wed, Oct 5, 2011 at 3:30 PM, Adrian Bunk <bunk@stusta.de> wrote: >>>> On Thu, Oct 06, 2011 at 12:22:34AM +0200, richard -rw- weinberger wrote: >>>>> On Thu, Oct 6, 2011 at 12:13 AM, Andrew Lutomirski <luto@mit.edu> wrote: >>>>> > On Mon, Oct 3, 2011 at 10:33 AM, Adrian Bunk <bunk@stusta.de> wrote: >>>>> >> On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: >>>>> >>> On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: >>>>> >>> > After upgrading a kernel the existing userspace should just work >>>>> >>> > (assuming it did work before ;-) ), but when I upgraded my kernel >>>>> >>> > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. >>>>> >>> > >>>>> >>> > dmesg said: >>>>> >>> > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 >>>>> >>> > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 >>>>> >>> > >>>>> >>> > Looking throught the changelog I ended up at commit 3ae36655 >>>>> >>> > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). >>>>> >>> > >>>>> >>> > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to >>>>> >>> > vsyscall=native. >>>>> >>> > >>>>> >>> > That sounds reasonable to me, and fixes the problem for me. >>>>> >>> >>>>> >>> At this point in the -rc cycle, this sounds fine. >>>>> >>> >>>>> >>> That being said, I'd like to fix it for real for 3.2. This particular >>>>> >>> failure is suspicious -- the "vsyscall fault" message means that >>>>> >>> sys_gettimeofday returned EFAULT, which means that the old (3.0 and >>>>> >>> before) vgettimeofday should *also* have segfaulted. >>>>> >> >>>>> >> This 2.6.30.1 UML kernel binary from 2009 worked for me for all host >>>>> >> kernels from 2.6.30 to 3.0, and with 3.1.0-rc8 and vsyscall=native >>>>> >> it also seems to run nicely. >>>>> >> >>>>> >> Looking deeper into "a UML instance didn't come up properly", >>>>> >> the problem is that it comes up in a strange (readonly) state. >>>>> >> >>>>> >> There are "Using makefile-style concurrent boot in runlevel S." >>>>> >> and "Using makefile-style concurrent boot in runlevel 2." in the >>>>> >> logs with a Debian userspace, but no output from the init scripts >>>>> >> in these broken bootups (normal messages are in non-broken bootups). >>>>> >> >>>>> >> Perhaps the two the messages I see in dmesg on the host are from the >>>>> >> processes running rcS and rc2 failing early? >>>>> >> >>>>> >> In a working startup with a Debian userspace, I'm getting during rcS >>>>> >> Setting the system clock. >>>>> >> Cannot access the Hardware Clock via any known method. >>>>> >> Use the --debug option to see the details of our search for an access method. >>>>> >> Unable to set System Clock to: Mon Oct 3 17:01:35 UTC 2011 ... (warning). >>>>> >> >>>>> >>> We do have a bit >>>>> >>> of a bug in that the new code doesn't report si_addr properly, but >>>>> >>> that sounds unlikely as a culprit. Did you try with the offending >>>>> >>> commit reverted (i.e. fce8dc0)? I bet that it also fails there. >>>>> >> >>>>> >> fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you >>>>> >> want me to revert? >>>>> >> >>>>> >>> What's the .config for your UML binary? I'd like to see if I can >>>>> >>> reproduce this. >>>>> >> >>>>> >> It's attached. >>>>> >> >>>>> > >>>>> > I can't reproduce it. What distro is running inside the UML instance? >>>>> >>>>> Same here. >>>>> Adrian, is the UML kernel crashing before executing init? >>>> >>>> As I wrote: >>>> Looking deeper into "a UML instance didn't come up properly", >>>> the problem is that it comes up in a strange (readonly) state. >>>> >>>> The UML kernel is running happily without crashing, and as I wrote my >>>> guess about my problems is: >>>> Perhaps the two the messages I see in dmesg on the host are from the >>>> processes running rcS and rc2 failing early? >>>> >>>>> We definitely need more information... >>>> >>>> I gave the information that was requested. plus my observations. >>>> >>>> What more information exactly do you need from me? >>> >>> None :) I just reproduced the problem with Debian Squeeze. Lenny works fine. >> >> This is strange. The problem appears to be in startpar. That same >> exact Debian image works fine on KVM running 3.1-rc8 (with >> vsyscall=emulate) and on 2.6.40 (i.e. Fedora 15's kernel). If I set >> print-fatal-signals=1 I don't see a fatal signal in startpar. >> >> Richard, is it possible that UML 2.6.30.1 generates a bogus >> vgettimeofday and recovers successfully on older kernels because the >> resulting SIGSEGV had a valid sigcontext? I can try hacking the >> "vsyscall fault" path to generate full sigcontext and info. This >> seems rather unlikely, though. > > I think that is the problem. UML appears to lazily set up "page > tables" just like a real machine; it does this by handling SIGSEGV and > calling handle_mm_fault. If cr2 isn't set right, though, it doesn't > know where the fault was and it can't handle it, so it just sends > SIGSEGV to userspace. > > In 3.0 and earlier, we don't crash but we malfunction differently: UML > doesn't intercept the vsyscall at all and the guest sees the hosts's > time. This should be fixed in a newer version of UML. How can we intercept a vsyscall? It's not trivial. Starting with Linux 3.1 UML (x86_64) has a vDSO page which transforms all vDSO calls to real system calls which can be intercepted. So, only statically linked binaries will use the host's vsyscall interface. > In vsyscall=native mode, we DTRT because UML handles the syscall itself. > > I'll see how ugly the patch to get this all correct is. It may not be > all that pretty because we won't be able to use sys_gettimeofday > anymore. > vsyscall=emulate would be okay for UML if the SEGV has a valid signal context. -- Thanks, //richard ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-06 3:06 ` Andrew Lutomirski 2011-10-06 12:12 ` richard -rw- weinberger @ 2011-10-06 15:37 ` richard -rw- weinberger 2011-10-06 18:16 ` Andrew Lutomirski 1 sibling, 1 reply; 50+ messages in thread From: richard -rw- weinberger @ 2011-10-06 15:37 UTC (permalink / raw) To: Andrew Lutomirski Cc: Adrian Bunk, H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel [-- Attachment #1: Type: text/plain, Size: 603 bytes --] On Thu, Oct 6, 2011 at 5:06 AM, Andrew Lutomirski <luto@mit.edu> wrote: > I'll see how ugly the patch to get this all correct is. It may not be > all that pretty because we won't be able to use sys_gettimeofday > anymore. BTW: The attached program triggers the issue. on 3.1-rc8+: # ./sig.dyn faulting address: 0xdeadbeef # ./sig.static [ 19.075106] sig.static[863] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fff9e53d8c8 ax:ffffffffff600000 si:0 di:deadbeef faulting address: 0x0 I guess UML is not the only user of this feature... -- Thanks, //richard [-- Attachment #2: sig.c --] [-- Type: text/x-csrc, Size: 454 bytes --] #include <stdio.h> #include <stdlib.h> #include <signal.h> #include <sys/time.h> static void sighandler(int sig, siginfo_t *si, void *uc) { printf("faulting address: 0x%lx\n", (unsigned long)si->si_addr); exit(1); } int main() { struct sigaction sa; sa.sa_sigaction = (void *)sighandler; sigemptyset(&sa.sa_mask); sa.sa_flags = SA_SIGINFO| SA_NODEFER; sigaction(SIGSEGV, &sa, NULL); gettimeofday((void *)0xdeadbeef, NULL); return 0; } ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-06 15:37 ` richard -rw- weinberger @ 2011-10-06 18:16 ` Andrew Lutomirski 2011-10-06 18:34 ` Linus Torvalds 0 siblings, 1 reply; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-06 18:16 UTC (permalink / raw) To: richard -rw- weinberger Cc: Adrian Bunk, H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1029 bytes --] On Thu, Oct 6, 2011 at 8:37 AM, richard -rw- weinberger <richard.weinberger@gmail.com> wrote: > On Thu, Oct 6, 2011 at 5:06 AM, Andrew Lutomirski <luto@mit.edu> wrote: >> I'll see how ugly the patch to get this all correct is. It may not be >> all that pretty because we won't be able to use sys_gettimeofday >> anymore. > > BTW: The attached program triggers the issue. > > on 3.1-rc8+: > # ./sig.dyn > faulting address: 0xdeadbeef > # ./sig.static > [ 19.075106] sig.static[863] vsyscall fault (exploit attempt?) > ip:ffffffffff600000 cs:33 sp:7fff9e53d8c8 ax:ffffffffff600000 si:0 > di:deadbeef > faulting address: 0x0 > > I guess UML is not the only user of this feature... I assume you wrote this to detect the problem :) Fixing it will be annoying because the attached fancier version needs to work, too. I could implement the whole mess in software, but it might be nicer to arrange for uaccess errors to stash some information somewhere (like in the thread_struct cr2 variable). --Andy [-- Attachment #2: sig.c --] [-- Type: text/x-csrc, Size: 691 bytes --] #include <stdio.h> #include <stdlib.h> #include <signal.h> #include <sys/time.h> #include <sys/mman.h> static void sighandler(int sig, siginfo_t *si, void *uc) { printf("faulting address: 0x%lx\n", (unsigned long)si->si_addr); exit(1); } int main() { char *page = mmap(0, 8192, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); mprotect(page, 4096, PROT_READ | PROT_WRITE); struct sigaction sa; sa.sa_sigaction = (void *)sighandler; sigemptyset(&sa.sa_mask); sa.sa_flags = SA_SIGINFO| SA_NODEFER; sigaction(SIGSEGV, &sa, NULL); void *access_addr = page + 4095; printf("Mapped page = %p; will access %p\n", page, access_addr); gettimeofday(access_addr, NULL); return 0; } ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-06 18:16 ` Andrew Lutomirski @ 2011-10-06 18:34 ` Linus Torvalds 2011-10-07 0:48 ` Andrew Lutomirski 0 siblings, 1 reply; 50+ messages in thread From: Linus Torvalds @ 2011-10-06 18:34 UTC (permalink / raw) To: Andrew Lutomirski Cc: richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Thu, Oct 6, 2011 at 11:16 AM, Andrew Lutomirski <luto@mit.edu> wrote: > > Fixing it will be annoying because the attached fancier version needs > to work, too. I could implement the whole mess in software, but it > might be nicer to arrange for uaccess errors to stash some information > somewhere (like in the thread_struct cr2 variable). That should be easy enough to do. Just add it to the "fixup_exception()" case in no_context(). Linus ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-06 18:34 ` Linus Torvalds @ 2011-10-07 0:48 ` Andrew Lutomirski 2011-10-10 11:19 ` richard -rw- weinberger 0 siblings, 1 reply; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-07 0:48 UTC (permalink / raw) To: Linus Torvalds Cc: richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Thu, Oct 6, 2011 at 11:34 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, Oct 6, 2011 at 11:16 AM, Andrew Lutomirski <luto@mit.edu> wrote: >> >> Fixing it will be annoying because the attached fancier version needs >> to work, too. I could implement the whole mess in software, but it >> might be nicer to arrange for uaccess errors to stash some information >> somewhere (like in the thread_struct cr2 variable). > > That should be easy enough to do. Just add it to the > "fixup_exception()" case in no_context(). This code is rather messy. We stash the cr2, err, and trap fields of sigcontext in thread_struct and we *never* reset them until the next segfault. So userspace sees stale garbage on every signal that isn't a (genuine) segfault. I can imagine this breaking UML is remarkably bizarre ways even without vsyscall emulation because UML actually seems to rely on that stuff to determine the source of a signal. The nice fix would be to move this information into siginfo. cr2 appears to be duplicated by sa_addr. trap_no is apparently redundant except for SIGTRAP. error_code is interesting. Any objection to using some padding bytes to move this into siginfo and remove the fields (except for uaccess) from thread_struct? Better ideas? Without some kind of cleanup, I'm a bit worried about breakage if a uaccess fault happens between something else setting the flags and a signal getting delivered, resulting in corruption of the sigcontext, unless I add more crud to thread_struct and waste memory for every process. --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-07 0:48 ` Andrew Lutomirski @ 2011-10-10 11:19 ` richard -rw- weinberger 2011-10-10 11:48 ` Ingo Molnar 0 siblings, 1 reply; 50+ messages in thread From: richard -rw- weinberger @ 2011-10-10 11:19 UTC (permalink / raw) To: Andrew Lutomirski Cc: Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Fri, Oct 7, 2011 at 2:48 AM, Andrew Lutomirski <luto@mit.edu> wrote: > On Thu, Oct 6, 2011 at 11:34 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> On Thu, Oct 6, 2011 at 11:16 AM, Andrew Lutomirski <luto@mit.edu> wrote: >>> >>> Fixing it will be annoying because the attached fancier version needs >>> to work, too. I could implement the whole mess in software, but it >>> might be nicer to arrange for uaccess errors to stash some information >>> somewhere (like in the thread_struct cr2 variable). >> >> That should be easy enough to do. Just add it to the >> "fixup_exception()" case in no_context(). > > This code is rather messy. We stash the cr2, err, and trap fields of > sigcontext in thread_struct and we *never* reset them until the next > segfault. So userspace sees stale garbage on every signal that isn't > a (genuine) segfault. I can imagine this breaking UML is remarkably > bizarre ways even without vsyscall emulation because UML actually > seems to rely on that stuff to determine the source of a signal. > >From UML's point of view the current situation is odd. UML will no longer run on top of a default 3.1 kernel. Why is this odd? One of the major reasons why people are still using UML is because you can run it as non-privileged user on any x86 Linux host. An user which has root privileges can setup and use KVM which is much nicer than UML... -- Thanks, //richard ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-10 11:19 ` richard -rw- weinberger @ 2011-10-10 11:48 ` Ingo Molnar 2011-10-10 15:31 ` Andrew Lutomirski 0 siblings, 1 reply; 50+ messages in thread From: Ingo Molnar @ 2011-10-10 11:48 UTC (permalink / raw) To: richard -rw- weinberger Cc: Andrew Lutomirski, Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel * richard -rw- weinberger <richard.weinberger@gmail.com> wrote: > From UML's point of view the current situation is odd. UML will no > longer run on top of a default 3.1 kernel. This needs to be fixed (perhaps worked around in UML if that's possible and if you agree with that) - or barring a real obvious fix needs to be reverted to the last-known-working state. We are in -rc9 so nothing but really, really obvious patches can be applied. > Why is this odd? One of the major reasons why people are still > using UML is because you can run it as non-privileged user on any > x86 Linux host. An user which has root privileges can setup and use > KVM which is much nicer than UML... No, your complaint is entirely justified. Andrew? Thanks, Ingo ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-10 11:48 ` Ingo Molnar @ 2011-10-10 15:31 ` Andrew Lutomirski 2011-10-11 6:22 ` Ingo Molnar 0 siblings, 1 reply; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-10 15:31 UTC (permalink / raw) To: Ingo Molnar Cc: richard -rw- weinberger, Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Mon, Oct 10, 2011 at 4:48 AM, Ingo Molnar <mingo@elte.hu> wrote: > > * richard -rw- weinberger <richard.weinberger@gmail.com> wrote: > >> From UML's point of view the current situation is odd. UML will no >> longer run on top of a default 3.1 kernel. > > This needs to be fixed (perhaps worked around in UML if that's > possible and if you agree with that) - or barring a real obvious fix > needs to be reverted to the last-known-working state. We are in -rc9 > so nothing but really, really obvious patches can be applied. > >> Why is this odd? One of the major reasons why people are still >> using UML is because you can run it as non-privileged user on any >> x86 Linux host. An user which has root privileges can setup and use >> KVM which is much nicer than UML... > > No, your complaint is entirely justified. > > Andrew? I think I know what the root cause is and I have most of a patch to fix it. It doesn't compile (yet), it's a little less trivial than I'd like for something this late in the -rc cycle, and it adds 16 bytes to thread_struct (ugh!). I think I can make a follow-up patch that removes 32 bytes of per-thread state to restore my karma, though, but that will definitely not be 3.1 material. The issue is that the existing trap_no, error_code, and cr2 fields are used in ways that appear rather broken and extremely fragile to report detailed exception info to user space when SIGSEGV, SIGBUS, and SIGTRAP happen. Touching them from the failed uaccess paths might have unfortunate side effects like breaking vm86. I suspect that nothing other than UML and vm86 users care because they're only used for the old sigcontext data and not for modern siginfo. The tricky case for vsyscall emulation is if gettimeofday is called with a buffer that crosses a page boundary and the second page causes the fault. I'll email something out in a day or two (maybe today). --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-10 15:31 ` Andrew Lutomirski @ 2011-10-11 6:22 ` Ingo Molnar 2011-10-11 17:24 ` [RFC] fixing the UML failure root cause Andrew Lutomirski 0 siblings, 1 reply; 50+ messages in thread From: Ingo Molnar @ 2011-10-11 6:22 UTC (permalink / raw) To: Andrew Lutomirski Cc: richard -rw- weinberger, Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel * Andrew Lutomirski <luto@mit.edu> wrote: > > Andrew? > > I think I know what the root cause is and I have most of a patch to > fix it. It doesn't compile (yet), it's a little less trivial than > I'd like for something this late in the -rc cycle, and it adds 16 > bytes to thread_struct (ugh!). > > I think I can make a follow-up patch that removes 32 bytes of > per-thread state to restore my karma, though, but that will > definitely not be 3.1 material. Ok, i've queued up the vsyscall=native patch in tip:x86/urgent for now - we can re-try in v3.2 (perhaps) if a satisfactory solution is found. Thanks, Ingo ^ permalink raw reply [flat|nested] 50+ messages in thread
* [RFC] fixing the UML failure root cause 2011-10-11 6:22 ` Ingo Molnar @ 2011-10-11 17:24 ` Andrew Lutomirski 2011-10-13 6:19 ` Linus Torvalds 2011-10-14 19:53 ` [RFC] fixing the UML failure root cause richard -rw- weinberger 0 siblings, 2 replies; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-11 17:24 UTC (permalink / raw) To: Ingo Molnar Cc: richard -rw- weinberger, Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1259 bytes --] On 10/10/2011 11:22 PM, Ingo Molnar wrote: > > * Andrew Lutomirski <luto@mit.edu> wrote: > >>> Andrew? >> >> I think I know what the root cause is and I have most of a patch to >> fix it. It doesn't compile (yet), it's a little less trivial than >> I'd like for something this late in the -rc cycle, and it adds 16 >> bytes to thread_struct (ugh!). >> >> I think I can make a follow-up patch that removes 32 bytes of >> per-thread state to restore my karma, though, but that will >> definitely not be 3.1 material. > > Ok, i've queued up the vsyscall=native patch in tip:x86/urgent for > now - we can re-try in v3.2 (perhaps) if a satisfactory solution is > found. Getting full cause information for uaccess failure was messy enough that I gave up. There are a *lot* of uaccess failure paths to work through. So here's a different approach. It's not perfect: it always blames SEGV_MAPERR instead of SEGV_ACCERR. I implemented it for vgettimeofday but not the other two vsyscalls. What do you think of this approach? If it seems good, I'll finish the patch and submit it. With this patch applied, UML appears to work, but it fills the log with exploit attempt warnings. Any ideas on what to do about that? --Andy > > Thanks, > > Ingo [-- Attachment #2: vsyscall_hack.patch --] [-- Type: text/plain, Size: 3033 bytes --] diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c index 18ae83d..c0bafec 100644 --- a/arch/x86/kernel/vsyscall_64.c +++ b/arch/x86/kernel/vsyscall_64.c @@ -139,6 +139,42 @@ static int addr_to_vsyscall_nr(unsigned long addr) return nr; } +/* Copy data to user space, forcing signals on failure. */ +static int copy_to_user_sig(unsigned long dest, const void *src, size_t len) +{ + /* + * This may be the slowest memcpy ever written. We don't really care. + */ + size_t i; + for (i = 0; i < len; i++) { + char __user *user_byte = (char __user *)(dest + i); + if (put_user(((char*)src)[i], user_byte) != 0) { + /* Report full siginfo and context */ + struct task_struct *tsk = current; + siginfo_t info; + memset(&info, 0, sizeof(info)); + info.si_signo = SIGSEGV; + /* + * Could be SEGV_ACCERR -- we don't distinguish it + * correctly. + */ + info.si_code = SEGV_MAPERR; + info.si_addr = user_byte; + /* + * Write fault in user mode. We don't distinguish + * protection fault from no page found. + */ + tsk->thread.error_code = 6; + tsk->thread.cr2 = (unsigned long)user_byte; + tsk->thread.trap_no = 14; + force_sig_info(SIGSEGV, &info, tsk); + return -EFAULT; + } + } + + return 0; +} + bool emulate_vsyscall(struct pt_regs *regs, unsigned long address) { struct task_struct *tsk; @@ -181,10 +217,19 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address) switch (vsyscall_nr) { case 0: - ret = sys_gettimeofday( - (struct timeval __user *)regs->di, - (struct timezone __user *)regs->si); + { + struct timeval tv; + do_gettimeofday(&tv); + + if (regs->di && copy_to_user_sig(regs->di, &tv, sizeof(tv))) + goto warn_fault; + if (regs->si && copy_to_user_sig(regs->si, &sys_tz, + sizeof(struct timezone))) + goto warn_fault; + + ret = 0; break; + } case 1: ret = sys_time((time_t __user *)regs->di); @@ -197,19 +242,6 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address) break; } - if (ret == -EFAULT) { - /* - * Bad news -- userspace fed a bad pointer to a vsyscall. - * - * With a real vsyscall, that would have caused SIGSEGV. - * To make writing reliable exploits using the emulated - * vsyscalls harder, generate SIGSEGV here as well. - */ - warn_bad_vsyscall(KERN_INFO, regs, - "vsyscall fault (exploit attempt?)"); - goto sigsegv; - } - regs->ax = ret; /* Emulate a ret instruction. */ @@ -221,6 +253,19 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address) sigsegv: force_sig(SIGSEGV, current); return true; + +warn_fault: + /* + * Bad news -- userspace fed a bad pointer to a vsyscall. + * + * With a real vsyscall, that would have caused SIGSEGV. + * To make writing reliable exploits using the emulated + * vsyscalls harder, generate SIGSEGV here as well. + */ + + warn_bad_vsyscall(KERN_INFO, regs, + "vsyscall fault (exploit attempt?)"); + return true; } /* ^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-11 17:24 ` [RFC] fixing the UML failure root cause Andrew Lutomirski @ 2011-10-13 6:19 ` Linus Torvalds 2011-10-13 8:40 ` Andrew Lutomirski 2011-10-14 19:53 ` [RFC] fixing the UML failure root cause richard -rw- weinberger 1 sibling, 1 reply; 50+ messages in thread From: Linus Torvalds @ 2011-10-13 6:19 UTC (permalink / raw) To: Andrew Lutomirski Cc: Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Wed, Oct 12, 2011 at 5:24 AM, Andrew Lutomirski <luto@mit.edu> wrote: > > So here's a different approach. It's not perfect: it always blames > SEGV_MAPERR instead of SEGV_ACCERR. I implemented it for vgettimeofday > but not the other two vsyscalls. > > What do you think of this approach? If it seems good, I'll finish the > patch and submit it. I think the approach is valid, but you should *not* do this as some kind of crazy byte-by-byte copy_to_user() emulation. Do the "copy tz to user mode" as individual "put_user()" calls for tv_sec/tv_usec/timezone. IOW, there are three words being written to user mode, not "two memcpy's". Other than that, there doesn't seem to be anything wrong. Linus ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-13 6:19 ` Linus Torvalds @ 2011-10-13 8:40 ` Andrew Lutomirski 2011-10-14 4:46 ` Linus Torvalds 0 siblings, 1 reply; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-13 8:40 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Wed, Oct 12, 2011 at 11:19 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Oct 12, 2011 at 5:24 AM, Andrew Lutomirski <luto@mit.edu> wrote: >> >> So here's a different approach. It's not perfect: it always blames >> SEGV_MAPERR instead of SEGV_ACCERR. I implemented it for vgettimeofday >> but not the other two vsyscalls. >> >> What do you think of this approach? If it seems good, I'll finish the >> patch and submit it. > > I think the approach is valid, but you should *not* do this as some > kind of crazy byte-by-byte copy_to_user() emulation. > > Do the "copy tz to user mode" as individual "put_user()" calls for > tv_sec/tv_usec/timezone. IOW, there are three words being written to > user mode, not "two memcpy's". How does that work? The tricky case is when one of those three words spans a page boundary if the access to the first page is valid, but the access to the second page is not. When that happens, if we report the fault as coming from the first page, then UML is likely to get think the fault was spurious and enter an infinite loop. To handle that case, I'll need 4- and 8- byte versions of put_user_sig (IIRC vgetcpu uses unsigneds) that check whether their destinations span page boundaries and complain accordingly, which will end up as more code than I have now. --Andy > > Other than that, there doesn't seem to be anything wrong. > > Linus > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-13 8:40 ` Andrew Lutomirski @ 2011-10-14 4:46 ` Linus Torvalds 2011-10-14 6:30 ` Andrew Lutomirski 0 siblings, 1 reply; 50+ messages in thread From: Linus Torvalds @ 2011-10-14 4:46 UTC (permalink / raw) To: Andrew Lutomirski Cc: Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel [-- Attachment #1: Type: text/plain, Size: 857 bytes --] On Thu, Oct 13, 2011 at 8:40 PM, Andrew Lutomirski <luto@mit.edu> wrote: > > How does that work? The tricky case is when one of those three words > spans a page boundary if the access to the first page is valid, but > the access to the second page is not. When that happens, if we report > the fault as coming from the first page, then UML is likely to get > think the fault was spurious and enter an infinite loop. Hmm. Gaah, I just find that memcpy loop disgusting. We already have that ugly "uaccess_error" crap in handle_exception(), we might as well do something like the attached and just say "hey, now you can catch the page fault information for a get_user/put_user fault". Isn't that much nicer? You don't even have to check each word, you can just take the last exception info from the thread-info. Linus [-- Attachment #2: patch.diff --] [-- Type: text/x-patch, Size: 1062 bytes --] arch/x86/include/asm/thread_info.h | 2 ++ arch/x86/mm/fault.c | 6 +++++- 2 files changed, 7 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index a1fe5c127b52..e8d245febfae 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -41,6 +41,8 @@ struct thread_info { __u8 supervisor_stack[0]; #endif int uaccess_err; + int uaccess_error_code; + unsigned long uaccess_addr; }; #define INIT_THREAD_INFO(tsk) \ diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 0d17c8c50acd..bbbee6e6a95b 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -628,8 +628,12 @@ no_context(struct pt_regs *regs, unsigned long error_code, int sig; /* Are we prepared to handle this kernel fault? */ - if (fixup_exception(regs)) + if (fixup_exception(regs)) { + struct thread_info *ti = current_thread_info(); + ti->uaccess_error_code = error_code; + ti->uaccess_addr = address; return; + } /* * 32-bit: ^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-14 4:46 ` Linus Torvalds @ 2011-10-14 6:30 ` Andrew Lutomirski 2011-10-14 20:10 ` Linus Torvalds 0 siblings, 1 reply; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-14 6:30 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Thu, Oct 13, 2011 at 9:46 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, Oct 13, 2011 at 8:40 PM, Andrew Lutomirski <luto@mit.edu> wrote: >> >> How does that work? The tricky case is when one of those three words >> spans a page boundary if the access to the first page is valid, but >> the access to the second page is not. When that happens, if we report >> the fault as coming from the first page, then UML is likely to get >> think the fault was spurious and enter an infinite loop. > > Hmm. Gaah, I just find that memcpy loop disgusting. > Yeah, it's not pretty. > We already have that ugly "uaccess_error" crap in handle_exception(), > we might as well do something like the attached and just say "hey, now > you can catch the page fault information for a get_user/put_user > fault". > > Isn't that much nicer? I actually tried this. To really get it right, though, I also need to either hook the access_ok failure paths (either every single one or just the ones that matter for those three syscalls, which could be fragile) or to check access_ok separately in the vsyscall emulation code. This also takes up 16 bytes of stack just to support a corner case of a legacy code path. Another idea is to have a flag that asks the fault handlers to call force_sig_info for us. That's just one bit of per-thread state. Then the vsyscall emulation code could check access_ok, force a signal if access is not ok, then set the flag and do the syscall. And maybe some processes would want to opt in to that mode anyway -- arguably EFAULT is a serious programmer error and should be dealt with more harshly than other syscall misuses. Admittedly, UML probably doesn't care about recovering vgettimeofday pointed at kernel space... --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-14 6:30 ` Andrew Lutomirski @ 2011-10-14 20:10 ` Linus Torvalds 2011-10-21 21:01 ` [PATCH] x86-64: Set siginfo and context on vsyscall emulation faults Andy Lutomirski 0 siblings, 1 reply; 50+ messages in thread From: Linus Torvalds @ 2011-10-14 20:10 UTC (permalink / raw) To: Andrew Lutomirski Cc: Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Fri, Oct 14, 2011 at 6:30 PM, Andrew Lutomirski <luto@mit.edu> wrote: > > Another idea is to have a flag that asks the fault handlers to call > force_sig_info for us. That's just one bit of per-thread state. Then > the vsyscall emulation code could check access_ok, force a signal if > access is not ok, then set the flag and do the syscall. And maybe > some processes would want to opt in to that mode anyway -- arguably > EFAULT is a serious programmer error and should be dealt with more > harshly than other syscall misuses. Ok, so I really like that approach. I could easily see some process saying "I want a SIGSEGV in addition to the EFAULT that I always get". And yes, it would fix the vsyscall emulation code which could just save the thread flag, set it, do the accesses, and restore it to the old valud. Please make it so, Linus ^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH] x86-64: Set siginfo and context on vsyscall emulation faults 2011-10-14 20:10 ` Linus Torvalds @ 2011-10-21 21:01 ` Andy Lutomirski 2011-10-22 4:46 ` Linus Torvalds 0 siblings, 1 reply; 50+ messages in thread From: Andy Lutomirski @ 2011-10-21 21:01 UTC (permalink / raw) To: Linus Torvalds, x86 Cc: Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel, Andy Lutomirski To make this work, we teach the page fault handler how to send signals on failed uaccess. This only works for user addresses (kernel addresses will never hit the page fault handler in the first place), so we need to generate signals for those separately. This gets the tricky case right: if the user buffer spans multiple pages and only the second page is invalid, we set cr2 and si_addr correctly. UML relies on this behavior to "fault in" pages as needed. We steal a bit from thread_info.uaccess_err to enable this. Before this change, uaccess_err was a 32-bit boolean value. This fixes issues with UML when vsyscall=emulate. Reported-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andy Lutomirski <luto@amacapital.net> --- I've tested this briefly on the UML image that used to blow it. It seems to work. It also passes my little sigcontext test. arch/x86/include/asm/thread_info.h | 3 +- arch/x86/include/asm/uaccess.h | 2 +- arch/x86/kernel/vsyscall_64.c | 67 +++++++++++++++++++++++++++++++---- arch/x86/mm/extable.c | 2 +- arch/x86/mm/fault.c | 22 ++++++++--- 5 files changed, 79 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index a1fe5c1..25ebd79 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -40,7 +40,8 @@ struct thread_info { */ __u8 supervisor_stack[0]; #endif - int uaccess_err; + int sig_on_uaccess_error:1; + int uaccess_err:1; /* uaccess failed */ }; #define INIT_THREAD_INFO(tsk) \ diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 36361bf..8be5f54 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -462,7 +462,7 @@ struct __large_struct { unsigned long buf[100]; }; barrier(); #define uaccess_catch(err) \ - (err) |= current_thread_info()->uaccess_err; \ + (err) |= (current_thread_info()->uaccess_err ? -EFAULT : 0); \ current_thread_info()->uaccess_err = prev_err; \ } while (0) diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c index 18ae83d..c6dd0e6 100644 --- a/arch/x86/kernel/vsyscall_64.c +++ b/arch/x86/kernel/vsyscall_64.c @@ -139,11 +139,38 @@ static int addr_to_vsyscall_nr(unsigned long addr) return nr; } +static bool write_ok_or_segv(unsigned long ptr, size_t size) +{ + if (ptr == 0) + return true; + + if (!access_ok(VERIFY_WRITE, (void __user *)ptr, size)) { + siginfo_t info; + struct thread_struct *thread = ¤t->thread; + + thread->error_code = 6; /* user fault, no page, write */ + thread->cr2 = ptr; + thread->trap_no = 14; + + memset(&info, 0, sizeof(info)); + info.si_signo = SIGSEGV; + info.si_errno = 0; + info.si_code = SEGV_MAPERR; + info.si_addr = (void __user *)ptr; + + force_sig_info(SIGSEGV, &info, current); + return false; + } else { + return true; + } +} + bool emulate_vsyscall(struct pt_regs *regs, unsigned long address) { struct task_struct *tsk; unsigned long caller; int vsyscall_nr; + int prev_sig_on_uaccess_error; long ret; /* @@ -179,35 +206,59 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address) if (seccomp_mode(&tsk->seccomp)) do_exit(SIGKILL); + /* + * With a real vsyscall, page faults cause SIGSEGV. We want to + * preserve that behavior to make writing exploits harder. + */ + prev_sig_on_uaccess_error = current_thread_info()->sig_on_uaccess_error; + current_thread_info()->sig_on_uaccess_error = 1; + + ret = -EFAULT; switch (vsyscall_nr) { case 0: + if (!write_ok_or_segv(regs->di, sizeof(struct timeval)) || + !write_ok_or_segv(regs->si, sizeof(struct timezone))) + break; + ret = sys_gettimeofday( (struct timeval __user *)regs->di, (struct timezone __user *)regs->si); break; case 1: + if (!write_ok_or_segv(regs->di, sizeof(time_t))) + break; + ret = sys_time((time_t __user *)regs->di); break; case 2: + if (!write_ok_or_segv(regs->di, sizeof(unsigned)) || + !write_ok_or_segv(regs->si, sizeof(unsigned))) + break; + ret = sys_getcpu((unsigned __user *)regs->di, (unsigned __user *)regs->si, 0); break; } + current_thread_info()->sig_on_uaccess_error = prev_sig_on_uaccess_error; + if (ret == -EFAULT) { - /* - * Bad news -- userspace fed a bad pointer to a vsyscall. - * - * With a real vsyscall, that would have caused SIGSEGV. - * To make writing reliable exploits using the emulated - * vsyscalls harder, generate SIGSEGV here as well. - */ + /* Bad news -- userspace fed a bad pointer to a vsyscall. */ warn_bad_vsyscall(KERN_INFO, regs, "vsyscall fault (exploit attempt?)"); - goto sigsegv; + + /* + * If we failed to generate a signal for any reason, + * generate one here. (This should be impossible.) + */ + if (WARN_ON_ONCE(!sigismember(&tsk->pending.signal, SIGBUS) && + !sigismember(&tsk->pending.signal, SIGSEGV))) + goto sigsegv; + + return true; /* Don't emulate the ret. */ } regs->ax = ret; diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index d0474ad..1fb85db 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -25,7 +25,7 @@ int fixup_exception(struct pt_regs *regs) if (fixup) { /* If fixup is less than 16, it means uaccess error */ if (fixup->fixup < 16) { - current_thread_info()->uaccess_err = -EFAULT; + current_thread_info()->uaccess_err = 1; regs->ip += fixup->fixup; return 1; } diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 0d17c8c..85bec26 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -620,7 +620,7 @@ pgtable_bad(struct pt_regs *regs, unsigned long error_code, static noinline void no_context(struct pt_regs *regs, unsigned long error_code, - unsigned long address) + unsigned long address, int signal, int si_code) { struct task_struct *tsk = current; unsigned long *stackend; @@ -628,8 +628,17 @@ no_context(struct pt_regs *regs, unsigned long error_code, int sig; /* Are we prepared to handle this kernel fault? */ - if (fixup_exception(regs)) + if (fixup_exception(regs)) { + if (current_thread_info()->sig_on_uaccess_error && signal) { + tsk->thread.trap_no = 14; + tsk->thread.error_code = error_code | PF_USER; + tsk->thread.cr2 = address; + + /* XXX: hwpoison faults will set the wrong code. */ + force_sig_info_fault(signal, si_code, address, tsk, 0); + } return; + } /* * 32-bit: @@ -749,7 +758,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code, if (is_f00f_bug(regs, address)) return; - no_context(regs, error_code, address); + no_context(regs, error_code, address, SIGSEGV, si_code); } static noinline void @@ -813,7 +822,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address, /* Kernel mode? Handle exceptions or die: */ if (!(error_code & PF_USER)) { - no_context(regs, error_code, address); + no_context(regs, error_code, address, SIGBUS, BUS_ADRERR); return; } @@ -848,7 +857,7 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code, if (!(fault & VM_FAULT_RETRY)) up_read(¤t->mm->mmap_sem); if (!(error_code & PF_USER)) - no_context(regs, error_code, address); + no_context(regs, error_code, address, 0, 0); return 1; } if (!(fault & VM_FAULT_ERROR)) @@ -858,7 +867,8 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code, /* Kernel mode? Handle exceptions or die: */ if (!(error_code & PF_USER)) { up_read(¤t->mm->mmap_sem); - no_context(regs, error_code, address); + no_context(regs, error_code, address, + SIGSEGV, SEGV_MAPERR); return 1; } -- 1.7.6.4 ^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH] x86-64: Set siginfo and context on vsyscall emulation faults 2011-10-21 21:01 ` [PATCH] x86-64: Set siginfo and context on vsyscall emulation faults Andy Lutomirski @ 2011-10-22 4:46 ` Linus Torvalds 2011-10-22 9:07 ` Andy Lutomirski 0 siblings, 1 reply; 50+ messages in thread From: Linus Torvalds @ 2011-10-22 4:46 UTC (permalink / raw) To: Andy Lutomirski Cc: x86, Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel On Sat, Oct 22, 2011 at 12:01 AM, Andy Lutomirski <luto@amacapital.net> wrote: > > +static bool write_ok_or_segv(unsigned long ptr, size_t size) > +{ > + if (ptr == 0) > + return true; Why is ptr==0 special? That makes no sense. Also, this whole function makes the notion of setting the "sigsegv on fault" flag much less interesting. It would be much better if access_ok() (including the cases embedded in get_user/put_user/etc) just did it right automatically for everything, rather than special-casing it for just this. I wonder if we could just make access_ok() use a trap instead of just the regular compares (and then in the trap handler do the same logic as in the page fault handler)? Sadly, the 'bounds' instruction doesn't work for this (in 32-bit mode it does a *signed* compare, and in 64-bit mode it no longer exists), but something like that might. That said, I think that your patch looks acceptable as a "let's fix vsyscalls without doing the bigger change". But I really don't see why ptr==0 would be special. So I think your write_ok_or_segv() function should just be static bool write_ok_or_segv(unsigned long ptr, size_t size) { if (access_ok(ptr, size)) return true; .. send signal ... return false; } instead of that odd thing you have now. Linus ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] x86-64: Set siginfo and context on vsyscall emulation faults 2011-10-22 4:46 ` Linus Torvalds @ 2011-10-22 9:07 ` Andy Lutomirski 2011-11-08 0:33 ` [PATCH 0/2] Fix and re-enable vsyscall=emulate Andy Lutomirski 0 siblings, 1 reply; 50+ messages in thread From: Andy Lutomirski @ 2011-10-22 9:07 UTC (permalink / raw) To: Linus Torvalds Cc: x86, Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel On Fri, Oct 21, 2011 at 9:46 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Sat, Oct 22, 2011 at 12:01 AM, Andy Lutomirski <luto@amacapital.net> wrote: >> >> +static bool write_ok_or_segv(unsigned long ptr, size_t size) >> +{ >> + if (ptr == 0) >> + return true; > > Why is ptr==0 special? That makes no sense. > Pure laziness. Null pointers to the vsyscalls are valid and mean that userspace doesn't care about the result. I could have put the check in the caller just as easily. > Also, this whole function makes the notion of setting the "sigsegv on > fault" flag much less interesting. It would be much better if > access_ok() (including the cases embedded in get_user/put_user/etc) > just did it right automatically for everything, rather than > special-casing it for just this. > Agreed. If I add an option to let userspace opt in to the signal-sending behavior, I'd want to convince myself that all callers of access_ok should be affected. > I wonder if we could just make access_ok() use a trap instead of just > the regular compares (and then in the trap handler do the same logic > as in the page fault handler)? Sadly, the 'bounds' instruction doesn't > work for this (in 32-bit mode it does a *signed* compare, and in > 64-bit mode it no longer exists), but something like that might. > I suspect that bounds is considerably slower than a comparison anyway. FWIW, there's a different optimization that could make a lot of this code much nicer: using asm goto for the failure path in get_user, etc. The failure path is already a branch, and if gcc could be convinced to generate sensible code for: if (put_user(...)) goto out; if (put_user(...)) goto out; if (put_user(...)) goto out; if (put_user(...)) goto out; then the uaccess_err mechanism and a whole lot of bitwise ors could go away. Sadly, gcc (at least 4.5 and 4.6) has weird limitations on the kind of constraints allowed on asm goto that, IIRC, make get_user impossible and put_user a little dicey. (I could have that backwards.) > That said, I think that your patch looks acceptable as a "let's fix > vsyscalls without doing the bigger change". But I really don't see why > ptr==0 would be special. > > So I think your write_ok_or_segv() function should just be > > static bool write_ok_or_segv(unsigned long ptr, size_t size) > { > if (access_ok(ptr, size)) > return true; > > .. send signal ... > > return false; > } > > instead of that odd thing you have now. Or a comment to clarify it. Alternatively I could ignore the issue because access to 0 is okay in the access_ok sense anyway. --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH 0/2] Fix and re-enable vsyscall=emulate 2011-10-22 9:07 ` Andy Lutomirski @ 2011-11-08 0:33 ` Andy Lutomirski 2011-11-08 0:33 ` [PATCH 1/2] x86-64: Set siginfo and context on vsyscall emulation faults Andy Lutomirski ` (2 more replies) 0 siblings, 3 replies; 50+ messages in thread From: Andy Lutomirski @ 2011-11-08 0:33 UTC (permalink / raw) To: Linus Torvalds Cc: x86, Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel, Andy Lutomirski The really nice fix (wiring up access_ok failures to be able to raise signals) won't be ready on time for 3.2, so let's try the simpler fix for now. Changes from the earlier version: - Clean up the odd ptr==0 check. - Flip the default back to vsyscall=emulate Andy Lutomirski (2): x86-64: Set siginfo and context on vsyscall emulation faults x86: Default to vsyscall=emulate Documentation/kernel-parameters.txt | 7 +-- arch/x86/include/asm/thread_info.h | 3 +- arch/x86/include/asm/uaccess.h | 2 +- arch/x86/kernel/vsyscall_64.c | 77 ++++++++++++++++++++++++++++++---- arch/x86/mm/extable.c | 2 +- arch/x86/mm/fault.c | 22 +++++++--- 6 files changed, 91 insertions(+), 22 deletions(-) -- 1.7.6.4 ^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH 1/2] x86-64: Set siginfo and context on vsyscall emulation faults 2011-11-08 0:33 ` [PATCH 0/2] Fix and re-enable vsyscall=emulate Andy Lutomirski @ 2011-11-08 0:33 ` Andy Lutomirski 2011-12-05 13:23 ` [tip:x86/asm] " tip-bot for Andy Lutomirski 2011-11-08 0:33 ` [PATCH 2/2] x86: Default to vsyscall=emulate Andy Lutomirski 2011-12-02 22:47 ` [PATCH 0/2] Fix and re-enable vsyscall=emulate Andy Lutomirski 2 siblings, 1 reply; 50+ messages in thread From: Andy Lutomirski @ 2011-11-08 0:33 UTC (permalink / raw) To: Linus Torvalds Cc: x86, Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel, Andy Lutomirski To make this work, we teach the page fault handler how to send signals on failed uaccess. This only works for user addresses (kernel addresses will never hit the page fault handler in the first place), so we need to generate signals for those separately. This gets the tricky case right: if the user buffer spans multiple pages and only the second page is invalid, we set cr2 and si_addr correctly. UML relies on this behavior to "fault in" pages as needed. We steal a bit from thread_info.uaccess_err to enable this. Before this change, uaccess_err was a 32-bit boolean value. This fixes issues with UML when vsyscall=emulate. Reported-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andy Lutomirski <luto@amacapital.net> --- arch/x86/include/asm/thread_info.h | 3 +- arch/x86/include/asm/uaccess.h | 2 +- arch/x86/kernel/vsyscall_64.c | 75 ++++++++++++++++++++++++++++++++---- arch/x86/mm/extable.c | 2 +- arch/x86/mm/fault.c | 22 ++++++++--- 5 files changed, 87 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index a1fe5c1..25ebd79 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -40,7 +40,8 @@ struct thread_info { */ __u8 supervisor_stack[0]; #endif - int uaccess_err; + int sig_on_uaccess_error:1; + int uaccess_err:1; /* uaccess failed */ }; #define INIT_THREAD_INFO(tsk) \ diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 36361bf..8be5f54 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -462,7 +462,7 @@ struct __large_struct { unsigned long buf[100]; }; barrier(); #define uaccess_catch(err) \ - (err) |= current_thread_info()->uaccess_err; \ + (err) |= (current_thread_info()->uaccess_err ? -EFAULT : 0); \ current_thread_info()->uaccess_err = prev_err; \ } while (0) diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c index b56c65de..9b05546 100644 --- a/arch/x86/kernel/vsyscall_64.c +++ b/arch/x86/kernel/vsyscall_64.c @@ -139,11 +139,40 @@ static int addr_to_vsyscall_nr(unsigned long addr) return nr; } +static bool write_ok_or_segv(unsigned long ptr, size_t size) +{ + /* + * XXX: if access_ok, get_user, and put_user handled + * sig_on_uaccess_error, this could go away. + */ + + if (!access_ok(VERIFY_WRITE, (void __user *)ptr, size)) { + siginfo_t info; + struct thread_struct *thread = ¤t->thread; + + thread->error_code = 6; /* user fault, no page, write */ + thread->cr2 = ptr; + thread->trap_no = 14; + + memset(&info, 0, sizeof(info)); + info.si_signo = SIGSEGV; + info.si_errno = 0; + info.si_code = SEGV_MAPERR; + info.si_addr = (void __user *)ptr; + + force_sig_info(SIGSEGV, &info, current); + return false; + } else { + return true; + } +} + bool emulate_vsyscall(struct pt_regs *regs, unsigned long address) { struct task_struct *tsk; unsigned long caller; int vsyscall_nr; + int prev_sig_on_uaccess_error; long ret; /* @@ -179,35 +208,65 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address) if (seccomp_mode(&tsk->seccomp)) do_exit(SIGKILL); + /* + * With a real vsyscall, page faults cause SIGSEGV. We want to + * preserve that behavior to make writing exploits harder. + */ + prev_sig_on_uaccess_error = current_thread_info()->sig_on_uaccess_error; + current_thread_info()->sig_on_uaccess_error = 1; + + /* + * 0 is a valid user pointer (in the access_ok sense) on 32-bit and + * 64-bit, so we don't need to special-case it here. For all the + * vsyscalls, 0 means "don't write anything" not "write it at + * address 0". + */ + ret = -EFAULT; switch (vsyscall_nr) { case 0: + if (!write_ok_or_segv(regs->di, sizeof(struct timeval)) || + !write_ok_or_segv(regs->si, sizeof(struct timezone))) + break; + ret = sys_gettimeofday( (struct timeval __user *)regs->di, (struct timezone __user *)regs->si); break; case 1: + if (!write_ok_or_segv(regs->di, sizeof(time_t))) + break; + ret = sys_time((time_t __user *)regs->di); break; case 2: + if (!write_ok_or_segv(regs->di, sizeof(unsigned)) || + !write_ok_or_segv(regs->si, sizeof(unsigned))) + break; + ret = sys_getcpu((unsigned __user *)regs->di, (unsigned __user *)regs->si, 0); break; } + current_thread_info()->sig_on_uaccess_error = prev_sig_on_uaccess_error; + if (ret == -EFAULT) { - /* - * Bad news -- userspace fed a bad pointer to a vsyscall. - * - * With a real vsyscall, that would have caused SIGSEGV. - * To make writing reliable exploits using the emulated - * vsyscalls harder, generate SIGSEGV here as well. - */ + /* Bad news -- userspace fed a bad pointer to a vsyscall. */ warn_bad_vsyscall(KERN_INFO, regs, "vsyscall fault (exploit attempt?)"); - goto sigsegv; + + /* + * If we failed to generate a signal for any reason, + * generate one here. (This should be impossible.) + */ + if (WARN_ON_ONCE(!sigismember(&tsk->pending.signal, SIGBUS) && + !sigismember(&tsk->pending.signal, SIGSEGV))) + goto sigsegv; + + return true; /* Don't emulate the ret. */ } regs->ax = ret; diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index d0474ad..1fb85db 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -25,7 +25,7 @@ int fixup_exception(struct pt_regs *regs) if (fixup) { /* If fixup is less than 16, it means uaccess error */ if (fixup->fixup < 16) { - current_thread_info()->uaccess_err = -EFAULT; + current_thread_info()->uaccess_err = 1; regs->ip += fixup->fixup; return 1; } diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 0d17c8c..85bec26 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -620,7 +620,7 @@ pgtable_bad(struct pt_regs *regs, unsigned long error_code, static noinline void no_context(struct pt_regs *regs, unsigned long error_code, - unsigned long address) + unsigned long address, int signal, int si_code) { struct task_struct *tsk = current; unsigned long *stackend; @@ -628,8 +628,17 @@ no_context(struct pt_regs *regs, unsigned long error_code, int sig; /* Are we prepared to handle this kernel fault? */ - if (fixup_exception(regs)) + if (fixup_exception(regs)) { + if (current_thread_info()->sig_on_uaccess_error && signal) { + tsk->thread.trap_no = 14; + tsk->thread.error_code = error_code | PF_USER; + tsk->thread.cr2 = address; + + /* XXX: hwpoison faults will set the wrong code. */ + force_sig_info_fault(signal, si_code, address, tsk, 0); + } return; + } /* * 32-bit: @@ -749,7 +758,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code, if (is_f00f_bug(regs, address)) return; - no_context(regs, error_code, address); + no_context(regs, error_code, address, SIGSEGV, si_code); } static noinline void @@ -813,7 +822,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address, /* Kernel mode? Handle exceptions or die: */ if (!(error_code & PF_USER)) { - no_context(regs, error_code, address); + no_context(regs, error_code, address, SIGBUS, BUS_ADRERR); return; } @@ -848,7 +857,7 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code, if (!(fault & VM_FAULT_RETRY)) up_read(¤t->mm->mmap_sem); if (!(error_code & PF_USER)) - no_context(regs, error_code, address); + no_context(regs, error_code, address, 0, 0); return 1; } if (!(fault & VM_FAULT_ERROR)) @@ -858,7 +867,8 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code, /* Kernel mode? Handle exceptions or die: */ if (!(error_code & PF_USER)) { up_read(¤t->mm->mmap_sem); - no_context(regs, error_code, address); + no_context(regs, error_code, address, + SIGSEGV, SEGV_MAPERR); return 1; } -- 1.7.6.4 ^ permalink raw reply related [flat|nested] 50+ messages in thread
* [tip:x86/asm] x86-64: Set siginfo and context on vsyscall emulation faults 2011-11-08 0:33 ` [PATCH 1/2] x86-64: Set siginfo and context on vsyscall emulation faults Andy Lutomirski @ 2011-12-05 13:23 ` tip-bot for Andy Lutomirski 0 siblings, 0 replies; 50+ messages in thread From: tip-bot for Andy Lutomirski @ 2011-12-05 13:23 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, torvalds, bunk, luto, richard.weinberger, tglx, hpa, mingo Commit-ID: 4fc3490114bb159bd4fff1b3c96f4320fe6fb08f Gitweb: http://git.kernel.org/tip/4fc3490114bb159bd4fff1b3c96f4320fe6fb08f Author: Andy Lutomirski <luto@amacapital.net> AuthorDate: Mon, 7 Nov 2011 16:33:40 -0800 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Mon, 5 Dec 2011 12:17:27 +0100 x86-64: Set siginfo and context on vsyscall emulation faults To make this work, we teach the page fault handler how to send signals on failed uaccess. This only works for user addresses (kernel addresses will never hit the page fault handler in the first place), so we need to generate signals for those separately. This gets the tricky case right: if the user buffer spans multiple pages and only the second page is invalid, we set cr2 and si_addr correctly. UML relies on this behavior to "fault in" pages as needed. We steal a bit from thread_info.uaccess_err to enable this. Before this change, uaccess_err was a 32-bit boolean value. This fixes issues with UML when vsyscall=emulate. Reported-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/4c8f91de7ec5cd2ef0f59521a04e1015f11e42b4.1320712291.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/x86/include/asm/thread_info.h | 3 +- arch/x86/include/asm/uaccess.h | 2 +- arch/x86/kernel/vsyscall_64.c | 75 ++++++++++++++++++++++++++++++++---- arch/x86/mm/extable.c | 2 +- arch/x86/mm/fault.c | 22 ++++++++--- 5 files changed, 87 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index a1fe5c1..25ebd79 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -40,7 +40,8 @@ struct thread_info { */ __u8 supervisor_stack[0]; #endif - int uaccess_err; + int sig_on_uaccess_error:1; + int uaccess_err:1; /* uaccess failed */ }; #define INIT_THREAD_INFO(tsk) \ diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 36361bf..8be5f54 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -462,7 +462,7 @@ struct __large_struct { unsigned long buf[100]; }; barrier(); #define uaccess_catch(err) \ - (err) |= current_thread_info()->uaccess_err; \ + (err) |= (current_thread_info()->uaccess_err ? -EFAULT : 0); \ current_thread_info()->uaccess_err = prev_err; \ } while (0) diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c index e4d4a22..8084bec 100644 --- a/arch/x86/kernel/vsyscall_64.c +++ b/arch/x86/kernel/vsyscall_64.c @@ -140,11 +140,40 @@ static int addr_to_vsyscall_nr(unsigned long addr) return nr; } +static bool write_ok_or_segv(unsigned long ptr, size_t size) +{ + /* + * XXX: if access_ok, get_user, and put_user handled + * sig_on_uaccess_error, this could go away. + */ + + if (!access_ok(VERIFY_WRITE, (void __user *)ptr, size)) { + siginfo_t info; + struct thread_struct *thread = ¤t->thread; + + thread->error_code = 6; /* user fault, no page, write */ + thread->cr2 = ptr; + thread->trap_no = 14; + + memset(&info, 0, sizeof(info)); + info.si_signo = SIGSEGV; + info.si_errno = 0; + info.si_code = SEGV_MAPERR; + info.si_addr = (void __user *)ptr; + + force_sig_info(SIGSEGV, &info, current); + return false; + } else { + return true; + } +} + bool emulate_vsyscall(struct pt_regs *regs, unsigned long address) { struct task_struct *tsk; unsigned long caller; int vsyscall_nr; + int prev_sig_on_uaccess_error; long ret; /* @@ -180,35 +209,65 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address) if (seccomp_mode(&tsk->seccomp)) do_exit(SIGKILL); + /* + * With a real vsyscall, page faults cause SIGSEGV. We want to + * preserve that behavior to make writing exploits harder. + */ + prev_sig_on_uaccess_error = current_thread_info()->sig_on_uaccess_error; + current_thread_info()->sig_on_uaccess_error = 1; + + /* + * 0 is a valid user pointer (in the access_ok sense) on 32-bit and + * 64-bit, so we don't need to special-case it here. For all the + * vsyscalls, 0 means "don't write anything" not "write it at + * address 0". + */ + ret = -EFAULT; switch (vsyscall_nr) { case 0: + if (!write_ok_or_segv(regs->di, sizeof(struct timeval)) || + !write_ok_or_segv(regs->si, sizeof(struct timezone))) + break; + ret = sys_gettimeofday( (struct timeval __user *)regs->di, (struct timezone __user *)regs->si); break; case 1: + if (!write_ok_or_segv(regs->di, sizeof(time_t))) + break; + ret = sys_time((time_t __user *)regs->di); break; case 2: + if (!write_ok_or_segv(regs->di, sizeof(unsigned)) || + !write_ok_or_segv(regs->si, sizeof(unsigned))) + break; + ret = sys_getcpu((unsigned __user *)regs->di, (unsigned __user *)regs->si, 0); break; } + current_thread_info()->sig_on_uaccess_error = prev_sig_on_uaccess_error; + if (ret == -EFAULT) { - /* - * Bad news -- userspace fed a bad pointer to a vsyscall. - * - * With a real vsyscall, that would have caused SIGSEGV. - * To make writing reliable exploits using the emulated - * vsyscalls harder, generate SIGSEGV here as well. - */ + /* Bad news -- userspace fed a bad pointer to a vsyscall. */ warn_bad_vsyscall(KERN_INFO, regs, "vsyscall fault (exploit attempt?)"); - goto sigsegv; + + /* + * If we failed to generate a signal for any reason, + * generate one here. (This should be impossible.) + */ + if (WARN_ON_ONCE(!sigismember(&tsk->pending.signal, SIGBUS) && + !sigismember(&tsk->pending.signal, SIGSEGV))) + goto sigsegv; + + return true; /* Don't emulate the ret. */ } regs->ax = ret; diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index d0474ad..1fb85db 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -25,7 +25,7 @@ int fixup_exception(struct pt_regs *regs) if (fixup) { /* If fixup is less than 16, it means uaccess error */ if (fixup->fixup < 16) { - current_thread_info()->uaccess_err = -EFAULT; + current_thread_info()->uaccess_err = 1; regs->ip += fixup->fixup; return 1; } diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 5db0490..9d74824 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -626,7 +626,7 @@ pgtable_bad(struct pt_regs *regs, unsigned long error_code, static noinline void no_context(struct pt_regs *regs, unsigned long error_code, - unsigned long address) + unsigned long address, int signal, int si_code) { struct task_struct *tsk = current; unsigned long *stackend; @@ -634,8 +634,17 @@ no_context(struct pt_regs *regs, unsigned long error_code, int sig; /* Are we prepared to handle this kernel fault? */ - if (fixup_exception(regs)) + if (fixup_exception(regs)) { + if (current_thread_info()->sig_on_uaccess_error && signal) { + tsk->thread.trap_no = 14; + tsk->thread.error_code = error_code | PF_USER; + tsk->thread.cr2 = address; + + /* XXX: hwpoison faults will set the wrong code. */ + force_sig_info_fault(signal, si_code, address, tsk, 0); + } return; + } /* * 32-bit: @@ -755,7 +764,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code, if (is_f00f_bug(regs, address)) return; - no_context(regs, error_code, address); + no_context(regs, error_code, address, SIGSEGV, si_code); } static noinline void @@ -819,7 +828,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address, /* Kernel mode? Handle exceptions or die: */ if (!(error_code & PF_USER)) { - no_context(regs, error_code, address); + no_context(regs, error_code, address, SIGBUS, BUS_ADRERR); return; } @@ -854,7 +863,7 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code, if (!(fault & VM_FAULT_RETRY)) up_read(¤t->mm->mmap_sem); if (!(error_code & PF_USER)) - no_context(regs, error_code, address); + no_context(regs, error_code, address, 0, 0); return 1; } if (!(fault & VM_FAULT_ERROR)) @@ -864,7 +873,8 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code, /* Kernel mode? Handle exceptions or die: */ if (!(error_code & PF_USER)) { up_read(¤t->mm->mmap_sem); - no_context(regs, error_code, address); + no_context(regs, error_code, address, + SIGSEGV, SEGV_MAPERR); return 1; } ^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 2/2] x86: Default to vsyscall=emulate 2011-11-08 0:33 ` [PATCH 0/2] Fix and re-enable vsyscall=emulate Andy Lutomirski 2011-11-08 0:33 ` [PATCH 1/2] x86-64: Set siginfo and context on vsyscall emulation faults Andy Lutomirski @ 2011-11-08 0:33 ` Andy Lutomirski 2011-12-05 13:24 ` [tip:x86/asm] " tip-bot for Andy Lutomirski 2011-12-02 22:47 ` [PATCH 0/2] Fix and re-enable vsyscall=emulate Andy Lutomirski 2 siblings, 1 reply; 50+ messages in thread From: Andy Lutomirski @ 2011-11-08 0:33 UTC (permalink / raw) To: Linus Torvalds Cc: x86, Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel, Andy Lutomirski This reverts commit 2b666859ec323403ac9a3a441d16eab30945404b. The ABI breakage should be fixed by: commit 48c4206f5b02f28c4c78a1f5b491d3772fb64fb9 Author: Andy Lutomirski <luto@mit.edu> Date: Thu Oct 20 08:48:19 2011 -0700 x86-64: Set siginfo and context on vsyscall emulation faults Signed-off-by: Andy Lutomirski <luto@amacapital.net> --- Documentation/kernel-parameters.txt | 7 +++---- arch/x86/kernel/vsyscall_64.c | 2 +- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index d6e6724..854ed5ca 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2706,11 +2706,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. functions are at fixed addresses, they make nice targets for exploits that can control RIP. - emulate Vsyscalls turn into traps and are emulated - reasonably safely. + emulate [default] Vsyscalls turn into traps and are + emulated reasonably safely. - native [default] Vsyscalls are native syscall - instructions. + native Vsyscalls are native syscall instructions. This is a little bit faster than trapping and makes a few dynamic recompilers work better than they would in emulation mode. diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c index 9b05546..02e980a 100644 --- a/arch/x86/kernel/vsyscall_64.c +++ b/arch/x86/kernel/vsyscall_64.c @@ -56,7 +56,7 @@ DEFINE_VVAR(struct vsyscall_gtod_data, vsyscall_gtod_data) = .lock = __SEQLOCK_UNLOCKED(__vsyscall_gtod_data.lock), }; -static enum { EMULATE, NATIVE, NONE } vsyscall_mode = NATIVE; +static enum { EMULATE, NATIVE, NONE } vsyscall_mode = EMULATE; static int __init vsyscall_setup(char *str) { -- 1.7.6.4 ^ permalink raw reply related [flat|nested] 50+ messages in thread
* [tip:x86/asm] x86: Default to vsyscall=emulate 2011-11-08 0:33 ` [PATCH 2/2] x86: Default to vsyscall=emulate Andy Lutomirski @ 2011-12-05 13:24 ` tip-bot for Andy Lutomirski 0 siblings, 0 replies; 50+ messages in thread From: tip-bot for Andy Lutomirski @ 2011-12-05 13:24 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, torvalds, bunk, luto, richard.weinberger, tglx, hpa, mingo Commit-ID: 2e57ae0515124af45dd889bfbd4840fd40fcc07d Gitweb: http://git.kernel.org/tip/2e57ae0515124af45dd889bfbd4840fd40fcc07d Author: Andy Lutomirski <luto@amacapital.net> AuthorDate: Mon, 7 Nov 2011 16:33:41 -0800 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Mon, 5 Dec 2011 12:17:29 +0100 x86: Default to vsyscall=emulate This essentially reverts: 2b666859ec32: x86: Default to vsyscall=native for now The ABI breakage should now be fixed by: commit 48c4206f5b02f28c4c78a1f5b491d3772fb64fb9 Author: Andy Lutomirski <luto@mit.edu> Date: Thu Oct 20 08:48:19 2011 -0700 x86-64: Set siginfo and context on vsyscall emulation faults Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: Adrian Bunk <bunk@stusta.de> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/93154af3b2b6d208906ae02d80d92cf60c6fa94f.1320712291.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@elte.hu> --- Documentation/kernel-parameters.txt | 7 +++---- arch/x86/kernel/vsyscall_64.c | 2 +- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index a0c5c5f..ce7fc8b 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2750,11 +2750,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. functions are at fixed addresses, they make nice targets for exploits that can control RIP. - emulate Vsyscalls turn into traps and are emulated - reasonably safely. + emulate [default] Vsyscalls turn into traps and are + emulated reasonably safely. - native [default] Vsyscalls are native syscall - instructions. + native Vsyscalls are native syscall instructions. This is a little bit faster than trapping and makes a few dynamic recompilers work better than they would in emulation mode. diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c index 8084bec..b07ba93 100644 --- a/arch/x86/kernel/vsyscall_64.c +++ b/arch/x86/kernel/vsyscall_64.c @@ -57,7 +57,7 @@ DEFINE_VVAR(struct vsyscall_gtod_data, vsyscall_gtod_data) = .lock = __SEQLOCK_UNLOCKED(__vsyscall_gtod_data.lock), }; -static enum { EMULATE, NATIVE, NONE } vsyscall_mode = NATIVE; +static enum { EMULATE, NATIVE, NONE } vsyscall_mode = EMULATE; static int __init vsyscall_setup(char *str) { ^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH 0/2] Fix and re-enable vsyscall=emulate 2011-11-08 0:33 ` [PATCH 0/2] Fix and re-enable vsyscall=emulate Andy Lutomirski 2011-11-08 0:33 ` [PATCH 1/2] x86-64: Set siginfo and context on vsyscall emulation faults Andy Lutomirski 2011-11-08 0:33 ` [PATCH 2/2] x86: Default to vsyscall=emulate Andy Lutomirski @ 2011-12-02 22:47 ` Andy Lutomirski 2011-12-05 11:18 ` H. Peter Anvin 2 siblings, 1 reply; 50+ messages in thread From: Andy Lutomirski @ 2011-12-02 22:47 UTC (permalink / raw) To: Linus Torvalds Cc: x86, Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel, Andy Lutomirski On Mon, Nov 7, 2011 at 4:33 PM, Andy Lutomirski <luto@amacapital.net> wrote: > The really nice fix (wiring up access_ok failures to be able to raise > signals) won't be ready on time for 3.2, so let's try the simpler fix > for now. I spoke to hpa about this a couple days ago, and he pointed out a problem with making access_ok send signals. Userspace expects signals that come with full context information to be restartable, and many system calls are not restartable. read() and write() are the obvious examples: once they're processed the beginning of the buffer, unless they adjust their parameters, they can't safely be restarted. So without massive changes, I think allowing access_ok to raise a signal with full context is asking for trouble. I can still do the patch with two modes: signals without context via arch_prctl and signals with context via vsyscall emulation, but that's probably overkill for fixing this bug. I'd say just apply these patches as is (for 3.3). --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 0/2] Fix and re-enable vsyscall=emulate 2011-12-02 22:47 ` [PATCH 0/2] Fix and re-enable vsyscall=emulate Andy Lutomirski @ 2011-12-05 11:18 ` H. Peter Anvin 0 siblings, 0 replies; 50+ messages in thread From: H. Peter Anvin @ 2011-12-05 11:18 UTC (permalink / raw) To: Andy Lutomirski Cc: Linus Torvalds, x86, Ingo Molnar, richard -rw- weinberger, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel On 12/02/2011 02:47 PM, Andy Lutomirski wrote: > On Mon, Nov 7, 2011 at 4:33 PM, Andy Lutomirski <luto@amacapital.net> wrote: >> The really nice fix (wiring up access_ok failures to be able to raise >> signals) won't be ready on time for 3.2, so let's try the simpler fix >> for now. > > I spoke to hpa about this a couple days ago, and he pointed out a > problem with making access_ok send signals. Userspace expects signals > that come with full context information to be restartable, and many > system calls are not restartable. read() and write() are the obvious > examples: once they're processed the beginning of the buffer, unless > they adjust their parameters, they can't safely be restarted. So > without massive changes, I think allowing access_ok to raise a signal > with full context is asking for trouble. > > I can still do the patch with two modes: signals without context via > arch_prctl and signals with context via vsyscall emulation, but that's > probably overkill for fixing this bug. I'd say just apply these > patches as is (for 3.3). > It's somewhat questionable if the "return -EFAULT and deliver SIGSEGV" semantic resolves the problem; obviously the signal handler isn't restartable, but returning from the signal handler will at least cause the application to see the EFAULT and not try to restart a system call in a way that is likely to cause massive failure. If the handler is aware about what needs to be done then it can correct the situation and restart the system call -- but it would have to have detailed information about the state before the system call. I am also concerned about information leaks from the kernel. The existing kernel paths are not necessarily designed to be robust against giving out additional error information. This may be a theoretical concern, but there have been real security holes in the past from these kinds of changes. -hpa ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-11 17:24 ` [RFC] fixing the UML failure root cause Andrew Lutomirski 2011-10-13 6:19 ` Linus Torvalds @ 2011-10-14 19:53 ` richard -rw- weinberger 2011-10-14 20:17 ` Andrew Lutomirski 1 sibling, 1 reply; 50+ messages in thread From: richard -rw- weinberger @ 2011-10-14 19:53 UTC (permalink / raw) To: Andrew Lutomirski Cc: Ingo Molnar, Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Tue, Oct 11, 2011 at 7:24 PM, Andrew Lutomirski <luto@mit.edu> wrote: > What do you think of this approach? If it seems good, I'll finish the > patch and submit it. > > With this patch applied, UML appears to work, but it fills the log with > exploit attempt warnings. Any ideas on what to do about that? > I can confirm that this patch works. And I really like vsyscall=emulate because with that UML can trap vsyscalls. :-) -- Thanks, //richard ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-14 19:53 ` [RFC] fixing the UML failure root cause richard -rw- weinberger @ 2011-10-14 20:17 ` Andrew Lutomirski 2011-10-14 20:23 ` richard -rw- weinberger 2011-10-14 22:28 ` richard -rw- weinberger 0 siblings, 2 replies; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-14 20:17 UTC (permalink / raw) To: richard -rw- weinberger Cc: Ingo Molnar, Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Fri, Oct 14, 2011 at 12:53 PM, richard -rw- weinberger <richard.weinberger@gmail.com> wrote: > On Tue, Oct 11, 2011 at 7:24 PM, Andrew Lutomirski <luto@mit.edu> wrote: >> What do you think of this approach? If it seems good, I'll finish the >> patch and submit it. >> >> With this patch applied, UML appears to work, but it fills the log with >> exploit attempt warnings. Any ideas on what to do about that? >> > > I can confirm that this patch works. > And I really like vsyscall=emulate because with that UML can trap vsyscalls. :-) Are you sure you don't mean vsyscall=native? I suspect that UML can't actually trap vsyscalls in emulate mode right now, although that ought to be fixable. --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-14 20:17 ` Andrew Lutomirski @ 2011-10-14 20:23 ` richard -rw- weinberger 2011-10-14 20:31 ` Andrew Lutomirski 2011-10-14 22:28 ` richard -rw- weinberger 1 sibling, 1 reply; 50+ messages in thread From: richard -rw- weinberger @ 2011-10-14 20:23 UTC (permalink / raw) To: Andrew Lutomirski Cc: Ingo Molnar, Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Fri, Oct 14, 2011 at 10:17 PM, Andrew Lutomirski <luto@mit.edu> wrote: > On Fri, Oct 14, 2011 at 12:53 PM, richard -rw- weinberger > <richard.weinberger@gmail.com> wrote: >> On Tue, Oct 11, 2011 at 7:24 PM, Andrew Lutomirski <luto@mit.edu> wrote: >>> What do you think of this approach? If it seems good, I'll finish the >>> patch and submit it. >>> >>> With this patch applied, UML appears to work, but it fills the log with >>> exploit attempt warnings. Any ideas on what to do about that? >>> >> >> I can confirm that this patch works. >> And I really like vsyscall=emulate because with that UML can trap vsyscalls. :-) > > Are you sure you don't mean vsyscall=native? I suspect that UML can't > actually trap vsyscalls in emulate mode right now, although that ought > to be fixable. > Doesn't vsyscall_emu_64.S transform any vsyscall into a real syscall? So UML can trap it. -- Thanks, //richard ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-14 20:23 ` richard -rw- weinberger @ 2011-10-14 20:31 ` Andrew Lutomirski 2011-10-14 20:39 ` richard -rw- weinberger 0 siblings, 1 reply; 50+ messages in thread From: Andrew Lutomirski @ 2011-10-14 20:31 UTC (permalink / raw) To: richard -rw- weinberger Cc: Ingo Molnar, Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Fri, Oct 14, 2011 at 1:23 PM, richard -rw- weinberger <richard.weinberger@gmail.com> wrote: > On Fri, Oct 14, 2011 at 10:17 PM, Andrew Lutomirski <luto@mit.edu> wrote: >> On Fri, Oct 14, 2011 at 12:53 PM, richard -rw- weinberger >> <richard.weinberger@gmail.com> wrote: >>> On Tue, Oct 11, 2011 at 7:24 PM, Andrew Lutomirski <luto@mit.edu> wrote: >>>> What do you think of this approach? If it seems good, I'll finish the >>>> patch and submit it. >>>> >>>> With this patch applied, UML appears to work, but it fills the log with >>>> exploit attempt warnings. Any ideas on what to do about that? >>>> >>> >>> I can confirm that this patch works. >>> And I really like vsyscall=emulate because with that UML can trap vsyscalls. :-) >> >> Are you sure you don't mean vsyscall=native? I suspect that UML can't >> actually trap vsyscalls in emulate mode right now, although that ought >> to be fixable. >> > > Doesn't vsyscall_emu_64.S transform any vsyscall into a real syscall? > So UML can trap it. Only if that code actually executes. In vsyscall=emulate mode, the page is not executable and a trap is taken instead. It's not entirely clear what the right thing to do is wrt ptrace users. --Andy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-14 20:31 ` Andrew Lutomirski @ 2011-10-14 20:39 ` richard -rw- weinberger 0 siblings, 0 replies; 50+ messages in thread From: richard -rw- weinberger @ 2011-10-14 20:39 UTC (permalink / raw) To: Andrew Lutomirski Cc: Ingo Molnar, Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Fri, Oct 14, 2011 at 10:31 PM, Andrew Lutomirski <luto@mit.edu> wrote: > On Fri, Oct 14, 2011 at 1:23 PM, richard -rw- weinberger > <richard.weinberger@gmail.com> wrote: >> On Fri, Oct 14, 2011 at 10:17 PM, Andrew Lutomirski <luto@mit.edu> wrote: >>> On Fri, Oct 14, 2011 at 12:53 PM, richard -rw- weinberger >>> <richard.weinberger@gmail.com> wrote: >>>> On Tue, Oct 11, 2011 at 7:24 PM, Andrew Lutomirski <luto@mit.edu> wrote: >>>>> What do you think of this approach? If it seems good, I'll finish the >>>>> patch and submit it. >>>>> >>>>> With this patch applied, UML appears to work, but it fills the log with >>>>> exploit attempt warnings. Any ideas on what to do about that? >>>>> >>>> >>>> I can confirm that this patch works. >>>> And I really like vsyscall=emulate because with that UML can trap vsyscalls. :-) >>> >>> Are you sure you don't mean vsyscall=native? I suspect that UML can't >>> actually trap vsyscalls in emulate mode right now, although that ought >>> to be fixable. >>> >> >> Doesn't vsyscall_emu_64.S transform any vsyscall into a real syscall? >> So UML can trap it. > > Only if that code actually executes. In vsyscall=emulate mode, the > page is not executable and a trap is taken instead. It's not entirely > clear what the right thing to do is wrt ptrace users. Okay. I did some tests, in vsyscall=emulate mode a statically linked program reports always the correct time. On < 3.1 kernel this was not the case, here the same program reports always the hosts time... -- Thanks, //richard ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-14 20:17 ` Andrew Lutomirski 2011-10-14 20:23 ` richard -rw- weinberger @ 2011-10-14 22:28 ` richard -rw- weinberger 2011-10-15 16:57 ` Ingo Molnar 1 sibling, 1 reply; 50+ messages in thread From: richard -rw- weinberger @ 2011-10-14 22:28 UTC (permalink / raw) To: Andrew Lutomirski Cc: Ingo Molnar, Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Fri, Oct 14, 2011 at 10:17 PM, Andrew Lutomirski <luto@mit.edu> wrote: > On Fri, Oct 14, 2011 at 12:53 PM, richard -rw- weinberger > <richard.weinberger@gmail.com> wrote: >> On Tue, Oct 11, 2011 at 7:24 PM, Andrew Lutomirski <luto@mit.edu> wrote: >>> What do you think of this approach? If it seems good, I'll finish the >>> patch and submit it. >>> >>> With this patch applied, UML appears to work, but it fills the log with >>> exploit attempt warnings. Any ideas on what to do about that? >>> >> >> I can confirm that this patch works. >> And I really like vsyscall=emulate because with that UML can trap vsyscalls. :-) > > Are you sure you don't mean vsyscall=native? I suspect that UML can't > actually trap vsyscalls in emulate mode right now, although that ought > to be fixable. > §/%)"&!, you are so right! I missed that vsyscall=native is the default setting now. Sorry for the confusion. -- Thanks, //richard ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC] fixing the UML failure root cause 2011-10-14 22:28 ` richard -rw- weinberger @ 2011-10-15 16:57 ` Ingo Molnar 0 siblings, 0 replies; 50+ messages in thread From: Ingo Molnar @ 2011-10-15 16:57 UTC (permalink / raw) To: richard -rw- weinberger Cc: Andrew Lutomirski, Linus Torvalds, Adrian Bunk, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel * richard -rw- weinberger <richard.weinberger@gmail.com> wrote: > On Fri, Oct 14, 2011 at 10:17 PM, Andrew Lutomirski <luto@mit.edu> wrote: > > On Fri, Oct 14, 2011 at 12:53 PM, richard -rw- weinberger > > <richard.weinberger@gmail.com> wrote: > >> On Tue, Oct 11, 2011 at 7:24 PM, Andrew Lutomirski <luto@mit.edu> wrote: > >>> What do you think of this approach? If it seems good, I'll finish the > >>> patch and submit it. > >>> > >>> With this patch applied, UML appears to work, but it fills the log with > >>> exploit attempt warnings. Any ideas on what to do about that? > >>> > >> > >> I can confirm that this patch works. > >> And I really like vsyscall=emulate because with that UML can trap vsyscalls. :-) > > > > Are you sure you don't mean vsyscall=native? I suspect that UML can't > > actually trap vsyscalls in emulate mode right now, although that ought > > to be fixable. > > > > §/%)"&!, you are so right! > I missed that vsyscall=native is the default setting now. > Sorry for the confusion. Switch back to vsyscall=native was just a temporary ABI fix for v3.1 - we'd like to switch to vsyscall=emulate again ASAP (possibly in v3.2), once Andrew is done with the patch and everyone is happy with it. Thanks, Ingo ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-05 22:13 ` Andrew Lutomirski 2011-10-05 22:22 ` richard -rw- weinberger @ 2011-10-05 22:24 ` Adrian Bunk 1 sibling, 0 replies; 50+ messages in thread From: Adrian Bunk @ 2011-10-05 22:24 UTC (permalink / raw) To: Andrew Lutomirski Cc: H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Wed, Oct 05, 2011 at 03:13:51PM -0700, Andrew Lutomirski wrote: > On Mon, Oct 3, 2011 at 10:33 AM, Adrian Bunk <bunk@stusta.de> wrote: > > On Mon, Oct 03, 2011 at 06:04:53AM -0700, Andrew Lutomirski wrote: > >> On Mon, Oct 3, 2011 at 2:08 AM, Adrian Bunk <bunk@stusta.de> wrote: > >> > After upgrading a kernel the existing userspace should just work > >> > (assuming it did work before ;-) ), but when I upgraded my kernel > >> > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. > >> > > >> > dmesg said: > >> > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 > >> > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 > >> > > >> > Looking throught the changelog I ended up at commit 3ae36655 > >> > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). > >> > > >> > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to > >> > vsyscall=native. > >> > > >> > That sounds reasonable to me, and fixes the problem for me. > >> > >> At this point in the -rc cycle, this sounds fine. > >> > >> That being said, I'd like to fix it for real for 3.2. This particular > >> failure is suspicious -- the "vsyscall fault" message means that > >> sys_gettimeofday returned EFAULT, which means that the old (3.0 and > >> before) vgettimeofday should *also* have segfaulted. > > > > This 2.6.30.1 UML kernel binary from 2009 worked for me for all host > > kernels from 2.6.30 to 3.0, and with 3.1.0-rc8 and vsyscall=native > > it also seems to run nicely. > > > > Looking deeper into "a UML instance didn't come up properly", > > the problem is that it comes up in a strange (readonly) state. > > > > There are "Using makefile-style concurrent boot in runlevel S." > > and "Using makefile-style concurrent boot in runlevel 2." in the > > logs with a Debian userspace, but no output from the init scripts > > in these broken bootups (normal messages are in non-broken bootups). > > > > Perhaps the two the messages I see in dmesg on the host are from the > > processes running rcS and rc2 failing early? > > > > In a working startup with a Debian userspace, I'm getting during rcS > > Setting the system clock. > > Cannot access the Hardware Clock via any known method. > > Use the --debug option to see the details of our search for an access method. > > Unable to set System Clock to: Mon Oct 3 17:01:35 UTC 2011 ... (warning). > > > >> We do have a bit > >> of a bug in that the new code doesn't report si_addr properly, but > >> that sounds unlikely as a culprit. Did you try with the offending > >> commit reverted (i.e. fce8dc0)? I bet that it also fails there. > > > > fce8dc0 is "x86-64: Wire up getcpu syscall", is that really the one you > > want me to revert? > > > >> What's the .config for your UML binary? I'd like to see if I can > >> reproduce this. > > > > It's attached. > > > > I can't reproduce it. What distro is running inside the UML instance? Debian stable. > --Andy cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-03 9:08 [3.1 patch] x86: default to vsyscall=native Adrian Bunk 2011-10-03 13:04 ` Andrew Lutomirski @ 2011-10-03 13:19 ` richard -rw- weinberger 2011-10-03 17:46 ` Adrian Bunk 1 sibling, 1 reply; 50+ messages in thread From: richard -rw- weinberger @ 2011-10-03 13:19 UTC (permalink / raw) To: Adrian Bunk Cc: Andy Lutomirski, H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel Adrian, On Mon, Oct 3, 2011 at 11:08 AM, Adrian Bunk <bunk@stusta.de> wrote: > After upgrading a kernel the existing userspace should just work > (assuming it did work before ;-) ), but when I upgraded my kernel > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. > Are only old UML kernels like 2.6.30.1 affected? Anyway, it's time to upgrade my main machine to 3.1.0-rc8 to observe some new UML issues. ;-) -- Thanks, //richard ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-03 13:19 ` richard -rw- weinberger @ 2011-10-03 17:46 ` Adrian Bunk 0 siblings, 0 replies; 50+ messages in thread From: Adrian Bunk @ 2011-10-03 17:46 UTC (permalink / raw) To: richard -rw- weinberger Cc: Andy Lutomirski, H. Peter Anvin, Linus Torvalds, Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Mon, Oct 03, 2011 at 03:19:59PM +0200, richard -rw- weinberger wrote: > Adrian, > > On Mon, Oct 3, 2011 at 11:08 AM, Adrian Bunk <bunk@stusta.de> wrote: > > After upgrading a kernel the existing userspace should just work > > (assuming it did work before ;-) ), but when I upgraded my kernel > > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. > > Are only old UML kernels like 2.6.30.1 affected? I don't know, that's my only running UML instance. > Anyway, it's time to upgrade my main machine to 3.1.0-rc8 to observe > some new UML issues. ;-) > > -- > Thanks, > //richard cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 50+ messages in thread
* [3.1 patch] x86: default to vsyscall=native @ 2011-10-05 21:40 Adrian Bunk 2011-10-05 22:01 ` Thomas Gleixner 0 siblings, 1 reply; 50+ messages in thread From: Adrian Bunk @ 2011-10-05 21:40 UTC (permalink / raw) To: Andrew Lutomirski, H. Peter Anvin Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel, Andrew Morton After upgrading a kernel the existing userspace should just work (assuming it did work before ;-) ), but when I upgraded my kernel from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. dmesg said: linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 Looking throught the changelog I ended up at commit 3ae36655 ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to vsyscall=native. That sounds reasonable to me, and fixes the problem for me. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Andrew Lutomirski <luto@mit.edu> --- Documentation/kernel-parameters.txt | 7 ++++--- arch/x86/kernel/vsyscall_64.c | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 854ed5ca..d6e6724 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2706,10 +2706,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. functions are at fixed addresses, they make nice targets for exploits that can control RIP. - emulate [default] Vsyscalls turn into traps and are - emulated reasonably safely. + emulate Vsyscalls turn into traps and are emulated + reasonably safely. - native Vsyscalls are native syscall instructions. + native [default] Vsyscalls are native syscall + instructions. This is a little bit faster than trapping and makes a few dynamic recompilers work better than they would in emulation mode. diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c index 18ae83d..b56c65de 100644 --- a/arch/x86/kernel/vsyscall_64.c +++ b/arch/x86/kernel/vsyscall_64.c @@ -56,7 +56,7 @@ DEFINE_VVAR(struct vsyscall_gtod_data, vsyscall_gtod_data) = .lock = __SEQLOCK_UNLOCKED(__vsyscall_gtod_data.lock), }; -static enum { EMULATE, NATIVE, NONE } vsyscall_mode = EMULATE; +static enum { EMULATE, NATIVE, NONE } vsyscall_mode = NATIVE; static int __init vsyscall_setup(char *str) { -- 1.7.6.3 ^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-05 21:40 Adrian Bunk @ 2011-10-05 22:01 ` Thomas Gleixner 2011-10-09 13:45 ` Adrian Bunk 0 siblings, 1 reply; 50+ messages in thread From: Thomas Gleixner @ 2011-10-05 22:01 UTC (permalink / raw) To: Adrian Bunk Cc: Andrew Lutomirski, H. Peter Anvin, Ingo Molnar, x86, LKML, Andrew Morton, Linus Torvalds, Arjan van de Ven On Thu, 6 Oct 2011, Adrian Bunk wrote: > After upgrading a kernel the existing userspace should just work > (assuming it did work before ;-) ), but when I upgraded my kernel > from 3.0.4 to 3.1.0-rc8 a UML instance didn't come up properly. > > dmesg said: > linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 > linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 > > Looking throught the changelog I ended up at commit 3ae36655 > ("x86-64: Rework vsyscall emulation and add vsyscall= parameter"). > > Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to > vsyscall=native. > > That sounds reasonable to me, and fixes the problem for me. NAK. We have way too long listened to people who insisted that we keep all known security holes open by default for the sake of backwards compatibility. Default wants to be restricted and not the other way round. Forcing people to loosen restrictions makes them aware of the problem. Not doing so keeps them in the illusion that stuff is just safe to use. We might need better dmesg output, e.g. printk_once("you might run something which requires vsyscall=native, but be aware that you are opening a security hole. See Documentation/....") That's fine, but making the defaults insecure is just ass backwards. Thanks, tglx ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [3.1 patch] x86: default to vsyscall=native 2011-10-05 22:01 ` Thomas Gleixner @ 2011-10-09 13:45 ` Adrian Bunk 0 siblings, 0 replies; 50+ messages in thread From: Adrian Bunk @ 2011-10-09 13:45 UTC (permalink / raw) To: Thomas Gleixner Cc: Andrew Lutomirski, H. Peter Anvin, Ingo Molnar, x86, LKML, Andrew Morton, Linus Torvalds, Arjan van de Ven On Thu, Oct 06, 2011 at 12:01:44AM +0200, Thomas Gleixner wrote: >... > We might need better dmesg output, e.g. > > printk_once("you might run something which requires > vsyscall=native, but be aware that you are > opening a security hole. See Documentation/....") > > That's fine, but making the defaults insecure is just ass backwards. Better dmesg output is in any case a better idea, patch is coming. I stayed with warn_bad_vsyscall() instead of printk_once() for the following reasons: - _once is bad for something that might indicate exploit attempts, warn_bad_vsyscall() is already ratelimited - the name and pid of the process should be shown - the additional output of warn_bad_vsyscall() can help determine what caused it > Thanks, > > tglx cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 50+ messages in thread
end of thread, other threads:[~2011-12-05 13:27 UTC | newest] Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-10-03 9:08 [3.1 patch] x86: default to vsyscall=native Adrian Bunk 2011-10-03 13:04 ` Andrew Lutomirski 2011-10-03 17:33 ` Adrian Bunk 2011-10-03 18:06 ` Andrew Lutomirski 2011-10-03 18:41 ` Adrian Bunk 2011-10-05 22:13 ` Andrew Lutomirski 2011-10-05 22:22 ` richard -rw- weinberger 2011-10-05 22:30 ` Adrian Bunk 2011-10-05 22:41 ` richard -rw- weinberger 2011-10-05 22:46 ` Andrew Lutomirski 2011-10-05 23:36 ` Andrew Lutomirski 2011-10-06 3:06 ` Andrew Lutomirski 2011-10-06 12:12 ` richard -rw- weinberger 2011-10-06 15:37 ` richard -rw- weinberger 2011-10-06 18:16 ` Andrew Lutomirski 2011-10-06 18:34 ` Linus Torvalds 2011-10-07 0:48 ` Andrew Lutomirski 2011-10-10 11:19 ` richard -rw- weinberger 2011-10-10 11:48 ` Ingo Molnar 2011-10-10 15:31 ` Andrew Lutomirski 2011-10-11 6:22 ` Ingo Molnar 2011-10-11 17:24 ` [RFC] fixing the UML failure root cause Andrew Lutomirski 2011-10-13 6:19 ` Linus Torvalds 2011-10-13 8:40 ` Andrew Lutomirski 2011-10-14 4:46 ` Linus Torvalds 2011-10-14 6:30 ` Andrew Lutomirski 2011-10-14 20:10 ` Linus Torvalds 2011-10-21 21:01 ` [PATCH] x86-64: Set siginfo and context on vsyscall emulation faults Andy Lutomirski 2011-10-22 4:46 ` Linus Torvalds 2011-10-22 9:07 ` Andy Lutomirski 2011-11-08 0:33 ` [PATCH 0/2] Fix and re-enable vsyscall=emulate Andy Lutomirski 2011-11-08 0:33 ` [PATCH 1/2] x86-64: Set siginfo and context on vsyscall emulation faults Andy Lutomirski 2011-12-05 13:23 ` [tip:x86/asm] " tip-bot for Andy Lutomirski 2011-11-08 0:33 ` [PATCH 2/2] x86: Default to vsyscall=emulate Andy Lutomirski 2011-12-05 13:24 ` [tip:x86/asm] " tip-bot for Andy Lutomirski 2011-12-02 22:47 ` [PATCH 0/2] Fix and re-enable vsyscall=emulate Andy Lutomirski 2011-12-05 11:18 ` H. Peter Anvin 2011-10-14 19:53 ` [RFC] fixing the UML failure root cause richard -rw- weinberger 2011-10-14 20:17 ` Andrew Lutomirski 2011-10-14 20:23 ` richard -rw- weinberger 2011-10-14 20:31 ` Andrew Lutomirski 2011-10-14 20:39 ` richard -rw- weinberger 2011-10-14 22:28 ` richard -rw- weinberger 2011-10-15 16:57 ` Ingo Molnar 2011-10-05 22:24 ` [3.1 patch] x86: default to vsyscall=native Adrian Bunk 2011-10-03 13:19 ` richard -rw- weinberger 2011-10-03 17:46 ` Adrian Bunk 2011-10-05 21:40 Adrian Bunk 2011-10-05 22:01 ` Thomas Gleixner 2011-10-09 13:45 ` Adrian Bunk
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.