linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] module: Overwrite st_size instead of st_info
@ 2018-11-19 16:25 Vincent Whitchurch
  2018-11-19 16:25 ` [PATCH 2/2] ARM: module: Fix function kallsyms on Thumb-2 Vincent Whitchurch
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Vincent Whitchurch @ 2018-11-19 16:25 UTC (permalink / raw)
  To: linux, jeyu
  Cc: dave.martin, linux-arm-kernel, linux-kernel, Vincent Whitchurch

st_info is currently overwritten after relocation and used to store the
elf_type().  However, we're going to need it fix kallsyms on ARM's
Thumb-2 kernels, so preserve st_info and overwrite the st_size field
instead.  st_size is neither used by the module core nor by any
architecture.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
---
v4: Split out to separate patch.  Use st_size instead of st_other.
v1-v3: See PATCH 2/2

 kernel/module.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 49a405891587..3d86a38b580c 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2682,7 +2682,7 @@ static void add_kallsyms(struct module *mod, const struct load_info *info)
 
 	/* Set types up while we still have access to sections. */
 	for (i = 0; i < mod->kallsyms->num_symtab; i++)
-		mod->kallsyms->symtab[i].st_info
+		mod->kallsyms->symtab[i].st_size
 			= elf_type(&mod->kallsyms->symtab[i], info);
 
 	/* Now populate the cut down core kallsyms for after init. */
@@ -4061,7 +4061,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
 		kallsyms = rcu_dereference_sched(mod->kallsyms);
 		if (symnum < kallsyms->num_symtab) {
 			*value = kallsyms->symtab[symnum].st_value;
-			*type = kallsyms->symtab[symnum].st_info;
+			*type = kallsyms->symtab[symnum].st_size;
 			strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
 			strlcpy(module_name, mod->name, MODULE_NAME_LEN);
 			*exported = is_exported(name, *value, mod);
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] ARM: module: Fix function kallsyms on Thumb-2
  2018-11-19 16:25 [PATCH 1/2] module: Overwrite st_size instead of st_info Vincent Whitchurch
@ 2018-11-19 16:25 ` Vincent Whitchurch
  2018-11-23 18:34   ` Dave Martin
  2018-11-22 12:01 ` [PATCH 1/2] module: Overwrite st_size instead of st_info Dave Martin
  2018-11-23 17:59 ` Dave Martin
  2 siblings, 1 reply; 11+ messages in thread
From: Vincent Whitchurch @ 2018-11-19 16:25 UTC (permalink / raw)
  To: linux, jeyu
  Cc: dave.martin, linux-arm-kernel, linux-kernel, Vincent Whitchurch

Thumb-2 functions have the lowest bit set in the symbol value in the
symtab.  When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.

 $ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
  95947: 8015dc89   686 FUNC    GLOBAL DEFAULT    2 show_interrupts
 $ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
 8015dc88 T show_interrupts
 $ cat /proc/kallsyms | grep show_interrupts
 8015dc88 T show_interrupts

However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.

 $ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
    333: 00002d4d    36 FUNC    GLOBAL DEFAULT    1 tun_get_socket
 $ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
 00002d4c T tun_get_socket
 $ cat /proc/kallsyms | grep tun_get_socket
 7f802d4d t tun_get_socket      [tun]

Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module.  For example, given a
tun_get_socket which starts like this,

 00002d4c <tun_get_socket>:
     2d4c:       6943            ldr     r3, [r0, #20]
     2d4e:       4a07            ldr     r2, [pc, #28]
     2d50:       4293            cmp     r3, r2

a crash when tun_get_socket is called with NULL results in:

 PC is at tun_xdp+0xa3/0xa4 [tun]
 pc : [<7f802d4c>]

As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.

To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.

After the fix:

 # cat /proc/kallsyms | grep tun_get_socket
 7f802d4c t tun_get_socket       [tun]

 PC is at tun_get_socket+0x0/0x24 [tun]
 pc : [<7f802d4c>]

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
---
v4: Split out st_value overwrite change.  Add HAVE* macro to avoid function call.
v3: Do not overwrite st_value
v2: Fix build warning with !MODULES

 arch/arm/include/asm/module.h | 11 +++++++++++
 include/linux/module.h        |  7 +++++++
 kernel/module.c               | 25 ++++++++++++++-----------
 3 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/arch/arm/include/asm/module.h b/arch/arm/include/asm/module.h
index 9e81b7c498d8..e2ccec651e71 100644
--- a/arch/arm/include/asm/module.h
+++ b/arch/arm/include/asm/module.h
@@ -61,4 +61,15 @@ u32 get_module_plt(struct module *mod, unsigned long loc, Elf32_Addr val);
 	MODULE_ARCH_VERMAGIC_ARMTHUMB \
 	MODULE_ARCH_VERMAGIC_P2V
 
+#ifdef CONFIG_THUMB2_KERNEL
+#define HAVE_ARCH_MODULE_KALLSYMS_SYMBOL_VALUE
+static inline unsigned long module_kallsyms_symbol_value(Elf_Sym *sym)
+{
+	if (ELF_ST_TYPE(sym->st_info) == STT_FUNC)
+		return sym->st_value & ~1;
+
+	return sym->st_value;
+}
+#endif
+
 #endif /* _ASM_ARM_MODULE_H */
diff --git a/include/linux/module.h b/include/linux/module.h
index fce6b4335e36..cfc55f358800 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -486,6 +486,13 @@ struct module {
 #define MODULE_ARCH_INIT {}
 #endif
 
+#ifndef HAVE_ARCH_MODULE_KALLSYMS_SYMBOL_VALUE
+static inline unsigned long module_kallsyms_symbol_value(Elf_Sym *sym)
+{
+	return sym->st_value;
+}
+#endif
+
 extern struct mutex module_mutex;
 
 /* FIXME: It'd be nice to isolate modules during init, too, so they
diff --git a/kernel/module.c b/kernel/module.c
index 3d86a38b580c..a36d7915ed2b 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3934,6 +3934,9 @@ static const char *get_ksymbol(struct module *mod,
 	/* Scan for closest preceding symbol, and next symbol. (ELF
 	   starts real symbols at 1). */
 	for (i = 1; i < kallsyms->num_symtab; i++) {
+		unsigned long thisval = module_kallsyms_symbol_value(&kallsyms->symtab[i]);
+		unsigned long bestval = module_kallsyms_symbol_value(&kallsyms->symtab[best]);
+
 		if (kallsyms->symtab[i].st_shndx == SHN_UNDEF)
 			continue;
 
@@ -3943,21 +3946,21 @@ static const char *get_ksymbol(struct module *mod,
 		    || is_arm_mapping_symbol(symname(kallsyms, i)))
 			continue;
 
-		if (kallsyms->symtab[i].st_value <= addr
-		    && kallsyms->symtab[i].st_value > kallsyms->symtab[best].st_value)
+		if (thisval <= addr
+		    && thisval > bestval)
 			best = i;
-		if (kallsyms->symtab[i].st_value > addr
-		    && kallsyms->symtab[i].st_value < nextval)
-			nextval = kallsyms->symtab[i].st_value;
+		if (thisval > addr
+		    && thisval < nextval)
+			nextval = thisval;
 	}
 
 	if (!best)
 		return NULL;
 
 	if (size)
-		*size = nextval - kallsyms->symtab[best].st_value;
+		*size = nextval - module_kallsyms_symbol_value(&kallsyms->symtab[best]);
 	if (offset)
-		*offset = addr - kallsyms->symtab[best].st_value;
+		*offset = addr - module_kallsyms_symbol_value(&kallsyms->symtab[best]);
 	return symname(kallsyms, best);
 }
 
@@ -4060,7 +4063,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
 			continue;
 		kallsyms = rcu_dereference_sched(mod->kallsyms);
 		if (symnum < kallsyms->num_symtab) {
-			*value = kallsyms->symtab[symnum].st_value;
+			*value = module_kallsyms_symbol_value(&kallsyms->symtab[symnum]);
 			*type = kallsyms->symtab[symnum].st_size;
 			strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
 			strlcpy(module_name, mod->name, MODULE_NAME_LEN);
@@ -4082,7 +4085,7 @@ static unsigned long mod_find_symname(struct module *mod, const char *name)
 	for (i = 0; i < kallsyms->num_symtab; i++)
 		if (strcmp(name, symname(kallsyms, i)) == 0 &&
 		    kallsyms->symtab[i].st_shndx != SHN_UNDEF)
-			return kallsyms->symtab[i].st_value;
+			return module_kallsyms_symbol_value(&kallsyms->symtab[i]);
 	return 0;
 }
 
@@ -4131,8 +4134,8 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
 			if (kallsyms->symtab[i].st_shndx == SHN_UNDEF)
 				continue;
 
-			ret = fn(data, symname(kallsyms, i),
-				 mod, kallsyms->symtab[i].st_value);
+			ret = fn(data, symname(kallsyms, i), mod,
+				 module_kallsyms_symbol_value(&kallsyms->symtab[i]));
 			if (ret != 0)
 				return ret;
 		}
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] module: Overwrite st_size instead of st_info
  2018-11-19 16:25 [PATCH 1/2] module: Overwrite st_size instead of st_info Vincent Whitchurch
  2018-11-19 16:25 ` [PATCH 2/2] ARM: module: Fix function kallsyms on Thumb-2 Vincent Whitchurch
@ 2018-11-22 12:01 ` Dave Martin
  2018-11-22 12:24   ` Vincent Whitchurch
  2018-11-23 17:59 ` Dave Martin
  2 siblings, 1 reply; 11+ messages in thread
From: Dave Martin @ 2018-11-22 12:01 UTC (permalink / raw)
  To: Vincent Whitchurch
  Cc: linux, jeyu, Vincent Whitchurch, linux-arm-kernel, linux-kernel

On Mon, Nov 19, 2018 at 05:25:12PM +0100, Vincent Whitchurch wrote:
> st_info is currently overwritten after relocation and used to store the
> elf_type().  However, we're going to need it fix kallsyms on ARM's
> Thumb-2 kernels, so preserve st_info and overwrite the st_size field
> instead.  st_size is neither used by the module core nor by any
> architecture.
> 
> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> ---
> v4: Split out to separate patch.  Use st_size instead of st_other.
> v1-v3: See PATCH 2/2
> 
>  kernel/module.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/module.c b/kernel/module.c
> index 49a405891587..3d86a38b580c 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -2682,7 +2682,7 @@ static void add_kallsyms(struct module *mod, const struct load_info *info)
>  
>  	/* Set types up while we still have access to sections. */
>  	for (i = 0; i < mod->kallsyms->num_symtab; i++)
> -		mod->kallsyms->symtab[i].st_info
> +		mod->kallsyms->symtab[i].st_size
>  			= elf_type(&mod->kallsyms->symtab[i], info);
>  
>  	/* Now populate the cut down core kallsyms for after init. */
> @@ -4061,7 +4061,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
>  		kallsyms = rcu_dereference_sched(mod->kallsyms);
>  		if (symnum < kallsyms->num_symtab) {
>  			*value = kallsyms->symtab[symnum].st_value;
> -			*type = kallsyms->symtab[symnum].st_info;
> +			*type = kallsyms->symtab[symnum].st_size;
>  			strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
>  			strlcpy(module_name, mod->name, MODULE_NAME_LEN);
>  			*exported = is_exported(name, *value, mod);

This is fine if st_size is really unused, but how sure are you of that?

grepping for st_size throws up some hits that appear ELF-related, but
I've not investigated them in detail.

(The fact that struct stat has an identically named field throws up
a load of false positives too.)

Cheers
---Dave

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] module: Overwrite st_size instead of st_info
  2018-11-22 12:01 ` [PATCH 1/2] module: Overwrite st_size instead of st_info Dave Martin
@ 2018-11-22 12:24   ` Vincent Whitchurch
  2018-11-22 16:28     ` Jessica Yu
  0 siblings, 1 reply; 11+ messages in thread
From: Vincent Whitchurch @ 2018-11-22 12:24 UTC (permalink / raw)
  To: Dave Martin; +Cc: linux, jeyu, linux-arm-kernel, linux-kernel

On Thu, Nov 22, 2018 at 12:01:54PM +0000, Dave Martin wrote:
> On Mon, Nov 19, 2018 at 05:25:12PM +0100, Vincent Whitchurch wrote:
> > st_info is currently overwritten after relocation and used to store the
> > elf_type().  However, we're going to need it fix kallsyms on ARM's
> > Thumb-2 kernels, so preserve st_info and overwrite the st_size field
> > instead.  st_size is neither used by the module core nor by any
> > architecture.
> > 
> > Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> > ---
> > v4: Split out to separate patch.  Use st_size instead of st_other.
> > v1-v3: See PATCH 2/2
> > 
> >  kernel/module.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/module.c b/kernel/module.c
> > index 49a405891587..3d86a38b580c 100644
> > --- a/kernel/module.c
> > +++ b/kernel/module.c
> > @@ -2682,7 +2682,7 @@ static void add_kallsyms(struct module *mod, const struct load_info *info)
> >  
> >  	/* Set types up while we still have access to sections. */
> >  	for (i = 0; i < mod->kallsyms->num_symtab; i++)
> > -		mod->kallsyms->symtab[i].st_info
> > +		mod->kallsyms->symtab[i].st_size
> >  			= elf_type(&mod->kallsyms->symtab[i], info);
> >  
> >  	/* Now populate the cut down core kallsyms for after init. */
> > @@ -4061,7 +4061,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
> >  		kallsyms = rcu_dereference_sched(mod->kallsyms);
> >  		if (symnum < kallsyms->num_symtab) {
> >  			*value = kallsyms->symtab[symnum].st_value;
> > -			*type = kallsyms->symtab[symnum].st_info;
> > +			*type = kallsyms->symtab[symnum].st_size;
> >  			strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
> >  			strlcpy(module_name, mod->name, MODULE_NAME_LEN);
> >  			*exported = is_exported(name, *value, mod);
> 
> This is fine if st_size is really unused, but how sure are you of that?
> 
> grepping for st_size throws up some hits that appear ELF-related, but
> I've not investigated them in detail.
> 
> (The fact that struct stat has an identically named field throws up
> a load of false positives too.)

$ git describe --tags
v4.20-rc3-93-g92b419289cee
 
$ rg -m1 '[\.>]st_size'  --iglob '!**/tools/**' --iglob '!**/vdso*' --iglob '!**/scripts/**' --iglob '!**/usr/**' --iglob '!**/samples/**' | cat
| kernel/kexec_file.c:	if (sym->st_size != size) {

Symbols in kexec kernel.

| fs/stat.c:	tmp.st_size = stat->size;
| Documentation/networking/tls.txt:  sendfile(sock, file, &offset, stat.st_size);
| net/9p/client.c:		ret->st_rdev, ret->st_size, ret->st_blksize,
| net/9p/protocol.c:					&stbuf->st_rdev, &stbuf->st_size,
| fs/9p/vfs_inode_dotl.c:		i_size_write(inode, stat->st_size);
| fs/hostfs/hostfs_user.c:	p->size = buf->st_size;
| arch/powerpc/boot/mktree.c:	nblks = (st.st_size + IMGBLK) / IMGBLK;
| arch/alpha/kernel/osf_sys.c:	tmp.st_size	= lstat->size;
| arch/x86/ia32/sys_ia32.c:	    __put_user(stat->size, &ubuf->st_size) ||

Not Elf_Sym.

| arch/x86/kernel/machine_kexec_64.c:			 sym->st_size);

Symbols in kexec kernel.

| arch/sparc/boot/piggyback.c:	st4(buffer + 12, s.st_size);
| arch/sparc/kernel/sys_sparc32.c:	err |= put_user(stat->size, &statbuf->st_size);
| arch/um/os-Linux/file.c:		.ust_size    = src->st_size,    /* total size, in bytes */
| arch/um/os-Linux/start_up.c:	size = (buf.st_size + UM_KERN_PAGE_SIZE) & ~(UM_KERN_PAGE_SIZE - 1);
| arch/s390/kernel/compat_linux.c:	tmp.st_size = stat->size;
| arch/arm/kernel/sys_oabi-compat.c:	tmp.st_size = stat->size;
| arch/mips/boot/compressed/calc_vmlinuz_load_addr.c:	vmlinux_size = (uint64_t)sb.st_size;
| drivers/net/ethernet/marvell/sky2.c:		hw->st_idx = RING_NEXT(hw->st_idx, hw->st_size);

Not Elf_Sym.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] module: Overwrite st_size instead of st_info
  2018-11-22 12:24   ` Vincent Whitchurch
@ 2018-11-22 16:28     ` Jessica Yu
  2018-11-22 17:40       ` Ard Biesheuvel
  2018-11-23 12:57       ` Miroslav Benes
  0 siblings, 2 replies; 11+ messages in thread
From: Jessica Yu @ 2018-11-22 16:28 UTC (permalink / raw)
  To: Vincent Whitchurch
  Cc: Miroslav Benes, Dave Martin, linux, linux-arm-kernel, linux-kernel

+++ Vincent Whitchurch [22/11/18 13:24 +0100]:
>On Thu, Nov 22, 2018 at 12:01:54PM +0000, Dave Martin wrote:
>> On Mon, Nov 19, 2018 at 05:25:12PM +0100, Vincent Whitchurch wrote:
>> > st_info is currently overwritten after relocation and used to store the
>> > elf_type().  However, we're going to need it fix kallsyms on ARM's
>> > Thumb-2 kernels, so preserve st_info and overwrite the st_size field
>> > instead.  st_size is neither used by the module core nor by any
>> > architecture.
>> >
>> > Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
>> > ---
>> > v4: Split out to separate patch.  Use st_size instead of st_other.
>> > v1-v3: See PATCH 2/2
>> >
>> >  kernel/module.c | 4 ++--
>> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/kernel/module.c b/kernel/module.c
>> > index 49a405891587..3d86a38b580c 100644
>> > --- a/kernel/module.c
>> > +++ b/kernel/module.c
>> > @@ -2682,7 +2682,7 @@ static void add_kallsyms(struct module *mod, const struct load_info *info)
>> >
>> >  	/* Set types up while we still have access to sections. */
>> >  	for (i = 0; i < mod->kallsyms->num_symtab; i++)
>> > -		mod->kallsyms->symtab[i].st_info
>> > +		mod->kallsyms->symtab[i].st_size
>> >  			= elf_type(&mod->kallsyms->symtab[i], info);
>> >
>> >  	/* Now populate the cut down core kallsyms for after init. */
>> > @@ -4061,7 +4061,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
>> >  		kallsyms = rcu_dereference_sched(mod->kallsyms);
>> >  		if (symnum < kallsyms->num_symtab) {
>> >  			*value = kallsyms->symtab[symnum].st_value;
>> > -			*type = kallsyms->symtab[symnum].st_info;
>> > +			*type = kallsyms->symtab[symnum].st_size;
>> >  			strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
>> >  			strlcpy(module_name, mod->name, MODULE_NAME_LEN);
>> >  			*exported = is_exported(name, *value, mod);
>>
>> This is fine if st_size is really unused, but how sure are you of that?
>>
>> grepping for st_size throws up some hits that appear ELF-related, but
>> I've not investigated them in detail.
>>
>> (The fact that struct stat has an identically named field throws up
>> a load of false positives too.)
>
>$ git describe --tags
>v4.20-rc3-93-g92b419289cee
>
>$ rg -m1 '[\.>]st_size'  --iglob '!**/tools/**' --iglob '!**/vdso*' --iglob '!**/scripts/**' --iglob '!**/usr/**' --iglob '!**/samples/**' | cat
>| kernel/kexec_file.c:	if (sym->st_size != size) {
>
>Symbols in kexec kernel.
>
>| fs/stat.c:	tmp.st_size = stat->size;
>| Documentation/networking/tls.txt:  sendfile(sock, file, &offset, stat.st_size);
>| net/9p/client.c:		ret->st_rdev, ret->st_size, ret->st_blksize,
>| net/9p/protocol.c:					&stbuf->st_rdev, &stbuf->st_size,
>| fs/9p/vfs_inode_dotl.c:		i_size_write(inode, stat->st_size);
>| fs/hostfs/hostfs_user.c:	p->size = buf->st_size;
>| arch/powerpc/boot/mktree.c:	nblks = (st.st_size + IMGBLK) / IMGBLK;
>| arch/alpha/kernel/osf_sys.c:	tmp.st_size	= lstat->size;
>| arch/x86/ia32/sys_ia32.c:	    __put_user(stat->size, &ubuf->st_size) ||
>
>Not Elf_Sym.
>
>| arch/x86/kernel/machine_kexec_64.c:			 sym->st_size);
>
>Symbols in kexec kernel.
>
>| arch/sparc/boot/piggyback.c:	st4(buffer + 12, s.st_size);
>| arch/sparc/kernel/sys_sparc32.c:	err |= put_user(stat->size, &statbuf->st_size);
>| arch/um/os-Linux/file.c:		.ust_size    = src->st_size,    /* total size, in bytes */
>| arch/um/os-Linux/start_up.c:	size = (buf.st_size + UM_KERN_PAGE_SIZE) & ~(UM_KERN_PAGE_SIZE - 1);
>| arch/s390/kernel/compat_linux.c:	tmp.st_size = stat->size;
>| arch/arm/kernel/sys_oabi-compat.c:	tmp.st_size = stat->size;
>| arch/mips/boot/compressed/calc_vmlinuz_load_addr.c:	vmlinux_size = (uint64_t)sb.st_size;
>| drivers/net/ethernet/marvell/sky2.c:		hw->st_idx = RING_NEXT(hw->st_idx, hw->st_size);
>
>Not Elf_Sym.

[ added Miroslav to CC, just in case he would like to check :) ]

I have just double checked as well, and am fairly certain that the
Elf_Sym st_size field is not used to apply module relocations in any
arches, and it is not used in the core module loader nor in the module
kallsyms code. We'd like to avoid overwriting st_info in any case, to
fix kallsyms on Thumb-2 and also so that livepatch won't run into any
issues with delayed relocations, should livepatch support ever expand
to arches (e.g., arm) that rely on st_info for module relocations.

Thanks,

Jessica

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] module: Overwrite st_size instead of st_info
  2018-11-22 16:28     ` Jessica Yu
@ 2018-11-22 17:40       ` Ard Biesheuvel
  2018-11-22 17:49         ` Russell King - ARM Linux
  2018-11-23 12:57       ` Miroslav Benes
  1 sibling, 1 reply; 11+ messages in thread
From: Ard Biesheuvel @ 2018-11-22 17:40 UTC (permalink / raw)
  To: Jessica Yu
  Cc: vincent.whitchurch, Linux Kernel Mailing List, Miroslav Benes,
	Dave Martin, linux-arm-kernel, Russell King

On Thu, 22 Nov 2018 at 17:29, Jessica Yu <jeyu@kernel.org> wrote:
>
> +++ Vincent Whitchurch [22/11/18 13:24 +0100]:
> >On Thu, Nov 22, 2018 at 12:01:54PM +0000, Dave Martin wrote:
> >> On Mon, Nov 19, 2018 at 05:25:12PM +0100, Vincent Whitchurch wrote:
> >> > st_info is currently overwritten after relocation and used to store the
> >> > elf_type().  However, we're going to need it fix kallsyms on ARM's
> >> > Thumb-2 kernels, so preserve st_info and overwrite the st_size field
> >> > instead.  st_size is neither used by the module core nor by any
> >> > architecture.
> >> >
> >> > Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> >> > ---
> >> > v4: Split out to separate patch.  Use st_size instead of st_other.
> >> > v1-v3: See PATCH 2/2
> >> >
> >> >  kernel/module.c | 4 ++--
> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/kernel/module.c b/kernel/module.c
> >> > index 49a405891587..3d86a38b580c 100644
> >> > --- a/kernel/module.c
> >> > +++ b/kernel/module.c
> >> > @@ -2682,7 +2682,7 @@ static void add_kallsyms(struct module *mod, const struct load_info *info)
> >> >
> >> >    /* Set types up while we still have access to sections. */
> >> >    for (i = 0; i < mod->kallsyms->num_symtab; i++)
> >> > -          mod->kallsyms->symtab[i].st_info
> >> > +          mod->kallsyms->symtab[i].st_size
> >> >                    = elf_type(&mod->kallsyms->symtab[i], info);
> >> >
> >> >    /* Now populate the cut down core kallsyms for after init. */
> >> > @@ -4061,7 +4061,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
> >> >            kallsyms = rcu_dereference_sched(mod->kallsyms);
> >> >            if (symnum < kallsyms->num_symtab) {
> >> >                    *value = kallsyms->symtab[symnum].st_value;
> >> > -                  *type = kallsyms->symtab[symnum].st_info;
> >> > +                  *type = kallsyms->symtab[symnum].st_size;
> >> >                    strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
> >> >                    strlcpy(module_name, mod->name, MODULE_NAME_LEN);
> >> >                    *exported = is_exported(name, *value, mod);
> >>
> >> This is fine if st_size is really unused, but how sure are you of that?
> >>
> >> grepping for st_size throws up some hits that appear ELF-related, but
> >> I've not investigated them in detail.
> >>
> >> (The fact that struct stat has an identically named field throws up
> >> a load of false positives too.)
> >
> >$ git describe --tags
> >v4.20-rc3-93-g92b419289cee
> >
> >$ rg -m1 '[\.>]st_size'  --iglob '!**/tools/**' --iglob '!**/vdso*' --iglob '!**/scripts/**' --iglob '!**/usr/**' --iglob '!**/samples/**' | cat
> >| kernel/kexec_file.c: if (sym->st_size != size) {
> >
> >Symbols in kexec kernel.
> >
> >| fs/stat.c:   tmp.st_size = stat->size;
> >| Documentation/networking/tls.txt:  sendfile(sock, file, &offset, stat.st_size);
> >| net/9p/client.c:             ret->st_rdev, ret->st_size, ret->st_blksize,
> >| net/9p/protocol.c:                                   &stbuf->st_rdev, &stbuf->st_size,
> >| fs/9p/vfs_inode_dotl.c:              i_size_write(inode, stat->st_size);
> >| fs/hostfs/hostfs_user.c:     p->size = buf->st_size;
> >| arch/powerpc/boot/mktree.c:  nblks = (st.st_size + IMGBLK) / IMGBLK;
> >| arch/alpha/kernel/osf_sys.c: tmp.st_size     = lstat->size;
> >| arch/x86/ia32/sys_ia32.c:        __put_user(stat->size, &ubuf->st_size) ||
> >
> >Not Elf_Sym.
> >
> >| arch/x86/kernel/machine_kexec_64.c:                   sym->st_size);
> >
> >Symbols in kexec kernel.
> >
> >| arch/sparc/boot/piggyback.c: st4(buffer + 12, s.st_size);
> >| arch/sparc/kernel/sys_sparc32.c:     err |= put_user(stat->size, &statbuf->st_size);
> >| arch/um/os-Linux/file.c:             .ust_size    = src->st_size,    /* total size, in bytes */
> >| arch/um/os-Linux/start_up.c: size = (buf.st_size + UM_KERN_PAGE_SIZE) & ~(UM_KERN_PAGE_SIZE - 1);
> >| arch/s390/kernel/compat_linux.c:     tmp.st_size = stat->size;
> >| arch/arm/kernel/sys_oabi-compat.c:   tmp.st_size = stat->size;
> >| arch/mips/boot/compressed/calc_vmlinuz_load_addr.c:  vmlinux_size = (uint64_t)sb.st_size;
> >| drivers/net/ethernet/marvell/sky2.c:         hw->st_idx = RING_NEXT(hw->st_idx, hw->st_size);
> >
> >Not Elf_Sym.
>
> [ added Miroslav to CC, just in case he would like to check :) ]
>
> I have just double checked as well, and am fairly certain that the
> Elf_Sym st_size field is not used to apply module relocations in any
> arches, and it is not used in the core module loader nor in the module
> kallsyms code. We'd like to avoid overwriting st_info in any case, to
> fix kallsyms on Thumb-2 and also so that livepatch won't run into any
> issues with delayed relocations, should livepatch support ever expand
> to arches (e.g., arm) that rely on st_info for module relocations.
>

Also note that st_size cannot be relied upon in general, since we
overwrite the addresses of undefined symbols in a module's symbol
table when resolve them against ksyms, while the st_size field is kept
at 0. At relocation time, we don't really distinguish anymore between
local and external module symbols, and so relying on st_size to be
accurate would definitely break things.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] module: Overwrite st_size instead of st_info
  2018-11-22 17:40       ` Ard Biesheuvel
@ 2018-11-22 17:49         ` Russell King - ARM Linux
  2018-11-23 17:21           ` Dave Martin
  0 siblings, 1 reply; 11+ messages in thread
From: Russell King - ARM Linux @ 2018-11-22 17:49 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jessica Yu, vincent.whitchurch, Linux Kernel Mailing List,
	Miroslav Benes, Dave Martin, linux-arm-kernel

On Thu, Nov 22, 2018 at 06:40:45PM +0100, Ard Biesheuvel wrote:
> On Thu, 22 Nov 2018 at 17:29, Jessica Yu <jeyu@kernel.org> wrote:
> >
> > +++ Vincent Whitchurch [22/11/18 13:24 +0100]:
> > >On Thu, Nov 22, 2018 at 12:01:54PM +0000, Dave Martin wrote:
> > >> On Mon, Nov 19, 2018 at 05:25:12PM +0100, Vincent Whitchurch wrote:
> > >> > st_info is currently overwritten after relocation and used to store the
> > >> > elf_type().  However, we're going to need it fix kallsyms on ARM's
> > >> > Thumb-2 kernels, so preserve st_info and overwrite the st_size field
> > >> > instead.  st_size is neither used by the module core nor by any
> > >> > architecture.
> > >> >
> > >> > Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> > >> > ---
> > >> > v4: Split out to separate patch.  Use st_size instead of st_other.
> > >> > v1-v3: See PATCH 2/2
> > >> >
> > >> >  kernel/module.c | 4 ++--
> > >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >> >
> > >> > diff --git a/kernel/module.c b/kernel/module.c
> > >> > index 49a405891587..3d86a38b580c 100644
> > >> > --- a/kernel/module.c
> > >> > +++ b/kernel/module.c
> > >> > @@ -2682,7 +2682,7 @@ static void add_kallsyms(struct module *mod, const struct load_info *info)
> > >> >
> > >> >    /* Set types up while we still have access to sections. */
> > >> >    for (i = 0; i < mod->kallsyms->num_symtab; i++)
> > >> > -          mod->kallsyms->symtab[i].st_info
> > >> > +          mod->kallsyms->symtab[i].st_size
> > >> >                    = elf_type(&mod->kallsyms->symtab[i], info);
> > >> >
> > >> >    /* Now populate the cut down core kallsyms for after init. */
> > >> > @@ -4061,7 +4061,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
> > >> >            kallsyms = rcu_dereference_sched(mod->kallsyms);
> > >> >            if (symnum < kallsyms->num_symtab) {
> > >> >                    *value = kallsyms->symtab[symnum].st_value;
> > >> > -                  *type = kallsyms->symtab[symnum].st_info;
> > >> > +                  *type = kallsyms->symtab[symnum].st_size;
> > >> >                    strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
> > >> >                    strlcpy(module_name, mod->name, MODULE_NAME_LEN);
> > >> >                    *exported = is_exported(name, *value, mod);
> > >>
> > >> This is fine if st_size is really unused, but how sure are you of that?
> > >>
> > >> grepping for st_size throws up some hits that appear ELF-related, but
> > >> I've not investigated them in detail.
> > >>
> > >> (The fact that struct stat has an identically named field throws up
> > >> a load of false positives too.)
> > >
> > >$ git describe --tags
> > >v4.20-rc3-93-g92b419289cee
> > >
> > >$ rg -m1 '[\.>]st_size'  --iglob '!**/tools/**' --iglob '!**/vdso*' --iglob '!**/scripts/**' --iglob '!**/usr/**' --iglob '!**/samples/**' | cat
> > >| kernel/kexec_file.c: if (sym->st_size != size) {
> > >
> > >Symbols in kexec kernel.
> > >
> > >| fs/stat.c:   tmp.st_size = stat->size;
> > >| Documentation/networking/tls.txt:  sendfile(sock, file, &offset, stat.st_size);
> > >| net/9p/client.c:             ret->st_rdev, ret->st_size, ret->st_blksize,
> > >| net/9p/protocol.c:                                   &stbuf->st_rdev, &stbuf->st_size,
> > >| fs/9p/vfs_inode_dotl.c:              i_size_write(inode, stat->st_size);
> > >| fs/hostfs/hostfs_user.c:     p->size = buf->st_size;
> > >| arch/powerpc/boot/mktree.c:  nblks = (st.st_size + IMGBLK) / IMGBLK;
> > >| arch/alpha/kernel/osf_sys.c: tmp.st_size     = lstat->size;
> > >| arch/x86/ia32/sys_ia32.c:        __put_user(stat->size, &ubuf->st_size) ||
> > >
> > >Not Elf_Sym.
> > >
> > >| arch/x86/kernel/machine_kexec_64.c:                   sym->st_size);
> > >
> > >Symbols in kexec kernel.
> > >
> > >| arch/sparc/boot/piggyback.c: st4(buffer + 12, s.st_size);
> > >| arch/sparc/kernel/sys_sparc32.c:     err |= put_user(stat->size, &statbuf->st_size);
> > >| arch/um/os-Linux/file.c:             .ust_size    = src->st_size,    /* total size, in bytes */
> > >| arch/um/os-Linux/start_up.c: size = (buf.st_size + UM_KERN_PAGE_SIZE) & ~(UM_KERN_PAGE_SIZE - 1);
> > >| arch/s390/kernel/compat_linux.c:     tmp.st_size = stat->size;
> > >| arch/arm/kernel/sys_oabi-compat.c:   tmp.st_size = stat->size;
> > >| arch/mips/boot/compressed/calc_vmlinuz_load_addr.c:  vmlinux_size = (uint64_t)sb.st_size;
> > >| drivers/net/ethernet/marvell/sky2.c:         hw->st_idx = RING_NEXT(hw->st_idx, hw->st_size);
> > >
> > >Not Elf_Sym.
> >
> > [ added Miroslav to CC, just in case he would like to check :) ]
> >
> > I have just double checked as well, and am fairly certain that the
> > Elf_Sym st_size field is not used to apply module relocations in any
> > arches, and it is not used in the core module loader nor in the module
> > kallsyms code. We'd like to avoid overwriting st_info in any case, to
> > fix kallsyms on Thumb-2 and also so that livepatch won't run into any
> > issues with delayed relocations, should livepatch support ever expand
> > to arches (e.g., arm) that rely on st_info for module relocations.
> >
> 
> Also note that st_size cannot be relied upon in general, since we
> overwrite the addresses of undefined symbols in a module's symbol
> table when resolve them against ksyms, while the st_size field is kept
> at 0. At relocation time, we don't really distinguish anymore between
> local and external module symbols, and so relying on st_size to be
> accurate would definitely break things.

Umm.

Undefined symbols of course have a zero size, because it's not known
how large the definition that the symbol refers to is.  However,
the non-undefined symbols should have a size with them - anything
emitted by the compiler should, but only if .size has been used in
the assembler will assembly have correct sizes.

Aren't SHN_UNDEF symbols ignored, except when applying the
relocations, where we rely on st_value to be set on SHN_UNDEF
symbols?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] module: Overwrite st_size instead of st_info
  2018-11-22 16:28     ` Jessica Yu
  2018-11-22 17:40       ` Ard Biesheuvel
@ 2018-11-23 12:57       ` Miroslav Benes
  1 sibling, 0 replies; 11+ messages in thread
From: Miroslav Benes @ 2018-11-23 12:57 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Vincent Whitchurch, Dave Martin, linux, linux-arm-kernel, linux-kernel

On Thu, 22 Nov 2018, Jessica Yu wrote:

> +++ Vincent Whitchurch [22/11/18 13:24 +0100]:
> >On Thu, Nov 22, 2018 at 12:01:54PM +0000, Dave Martin wrote:
> >> On Mon, Nov 19, 2018 at 05:25:12PM +0100, Vincent Whitchurch wrote:
> >> > st_info is currently overwritten after relocation and used to store the
> >> > elf_type().  However, we're going to need it fix kallsyms on ARM's
> >> > Thumb-2 kernels, so preserve st_info and overwrite the st_size field
> >> > instead.  st_size is neither used by the module core nor by any
> >> > architecture.
> >> >
> >> > Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> >> > ---
> >> > v4: Split out to separate patch.  Use st_size instead of st_other.
> >> > v1-v3: See PATCH 2/2
> >> >
> >> >  kernel/module.c | 4 ++--
> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/kernel/module.c b/kernel/module.c
> >> > index 49a405891587..3d86a38b580c 100644
> >> > --- a/kernel/module.c
> >> > +++ b/kernel/module.c
> >> > @@ -2682,7 +2682,7 @@ static void add_kallsyms(struct module *mod, const
> >> > struct load_info *info)
> >> >
> >> >   /* Set types up while we still have access to sections. */
> >> >   for (i = 0; i < mod->kallsyms->num_symtab; i++)
> >> > -		mod->kallsyms->symtab[i].st_info
> >> > +		mod->kallsyms->symtab[i].st_size
> >> >     = elf_type(&mod->kallsyms->symtab[i], info);
> >> >
> >> > 	/* Now populate the cut down core kallsyms for after init. */
> >> > @@ -4061,7 +4061,7 @@ int module_get_kallsym(unsigned int symnum,
> >> > unsigned long *value, char *type,
> >> >    kallsyms = rcu_dereference_sched(mod->kallsyms);
> >> >    if (symnum < kallsyms->num_symtab) {
> >> > 			*value = kallsyms->symtab[symnum].st_value;
> >> > -			*type = kallsyms->symtab[symnum].st_info;
> >> > +			*type = kallsyms->symtab[symnum].st_size;
> >> >     strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
> >> >     strlcpy(module_name, mod->name, MODULE_NAME_LEN);
> >> >     *exported = is_exported(name, *value, mod);
> >>
> >> This is fine if st_size is really unused, but how sure are you of that?
> >>
> >> grepping for st_size throws up some hits that appear ELF-related, but
> >> I've not investigated them in detail.
> >>
> >> (The fact that struct stat has an identically named field throws up
> >> a load of false positives too.)
> >
> >$ git describe --tags
> >v4.20-rc3-93-g92b419289cee
> >
> >$ rg -m1 '[\.>]st_size'  --iglob '!**/tools/**' --iglob '!**/vdso*' --iglob
> >'!**/scripts/**' --iglob '!**/usr/**' --iglob '!**/samples/**' | cat
> >| kernel/kexec_file.c:	if (sym->st_size != size) {
> >
> >Symbols in kexec kernel.
> >
> >| fs/stat.c:	tmp.st_size = stat->size;
> >| Documentation/networking/tls.txt:  sendfile(sock, file, &offset,
> >| stat.st_size);
> >| net/9p/client.c:		ret->st_rdev, ret->st_size, ret->st_blksize,
> >| net/9p/protocol.c:					&stbuf->st_rdev,
> >| &stbuf->st_size,
> >| fs/9p/vfs_inode_dotl.c:		i_size_write(inode, stat->st_size);
> >| fs/hostfs/hostfs_user.c:	p->size = buf->st_size;
> >| arch/powerpc/boot/mktree.c:	nblks = (st.st_size + IMGBLK) / IMGBLK;
> >| arch/alpha/kernel/osf_sys.c:	tmp.st_size	= lstat->size;
> >| arch/x86/ia32/sys_ia32.c:	    __put_user(stat->size, &ubuf->st_size) ||
> >
> >Not Elf_Sym.
> >
> >| arch/x86/kernel/machine_kexec_64.c:			 sym->st_size);
> >
> >Symbols in kexec kernel.
> >
> >| arch/sparc/boot/piggyback.c:	st4(buffer + 12, s.st_size);
> >| arch/sparc/kernel/sys_sparc32.c:	err |= put_user(stat->size,
> >| &statbuf->st_size);
> >| arch/um/os-Linux/file.c:		.ust_size    = src->st_size,    /*
> >| total size, in bytes */
> >| arch/um/os-Linux/start_up.c:	size = (buf.st_size + UM_KERN_PAGE_SIZE) &
> >| ~(UM_KERN_PAGE_SIZE - 1);
> >| arch/s390/kernel/compat_linux.c:	tmp.st_size = stat->size;
> >| arch/arm/kernel/sys_oabi-compat.c:	tmp.st_size = stat->size;
> >| arch/mips/boot/compressed/calc_vmlinuz_load_addr.c:	vmlinux_size =
> >| (uint64_t)sb.st_size;
> >| drivers/net/ethernet/marvell/sky2.c:		hw->st_idx =
> >| RING_NEXT(hw->st_idx, hw->st_size);
> >
> >Not Elf_Sym.
> 
> [ added Miroslav to CC, just in case he would like to check :) ]
> 
> I have just double checked as well, and am fairly certain that the
> Elf_Sym st_size field is not used to apply module relocations in any
> arches, and it is not used in the core module loader nor in the module
> kallsyms code. We'd like to avoid overwriting st_info in any case, to
> fix kallsyms on Thumb-2 and also so that livepatch won't run into any
> issues with delayed relocations, should livepatch support ever expand
> to arches (e.g., arm) that rely on st_info for module relocations.

I checked and came to the same conclusion.

Regards,
Miroslav

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] module: Overwrite st_size instead of st_info
  2018-11-22 17:49         ` Russell King - ARM Linux
@ 2018-11-23 17:21           ` Dave Martin
  0 siblings, 0 replies; 11+ messages in thread
From: Dave Martin @ 2018-11-23 17:21 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Ard Biesheuvel, vincent.whitchurch, Linux Kernel Mailing List,
	Jessica Yu, Miroslav Benes, linux-arm-kernel

On Thu, Nov 22, 2018 at 05:49:23PM +0000, Russell King - ARM Linux wrote:
> On Thu, Nov 22, 2018 at 06:40:45PM +0100, Ard Biesheuvel wrote:
> > On Thu, 22 Nov 2018 at 17:29, Jessica Yu <jeyu@kernel.org> wrote:
> > >
> > > +++ Vincent Whitchurch [22/11/18 13:24 +0100]:
> > > >On Thu, Nov 22, 2018 at 12:01:54PM +0000, Dave Martin wrote:
> > > >> On Mon, Nov 19, 2018 at 05:25:12PM +0100, Vincent Whitchurch wrote:
> > > >> > st_info is currently overwritten after relocation and used to store the
> > > >> > elf_type().  However, we're going to need it fix kallsyms on ARM's
> > > >> > Thumb-2 kernels, so preserve st_info and overwrite the st_size field
> > > >> > instead.  st_size is neither used by the module core nor by any
> > > >> > architecture.
> > > >> >
> > > >> > Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> > > >> > ---
> > > >> > v4: Split out to separate patch.  Use st_size instead of st_other.
> > > >> > v1-v3: See PATCH 2/2
> > > >> >
> > > >> >  kernel/module.c | 4 ++--
> > > >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >> >
> > > >> > diff --git a/kernel/module.c b/kernel/module.c
> > > >> > index 49a405891587..3d86a38b580c 100644
> > > >> > --- a/kernel/module.c
> > > >> > +++ b/kernel/module.c
> > > >> > @@ -2682,7 +2682,7 @@ static void add_kallsyms(struct module *mod, const struct load_info *info)
> > > >> >
> > > >> >    /* Set types up while we still have access to sections. */
> > > >> >    for (i = 0; i < mod->kallsyms->num_symtab; i++)
> > > >> > -          mod->kallsyms->symtab[i].st_info
> > > >> > +          mod->kallsyms->symtab[i].st_size
> > > >> >                    = elf_type(&mod->kallsyms->symtab[i], info);
> > > >> >
> > > >> >    /* Now populate the cut down core kallsyms for after init. */
> > > >> > @@ -4061,7 +4061,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
> > > >> >            kallsyms = rcu_dereference_sched(mod->kallsyms);
> > > >> >            if (symnum < kallsyms->num_symtab) {
> > > >> >                    *value = kallsyms->symtab[symnum].st_value;
> > > >> > -                  *type = kallsyms->symtab[symnum].st_info;
> > > >> > +                  *type = kallsyms->symtab[symnum].st_size;
> > > >> >                    strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
> > > >> >                    strlcpy(module_name, mod->name, MODULE_NAME_LEN);
> > > >> >                    *exported = is_exported(name, *value, mod);
> > > >>
> > > >> This is fine if st_size is really unused, but how sure are you of that?
> > > >>
> > > >> grepping for st_size throws up some hits that appear ELF-related, but
> > > >> I've not investigated them in detail.
> > > >>
> > > >> (The fact that struct stat has an identically named field throws up
> > > >> a load of false positives too.)
> > > >
> > > >$ git describe --tags
> > > >v4.20-rc3-93-g92b419289cee
> > > >
> > > >$ rg -m1 '[\.>]st_size'  --iglob '!**/tools/**' --iglob '!**/vdso*' --iglob '!**/scripts/**' --iglob '!**/usr/**' --iglob '!**/samples/**' | cat
> > > >| kernel/kexec_file.c: if (sym->st_size != size) {
> > > >
> > > >Symbols in kexec kernel.
> > > >
> > > >| fs/stat.c:   tmp.st_size = stat->size;
> > > >| Documentation/networking/tls.txt:  sendfile(sock, file, &offset, stat.st_size);
> > > >| net/9p/client.c:             ret->st_rdev, ret->st_size, ret->st_blksize,
> > > >| net/9p/protocol.c:                                   &stbuf->st_rdev, &stbuf->st_size,
> > > >| fs/9p/vfs_inode_dotl.c:              i_size_write(inode, stat->st_size);
> > > >| fs/hostfs/hostfs_user.c:     p->size = buf->st_size;
> > > >| arch/powerpc/boot/mktree.c:  nblks = (st.st_size + IMGBLK) / IMGBLK;
> > > >| arch/alpha/kernel/osf_sys.c: tmp.st_size     = lstat->size;
> > > >| arch/x86/ia32/sys_ia32.c:        __put_user(stat->size, &ubuf->st_size) ||
> > > >
> > > >Not Elf_Sym.
> > > >
> > > >| arch/x86/kernel/machine_kexec_64.c:                   sym->st_size);
> > > >
> > > >Symbols in kexec kernel.
> > > >
> > > >| arch/sparc/boot/piggyback.c: st4(buffer + 12, s.st_size);
> > > >| arch/sparc/kernel/sys_sparc32.c:     err |= put_user(stat->size, &statbuf->st_size);
> > > >| arch/um/os-Linux/file.c:             .ust_size    = src->st_size,    /* total size, in bytes */
> > > >| arch/um/os-Linux/start_up.c: size = (buf.st_size + UM_KERN_PAGE_SIZE) & ~(UM_KERN_PAGE_SIZE - 1);
> > > >| arch/s390/kernel/compat_linux.c:     tmp.st_size = stat->size;
> > > >| arch/arm/kernel/sys_oabi-compat.c:   tmp.st_size = stat->size;
> > > >| arch/mips/boot/compressed/calc_vmlinuz_load_addr.c:  vmlinux_size = (uint64_t)sb.st_size;
> > > >| drivers/net/ethernet/marvell/sky2.c:         hw->st_idx = RING_NEXT(hw->st_idx, hw->st_size);
> > > >
> > > >Not Elf_Sym.
> > >
> > > [ added Miroslav to CC, just in case he would like to check :) ]
> > >
> > > I have just double checked as well, and am fairly certain that the
> > > Elf_Sym st_size field is not used to apply module relocations in any
> > > arches, and it is not used in the core module loader nor in the module
> > > kallsyms code. We'd like to avoid overwriting st_info in any case, to
> > > fix kallsyms on Thumb-2 and also so that livepatch won't run into any
> > > issues with delayed relocations, should livepatch support ever expand
> > > to arches (e.g., arm) that rely on st_info for module relocations.
> > >
> > 
> > Also note that st_size cannot be relied upon in general, since we
> > overwrite the addresses of undefined symbols in a module's symbol
> > table when resolve them against ksyms, while the st_size field is kept
> > at 0. At relocation time, we don't really distinguish anymore between
> > local and external module symbols, and so relying on st_size to be
> > accurate would definitely break things.
> 
> Umm.
> 
> Undefined symbols of course have a zero size, because it's not known
> how large the definition that the symbol refers to is.  However,
> the non-undefined symbols should have a size with them - anything
> emitted by the compiler should, but only if .size has been used in
> the assembler will assembly have correct sizes.

which I guess is one reason st_size is a bit unreliable.

For kallsyms purposes we assume that the size of each symbol is just the
offset from it to the next symbol, which seems good enough in practice.

> Aren't SHN_UNDEF symbols ignored, except when applying the
> relocations, where we rely on st_value to be set on SHN_UNDEF
> symbols?

Presumably, but it looks like st_size is pretty much don't care.

I renamed st_size in the struct in elf.h to a garbage name and the
kernel and modules still build fine, though I only tried this for arm64
so far.

This suggests that at runtime at least, there is no runtime usage of
this field at all unless it's in arch-specific code for other arches.
Others seem confident that it's not used anywhere, and although I've not
gone through all the grep hits, I've only encountered false positives so
far.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] module: Overwrite st_size instead of st_info
  2018-11-19 16:25 [PATCH 1/2] module: Overwrite st_size instead of st_info Vincent Whitchurch
  2018-11-19 16:25 ` [PATCH 2/2] ARM: module: Fix function kallsyms on Thumb-2 Vincent Whitchurch
  2018-11-22 12:01 ` [PATCH 1/2] module: Overwrite st_size instead of st_info Dave Martin
@ 2018-11-23 17:59 ` Dave Martin
  2 siblings, 0 replies; 11+ messages in thread
From: Dave Martin @ 2018-11-23 17:59 UTC (permalink / raw)
  To: Vincent Whitchurch
  Cc: linux, jeyu, Vincent Whitchurch, linux-arm-kernel, linux-kernel

On Mon, Nov 19, 2018 at 05:25:12PM +0100, Vincent Whitchurch wrote:
> st_info is currently overwritten after relocation and used to store the
> elf_type().  However, we're going to need it fix kallsyms on ARM's
> Thumb-2 kernels, so preserve st_info and overwrite the st_size field
> instead.  st_size is neither used by the module core nor by any
> architecture.
> 
> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> ---
> v4: Split out to separate patch.  Use st_size instead of st_other.
> v1-v3: See PATCH 2/2
> 
>  kernel/module.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/module.c b/kernel/module.c
> index 49a405891587..3d86a38b580c 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -2682,7 +2682,7 @@ static void add_kallsyms(struct module *mod, const struct load_info *info)
>  
>  	/* Set types up while we still have access to sections. */
>  	for (i = 0; i < mod->kallsyms->num_symtab; i++)
> -		mod->kallsyms->symtab[i].st_info
> +		mod->kallsyms->symtab[i].st_size
>  			= elf_type(&mod->kallsyms->symtab[i], info);
>  
>  	/* Now populate the cut down core kallsyms for after init. */
> @@ -4061,7 +4061,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
>  		kallsyms = rcu_dereference_sched(mod->kallsyms);
>  		if (symnum < kallsyms->num_symtab) {
>  			*value = kallsyms->symtab[symnum].st_value;
> -			*type = kallsyms->symtab[symnum].st_info;
> +			*type = kallsyms->symtab[symnum].st_size;
>  			strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
>  			strlcpy(module_name, mod->name, MODULE_NAME_LEN);
>  			*exported = is_exported(name, *value, mod);

Based on the discussion here:

Reviewed-by: Dave Martin <Dave.Martin@arm.com>

(I still think this is a bit gross, but without resorting to overkill
there's not an obvious nicer option.  Even before this patch, we were
abusing st_info.  Of course, someday somebody may find a use for
st_size and then we'd need to revisit this code...)

Cheers
---Dave

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] ARM: module: Fix function kallsyms on Thumb-2
  2018-11-19 16:25 ` [PATCH 2/2] ARM: module: Fix function kallsyms on Thumb-2 Vincent Whitchurch
@ 2018-11-23 18:34   ` Dave Martin
  0 siblings, 0 replies; 11+ messages in thread
From: Dave Martin @ 2018-11-23 18:34 UTC (permalink / raw)
  To: Vincent Whitchurch
  Cc: linux, jeyu, Vincent Whitchurch, linux-arm-kernel, linux-kernel

On Mon, Nov 19, 2018 at 05:25:13PM +0100, Vincent Whitchurch wrote:
> Thumb-2 functions have the lowest bit set in the symbol value in the
> symtab.  When kallsyms are generated for the vmlinux, the kallsyms are
> generated from the output of nm, and nm clears the lowest bit.
> 
>  $ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
>   95947: 8015dc89   686 FUNC    GLOBAL DEFAULT    2 show_interrupts
>  $ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
>  8015dc88 T show_interrupts
>  $ cat /proc/kallsyms | grep show_interrupts
>  8015dc88 T show_interrupts
> 
> However, for modules, the kallsyms uses the values in the symbol table
> without modification, so for functions in modules, the lowest bit is set
> in kallsyms.
> 
>  $ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
>     333: 00002d4d    36 FUNC    GLOBAL DEFAULT    1 tun_get_socket
>  $ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
>  00002d4c T tun_get_socket
>  $ cat /proc/kallsyms | grep tun_get_socket
>  7f802d4d t tun_get_socket      [tun]
> 
> Because of this, the symbol+offset of the crashing instruction shown in
> oopses is incorrect when the crash is in a module.  For example, given a
> tun_get_socket which starts like this,
> 
>  00002d4c <tun_get_socket>:
>      2d4c:       6943            ldr     r3, [r0, #20]
>      2d4e:       4a07            ldr     r2, [pc, #28]
>      2d50:       4293            cmp     r3, r2
> 
> a crash when tun_get_socket is called with NULL results in:
> 
>  PC is at tun_xdp+0xa3/0xa4 [tun]
>  pc : [<7f802d4c>]
> 
> As can be seen, the "PC is at" line reports the wrong symbol name, and
> the symbol+offset will point to the wrong source line if it is passed to
> gdb.
> 
> To solve this, add a way for archs to fixup the reading of these module
> kallsyms values, and use that to clear the lowest bit for function
> symbols on Thumb-2.
> 
> After the fix:
> 
>  # cat /proc/kallsyms | grep tun_get_socket
>  7f802d4c t tun_get_socket       [tun]
> 
>  PC is at tun_get_socket+0x0/0x24 [tun]
>  pc : [<7f802d4c>]
> 
> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> ---
> v4: Split out st_value overwrite change.  Add HAVE* macro to avoid function call.
> v3: Do not overwrite st_value
> v2: Fix build warning with !MODULES
> 
>  arch/arm/include/asm/module.h | 11 +++++++++++
>  include/linux/module.h        |  7 +++++++
>  kernel/module.c               | 25 ++++++++++++++-----------
>  3 files changed, 32 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm/include/asm/module.h b/arch/arm/include/asm/module.h
> index 9e81b7c498d8..e2ccec651e71 100644
> --- a/arch/arm/include/asm/module.h
> +++ b/arch/arm/include/asm/module.h
> @@ -61,4 +61,15 @@ u32 get_module_plt(struct module *mod, unsigned long loc, Elf32_Addr val);
>  	MODULE_ARCH_VERMAGIC_ARMTHUMB \
>  	MODULE_ARCH_VERMAGIC_P2V
>  
> +#ifdef CONFIG_THUMB2_KERNEL
> +#define HAVE_ARCH_MODULE_KALLSYMS_SYMBOL_VALUE
> +static inline unsigned long module_kallsyms_symbol_value(Elf_Sym *sym)
> +{
> +	if (ELF_ST_TYPE(sym->st_info) == STT_FUNC)
> +		return sym->st_value & ~1;
> +
> +	return sym->st_value;
> +}
> +#endif
> +
>  #endif /* _ASM_ARM_MODULE_H */
> diff --git a/include/linux/module.h b/include/linux/module.h
> index fce6b4335e36..cfc55f358800 100644
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -486,6 +486,13 @@ struct module {
>  #define MODULE_ARCH_INIT {}
>  #endif
>  
> +#ifndef HAVE_ARCH_MODULE_KALLSYMS_SYMBOL_VALUE
> +static inline unsigned long module_kallsyms_symbol_value(Elf_Sym *sym)
> +{
> +	return sym->st_value;
> +}
> +#endif
> +
>  extern struct mutex module_mutex;
>  
>  /* FIXME: It'd be nice to isolate modules during init, too, so they
> diff --git a/kernel/module.c b/kernel/module.c
> index 3d86a38b580c..a36d7915ed2b 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3934,6 +3934,9 @@ static const char *get_ksymbol(struct module *mod,
>  	/* Scan for closest preceding symbol, and next symbol. (ELF
>  	   starts real symbols at 1). */
>  	for (i = 1; i < kallsyms->num_symtab; i++) {
> +		unsigned long thisval = module_kallsyms_symbol_value(&kallsyms->symtab[i]);
> +		unsigned long bestval = module_kallsyms_symbol_value(&kallsyms->symtab[best]);
> +
>  		if (kallsyms->symtab[i].st_shndx == SHN_UNDEF)
>  			continue;
>  
> @@ -3943,21 +3946,21 @@ static const char *get_ksymbol(struct module *mod,
>  		    || is_arm_mapping_symbol(symname(kallsyms, i)))
>  			continue;
>  
> -		if (kallsyms->symtab[i].st_value <= addr
> -		    && kallsyms->symtab[i].st_value > kallsyms->symtab[best].st_value)
> +		if (thisval <= addr
> +		    && thisval > bestval)

Nit: this fits easily on one line now.

>  			best = i;

Can we declare bestval outside the loop and update it here, so that
we always have
bestval == module_kallsyms_symbol_value(&kallsyms->symtab[best]) ?

Then we wouldn't need to call module_kallsyms_symbol_value() again
afterwards at [1], [2] below.

> -		if (kallsyms->symtab[i].st_value > addr
> -		    && kallsyms->symtab[i].st_value < nextval)
> -			nextval = kallsyms->symtab[i].st_value;
> +		if (thisval > addr
> +		    && thisval < nextval)
> +			nextval = thisval;

Nit: same again.

>  	}
>  
>  	if (!best)
>  		return NULL;
>  
>  	if (size)
> -		*size = nextval - kallsyms->symtab[best].st_value;
> +		*size = nextval - module_kallsyms_symbol_value(&kallsyms->symtab[best]);

[1]

>  	if (offset)
> -		*offset = addr - kallsyms->symtab[best].st_value;
> +		*offset = addr - module_kallsyms_symbol_value(&kallsyms->symtab[best]);

[2]

>  	return symname(kallsyms, best);
>  }
>  
> @@ -4060,7 +4063,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
>  			continue;
>  		kallsyms = rcu_dereference_sched(mod->kallsyms);
>  		if (symnum < kallsyms->num_symtab) {
> -			*value = kallsyms->symtab[symnum].st_value;
> +			*value = module_kallsyms_symbol_value(&kallsyms->symtab[symnum]);
>  			*type = kallsyms->symtab[symnum].st_size;
>  			strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
>  			strlcpy(module_name, mod->name, MODULE_NAME_LEN);
> @@ -4082,7 +4085,7 @@ static unsigned long mod_find_symname(struct module *mod, const char *name)
>  	for (i = 0; i < kallsyms->num_symtab; i++)
>  		if (strcmp(name, symname(kallsyms, i)) == 0 &&
>  		    kallsyms->symtab[i].st_shndx != SHN_UNDEF)
> -			return kallsyms->symtab[i].st_value;
> +			return module_kallsyms_symbol_value(&kallsyms->symtab[i]);
>  	return 0;
>  }
>  
> @@ -4131,8 +4134,8 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
>  			if (kallsyms->symtab[i].st_shndx == SHN_UNDEF)
>  				continue;
>  
> -			ret = fn(data, symname(kallsyms, i),
> -				 mod, kallsyms->symtab[i].st_value);
> +			ret = fn(data, symname(kallsyms, i), mod,
> +				 module_kallsyms_symbol_value(&kallsyms->symtab[i]));

Nit: We have some more overlong lines throughout this file now, which is
mildly annoying (though hardly a big deal).

In may places the expression kallsyms->symtab[foo] appears multiple
times and adds to verbosity.

Is it worth stashing the pointer first?  For example, in this case:

			const Elf_Sym *sym = &kallsyms->symtab[i];

			if (sym->st_shndx == SHN_UNDEF)
				continue;

			ret = fn(data, symname(kallsyms, i), mod,
				module_kallsyms_symbol_value(sym));

This adds two lines in the diffstat of course.  Take your pick!

Cheers
---Dave

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-11-23 18:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-19 16:25 [PATCH 1/2] module: Overwrite st_size instead of st_info Vincent Whitchurch
2018-11-19 16:25 ` [PATCH 2/2] ARM: module: Fix function kallsyms on Thumb-2 Vincent Whitchurch
2018-11-23 18:34   ` Dave Martin
2018-11-22 12:01 ` [PATCH 1/2] module: Overwrite st_size instead of st_info Dave Martin
2018-11-22 12:24   ` Vincent Whitchurch
2018-11-22 16:28     ` Jessica Yu
2018-11-22 17:40       ` Ard Biesheuvel
2018-11-22 17:49         ` Russell King - ARM Linux
2018-11-23 17:21           ` Dave Martin
2018-11-23 12:57       ` Miroslav Benes
2018-11-23 17:59 ` Dave Martin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).