All of lore.kernel.org
 help / color / mirror / Atom feed
* alternatives to null-terminated byte arrays in syscalls in the future?
@ 2016-04-08 21:04 Andrew Kelley
  2016-04-08 21:10 ` Denys Vlasenko
  2016-04-09 12:37 ` One Thousand Gnomes
  0 siblings, 2 replies; 4+ messages in thread
From: Andrew Kelley @ 2016-04-08 21:04 UTC (permalink / raw)
  To: linux-kernel

The open syscall looks like this:

SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode)

filename is a null terminated byte array. Null termination is one way
to handle lengths of byte arrays, but arguably a better way is to keep
track of the length in a separate field. Many programming languages
use pointer + length instead of null termination for various reasons.

When it's time to make a syscall such as open, software which does not
have a null character at the end of byte arrays are forced to allocate
memory, do a memcpy, insert a null byte, perform the open syscall,
then deallocate the memory.

What are the chances that in the future, Linux will have alternate
syscalls which accept byte array parameters where one can pass the
length of the byte array explicitly instead of using a null byte?

Regards,
Andrew Kelley

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: alternatives to null-terminated byte arrays in syscalls in the future?
  2016-04-08 21:04 alternatives to null-terminated byte arrays in syscalls in the future? Andrew Kelley
@ 2016-04-08 21:10 ` Denys Vlasenko
  2016-04-08 21:21   ` Andrew Kelley
  2016-04-09 12:37 ` One Thousand Gnomes
  1 sibling, 1 reply; 4+ messages in thread
From: Denys Vlasenko @ 2016-04-08 21:10 UTC (permalink / raw)
  To: Andrew Kelley; +Cc: Linux Kernel Mailing List

On Fri, Apr 8, 2016 at 11:04 PM, Andrew Kelley <superjoe30@gmail.com> wrote:
> The open syscall looks like this:
>
> SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode)
>
> filename is a null terminated byte array. Null termination is one way
> to handle lengths of byte arrays, but arguably a better way is to keep
> track of the length in a separate field. Many programming languages
> use pointer + length instead of null termination for various reasons.
>
> When it's time to make a syscall such as open, software which does not
> have a null character at the end of byte arrays are forced to allocate
> memory, do a memcpy, insert a null byte, perform the open syscall,
> then deallocate the memory.

In many cases, it's possible to just add the NUL byte instead.

> What are the chances that in the future, Linux will have alternate
> syscalls which accept byte array parameters where one can pass the
> length of the byte array explicitly instead of using a null byte?

0% chances. Amount of PITA to make that happen far outweighs
possible benefits.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: alternatives to null-terminated byte arrays in syscalls in the future?
  2016-04-08 21:10 ` Denys Vlasenko
@ 2016-04-08 21:21   ` Andrew Kelley
  0 siblings, 0 replies; 4+ messages in thread
From: Andrew Kelley @ 2016-04-08 21:21 UTC (permalink / raw)
  To: Denys Vlasenko; +Cc: Linux Kernel Mailing List

On Fri, Apr 8, 2016 at 2:10 PM, Denys Vlasenko <vda.linux@googlemail.com> wrote:
> On Fri, Apr 8, 2016 at 11:04 PM, Andrew Kelley <superjoe30@gmail.com> wrote:
>> The open syscall looks like this:
>>
>> SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode)
>>
>> filename is a null terminated byte array. Null termination is one way
>> to handle lengths of byte arrays, but arguably a better way is to keep
>> track of the length in a separate field. Many programming languages
>> use pointer + length instead of null termination for various reasons.
>>
>> When it's time to make a syscall such as open, software which does not
>> have a null character at the end of byte arrays are forced to allocate
>> memory, do a memcpy, insert a null byte, perform the open syscall,
>> then deallocate the memory.
>
> In many cases, it's possible to just add the NUL byte instead.

Counter example, the Rust standard library:
https://github.com/rust-lang/rust/blob/7e996943784dcbabed433b6906510298ad80903b/src/libstd/sys/unix/fs.rs#L420-L423
https://github.com/rust-lang/rust/blob/7e996943784dcbabed433b6906510298ad80903b/src/libstd/sys/unix/fs.rs#L534-L536

The problem is that the open syscall is low level in a given
application so is usually abstracted in a way where having space to
add the NUL byte is not guaranteed, so implementations have to take
the safe bet of copying memory.

>
>> What are the chances that in the future, Linux will have alternate
>> syscalls which accept byte array parameters where one can pass the
>> length of the byte array explicitly instead of using a null byte?
>
> 0% chances. Amount of PITA to make that happen far outweighs
> possible benefits.

OK, fair enough. If I proposed a patch to the mailing list, would that
change the chances at all?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: alternatives to null-terminated byte arrays in syscalls in the future?
  2016-04-08 21:04 alternatives to null-terminated byte arrays in syscalls in the future? Andrew Kelley
  2016-04-08 21:10 ` Denys Vlasenko
@ 2016-04-09 12:37 ` One Thousand Gnomes
  1 sibling, 0 replies; 4+ messages in thread
From: One Thousand Gnomes @ 2016-04-09 12:37 UTC (permalink / raw)
  To: Andrew Kelley; +Cc: linux-kernel

On Fri, 8 Apr 2016 14:04:00 -0700
Andrew Kelley <superjoe30@gmail.com> wrote:

> The open syscall looks like this:
> 
> SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode)
> 
> filename is a null terminated byte array. Null termination is one way
> to handle lengths of byte arrays, but arguably a better way is to keep
> track of the length in a separate field. Many programming languages
> use pointer + length instead of null termination for various reasons.
> 
> When it's time to make a syscall such as open, software which does not
> have a null character at the end of byte arrays are forced to allocate
> memory, do a memcpy, insert a null byte, perform the open syscall,
> then deallocate the memory.

That should only happen if the language wasn't carefully thought out. If
your name objects include both the length and the space available so you
can do array offset validation then

- you can check if the \0 will fit
- your app or interreter can add space for \0 or even include it
  specifically

I would also be very surprised if most applications doing such
conversions even showed up meaningfully in the profiling. pathname
syscalls are not the most common ones being executed.

Alan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-04-09 12:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-08 21:04 alternatives to null-terminated byte arrays in syscalls in the future? Andrew Kelley
2016-04-08 21:10 ` Denys Vlasenko
2016-04-08 21:21   ` Andrew Kelley
2016-04-09 12:37 ` One Thousand Gnomes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.