[Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace
@ 2017-07-12 12:43 David Howells
  2017-07-12 14:33 ` Arnaldo Carvalho de Melo
  2017-07-12 14:57 ` David Howells
  0 siblings, 2 replies; 11+ messages in thread
From: David Howells @ 2017-07-12 12:43 UTC (permalink / raw)
  To: ksummit-discuss

Whilst undertaking a foray into container space and, related to that, looking
at overhauling the mounting API, it occurred to me that I could make use of
the mount context (now fs_context) that I was creating to allow the filesystem
driver to pass supplementary error information back to the userspace program
that was driving it in the form of textual messages:

	int fd = fsopen("ext4");
	write(fd, "d /dev/sda2");
	write(fd, "o user_xattr");
	if (fsmount(fd, "/mnt") == -1) {
		/* Something went wrong, read back any error info */
		size = read(fd, buffer, sizeof(buffer));
		/* Now print the supplementary error message */
		fprintf(stderr, "%*.*s\n", size, size, buffer);
	}

This would be particularly useful in the case of mounting a filesystem where
so many things can go wrong that a small number is insufficient to represent
them all.  Yes, you have the dmesg log, but that's not necessarily available
to you and is potentially intermixed with other things.  Further, it's more
user-friendly if the mount command or your GUI gives you the errors directly.

However, it occurred to me that this feature might be useful in other cases,
not just mounting, and there are cases where it's not easy or not possible to
get the message back to userspace because there's no user-accessible context
(eg. automounting), or because the context is buried several levels down the
stack (eg. NFS mount doing a pathwalk).

In which case, would it make sense to attach such a facility to the
task_struct instead?  I implemented a test of this using prctl, but a new
syscall might be a better idea, at least for reading.

 (*) int old_setting = prctl(PR_ERRMSG_ENABLE, int setting);

     Enable (setting == 1) or disable (setting == 0) the facility.
     Disabling the facility clears the error buffer.

 (*) int size = prctl(PR_ERRMSG_READ, char *buffer, int buf_size);

     Read back a message and discard it.  

Anyway, some questions:

 (1) Is this something that would be of interest on a more global scale?

     Or should I just stick with stashing it in the fs_context structure and
     find someway to route around the pathwalk in nfs mount?

     Or is this totally a bad idea and only dmesg should ever be used?

If it is of interest globally:

 (2) How big should I make each task's message buffer?  My current
     implementation allows each task to hold a single message if enabled.

 (3) Should I allow warnings in addition to errors?

 (4) Should I allow wait() and co. to try and retrieve errors from zombies?

 (5) Should execve() disable the facility?

 (6) Could all the messages be static (not kmalloc'd) and cleared/redacted by
     rmmod?  This would potentially prevent the use of formatted messages.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace
  2017-07-12 12:43 [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace David Howells
@ 2017-07-12 14:33 ` Arnaldo Carvalho de Melo
  2017-07-12 14:44   ` Arnaldo Carvalho de Melo
  2017-07-12 14:57 ` David Howells
  1 sibling, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-07-12 14:33 UTC (permalink / raw)
  To: David Howells; +Cc: ksummit-discuss

Em Wed, Jul 12, 2017 at 01:43:30PM +0100, David Howells escreveu:
> Whilst undertaking a foray into container space and, related to that, looking
> at overhauling the mounting API, it occurred to me that I could make use of
> the mount context (now fs_context) that I was creating to allow the filesystem
> driver to pass supplementary error information back to the userspace program
> that was driving it in the form of textual messages:
> 
> 	int fd = fsopen("ext4");
> 	write(fd, "d /dev/sda2");
> 	write(fd, "o user_xattr");
> 	if (fsmount(fd, "/mnt") == -1) {
> 		/* Something went wrong, read back any error info */
> 		size = read(fd, buffer, sizeof(buffer));
> 		/* Now print the supplementary error message */
> 		fprintf(stderr, "%*.*s\n", size, size, buffer);
> 	}
> 
> This would be particularly useful in the case of mounting a filesystem where
> so many things can go wrong that a small number is insufficient to represent
> them all.  Yes, you have the dmesg log, but that's not necessarily available
> to you and is potentially intermixed with other things.  Further, it's more
> user-friendly if the mount command or your GUI gives you the errors directly.
> 
> However, it occurred to me that this feature might be useful in other cases,
> not just mounting, and there are cases where it's not easy or not possible to
> get the message back to userspace because there's no user-accessible context
> (eg. automounting), or because the context is buried several levels down the
> stack (eg. NFS mount doing a pathwalk).
> 
> In which case, would it make sense to attach such a facility to the
> task_struct instead?  I implemented a test of this using prctl, but a new
> syscall might be a better idea, at least for reading.
> 
>  (*) int old_setting = prctl(PR_ERRMSG_ENABLE, int setting);
> 
>      Enable (setting == 1) or disable (setting == 0) the facility.
>      Disabling the facility clears the error buffer.
> 
>  (*) int size = prctl(PR_ERRMSG_READ, char *buffer, int buf_size);
> 
>      Read back a message and discard it.  

There were discussions about this in the not so distant past, perf being
one of the areas where something like this would help a lot, lemme dig
it, yeah, there is even a short LWN article describing it and with links
to the lkml posts:

  https://lwn.net/Articles/657341/

Involces prctl as yours, etc, etc.

What we do now in tools/perf/ with what we do have now is to have
strerrno like messages for each class and method (well, we have for some
of them), like:

  int perf_evsel__open_strerror(struct perf_evsel *evsel,
                                struct target *target,
                                int err, char *msg, size_t size);

where we have a switch to see, from syscall errno return and intended
target (CPU, system wide, a specific thread, cgroups, etc), who is
asking this (user, root, etc) and lots of other tunables, how to best
translate this to the user, formatting it in a string allows us to show
it in whatever GUI is in use.

- Arnaldo
 
> 
> Anyway, some questions:
> 
>  (1) Is this something that would be of interest on a more global scale?
> 
>      Or should I just stick with stashing it in the fs_context structure and
>      find someway to route around the pathwalk in nfs mount?
> 
>      Or is this totally a bad idea and only dmesg should ever be used?
> 
> If it is of interest globally:
> 
>  (2) How big should I make each task's message buffer?  My current
>      implementation allows each task to hold a single message if enabled.
> 
>  (3) Should I allow warnings in addition to errors?
> 
>  (4) Should I allow wait() and co. to try and retrieve errors from zombies?
> 
>  (5) Should execve() disable the facility?
> 
>  (6) Could all the messages be static (not kmalloc'd) and cleared/redacted by
>      rmmod?  This would potentially prevent the use of formatted messages.
> 
> David
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace
  2017-07-12 14:33 ` Arnaldo Carvalho de Melo
@ 2017-07-12 14:44   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-07-12 14:44 UTC (permalink / raw)
  To: David Howells; +Cc: ksummit-discuss

Em Wed, Jul 12, 2017 at 11:33:21AM -0300, Arnaldo Carvalho de Melo escreveu:
> What we do now in tools/perf/ with what we do have now is to have
> strerrno like messages for each class and method (well, we have for some
> of them), like:
> 
>   int perf_evsel__open_strerror(struct perf_evsel *evsel,
>                                 struct target *target,
>                                 int err, char *msg, size_t size);
> 
> where we have a switch to see, from syscall errno return and intended
> target (CPU, system wide, a specific thread, cgroups, etc), who is
> asking this (user, root, etc) and lots of other tunables, how to best
> translate this to the user, formatting it in a string allows us to show
> it in whatever GUI is in use.

To get this clearer in terms of actual usage, here is a (simplified) snippet
for 'perf top':

  try_again:

  	if (perf_evsel__open(event, cpus, threads) < 0) {
		if (perf_evsel__fallback(event, errno, msg, sizeof(msg))) {
			if (verbose > 0)
				ui__warning("%s\n", msg);
				goto try_again;
                        }

		perf_evsel__open_strerror(event, target, errno, msg, sizeof(msg));
		ui__error("%s\n", msg);
		goto out_err;
	}

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace
  2017-07-12 12:43 [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace David Howells
  2017-07-12 14:33 ` Arnaldo Carvalho de Melo
@ 2017-07-12 14:57 ` David Howells
  2017-07-12 15:21   ` Stephen Hemminger
  1 sibling, 1 reply; 11+ messages in thread
From: David Howells @ 2017-07-12 14:57 UTC (permalink / raw)
  To: ksummit-discuss

David Howells <dhowells@redhat.com> wrote:

> In which case, would it make sense to attach such a facility to the
> task_struct instead?  I implemented a test of this using prctl, but a new
> syscall might be a better idea, at least for reading.
> 
>  (*) int old_setting = prctl(PR_ERRMSG_ENABLE, int setting);
> 
>      Enable (setting == 1) or disable (setting == 0) the facility.
>      Disabling the facility clears the error buffer.
> 
>  (*) int size = prctl(PR_ERRMSG_READ, char *buffer, int buf_size);
> 
>      Read back a message and discard it.  

I forgot to add that I've kept the in-kernel interface I have for this very
simple for the moment:

	void errorf(const char *fmt, ...);
	int invalf(const char *fmt, ...);

where these functions take printf-style arguments and where invalf() is the
same as errorf(), but returns -EINVAL for convenience.  To take an example
from NFS:

-	if (auth_info->flavor_len + 1 >= max_flavor_len) {
-		dfprintk(MOUNT, "NFS: too many sec= flavors\n");
-		return -EINVAL;
-	}
+	if (auth_info->flavor_len + 1 >= max_flavor_len)
+		return invalf("NFS: too many sec= flavors");

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace
  2017-07-12 14:57 ` David Howells
@ 2017-07-12 15:21   ` Stephen Hemminger
  2017-07-12 16:19     ` Linus Torvalds
  2017-07-21 13:41     ` David Howells
  0 siblings, 2 replies; 11+ messages in thread
From: Stephen Hemminger @ 2017-07-12 15:21 UTC (permalink / raw)
  To: David Howells; +Cc: ksummit-discuss

On Wed, 12 Jul 2017 15:57:56 +0100
David Howells <dhowells@redhat.com> wrote:

> David Howells <dhowells@redhat.com> wrote:
> 
> > In which case, would it make sense to attach such a facility to the
> > task_struct instead?  I implemented a test of this using prctl, but a new
> > syscall might be a better idea, at least for reading.
> > 
> >  (*) int old_setting = prctl(PR_ERRMSG_ENABLE, int setting);
> > 
> >      Enable (setting == 1) or disable (setting == 0) the facility.
> >      Disabling the facility clears the error buffer.
> > 
> >  (*) int size = prctl(PR_ERRMSG_READ, char *buffer, int buf_size);
> > 
> >      Read back a message and discard it.    
> 
> I forgot to add that I've kept the in-kernel interface I have for this very
> simple for the moment:
> 
> 	void errorf(const char *fmt, ...);
> 	int invalf(const char *fmt, ...);
> 
> where these functions take printf-style arguments and where invalf() is the
> same as errorf(), but returns -EINVAL for convenience.  To take an example
> from NFS:
> 
> -	if (auth_info->flavor_len + 1 >= max_flavor_len) {
> -		dfprintk(MOUNT, "NFS: too many sec= flavors\n");
> -		return -EINVAL;
> -	}
> +	if (auth_info->flavor_len + 1 >= max_flavor_len)
> +		return invalf("NFS: too many sec= flavors");

Netlink has recently got extended error reporting, still not used widely
and library support is lacking in most places.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace
  2017-07-12 15:21   ` Stephen Hemminger
@ 2017-07-12 16:19     ` Linus Torvalds
  2017-07-12 16:35       ` Stephen Hemminger
  2017-07-19 13:02       ` Steven Rostedt
  2017-07-21 13:41     ` David Howells
  1 sibling, 2 replies; 11+ messages in thread
From: Linus Torvalds @ 2017-07-12 16:19 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: ksummit

On Wed, Jul 12, 2017 at 8:21 AM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> Netlink has recently got extended error reporting, still not used widely
> and library support is lacking in most places.

Yeah, and that "not widely supported and library support is lacking"
is always going to be an issue with anything like that.

Along with internationalization, which is a whole nasty set of issues
in itself with error messages.

It's not going to happen, in other words. The problems are basically
insurmountable, and the thing it fixes will always be some special
case that doesn't much matter.

Every time it comes up it is because some developer found one case
that they were hunting down and it annoyed them, and the developer
went "if only it had included more information and it would have been
obvious".

But every time it comes up people ignore this basic issue:

     [torvalds@i7 linux]$ git grep -e '-E[A-Z]\{4\}' | wc -l
     182523

Give it up. It's really is a horrible idea for so many reasons.

                     Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace
  2017-07-12 16:19     ` Linus Torvalds
@ 2017-07-12 16:35       ` Stephen Hemminger
  2017-07-19 13:02       ` Steven Rostedt
  1 sibling, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2017-07-12 16:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: ksummit

On Wed, 12 Jul 2017 09:19:55 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Wed, Jul 12, 2017 at 8:21 AM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > Netlink has recently got extended error reporting, still not used widely
> > and library support is lacking in most places.  
> 
> Yeah, and that "not widely supported and library support is lacking"
> is always going to be an issue with anything like that.
> 
> Along with internationalization, which is a whole nasty set of issues
> in itself with error messages.
> 
> It's not going to happen, in other words. The problems are basically
> insurmountable, and the thing it fixes will always be some special
> case that doesn't much matter.
> 
> Every time it comes up it is because some developer found one case
> that they were hunting down and it annoyed them, and the developer
> went "if only it had included more information and it would have been
> obvious".
> 
> But every time it comes up people ignore this basic issue:
> 
>      [torvalds@i7 linux]$ git grep -e '-E[A-Z]\{4\}' | wc -l
>      182523
> 
> 
> Give it up. It's really is a horrible idea for so many reasons.

For netlink, it isn't so bad. 80% of the usage is in iproute2 and
therefore getting tool support for the usual cases isn't too hard.

I fear kernel developers think at too low a level. They think if glibc
and/or 1st level command can handle an extension, their work is done.
But in the modern world, there are many scripts and layers above that.
For the networking case, the worst case examples are things where configuration
is done in stuff like some layer on top of Openstack, in python code
which is scripting ip commands, which is talking to the kernel. Good luck
on trying to get any meaningful error handling out of that dog pile.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace
  2017-07-12 16:19     ` Linus Torvalds
  2017-07-12 16:35       ` Stephen Hemminger
@ 2017-07-19 13:02       ` Steven Rostedt
  2017-07-24  7:55         ` Miklos Szeredi
  2017-07-24  8:25         ` David Howells
  1 sibling, 2 replies; 11+ messages in thread
From: Steven Rostedt @ 2017-07-19 13:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: ksummit

On Wed, 12 Jul 2017 09:19:55 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Wed, Jul 12, 2017 at 8:21 AM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > Netlink has recently got extended error reporting, still not used widely
> > and library support is lacking in most places.  
> 
> Yeah, and that "not widely supported and library support is lacking"
> is always going to be an issue with anything like that.
> 
> Along with internationalization, which is a whole nasty set of issues
> in itself with error messages.
> 
> It's not going to happen, in other words. The problems are basically
> insurmountable, and the thing it fixes will always be some special
> case that doesn't much matter.
> 
> Every time it comes up it is because some developer found one case
> that they were hunting down and it annoyed them, and the developer
> went "if only it had included more information and it would have been
> obvious".
> 
> But every time it comes up people ignore this basic issue:
> 
>      [torvalds@i7 linux]$ git grep -e '-E[A-Z]\{4\}' | wc -l
>      182523
> 

Note a lot of those -E* are not going to user space. Some are in
comments, and some are used internally. I use them to pass back
information to other kernel only routines, as some errors are more
critical than others.

> 
> Give it up. It's really is a horrible idea for so many reasons.
> 

One reason that this has never taken off is that there is no good
infrastructure in doing it. I wouldn't tell people to give it up, but
I don't see a one size fits all. In tracing, we have ways to pass
detailed errors back to user space. But that's probably one of the
easier cases as we have defined methods to do so.

A more generic approach would require a lot more planning, and making
it simple to use both in user space and in the kernel. If it is too
complex in either place, it will be ignored.

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace
  2017-07-12 15:21   ` Stephen Hemminger
  2017-07-12 16:19     ` Linus Torvalds
@ 2017-07-21 13:41     ` David Howells
  1 sibling, 0 replies; 11+ messages in thread
From: David Howells @ 2017-07-21 13:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: ksummit

Linus Torvalds <torvalds@linux-foundation.org> wrote:

> But every time it comes up people ignore this basic issue:
> 
>      [torvalds@i7 linux]$ git grep -e '-E[A-Z]\{4\}' | wc -l
>      182523
> 
> 
> Give it up. It's really is a horrible idea for so many reasons.

Are you okay with me making it possible to retrieve mount errors, warnings and
informational messages through fd-arbitrated-mount I'm working on?  For
example (and skipping some of the parameters for brevity):

	int fs_fd;

	static inline void e(int x)
	{
		char buf[1024];
		int i;
		if (x == -1)
			fprintf(stderr, "Mount error: %m\n");
		/* Read back any messages */
		while (i = read(fs_fd, buf), i != -1) {
			buf[i] = 0;
			fprintf(stderr, "%s\n", buf);
		}
		if (x == -1)
			exit(1);
	}

	fs_fd = fsopen("ext4");
	e(write(fs_fd, "d /dev/sda3"));
	e(write(fs_fd, "o user_xattr"));
	e(write(fs_fd, "o acl"));
	e(write(fs_fd, "o data=ordered"));
	e(write(fs_fd, "x create"));
	e(fsmount(fs_fd, AT_FDCWD, "/mnt", MS_NODEV));
	close(fs_fd);

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace
  2017-07-19 13:02       ` Steven Rostedt
@ 2017-07-24  7:55         ` Miklos Szeredi
  2017-07-24  8:25         ` David Howells
  1 sibling, 0 replies; 11+ messages in thread
From: Miklos Szeredi @ 2017-07-24  7:55 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: ksummit

On Wed, Jul 19, 2017 at 3:02 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed, 12 Jul 2017 09:19:55 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>> On Wed, Jul 12, 2017 at 8:21 AM, Stephen Hemminger
>> <stephen@networkplumber.org> wrote:
>> >
>> > Netlink has recently got extended error reporting, still not used widely
>> > and library support is lacking in most places.
>>
>> Yeah, and that "not widely supported and library support is lacking"
>> is always going to be an issue with anything like that.
>>
>> Along with internationalization, which is a whole nasty set of issues
>> in itself with error messages.
>>
>> It's not going to happen, in other words. The problems are basically
>> insurmountable, and the thing it fixes will always be some special
>> case that doesn't much matter.
>>
>> Every time it comes up it is because some developer found one case
>> that they were hunting down and it annoyed them, and the developer
>> went "if only it had included more information and it would have been
>> obvious".
>>
>> But every time it comes up people ignore this basic issue:
>>
>>      [torvalds@i7 linux]$ git grep -e '-E[A-Z]\{4\}' | wc -l
>>      182523
>>
>
> Note a lot of those -E* are not going to user space. Some are in
> comments, and some are used internally. I use them to pass back
> information to other kernel only routines, as some errors are more
> critical than others.

a) it wouldn't have to be for every error

b) kernel prints detailed error in dmesg anyway, why not allow that
info to be bound to the syscall that triggered the error?

c) internationalization can be solved at the level where it matters
(NOT in the kernel)

My suggestion was to keep the kernel interface really simple, e.g.:

   return detailed_error(-EINVAL, "failure to do foo because of bar");

What are the insurmountable issues you are talking about?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace
  2017-07-19 13:02       ` Steven Rostedt
  2017-07-24  7:55         ` Miklos Szeredi
@ 2017-07-24  8:25         ` David Howells
  1 sibling, 0 replies; 11+ messages in thread
From: David Howells @ 2017-07-24  8:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ksummit

Miklos Szeredi <miklos@szeredi.hu> wrote:

> My suggestion was to keep the kernel interface really simple, e.g.:
> 
>    return detailed_error(-EINVAL, "failure to do foo because of bar");

That's what I was thinking of, though I'd prefix the string with a source tag,
such as "nfs", "vfs" or "dvb-core".

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-07-24  8:25 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-12 12:43 [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error info back to userspace David Howells
2017-07-12 14:33 ` Arnaldo Carvalho de Melo
2017-07-12 14:44   ` Arnaldo Carvalho de Melo
2017-07-12 14:57 ` David Howells
2017-07-12 15:21   ` Stephen Hemminger
2017-07-12 16:19     ` Linus Torvalds
2017-07-12 16:35       ` Stephen Hemminger
2017-07-19 13:02       ` Steven Rostedt
2017-07-24  7:55         ` Miklos Szeredi
2017-07-24  8:25         ` David Howells
2017-07-21 13:41     ` David Howells

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.