* [patch] proc.5: tell how to parse /proc/*/stat correctly @ 2022-12-22 16:42 Alexey Dobriyan 2022-12-22 22:03 ` [oss-security] " Dominique Martinet 0 siblings, 1 reply; 21+ messages in thread From: Alexey Dobriyan @ 2022-12-22 16:42 UTC (permalink / raw) To: Alejandro Colomar, Michael Kerrisk; +Cc: linux-kernel, linux-man, oss-security /proc/*/stat can't be parsed with split() or split(" ") or split(' ') or sscanf("%d (%s) ...") or equivalents because "comm" can contain whitespace and parenthesis and is not escaped by the kernel. BTW escaping would not help with naive split() anyway. Mention strrchr(')') so people can at least stop adding new bugs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> --- man5/proc.5 | 5 +++++ 1 file changed, 5 insertions(+) --- a/man5/proc.5 +++ b/man5/proc.5 @@ -2092,6 +2092,11 @@ Strings longer than .B TASK_COMM_LEN (16) characters (including the terminating null byte) are silently truncated. This is visible whether or not the executable is swapped out. + +Note that \fIcomm\fP can contain space and closing parenthesis characters. +Parsing /proc/${pid}/stat with split() or equivalent, or scanf(3) isn't +reliable. The correct way is to locate closing parenthesis with strrchr(')') +from the end of the buffer and parse integers from there. .TP (3) \fIstate\fP \ %c One of the following characters, indicating process state: ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-22 16:42 [patch] proc.5: tell how to parse /proc/*/stat correctly Alexey Dobriyan @ 2022-12-22 22:03 ` Dominique Martinet 2022-12-22 23:21 ` Solar Designer ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Dominique Martinet @ 2022-12-22 22:03 UTC (permalink / raw) To: oss-security; +Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man Alexey Dobriyan wrote on Thu, Dec 22, 2022 at 07:42:53PM +0300: > --- a/man5/proc.5 > +++ b/man5/proc.5 > @@ -2092,6 +2092,11 @@ Strings longer than > .B TASK_COMM_LEN > (16) characters (including the terminating null byte) are silently truncated. > This is visible whether or not the executable is swapped out. > + > +Note that \fIcomm\fP can contain space and closing parenthesis characters. > +Parsing /proc/${pid}/stat with split() or equivalent, or scanf(3) isn't > +reliable. The correct way is to locate closing parenthesis with strrchr(')') > +from the end of the buffer and parse integers from there. That's still not enough unless new lines are escaped, which they aren't: $ echo -n 'test) 0 0 0 ' > /proc/$$/comm $ cat /proc/$$/stat 71076 (test) 0 0 0 ) S 71075 71076 71076 34840 71192 4194304 6623 6824 0 0 10 3 2 7 20 0 1 0 36396573 15208448 2888 18446744073709551615 94173281726464 94173282650929 140734972513568 0 0 0 65536 3686404 1266761467 1 0 0 17 1 0 0 0 0 0 94173282892592 94173282940880 94173287231488 140734972522071 140734972522076 140734972522076 140734972526574 0 The silver lining here is that comm length is rather small (16) so we cannot emulate full lines and a very careful process could notice that there are not enough fields after the last parenthesis... So just look for the last closing parenthesis in the next line and try again? But, really, I just don't see how this can practically be said to be parsable... -- Dominique Martinet | Asmadeus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-22 22:03 ` [oss-security] " Dominique Martinet @ 2022-12-22 23:21 ` Solar Designer 2022-12-23 0:15 ` Dominique Martinet 2022-12-23 0:21 ` Jan Engelhardt 2022-12-28 0:44 ` Lyndon Nerenberg (VE7TFX/VE6BBM) 2 siblings, 1 reply; 21+ messages in thread From: Solar Designer @ 2022-12-22 23:21 UTC (permalink / raw) To: Dominique Martinet Cc: oss-security, Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man On Fri, Dec 23, 2022 at 07:03:17AM +0900, Dominique Martinet wrote: > Alexey Dobriyan wrote on Thu, Dec 22, 2022 at 07:42:53PM +0300: > > --- a/man5/proc.5 > > +++ b/man5/proc.5 > > @@ -2092,6 +2092,11 @@ Strings longer than > > .B TASK_COMM_LEN > > (16) characters (including the terminating null byte) are silently truncated. > > This is visible whether or not the executable is swapped out. > > + > > +Note that \fIcomm\fP can contain space and closing parenthesis characters. > > +Parsing /proc/${pid}/stat with split() or equivalent, or scanf(3) isn't > > +reliable. The correct way is to locate closing parenthesis with strrchr(')') > > +from the end of the buffer and parse integers from there. > > That's still not enough unless new lines are escaped, which they aren't: > > $ echo -n 'test) 0 0 0 > ' > /proc/$$/comm > $ cat /proc/$$/stat > 71076 (test) 0 0 0 > ) S 71075 71076 71076 34840 71192 4194304 6623 6824 0 0 10 3 2 7 20 0 1 0 36396573 15208448 2888 18446744073709551615 94173281726464 94173282650929 140734972513568 0 0 0 65536 3686404 1266761467 1 0 0 17 1 0 0 0 0 0 94173282892592 94173282940880 94173287231488 140734972522071 140734972522076 140734972522076 140734972526574 0 > > The silver lining here is that comm length is rather small (16) so we > cannot emulate full lines and a very careful process could notice that > there are not enough fields after the last parenthesis... So just look > for the last closing parenthesis in the next line and try again? No, just don't treat this file's content as a line (nor as several lines) - treat it as a string that might contain new line characters. The ps command from procps-ng seems to manage, e.g. for your test "ps c" prints: 29394 pts/3 S 0:00 test) 0 0 0? where the question mark is what it substitutes for the non-printable character (the new line character). I didn't check whether the process name it prints comes from /proc/$$/stat or /proc/$$/status, though (per strace, it reads both). > But, really, I just don't see how this can practically be said to be parsable... This format certainly makes it easier to get a parser wrong than to get it right. I agree the above man page edit is not enough, and should also mention the caveat that this shouldn't be read in nor parsed as a line. Also, the Linux kernel does have problems with new lines in the comm field elsewhere, at least in the log messages it produces: https://github.com/lkrg-org/lkrg/issues/165 Here I looked into this in context of LKRG development, but with the kernel itself also producing messages with comm in them the point of only fixing LKRG's messages is moot. Alexander P.S. While this thread goes well so far, please note that in general CC'ing other lists on postings to oss-security (or vice versa) is discouraged. With such CC's, possible follow-ups from members of those other lists can be off-topic for oss-security - e.g., they might focus on non-security technicalities. Probably not this time when only a man page is to be patched, but proposed patches to the Linux kernel often result in lengthy discussions and multiple versions of the patch. In those cases, I think it's better to have separate threads and only post summary follow-up(s) to oss-security (e.g., one message stating that a patch was proposed and linking to the thread, and another after the final version is merged). ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-22 23:21 ` Solar Designer @ 2022-12-23 0:15 ` Dominique Martinet 0 siblings, 0 replies; 21+ messages in thread From: Dominique Martinet @ 2022-12-23 0:15 UTC (permalink / raw) To: Solar Designer Cc: oss-security, Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man Solar Designer wrote on Fri, Dec 23, 2022 at 12:21:12AM +0100: > On Fri, Dec 23, 2022 at 07:03:17AM +0900, Dominique Martinet wrote: > > Alexey Dobriyan wrote on Thu, Dec 22, 2022 at 07:42:53PM +0300: > > > --- a/man5/proc.5 > > > +++ b/man5/proc.5 > > > @@ -2092,6 +2092,11 @@ Strings longer than > > > .B TASK_COMM_LEN > > > (16) characters (including the terminating null byte) are silently truncated. > > > This is visible whether or not the executable is swapped out. > > > + > > > +Note that \fIcomm\fP can contain space and closing parenthesis characters. > > > +Parsing /proc/${pid}/stat with split() or equivalent, or scanf(3) isn't > > > +reliable. The correct way is to locate closing parenthesis with strrchr(')') > > > +from the end of the buffer and parse integers from there. > > > > That's still not enough unless new lines are escaped, which they aren't: > > > > $ echo -n 'test) 0 0 0 > > ' > /proc/$$/comm > > $ cat /proc/$$/stat > > 71076 (test) 0 0 0 > > ) S 71075 71076 71076 34840 71192 4194304 6623 6824 0 0 10 3 2 7 20 0 1 0 36396573 15208448 2888 18446744073709551615 94173281726464 94173282650929 140734972513568 0 0 0 65536 3686404 1266761467 1 0 0 17 1 0 0 0 0 0 94173282892592 94173282940880 94173287231488 140734972522071 140734972522076 140734972522076 140734972526574 0 > > > > The silver lining here is that comm length is rather small (16) so we > > cannot emulate full lines and a very careful process could notice that > > there are not enough fields after the last parenthesis... So just look > > for the last closing parenthesis in the next line and try again? > > No, just don't treat this file's content as a line (nor as several > lines) - treat it as a string that might contain new line characters. Ah, this came just after the /proc/net/unix discussion in another thread[1] pointing to [2] with one line per entry, and I was still in that mode. For /proc/pid/stat with a single entry I agree treating it as a buffer and looking for the last closing parenthesis should be correct as per the man page suggestion -- sorry for the noise. [1] https://www.openwall.com/lists/oss-security/2022/12/21/8 [2] https://lore.kernel.org/all/8a87957e-4d33-9351-ae74-243441cb03cd@opteya.com/ -- Dominique Martinet | Asmadeus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-22 22:03 ` [oss-security] " Dominique Martinet 2022-12-22 23:21 ` Solar Designer @ 2022-12-23 0:21 ` Jan Engelhardt 2022-12-28 0:44 ` Lyndon Nerenberg (VE7TFX/VE6BBM) 2 siblings, 0 replies; 21+ messages in thread From: Jan Engelhardt @ 2022-12-23 0:21 UTC (permalink / raw) To: oss-security; +Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man On Thursday 2022-12-22 23:03, Dominique Martinet wrote: >> + >> +Note that \fIcomm\fP can contain space and closing parenthesis characters. >> +Parsing /proc/${pid}/stat with split() or equivalent, or scanf(3) isn't >> +reliable. The correct way is to locate closing parenthesis with strrchr(')') >> +from the end of the buffer and parse integers from there. > >That's still not enough unless new lines are escaped, which they aren't: strrchr does not concern itself with "lines". If your input buffer contains the complete content of /proc/X/stat (and not just a "line" thereof), the strrchr approach appears quite workable. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-22 22:03 ` [oss-security] " Dominique Martinet 2022-12-22 23:21 ` Solar Designer 2022-12-23 0:21 ` Jan Engelhardt @ 2022-12-28 0:44 ` Lyndon Nerenberg (VE7TFX/VE6BBM) 2022-12-28 1:50 ` Tavis Ormandy 2022-12-28 15:24 ` Shawn Webb 2 siblings, 2 replies; 21+ messages in thread From: Lyndon Nerenberg (VE7TFX/VE6BBM) @ 2022-12-28 0:44 UTC (permalink / raw) To: oss-security; +Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man Dominique Martinet writes: > But, really, I just don't see how this can practically be said to be parsable... In its current form it never will be. The solution is to place this variable-length field last. Then you can "cut -d ' ' -f 51-" to get the command+args part (assuming I counted all those fields correctly ...) Of course, this breaks backwards compatability. --lyndon ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 0:44 ` Lyndon Nerenberg (VE7TFX/VE6BBM) @ 2022-12-28 1:50 ` Tavis Ormandy 2022-12-30 20:15 ` Jakub Wilk 2022-12-28 15:24 ` Shawn Webb 1 sibling, 1 reply; 21+ messages in thread From: Tavis Ormandy @ 2022-12-28 1:50 UTC (permalink / raw) To: linux-kernel; +Cc: oss-security, linux-man On 2022-12-28, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > Dominique Martinet writes: > >> But, really, I just don't see how this can practically be said to be parsable... > > In its current form it never will be. The solution is to place > this variable-length field last. Then you can "cut -d ' ' -f 51-" > to get the command+args part (assuming I counted all those fields > correctly ...) > > Of course, this breaks backwards compatability. > I think that cut command doesn't handle newlines, you probably need cut -z, which looks a bit hacky to me. There already is 'ps -q $$ -o comm=', of course, and that works fine - as well as libprocps. I don't know, I think just adding the strrchr note seems acceptable. Tavis. -- _o) $ lynx lock.cmpxchg8b.com /\\ _o) _o) $ finger taviso@sdf.org _\_V _( ) _( ) @taviso ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 1:50 ` Tavis Ormandy @ 2022-12-30 20:15 ` Jakub Wilk 0 siblings, 0 replies; 21+ messages in thread From: Jakub Wilk @ 2022-12-30 20:15 UTC (permalink / raw) To: oss-security; +Cc: linux-man, linux-kernel * Tavis Ormandy <taviso@gmail.com>, 2022-12-28 01:50: >>>But, really, I just don't see how this can practically be said to be >>>parsable... >> >>In its current form it never will be. The solution is to place this >>variable-length field last. Then you can "cut -d ' ' -f 51-" to get >>the command+args part (assuming I counted all those fields correctly >>...) >> >>Of course, this breaks backwards compatability. > >I think that cut command doesn't handle newlines, Indeed. >There already is 'ps -q $$ -o >comm=' FWIW, "ps ... -o comm=" doesn't just print the raw comm value: it replaces non-printable chars with punctuation characters, and it may append " <defunct>" if the process is a zombie. The easiest way to get unmangled comm is to read it from /proc/$PID/comm, then strip the trailing newline. (But I suspect most /proc/*/stat parsers don't care about the comm field at all; they just want to skip over it to get their hands on the following fields.) -- Jakub Wilk ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 0:44 ` Lyndon Nerenberg (VE7TFX/VE6BBM) 2022-12-28 1:50 ` Tavis Ormandy @ 2022-12-28 15:24 ` Shawn Webb 2022-12-28 15:31 ` Shawn Webb ` (2 more replies) 1 sibling, 3 replies; 21+ messages in thread From: Shawn Webb @ 2022-12-28 15:24 UTC (permalink / raw) To: oss-security; +Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man [-- Attachment #1: Type: text/plain, Size: 1013 bytes --] On Tue, Dec 27, 2022 at 04:44:49PM -0800, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > Dominique Martinet writes: > > > But, really, I just don't see how this can practically be said to be parsable... > > In its current form it never will be. The solution is to place > this variable-length field last. Then you can "cut -d ' ' -f 51-" > to get the command+args part (assuming I counted all those fields > correctly ...) > > Of course, this breaks backwards compatability. It would also break forwards compatibility in the case new fields needed to be added. The only solution would be a libxo-style feature wherein a machine-parseable format is exposed by virtue of a file extension. Examples: 1. /proc/pid/stats.json 2. /proc/pid/stats.xml 3. /proc/pid/stats.yaml_shouldnt_be_a_thing Thanks, -- Shawn Webb Cofounder / Security Engineer HardenedBSD https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 15:24 ` Shawn Webb @ 2022-12-28 15:31 ` Shawn Webb 2022-12-28 16:47 ` Demi Marie Obenour 2022-12-31 16:31 ` David Laight 2 siblings, 0 replies; 21+ messages in thread From: Shawn Webb @ 2022-12-28 15:31 UTC (permalink / raw) To: oss-security; +Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man [-- Attachment #1: Type: text/plain, Size: 1448 bytes --] On Wed, Dec 28, 2022 at 10:24:58AM -0500, Shawn Webb wrote: > On Tue, Dec 27, 2022 at 04:44:49PM -0800, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > > Dominique Martinet writes: > > > > > But, really, I just don't see how this can practically be said to be parsable... > > > > In its current form it never will be. The solution is to place > > this variable-length field last. Then you can "cut -d ' ' -f 51-" > > to get the command+args part (assuming I counted all those fields > > correctly ...) > > > > Of course, this breaks backwards compatability. > > It would also break forwards compatibility in the case new fields > needed to be added. > > The only solution would be a libxo-style feature wherein a > machine-parseable format is exposed by virtue of a file extension. > > Examples: > > 1. /proc/pid/stats.json > 2. /proc/pid/stats.xml > 3. /proc/pid/stats.yaml_shouldnt_be_a_thing To expand upon this idea, lets define an example json file: { "schemaver": "20221228001", "name": "cat", "state": { "raw": "R", "intval": 1, "Pretty": "(Running)", }, "tgid": 5452, "pid": 5452, "ppid": 743, "uid": { "real": 501, "effective": 501, "saved_set": 501, "fs": 501 } } And so on. -- Shawn Webb Cofounder / Security Engineer HardenedBSD https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 15:24 ` Shawn Webb 2022-12-28 15:31 ` Shawn Webb @ 2022-12-28 16:47 ` Demi Marie Obenour 2022-12-28 17:09 ` Jan Engelhardt 2022-12-28 17:25 ` Shawn Webb 2022-12-31 16:31 ` David Laight 2 siblings, 2 replies; 21+ messages in thread From: Demi Marie Obenour @ 2022-12-28 16:47 UTC (permalink / raw) To: oss-security; +Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man [-- Attachment #1: Type: text/plain, Size: 1060 bytes --] On Wed, Dec 28, 2022 at 10:24:58AM -0500, Shawn Webb wrote: > On Tue, Dec 27, 2022 at 04:44:49PM -0800, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > > Dominique Martinet writes: > > > > > But, really, I just don't see how this can practically be said to be parsable... > > > > In its current form it never will be. The solution is to place > > this variable-length field last. Then you can "cut -d ' ' -f 51-" > > to get the command+args part (assuming I counted all those fields > > correctly ...) > > > > Of course, this breaks backwards compatability. > > It would also break forwards compatibility in the case new fields > needed to be added. > > The only solution would be a libxo-style feature wherein a > machine-parseable format is exposed by virtue of a file extension. > > Examples: > > 1. /proc/pid/stats.json > 2. /proc/pid/stats.xml > 3. /proc/pid/stats.yaml_shouldnt_be_a_thing A binary format would be even better. No risk of ambiguity. -- Sincerely, Demi Marie Obenour (she/her/hers) Invisible Things Lab [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 16:47 ` Demi Marie Obenour @ 2022-12-28 17:09 ` Jan Engelhardt 2022-12-28 17:25 ` Shawn Webb 1 sibling, 0 replies; 21+ messages in thread From: Jan Engelhardt @ 2022-12-28 17:09 UTC (permalink / raw) To: oss-security; +Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man On Wednesday 2022-12-28 17:47, Demi Marie Obenour wrote: >> Examples: >> >> 1. /proc/pid/stats.json >> 2. /proc/pid/stats.xml >> 3. /proc/pid/stats.yaml_shouldnt_be_a_thing > >A binary format would be even better. No risk of ambiguity. So like EBML? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 16:47 ` Demi Marie Obenour 2022-12-28 17:09 ` Jan Engelhardt @ 2022-12-28 17:25 ` Shawn Webb 2022-12-28 18:02 ` Demi Marie Obenour 1 sibling, 1 reply; 21+ messages in thread From: Shawn Webb @ 2022-12-28 17:25 UTC (permalink / raw) To: oss-security; +Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man [-- Attachment #1: Type: text/plain, Size: 1440 bytes --] On Wed, Dec 28, 2022 at 11:47:25AM -0500, Demi Marie Obenour wrote: > On Wed, Dec 28, 2022 at 10:24:58AM -0500, Shawn Webb wrote: > > On Tue, Dec 27, 2022 at 04:44:49PM -0800, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > > > Dominique Martinet writes: > > > > > > > But, really, I just don't see how this can practically be said to be parsable... > > > > > > In its current form it never will be. The solution is to place > > > this variable-length field last. Then you can "cut -d ' ' -f 51-" > > > to get the command+args part (assuming I counted all those fields > > > correctly ...) > > > > > > Of course, this breaks backwards compatability. > > > > It would also break forwards compatibility in the case new fields > > needed to be added. > > > > The only solution would be a libxo-style feature wherein a > > machine-parseable format is exposed by virtue of a file extension. > > > > Examples: > > > > 1. /proc/pid/stats.json > > 2. /proc/pid/stats.xml > > 3. /proc/pid/stats.yaml_shouldnt_be_a_thing > > A binary format would be even better. No risk of ambiguity. I think the argument I'm trying to make is to be flexible in implementation, allowing for future needs and wants--that is "future proofing". -- Shawn Webb Cofounder / Security Engineer HardenedBSD https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 17:25 ` Shawn Webb @ 2022-12-28 18:02 ` Demi Marie Obenour 2022-12-28 18:36 ` John Helmert III ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Demi Marie Obenour @ 2022-12-28 18:02 UTC (permalink / raw) To: oss-security; +Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man [-- Attachment #1: Type: text/plain, Size: 1588 bytes --] On Wed, Dec 28, 2022 at 12:25:17PM -0500, Shawn Webb wrote: > On Wed, Dec 28, 2022 at 11:47:25AM -0500, Demi Marie Obenour wrote: > > On Wed, Dec 28, 2022 at 10:24:58AM -0500, Shawn Webb wrote: > > > On Tue, Dec 27, 2022 at 04:44:49PM -0800, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > > > > Dominique Martinet writes: > > > > > > > > > But, really, I just don't see how this can practically be said to be parsable... > > > > > > > > In its current form it never will be. The solution is to place > > > > this variable-length field last. Then you can "cut -d ' ' -f 51-" > > > > to get the command+args part (assuming I counted all those fields > > > > correctly ...) > > > > > > > > Of course, this breaks backwards compatability. > > > > > > It would also break forwards compatibility in the case new fields > > > needed to be added. > > > > > > The only solution would be a libxo-style feature wherein a > > > machine-parseable format is exposed by virtue of a file extension. > > > > > > Examples: > > > > > > 1. /proc/pid/stats.json > > > 2. /proc/pid/stats.xml > > > 3. /proc/pid/stats.yaml_shouldnt_be_a_thing > > > > A binary format would be even better. No risk of ambiguity. > > I think the argument I'm trying to make is to be flexible in > implementation, allowing for future needs and wants--that is "future > proofing". Linux should not have an XML, JSON, or YAML serializer. Linux already does way too much; let’s not add one more thing to the list. -- Sincerely, Demi Marie Obenour (she/her/hers) Invisible Things Lab [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 18:02 ` Demi Marie Obenour @ 2022-12-28 18:36 ` John Helmert III 2022-12-28 19:24 ` Shawn Webb 2022-12-28 22:14 ` Theodore Ts'o 2 siblings, 0 replies; 21+ messages in thread From: John Helmert III @ 2022-12-28 18:36 UTC (permalink / raw) To: oss-security; +Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man [-- Attachment #1: Type: text/plain, Size: 1804 bytes --] On Wed, Dec 28, 2022 at 01:02:35PM -0500, Demi Marie Obenour wrote: > On Wed, Dec 28, 2022 at 12:25:17PM -0500, Shawn Webb wrote: > > On Wed, Dec 28, 2022 at 11:47:25AM -0500, Demi Marie Obenour wrote: > > > On Wed, Dec 28, 2022 at 10:24:58AM -0500, Shawn Webb wrote: > > > > On Tue, Dec 27, 2022 at 04:44:49PM -0800, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > > > > > Dominique Martinet writes: > > > > > > > > > > > But, really, I just don't see how this can practically be said to be parsable... > > > > > > > > > > In its current form it never will be. The solution is to place > > > > > this variable-length field last. Then you can "cut -d ' ' -f 51-" > > > > > to get the command+args part (assuming I counted all those fields > > > > > correctly ...) > > > > > > > > > > Of course, this breaks backwards compatability. > > > > > > > > It would also break forwards compatibility in the case new fields > > > > needed to be added. > > > > > > > > The only solution would be a libxo-style feature wherein a > > > > machine-parseable format is exposed by virtue of a file extension. > > > > > > > > Examples: > > > > > > > > 1. /proc/pid/stats.json > > > > 2. /proc/pid/stats.xml > > > > 3. /proc/pid/stats.yaml_shouldnt_be_a_thing > > > > > > A binary format would be even better. No risk of ambiguity. > > > > I think the argument I'm trying to make is to be flexible in > > implementation, allowing for future needs and wants--that is "future > > proofing". > > Linux should not have an XML, JSON, or YAML serializer. Linux already > does way too much; let’s not add one more thing to the list. Handling a new binary format is not 'one more thing' added? > -- > Sincerely, > Demi Marie Obenour (she/her/hers) > Invisible Things Lab [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 18:02 ` Demi Marie Obenour 2022-12-28 18:36 ` John Helmert III @ 2022-12-28 19:24 ` Shawn Webb 2022-12-28 19:57 ` Alejandro Colomar 2022-12-28 22:14 ` Theodore Ts'o 2 siblings, 1 reply; 21+ messages in thread From: Shawn Webb @ 2022-12-28 19:24 UTC (permalink / raw) To: oss-security; +Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man [-- Attachment #1: Type: text/plain, Size: 2291 bytes --] On Wed, Dec 28, 2022 at 01:02:35PM -0500, Demi Marie Obenour wrote: > On Wed, Dec 28, 2022 at 12:25:17PM -0500, Shawn Webb wrote: > > On Wed, Dec 28, 2022 at 11:47:25AM -0500, Demi Marie Obenour wrote: > > > On Wed, Dec 28, 2022 at 10:24:58AM -0500, Shawn Webb wrote: > > > > On Tue, Dec 27, 2022 at 04:44:49PM -0800, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > > > > > Dominique Martinet writes: > > > > > > > > > > > But, really, I just don't see how this can practically be said to be parsable... > > > > > > > > > > In its current form it never will be. The solution is to place > > > > > this variable-length field last. Then you can "cut -d ' ' -f 51-" > > > > > to get the command+args part (assuming I counted all those fields > > > > > correctly ...) > > > > > > > > > > Of course, this breaks backwards compatability. > > > > > > > > It would also break forwards compatibility in the case new fields > > > > needed to be added. > > > > > > > > The only solution would be a libxo-style feature wherein a > > > > machine-parseable format is exposed by virtue of a file extension. > > > > > > > > Examples: > > > > > > > > 1. /proc/pid/stats.json > > > > 2. /proc/pid/stats.xml > > > > 3. /proc/pid/stats.yaml_shouldnt_be_a_thing > > > > > > A binary format would be even better. No risk of ambiguity. > > > > I think the argument I'm trying to make is to be flexible in > > implementation, allowing for future needs and wants--that is "future > > proofing". > > Linux should not have an XML, JSON, or YAML serializer. Linux already > does way too much; let’s not add one more thing to the list. Somewhat agreed. I think formats like JSON provide a good balance between machine parseable and human readable. As I described earlier, though, when it comes to concepts like procfs and sysfs, I have a bias towards abandoning them in favor of sysctl. If sysctl nodes were to be used, no new serialization formats would need to be implemented--and developers would also use a safter method of system and process inspection and manipulation. -- Shawn Webb Cofounder / Security Engineer HardenedBSD https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 19:24 ` Shawn Webb @ 2022-12-28 19:57 ` Alejandro Colomar 0 siblings, 0 replies; 21+ messages in thread From: Alejandro Colomar @ 2022-12-28 19:57 UTC (permalink / raw) To: Shawn Webb, oss-security, John Helmert III, Demi Marie Obenour, Jan Engelhardt, Lyndon Nerenberg (VE7TFX/VE6BBM) Cc: Michael Kerrisk, linux-kernel, linux-man [-- Attachment #1.1: Type: text/plain, Size: 2760 bytes --] Hi all, On 12/28/22 20:24, Shawn Webb wrote: > On Wed, Dec 28, 2022 at 01:02:35PM -0500, Demi Marie Obenour wrote: >> On Wed, Dec 28, 2022 at 12:25:17PM -0500, Shawn Webb wrote: >>> On Wed, Dec 28, 2022 at 11:47:25AM -0500, Demi Marie Obenour wrote: >>>> On Wed, Dec 28, 2022 at 10:24:58AM -0500, Shawn Webb wrote: >>>>> On Tue, Dec 27, 2022 at 04:44:49PM -0800, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: >>>>>> Dominique Martinet writes: >>>>>> >>>>>>> But, really, I just don't see how this can practically be said to be parsable... >>>>>> >>>>>> In its current form it never will be. The solution is to place >>>>>> this variable-length field last. Then you can "cut -d ' ' -f 51-" >>>>>> to get the command+args part (assuming I counted all those fields >>>>>> correctly ...) >>>>>> >>>>>> Of course, this breaks backwards compatability. >>>>> >>>>> It would also break forwards compatibility in the case new fields >>>>> needed to be added. >>>>> >>>>> The only solution would be a libxo-style feature wherein a >>>>> machine-parseable format is exposed by virtue of a file extension. >>>>> >>>>> Examples: >>>>> >>>>> 1. /proc/pid/stats.json >>>>> 2. /proc/pid/stats.xml >>>>> 3. /proc/pid/stats.yaml_shouldnt_be_a_thing >>>> >>>> A binary format would be even better. No risk of ambiguity. >>> >>> I think the argument I'm trying to make is to be flexible in >>> implementation, allowing for future needs and wants--that is "future >>> proofing". >> >> Linux should not have an XML, JSON, or YAML serializer. Linux already >> does way too much; let’s not add one more thing to the list. > > Somewhat agreed. I think formats like JSON provide a good balance > between machine parseable and human readable. > a > As I described earlier, though, when it comes to concepts like procfs > and sysfs, I have a bias towards abandoning them in favor of sysctl. > If sysctl nodes were to be used, no new serialization formats would > need to be implemented--and developers would also use a safter method > of system and process inspection and manipulation. > Just a comment as someone who is reading without much understanding of the contents of /prod/pid/stat: If organization of the data in the file is a problem, and the format starts to matter, maybe it's a hint that there are too many different contents, and could be split into different files, each one with its own formatting rules. I'll suggest that maybe a set of files, maybe contained in a common directory stats.d, is what you're looking for? Binary format is not of my preference, since most user-space tools work with the standard interface, that is, text. Cheers, Alex -- <http://www.alejandro-colomar.es/> [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 18:02 ` Demi Marie Obenour 2022-12-28 18:36 ` John Helmert III 2022-12-28 19:24 ` Shawn Webb @ 2022-12-28 22:14 ` Theodore Ts'o 2022-12-29 0:33 ` Demi Marie Obenour 2 siblings, 1 reply; 21+ messages in thread From: Theodore Ts'o @ 2022-12-28 22:14 UTC (permalink / raw) To: Demi Marie Obenour Cc: oss-security, Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man On Wed, Dec 28, 2022 at 01:02:35PM -0500, Demi Marie Obenour wrote: > > I think the argument I'm trying to make is to be flexible in > > implementation, allowing for future needs and wants--that is "future > > proofing". > > Linux should not have an XML, JSON, or YAML serializer. Linux already > does way too much; let’s not add one more thing to the list. There's always Protobufs[1]! :-) And all of these are better than ASN.1, for which Google already has a limited parser (for x.509 certificates). :-) :-) :-) - Ted ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 22:14 ` Theodore Ts'o @ 2022-12-29 0:33 ` Demi Marie Obenour 0 siblings, 0 replies; 21+ messages in thread From: Demi Marie Obenour @ 2022-12-29 0:33 UTC (permalink / raw) To: Theodore Ts'o Cc: oss-security, Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man [-- Attachment #1: Type: text/plain, Size: 764 bytes --] On Wed, Dec 28, 2022 at 05:14:42PM -0500, Theodore Ts'o wrote: > On Wed, Dec 28, 2022 at 01:02:35PM -0500, Demi Marie Obenour wrote: > > > I think the argument I'm trying to make is to be flexible in > > > implementation, allowing for future needs and wants--that is "future > > > proofing". > > > > Linux should not have an XML, JSON, or YAML serializer. Linux already > > does way too much; let’s not add one more thing to the list. > > There's always Protobufs[1]! :-) And all of these are better than > ASN.1, for which Google already has a limited parser (for x.509 > certificates). :-) :-) :-) > > - Ted Cap’n Proto is better than Protobufs :-) -- Sincerely, Demi Marie Obenour (she/her/hers) Invisible Things Lab [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-28 15:24 ` Shawn Webb 2022-12-28 15:31 ` Shawn Webb 2022-12-28 16:47 ` Demi Marie Obenour @ 2022-12-31 16:31 ` David Laight 2022-12-31 17:27 ` Solar Designer 2 siblings, 1 reply; 21+ messages in thread From: David Laight @ 2022-12-31 16:31 UTC (permalink / raw) To: 'Shawn Webb', oss-security Cc: Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man From: Shawn Webb > Sent: 28 December 2022 15:25 > > On Tue, Dec 27, 2022 at 04:44:49PM -0800, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > > Dominique Martinet writes: > > > > > But, really, I just don't see how this can practically be said to be parsable... > > > > In its current form it never will be. The solution is to place > > this variable-length field last. Then you can "cut -d ' ' -f 51-" > > to get the command+args part (assuming I counted all those fields > > correctly ...) > > > > Of course, this breaks backwards compatability. > > It would also break forwards compatibility in the case new fields > needed to be added. > > The only solution would be a libxo-style feature wherein a > machine-parseable format is exposed by virtue of a file extension. > > Examples: > > 1. /proc/pid/stats.json > 2. /proc/pid/stats.xml > 3. /proc/pid/stats.yaml_shouldnt_be_a_thing None of those are of any real use if you are trying to parse the data in something like a shell script. Multiple lines formatted as "tag:value" are probably the best bet. Provided something sane is done with embedded \n (and maybe \r). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly 2022-12-31 16:31 ` David Laight @ 2022-12-31 17:27 ` Solar Designer 0 siblings, 0 replies; 21+ messages in thread From: Solar Designer @ 2022-12-31 17:27 UTC (permalink / raw) To: David Laight Cc: 'Shawn Webb', oss-security, Alejandro Colomar, Michael Kerrisk, linux-kernel, linux-man Hi all, Let's wind this oss-security thread down as it relates to brainstorming and commenting on totally new designs - no more of that, please. Many things were said, but realistically the interface isn't _that_ broken (this can be parsed correctly, and procps-ng manages to) and is (hopefully) not going to change much (in my opinion, and I know I'm not alone in this, most of the proposals would make things worse overall). Somewhat realistically, one possible change is replacing the most risky characters, such as braces and anything <= ASCII 32, perhaps with '?' to match what procps-ng is doing. Perhaps do this either on all updates of "comm" or in all places where "comm" is reported to userspace (including procfs and kernel messages, by calling a common function). "comm" isn't the full process name anyway - it's often truncated - so it can reasonably be made safer in other ways as well. As an option, the replacing of whitespace (ASCII 32) and braces could be limited to the "stat" file, but the control characters are (even more) problematic with other interfaces where "comm" is exposed, so replacing them should probably be global. Happy New Year! Alexander ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2022-12-31 17:28 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-12-22 16:42 [patch] proc.5: tell how to parse /proc/*/stat correctly Alexey Dobriyan 2022-12-22 22:03 ` [oss-security] " Dominique Martinet 2022-12-22 23:21 ` Solar Designer 2022-12-23 0:15 ` Dominique Martinet 2022-12-23 0:21 ` Jan Engelhardt 2022-12-28 0:44 ` Lyndon Nerenberg (VE7TFX/VE6BBM) 2022-12-28 1:50 ` Tavis Ormandy 2022-12-30 20:15 ` Jakub Wilk 2022-12-28 15:24 ` Shawn Webb 2022-12-28 15:31 ` Shawn Webb 2022-12-28 16:47 ` Demi Marie Obenour 2022-12-28 17:09 ` Jan Engelhardt 2022-12-28 17:25 ` Shawn Webb 2022-12-28 18:02 ` Demi Marie Obenour 2022-12-28 18:36 ` John Helmert III 2022-12-28 19:24 ` Shawn Webb 2022-12-28 19:57 ` Alejandro Colomar 2022-12-28 22:14 ` Theodore Ts'o 2022-12-29 0:33 ` Demi Marie Obenour 2022-12-31 16:31 ` David Laight 2022-12-31 17:27 ` Solar Designer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).