linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] gen_compile_commands.py: fix path resolve with symlinks in it
@ 2023-12-04 10:41 Jialu Xu
  2023-12-04 16:59 ` Nathan Chancellor
  0 siblings, 1 reply; 13+ messages in thread
From: Jialu Xu @ 2023-12-04 10:41 UTC (permalink / raw)
  To: nathan, ndesaulniers, morbo, justinstitt; +Cc: llvm, linux-kernel, Jialu Xu

When symbolic links are involved in the path, os.path.abspath might not
resolve the symlinks and instead return the absolute path with the
symlinks intact.

use pathlib.Path resolve() instead of os.path.abspath()

Signed-off-by: Jialu Xu <xujialu@vimux.org>
---
 scripts/clang-tools/gen_compile_commands.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
index 180952fb91c1b..0a6c0996b4a8f 100755
--- a/scripts/clang-tools/gen_compile_commands.py
+++ b/scripts/clang-tools/gen_compile_commands.py
@@ -11,6 +11,7 @@ import argparse
 import json
 import logging
 import os
+from pathlib import Path
 import re
 import subprocess
 import sys
@@ -172,8 +173,8 @@ def process_line(root_directory, command_prefix, file_path):
     # by Make, so this code replaces the escaped version with '#'.
     prefix = command_prefix.replace('\#', '#').replace('$(pound)', '#')
 
-    # Use os.path.abspath() to normalize the path resolving '.' and '..' .
-    abs_path = os.path.abspath(os.path.join(root_directory, file_path))
+    # Make the path absolute, resolving all symlinks on the way and also normalizing it.
+    abs_path = str(Path(os.path.join(root_directory, file_path)).resolve())
     if not os.path.exists(abs_path):
         raise ValueError('File %s not found' % abs_path)
     return {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] gen_compile_commands.py: fix path resolve with symlinks in it
  2023-12-04 10:41 [PATCH] gen_compile_commands.py: fix path resolve with symlinks in it Jialu Xu
@ 2023-12-04 16:59 ` Nathan Chancellor
  2023-12-05  2:15   ` Jialu Xu
  2023-12-05  2:15   ` [PATCH v2] gen_compile_commands.py: fix path resolve with symlinks in it Jialu Xu
  0 siblings, 2 replies; 13+ messages in thread
From: Nathan Chancellor @ 2023-12-04 16:59 UTC (permalink / raw)
  To: Jialu Xu; +Cc: ndesaulniers, morbo, justinstitt, llvm, linux-kernel

On Mon, Dec 04, 2023 at 06:41:42PM +0800, Jialu Xu wrote:
> When symbolic links are involved in the path, os.path.abspath might not
> resolve the symlinks and instead return the absolute path with the
> symlinks intact.

Can you provide an example or more detailed description of how this
behavior is currently broken? I can't really understand how having
symlinks in the path after normalization would break anything but I am
potentially missing something :)

> use pathlib.Path resolve() instead of os.path.abspath()
> 
> Signed-off-by: Jialu Xu <xujialu@vimux.org>
> ---
>  scripts/clang-tools/gen_compile_commands.py | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
> index 180952fb91c1b..0a6c0996b4a8f 100755
> --- a/scripts/clang-tools/gen_compile_commands.py
> +++ b/scripts/clang-tools/gen_compile_commands.py
> @@ -11,6 +11,7 @@ import argparse
>  import json
>  import logging
>  import os
> +from pathlib import Path
>  import re
>  import subprocess
>  import sys
> @@ -172,8 +173,8 @@ def process_line(root_directory, command_prefix, file_path):
>      # by Make, so this code replaces the escaped version with '#'.
>      prefix = command_prefix.replace('\#', '#').replace('$(pound)', '#')
>  
> -    # Use os.path.abspath() to normalize the path resolving '.' and '..' .
> -    abs_path = os.path.abspath(os.path.join(root_directory, file_path))
> +    # Make the path absolute, resolving all symlinks on the way and also normalizing it.
> +    abs_path = str(Path(os.path.join(root_directory, file_path)).resolve())

I think this can be more simply:

    abs_path = str(Path(root_directory, file_path).resolve())

I think there should be a comment around why we are creating a Path
object then creating a string from it, rather than using the Path object
directly, namely that PosixPath is not JSON serializable.

>      if not os.path.exists(abs_path):
>          raise ValueError('File %s not found' % abs_path)
>      return {
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* (no subject)
  2023-12-04 16:59 ` Nathan Chancellor
@ 2023-12-05  2:15   ` Jialu Xu
  2023-12-05  2:15   ` [PATCH v2] gen_compile_commands.py: fix path resolve with symlinks in it Jialu Xu
  1 sibling, 0 replies; 13+ messages in thread
From: Jialu Xu @ 2023-12-05  2:15 UTC (permalink / raw)
  To: nathan, ndesaulniers, morbo, justinstitt; +Cc: llvm, linux-kernel

>On Mon, Dec 04, 2023 at 06:41:42PM +0800, Jialu Xu wrote:
>> When symbolic links are involved in the path, os.path.abspath might not
>> resolve the symlinks and instead return the absolute path with the
>> symlinks intact.
>
>Can you provide an example or more detailed description of how this
>behavior is currently broken? I can't really understand how having
>symlinks in the path after normalization would break anything but I am
>potentially missing something :)

Glad to show my situation:

Say "drivers/hdf/" has some symlinks:

    # ls -l drivers/hdf/
    total 364
    drwxrwxr-x 2 ...   4096 ... evdev
    lrwxrwxrwx 1 ...     44 ... framework -> ../../../../../../drivers/hdf_core/framework
    -rw-rw-r-- 1 ... 359010 ... hdf_macro_test.h
    lrwxrwxrwx 1 ...     55 ... inner_api -> ../../../../../../drivers/hdf_core/interfaces/inner_api
    lrwxrwxrwx 1 ...     53 ... khdf -> ../../../../../../drivers/hdf_core/adapter/khdf/linux
    -rw-r--r-- 1 ...     74 ... Makefile
    drwxrwxr-x 3 ...   4096 ... wifi

One .cmd file records that:

    # head -1 ./framework/core/manager/src/.devmgr_service.o.cmd 
    cmd_drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.o := ... \
    /path/to/out/drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.c

os.path.abspath returns "/path/to/out/framework/core/manager/src/devmgr_service.c", not correct:

    # ./scripts/clang-tools/gen_compile_commands.py
    INFO: Could not add line from ./framework/core/manager/src/.devmgr_service.o.cmd: File \
        /path/to/out/framework/core/manager/src/devmgr_service.c not found

Use pathlib.Path resolve() instead, got the correct path

    # cat compile_commands.json
    ...
    {
      "command": ...
      "directory": ...
      "file": "/path/to/blabla/drivers/hdf_core/framework/core/manager/src/devmgr_service.c"
    },
    ...

>> -    # Use os.path.abspath() to normalize the path resolving '.' and '..' .
>> -    abs_path = os.path.abspath(os.path.join(root_directory, file_path))
>> +    # Make the path absolute, resolving all symlinks on the way and also normalizing it.
>> +    abs_path = str(Path(os.path.join(root_directory, file_path)).resolve())
>
>I think this can be more simply:
>
>    abs_path = str(Path(root_directory, file_path).resolve())
>
>I think there should be a comment around why we are creating a Path
>object then creating a string from it, rather than using the Path object
>directly, namely that PosixPath is not JSON serializable.

Nice, update as yours, thanks.

>>      if not os.path.exists(abs_path):
>>          raise ValueError('File %s not found' % abs_path)
>>      return {
>> -- 
>> 2.39.2
>>  


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] gen_compile_commands.py: fix path resolve with symlinks in it
  2023-12-04 16:59 ` Nathan Chancellor
  2023-12-05  2:15   ` Jialu Xu
@ 2023-12-05  2:15   ` Jialu Xu
  2023-12-05 16:56     ` Nathan Chancellor
  1 sibling, 1 reply; 13+ messages in thread
From: Jialu Xu @ 2023-12-05  2:15 UTC (permalink / raw)
  To: nathan, ndesaulniers, morbo, justinstitt; +Cc: llvm, linux-kernel, Jialu Xu

When symbolic links are involved in the path, os.path.abspath might not
resolve the symlinks and instead return the absolute path with the
symlinks intact.

Use pathlib.Path resolve() instead of os.path.abspath()

Signed-off-by: Jialu Xu <xujialu@vimux.org>
---
 scripts/clang-tools/gen_compile_commands.py | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
index 180952fb91c1b..99e28b7152c19 100755
--- a/scripts/clang-tools/gen_compile_commands.py
+++ b/scripts/clang-tools/gen_compile_commands.py
@@ -11,6 +11,7 @@ import argparse
 import json
 import logging
 import os
+from pathlib import Path
 import re
 import subprocess
 import sys
@@ -172,8 +173,9 @@ def process_line(root_directory, command_prefix, file_path):
     # by Make, so this code replaces the escaped version with '#'.
     prefix = command_prefix.replace('\#', '#').replace('$(pound)', '#')
 
-    # Use os.path.abspath() to normalize the path resolving '.' and '..' .
-    abs_path = os.path.abspath(os.path.join(root_directory, file_path))
+    # Make the path absolute, resolving all symlinks on the way and also normalizing it.
+    # Convert Path object to a string because 'PosixPath' is not JSON serializable.
+    abs_path = str(Path(root_directory, file_path).resolve())
     if not os.path.exists(abs_path):
         raise ValueError('File %s not found' % abs_path)
     return {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] gen_compile_commands.py: fix path resolve with symlinks in it
  2023-12-05  2:15   ` [PATCH v2] gen_compile_commands.py: fix path resolve with symlinks in it Jialu Xu
@ 2023-12-05 16:56     ` Nathan Chancellor
  2023-12-06  1:20       ` Jialu Xu
                         ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Nathan Chancellor @ 2023-12-05 16:56 UTC (permalink / raw)
  To: Jialu Xu; +Cc: ndesaulniers, morbo, justinstitt, llvm, linux-kernel

Hi Jialu,

On Tue, Dec 05, 2023 at 10:15:26AM +0800, Jialu Xu wrote:
> When symbolic links are involved in the path, os.path.abspath might not
> resolve the symlinks and instead return the absolute path with the
> symlinks intact.
> 
> Use pathlib.Path resolve() instead of os.path.abspath()
> 
> Signed-off-by: Jialu Xu <xujialu@vimux.org>

Thanks for the clarification in your previous message [1], I suppose
that makes sense as to why nobody has reported this to us because that
is a rather odd situation that the upstream kernel would not experience.

I think that some of those details should be in the commit message,
along with a short example like you provided, so that we know exactly
what the situation was and how this patch resolves it.

Perhaps something like (please feel free to correct or reword as you
feel necessary):

"When a path contains relative symbolic links, os.path.abspath() might
not follow the symlinks and instead return the absolute path with just
the relative paths resolved, resulting in an incorrect path.

<broken example>

Use pathlib.Path.resolve(), which resolves the symlinks and normalizes
the paths correctly.

<working example>"

The actual fix seems fine to me. Feel free to add

  Reviewed-by: Nathan Chancellor <nathan@kernel.org>

to the subsequent submission and please include both

  Masahiro Yamada <masahiroy@kernel.org>
  linux-kbuild@vger.kernel.org

on it in addition to the people you have here, as he is the one who
actually applies gen_compile_commands.py changes (I am going to send a
MAINTAINERS change for this).

[1]: https://lore.kernel.org/20231205021523.4152128-1-xujialu@vimux.org/

Cheers,
Nathan

> ---
>  scripts/clang-tools/gen_compile_commands.py | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
> index 180952fb91c1b..99e28b7152c19 100755
> --- a/scripts/clang-tools/gen_compile_commands.py
> +++ b/scripts/clang-tools/gen_compile_commands.py
> @@ -11,6 +11,7 @@ import argparse
>  import json
>  import logging
>  import os
> +from pathlib import Path
>  import re
>  import subprocess
>  import sys
> @@ -172,8 +173,9 @@ def process_line(root_directory, command_prefix, file_path):
>      # by Make, so this code replaces the escaped version with '#'.
>      prefix = command_prefix.replace('\#', '#').replace('$(pound)', '#')
>  
> -    # Use os.path.abspath() to normalize the path resolving '.' and '..' .
> -    abs_path = os.path.abspath(os.path.join(root_directory, file_path))
> +    # Make the path absolute, resolving all symlinks on the way and also normalizing it.
> +    # Convert Path object to a string because 'PosixPath' is not JSON serializable.
> +    abs_path = str(Path(root_directory, file_path).resolve())
>      if not os.path.exists(abs_path):
>          raise ValueError('File %s not found' % abs_path)
>      return {
> -- 
> 2.39.2
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] gen_compile_commands.py: fix path resolve with symlinks in it
  2023-12-05 16:56     ` Nathan Chancellor
@ 2023-12-06  1:20       ` Jialu Xu
  2023-12-06  1:20       ` [PATCH v3] " Jialu Xu
  2023-12-06  1:24       ` [PATCH v4] " Jialu Xu
  2 siblings, 0 replies; 13+ messages in thread
From: Jialu Xu @ 2023-12-06  1:20 UTC (permalink / raw)
  To: nathan
  Cc: justinstitt, linux-kbuild, linux-kernel, llvm, masahiroy, morbo,
	ndesaulniers, xujialu

Hi Nathan,

>On Tue, Dec 05, 2023 at 10:15:26AM +0800, Jialu Xu wrote:
>> When symbolic links are involved in the path, os.path.abspath might not
>> resolve the symlinks and instead return the absolute path with the
>> symlinks intact.
>> 
>> Use pathlib.Path resolve() instead of os.path.abspath()
>> 
>> Signed-off-by: Jialu Xu <xujialu@vimux.org>
>
>Thanks for the clarification in your previous message [1], I suppose
>that makes sense as to why nobody has reported this to us because that
>is a rather odd situation that the upstream kernel would not experience.
>
>I think that some of those details should be in the commit message,
>along with a short example like you provided, so that we know exactly
>what the situation was and how this patch resolves it.
>
>Perhaps something like (please feel free to correct or reword as you
>feel necessary):
>
>"When a path contains relative symbolic links, os.path.abspath() might
>not follow the symlinks and instead return the absolute path with just
>the relative paths resolved, resulting in an incorrect path.
>
><broken example>
>
>Use pathlib.Path.resolve(), which resolves the symlinks and normalizes
>the paths correctly.
>
><working example>"
>
>The actual fix seems fine to me. Feel free to add
>
>  Reviewed-by: Nathan Chancellor <nathan@kernel.org>
>
>to the subsequent submission and please include both
>
>  Masahiro Yamada <masahiroy@kernel.org>
>  linux-kbuild@vger.kernel.org
>
>on it in addition to the people you have here, as he is the one who
>actually applies gen_compile_commands.py changes (I am going to send a
>MAINTAINERS change for this).
>
>[1]: https://lore.kernel.org/20231205021523.4152128-1-xujialu@vimux.org/
>

Thanks for the very detailed help!

Patch update as v3.

Cheers,
Jialu



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3] gen_compile_commands.py: fix path resolve with symlinks in it
  2023-12-05 16:56     ` Nathan Chancellor
  2023-12-06  1:20       ` Jialu Xu
@ 2023-12-06  1:20       ` Jialu Xu
  2023-12-06  1:24       ` [PATCH v4] " Jialu Xu
  2 siblings, 0 replies; 13+ messages in thread
From: Jialu Xu @ 2023-12-06  1:20 UTC (permalink / raw)
  To: nathan
  Cc: justinstitt, linux-kbuild, linux-kernel, llvm, masahiroy, morbo,
	ndesaulniers, xujialu

When a path contains relative symbolic links, os.path.abspath() might
not follow the symlinks and instead return the absolute path with just
the relative paths resolved, resulting in an incorrect path.

1. Say "drivers/hdf/" has some symlinks:

    # ls -l drivers/hdf/
    total 364
    drwxrwxr-x 2 ...   4096 ... evdev
    lrwxrwxrwx 1 ...     44 ... framework -> ../../../../../../drivers/hdf_core/framework
    -rw-rw-r-- 1 ... 359010 ... hdf_macro_test.h
    lrwxrwxrwx 1 ...     55 ... inner_api -> ../../../../../../drivers/hdf_core/interfaces/inner_api
    lrwxrwxrwx 1 ...     53 ... khdf -> ../../../../../../drivers/hdf_core/adapter/khdf/linux
    -rw-r--r-- 1 ...     74 ... Makefile
    drwxrwxr-x 3 ...   4096 ... wifi

2. One .cmd file records that:

    # head -1 ./framework/core/manager/src/.devmgr_service.o.cmd
    cmd_drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.o := ... \
    /path/to/out/drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.c

3. os.path.abspath returns "/path/to/out/framework/core/manager/src/devmgr_service.c", not correct:

    # ./scripts/clang-tools/gen_compile_commands.py
    INFO: Could not add line from ./framework/core/manager/src/.devmgr_service.o.cmd: File \
        /path/to/out/framework/core/manager/src/devmgr_service.c not found

Use pathlib.Path.resolve(), which resolves the symlinks and normalizes
the paths correctly.

    # cat compile_commands.json
    ...
    {
      "command": ...
      "directory": ...
      "file": "/path/to/blabla/drivers/hdf_core/framework/core/manager/src/devmgr_service.c"
    },
    ...

Reviewed-by: Nathan Chancellor <nathan@kernel.org>
---
 scripts/clang-tools/gen_compile_commands.py | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
index 180952fb91c1b..99e28b7152c19 100755
--- a/scripts/clang-tools/gen_compile_commands.py
+++ b/scripts/clang-tools/gen_compile_commands.py
@@ -11,6 +11,7 @@ import argparse
 import json
 import logging
 import os
+from pathlib import Path
 import re
 import subprocess
 import sys
@@ -172,8 +173,9 @@ def process_line(root_directory, command_prefix, file_path):
     # by Make, so this code replaces the escaped version with '#'.
     prefix = command_prefix.replace('\#', '#').replace('$(pound)', '#')
 
-    # Use os.path.abspath() to normalize the path resolving '.' and '..' .
-    abs_path = os.path.abspath(os.path.join(root_directory, file_path))
+    # Make the path absolute, resolving all symlinks on the way and also normalizing it.
+    # Convert Path object to a string because 'PosixPath' is not JSON serializable.
+    abs_path = str(Path(root_directory, file_path).resolve())
     if not os.path.exists(abs_path):
         raise ValueError('File %s not found' % abs_path)
     return {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4] gen_compile_commands.py: fix path resolve with symlinks in it
  2023-12-05 16:56     ` Nathan Chancellor
  2023-12-06  1:20       ` Jialu Xu
  2023-12-06  1:20       ` [PATCH v3] " Jialu Xu
@ 2023-12-06  1:24       ` Jialu Xu
  2023-12-07 22:54         ` Justin Stitt
  2023-12-10  5:52         ` Masahiro Yamada
  2 siblings, 2 replies; 13+ messages in thread
From: Jialu Xu @ 2023-12-06  1:24 UTC (permalink / raw)
  To: nathan
  Cc: justinstitt, linux-kbuild, linux-kernel, llvm, masahiroy, morbo,
	ndesaulniers, xujialu

When a path contains relative symbolic links, os.path.abspath() might
not follow the symlinks and instead return the absolute path with just
the relative paths resolved, resulting in an incorrect path.

1. Say "drivers/hdf/" has some symlinks:

    # ls -l drivers/hdf/
    total 364
    drwxrwxr-x 2 ...   4096 ... evdev
    lrwxrwxrwx 1 ...     44 ... framework -> ../../../../../../drivers/hdf_core/framework
    -rw-rw-r-- 1 ... 359010 ... hdf_macro_test.h
    lrwxrwxrwx 1 ...     55 ... inner_api -> ../../../../../../drivers/hdf_core/interfaces/inner_api
    lrwxrwxrwx 1 ...     53 ... khdf -> ../../../../../../drivers/hdf_core/adapter/khdf/linux
    -rw-r--r-- 1 ...     74 ... Makefile
    drwxrwxr-x 3 ...   4096 ... wifi

2. One .cmd file records that:

    # head -1 ./framework/core/manager/src/.devmgr_service.o.cmd
    cmd_drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.o := ... \
    /path/to/out/drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.c

3. os.path.abspath returns "/path/to/out/framework/core/manager/src/devmgr_service.c", not correct:

    # ./scripts/clang-tools/gen_compile_commands.py
    INFO: Could not add line from ./framework/core/manager/src/.devmgr_service.o.cmd: File \
        /path/to/out/framework/core/manager/src/devmgr_service.c not found

Use pathlib.Path.resolve(), which resolves the symlinks and normalizes
the paths correctly.

    # cat compile_commands.json
    ...
    {
      "command": ...
      "directory": ...
      "file": "/path/to/blabla/drivers/hdf_core/framework/core/manager/src/devmgr_service.c"
    },
    ...

Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Jialu Xu <xujialu@vimux.org>
---
 scripts/clang-tools/gen_compile_commands.py | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
index 180952fb91c1b..99e28b7152c19 100755
--- a/scripts/clang-tools/gen_compile_commands.py
+++ b/scripts/clang-tools/gen_compile_commands.py
@@ -11,6 +11,7 @@ import argparse
 import json
 import logging
 import os
+from pathlib import Path
 import re
 import subprocess
 import sys
@@ -172,8 +173,9 @@ def process_line(root_directory, command_prefix, file_path):
     # by Make, so this code replaces the escaped version with '#'.
     prefix = command_prefix.replace('\#', '#').replace('$(pound)', '#')
 
-    # Use os.path.abspath() to normalize the path resolving '.' and '..' .
-    abs_path = os.path.abspath(os.path.join(root_directory, file_path))
+    # Make the path absolute, resolving all symlinks on the way and also normalizing it.
+    # Convert Path object to a string because 'PosixPath' is not JSON serializable.
+    abs_path = str(Path(root_directory, file_path).resolve())
     if not os.path.exists(abs_path):
         raise ValueError('File %s not found' % abs_path)
     return {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4] gen_compile_commands.py: fix path resolve with symlinks in it
  2023-12-06  1:24       ` [PATCH v4] " Jialu Xu
@ 2023-12-07 22:54         ` Justin Stitt
  2023-12-10  5:52         ` Masahiro Yamada
  1 sibling, 0 replies; 13+ messages in thread
From: Justin Stitt @ 2023-12-07 22:54 UTC (permalink / raw)
  To: Jialu Xu
  Cc: nathan, linux-kbuild, linux-kernel, llvm, masahiroy, morbo, ndesaulniers

Hi,

On Wed, Dec 06, 2023 at 09:24:42AM +0800, Jialu Xu wrote:
> When a path contains relative symbolic links, os.path.abspath() might
> not follow the symlinks and instead return the absolute path with just
> the relative paths resolved, resulting in an incorrect path.
>
> 1. Say "drivers/hdf/" has some symlinks:
>
>     # ls -l drivers/hdf/
>     total 364
>     drwxrwxr-x 2 ...   4096 ... evdev
>     lrwxrwxrwx 1 ...     44 ... framework -> ../../../../../../drivers/hdf_core/framework
>     -rw-rw-r-- 1 ... 359010 ... hdf_macro_test.h
>     lrwxrwxrwx 1 ...     55 ... inner_api -> ../../../../../../drivers/hdf_core/interfaces/inner_api
>     lrwxrwxrwx 1 ...     53 ... khdf -> ../../../../../../drivers/hdf_core/adapter/khdf/linux
>     -rw-r--r-- 1 ...     74 ... Makefile
>     drwxrwxr-x 3 ...   4096 ... wifi
>
> 2. One .cmd file records that:
>
>     # head -1 ./framework/core/manager/src/.devmgr_service.o.cmd
>     cmd_drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.o := ... \
>     /path/to/out/drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.c
>
> 3. os.path.abspath returns "/path/to/out/framework/core/manager/src/devmgr_service.c", not correct:
>
>     # ./scripts/clang-tools/gen_compile_commands.py
>     INFO: Could not add line from ./framework/core/manager/src/.devmgr_service.o.cmd: File \
>         /path/to/out/framework/core/manager/src/devmgr_service.c not found
>
> Use pathlib.Path.resolve(), which resolves the symlinks and normalizes
> the paths correctly.
>
>     # cat compile_commands.json
>     ...
>     {
>       "command": ...
>       "directory": ...
>       "file": "/path/to/blabla/drivers/hdf_core/framework/core/manager/src/devmgr_service.c"
>     },
>     ...
>
> Reviewed-by: Nathan Chancellor <nathan@kernel.org>
> Signed-off-by: Jialu Xu <xujialu@vimux.org>
> ---

This looks good to me. I prefer using pathlib over anything in os
module, even if the behavior was the same. In this case, the pathlib
behavior is better -- which is a bonus.

Reviewed-by: Justin Stitt <justinstitt@google.com>
>  scripts/clang-tools/gen_compile_commands.py | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
> index 180952fb91c1b..99e28b7152c19 100755
> --- a/scripts/clang-tools/gen_compile_commands.py
> +++ b/scripts/clang-tools/gen_compile_commands.py
> @@ -11,6 +11,7 @@ import argparse
>  import json
>  import logging
>  import os
> +from pathlib import Path
>  import re
>  import subprocess
>  import sys
> @@ -172,8 +173,9 @@ def process_line(root_directory, command_prefix, file_path):
>      # by Make, so this code replaces the escaped version with '#'.
>      prefix = command_prefix.replace('\#', '#').replace('$(pound)', '#')
>
> -    # Use os.path.abspath() to normalize the path resolving '.' and '..' .
> -    abs_path = os.path.abspath(os.path.join(root_directory, file_path))
> +    # Make the path absolute, resolving all symlinks on the way and also normalizing it.
> +    # Convert Path object to a string because 'PosixPath' is not JSON serializable.
> +    abs_path = str(Path(root_directory, file_path).resolve())
>      if not os.path.exists(abs_path):
>          raise ValueError('File %s not found' % abs_path)
>      return {
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4] gen_compile_commands.py: fix path resolve with symlinks in it
  2023-12-06  1:24       ` [PATCH v4] " Jialu Xu
  2023-12-07 22:54         ` Justin Stitt
@ 2023-12-10  5:52         ` Masahiro Yamada
  2023-12-10  7:05           ` Jialu Xu
  1 sibling, 1 reply; 13+ messages in thread
From: Masahiro Yamada @ 2023-12-10  5:52 UTC (permalink / raw)
  To: Jialu Xu
  Cc: nathan, justinstitt, linux-kbuild, linux-kernel, llvm, morbo,
	ndesaulniers

On Wed, Dec 6, 2023 at 10:26 AM Jialu Xu <xujialu@vimux.org> wrote:
>
> When a path contains relative symbolic links, os.path.abspath() might
> not follow the symlinks and instead return the absolute path with just
> the relative paths resolved, resulting in an incorrect path.
>
> 1. Say "drivers/hdf/" has some symlinks:
>
>     # ls -l drivers/hdf/
>     total 364
>     drwxrwxr-x 2 ...   4096 ... evdev
>     lrwxrwxrwx 1 ...     44 ... framework -> ../../../../../../drivers/hdf_core/framework
>     -rw-rw-r-- 1 ... 359010 ... hdf_macro_test.h
>     lrwxrwxrwx 1 ...     55 ... inner_api -> ../../../../../../drivers/hdf_core/interfaces/inner_api
>     lrwxrwxrwx 1 ...     53 ... khdf -> ../../../../../../drivers/hdf_core/adapter/khdf/linux
>     -rw-r--r-- 1 ...     74 ... Makefile
>     drwxrwxr-x 3 ...   4096 ... wifi
>
> 2. One .cmd file records that:
>
>     # head -1 ./framework/core/manager/src/.devmgr_service.o.cmd
>     cmd_drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.o := ... \
>     /path/to/out/drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.c
>
> 3. os.path.abspath returns "/path/to/out/framework/core/manager/src/devmgr_service.c", not correct:
>
>     # ./scripts/clang-tools/gen_compile_commands.py
>     INFO: Could not add line from ./framework/core/manager/src/.devmgr_service.o.cmd: File \
>         /path/to/out/framework/core/manager/src/devmgr_service.c not found
>
> Use pathlib.Path.resolve(), which resolves the symlinks and normalizes
> the paths correctly.
>
>     # cat compile_commands.json
>     ...
>     {
>       "command": ...
>       "directory": ...
>       "file": "/path/to/blabla/drivers/hdf_core/framework/core/manager/src/devmgr_service.c"
>     },
>     ...
>
> Reviewed-by: Nathan Chancellor <nathan@kernel.org>
> Signed-off-by: Jialu Xu <xujialu@vimux.org>
> ---
>  scripts/clang-tools/gen_compile_commands.py | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
> index 180952fb91c1b..99e28b7152c19 100755
> --- a/scripts/clang-tools/gen_compile_commands.py
> +++ b/scripts/clang-tools/gen_compile_commands.py
> @@ -11,6 +11,7 @@ import argparse
>  import json
>  import logging
>  import os
> +from pathlib import Path
>  import re
>  import subprocess
>  import sys
> @@ -172,8 +173,9 @@ def process_line(root_directory, command_prefix, file_path):
>      # by Make, so this code replaces the escaped version with '#'.
>      prefix = command_prefix.replace('\#', '#').replace('$(pound)', '#')
>
> -    # Use os.path.abspath() to normalize the path resolving '.' and '..' .
> -    abs_path = os.path.abspath(os.path.join(root_directory, file_path))
> +    # Make the path absolute, resolving all symlinks on the way and also normalizing it.
> +    # Convert Path object to a string because 'PosixPath' is not JSON serializable.
> +    abs_path = str(Path(root_directory, file_path).resolve())
>      if not os.path.exists(abs_path):
>          raise ValueError('File %s not found' % abs_path)
>      return {





Is there any reason why you didn't simply replace
os.path.abspath() with os.path.realpath() ?



This patch uses pathlib.Path() just in one place,
leaving many call-sites of os.path.*() functions.

If it is just a matter of your preference,
you need to convert os.path.*() for consistency
(as a follow-up patch).





I see one more os.path.abspath()



    return (args.log_level,
            os.path.abspath(args.directory),
            args.output,
            args.ar,
            args.paths if len(args.paths) > 0 else [args.directory])




Does it cause a similar issue for the 'directory' field
with symbolic link jungles?







--
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4] gen_compile_commands.py: fix path resolve with symlinks in it
  2023-12-10  5:52         ` Masahiro Yamada
@ 2023-12-10  7:05           ` Jialu Xu
  2023-12-10  7:05             ` [PATCH v5] " Jialu Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Jialu Xu @ 2023-12-10  7:05 UTC (permalink / raw)
  To: masahiroy
  Cc: justinstitt, linux-kbuild, linux-kernel, llvm, morbo, nathan,
	ndesaulniers, xujialu

>Is there any reason why you didn't simply replace
>os.path.abspath() with os.path.realpath() ?

I have tried it before, but obviously, I made a mistake.

>This patch uses pathlib.Path() just in one place,
>leaving many call-sites of os.path.*() functions.
>
>If it is just a matter of your preference,
>you need to convert os.path.*() for consistency
>(as a follow-up patch).

Keep os.path.* as os.path.realpath() works.

>I see one more os.path.abspath()
>
>    return (args.log_level,
>            os.path.abspath(args.directory),
>            args.output,
>            args.ar,
>            args.paths if len(args.paths) > 0 else [args.directory])
>
>Does it cause a similar issue for the 'directory' field
>with symbolic link jungles?

Yes, also fixed.


--
Best Regards
Jialu Xu



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v5] gen_compile_commands.py: fix path resolve with symlinks in it
  2023-12-10  7:05           ` Jialu Xu
@ 2023-12-10  7:05             ` Jialu Xu
  2023-12-11 17:53               ` Masahiro Yamada
  0 siblings, 1 reply; 13+ messages in thread
From: Jialu Xu @ 2023-12-10  7:05 UTC (permalink / raw)
  To: masahiroy
  Cc: justinstitt, linux-kbuild, linux-kernel, llvm, morbo, nathan,
	ndesaulniers, xujialu

When a path contains relative symbolic links, os.path.abspath() might
not follow the symlinks and instead return the absolute path with just
the relative paths resolved, resulting in an incorrect path.

1. Say "drivers/hdf/" has some symlinks:

    # ls -l drivers/hdf/
    total 364
    drwxrwxr-x 2 ...   4096 ... evdev
    lrwxrwxrwx 1 ...     44 ... framework -> ../../../../../../drivers/hdf_core/framework
    -rw-rw-r-- 1 ... 359010 ... hdf_macro_test.h
    lrwxrwxrwx 1 ...     55 ... inner_api -> ../../../../../../drivers/hdf_core/interfaces/inner_api
    lrwxrwxrwx 1 ...     53 ... khdf -> ../../../../../../drivers/hdf_core/adapter/khdf/linux
    -rw-r--r-- 1 ...     74 ... Makefile
    drwxrwxr-x 3 ...   4096 ... wifi

2. One .cmd file records that:

    # head -1 ./framework/core/manager/src/.devmgr_service.o.cmd
    cmd_drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.o := ... \
    /path/to/src/drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.c

3. os.path.abspath returns "/path/to/src/framework/core/manager/src/devmgr_service.c", not correct:

    # ./scripts/clang-tools/gen_compile_commands.py
    INFO: Could not add line from ./framework/core/manager/src/.devmgr_service.o.cmd: File \
        /path/to/src/framework/core/manager/src/devmgr_service.c not found

Use os.path.realpath(), which resolves the symlinks and normalizes the paths correctly.

    # cat compile_commands.json
    ...
    {
      "command": ...
      "directory": ...
      "file": "/path/to/bla/drivers/hdf_core/framework/core/manager/src/devmgr_service.c"
    },
    ...

Also fix it in parse_arguments().

Signed-off-by: Jialu Xu <xujialu@vimux.org>
---
 scripts/clang-tools/gen_compile_commands.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
index 180952fb91c1b..5dea4479240bc 100755
--- a/scripts/clang-tools/gen_compile_commands.py
+++ b/scripts/clang-tools/gen_compile_commands.py
@@ -64,7 +64,7 @@ def parse_arguments():
     args = parser.parse_args()
 
     return (args.log_level,
-            os.path.abspath(args.directory),
+            os.path.realpath(args.directory),
             args.output,
             args.ar,
             args.paths if len(args.paths) > 0 else [args.directory])
@@ -172,8 +172,8 @@ def process_line(root_directory, command_prefix, file_path):
     # by Make, so this code replaces the escaped version with '#'.
     prefix = command_prefix.replace('\#', '#').replace('$(pound)', '#')
 
-    # Use os.path.abspath() to normalize the path resolving '.' and '..' .
-    abs_path = os.path.abspath(os.path.join(root_directory, file_path))
+    # Return the canonical path, eliminating any symbolic links encountered in the path.
+    abs_path = os.path.realpath(os.path.join(root_directory, file_path))
     if not os.path.exists(abs_path):
         raise ValueError('File %s not found' % abs_path)
     return {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v5] gen_compile_commands.py: fix path resolve with symlinks in it
  2023-12-10  7:05             ` [PATCH v5] " Jialu Xu
@ 2023-12-11 17:53               ` Masahiro Yamada
  0 siblings, 0 replies; 13+ messages in thread
From: Masahiro Yamada @ 2023-12-11 17:53 UTC (permalink / raw)
  To: Jialu Xu
  Cc: justinstitt, linux-kbuild, linux-kernel, llvm, morbo, nathan,
	ndesaulniers

On Sun, Dec 10, 2023 at 4:06 PM Jialu Xu <xujialu@vimux.org> wrote:
>
> When a path contains relative symbolic links, os.path.abspath() might
> not follow the symlinks and instead return the absolute path with just
> the relative paths resolved, resulting in an incorrect path.
>
> 1. Say "drivers/hdf/" has some symlinks:
>
>     # ls -l drivers/hdf/
>     total 364
>     drwxrwxr-x 2 ...   4096 ... evdev
>     lrwxrwxrwx 1 ...     44 ... framework -> ../../../../../../drivers/hdf_core/framework
>     -rw-rw-r-- 1 ... 359010 ... hdf_macro_test.h
>     lrwxrwxrwx 1 ...     55 ... inner_api -> ../../../../../../drivers/hdf_core/interfaces/inner_api
>     lrwxrwxrwx 1 ...     53 ... khdf -> ../../../../../../drivers/hdf_core/adapter/khdf/linux
>     -rw-r--r-- 1 ...     74 ... Makefile
>     drwxrwxr-x 3 ...   4096 ... wifi
>
> 2. One .cmd file records that:
>
>     # head -1 ./framework/core/manager/src/.devmgr_service.o.cmd
>     cmd_drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.o := ... \
>     /path/to/src/drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.c
>
> 3. os.path.abspath returns "/path/to/src/framework/core/manager/src/devmgr_service.c", not correct:
>
>     # ./scripts/clang-tools/gen_compile_commands.py
>     INFO: Could not add line from ./framework/core/manager/src/.devmgr_service.o.cmd: File \
>         /path/to/src/framework/core/manager/src/devmgr_service.c not found
>
> Use os.path.realpath(), which resolves the symlinks and normalizes the paths correctly.
>
>     # cat compile_commands.json
>     ...
>     {
>       "command": ...
>       "directory": ...
>       "file": "/path/to/bla/drivers/hdf_core/framework/core/manager/src/devmgr_service.c"
>     },
>     ...
>
> Also fix it in parse_arguments().
>
> Signed-off-by: Jialu Xu <xujialu@vimux.org>



Applied to linux-kbuild. Thanks.




-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-12-11 17:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-04 10:41 [PATCH] gen_compile_commands.py: fix path resolve with symlinks in it Jialu Xu
2023-12-04 16:59 ` Nathan Chancellor
2023-12-05  2:15   ` Jialu Xu
2023-12-05  2:15   ` [PATCH v2] gen_compile_commands.py: fix path resolve with symlinks in it Jialu Xu
2023-12-05 16:56     ` Nathan Chancellor
2023-12-06  1:20       ` Jialu Xu
2023-12-06  1:20       ` [PATCH v3] " Jialu Xu
2023-12-06  1:24       ` [PATCH v4] " Jialu Xu
2023-12-07 22:54         ` Justin Stitt
2023-12-10  5:52         ` Masahiro Yamada
2023-12-10  7:05           ` Jialu Xu
2023-12-10  7:05             ` [PATCH v5] " Jialu Xu
2023-12-11 17:53               ` Masahiro Yamada

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).