All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/8] Introduce an extensible static analyzer
@ 2022-07-02 11:33 Alberto Faria
  2022-07-02 11:33 ` [RFC 1/8] Add " Alberto Faria
                   ` (10 more replies)
  0 siblings, 11 replies; 27+ messages in thread
From: Alberto Faria @ 2022-07-02 11:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven,
	Alberto Faria

This series introduces a static analyzer for QEMU. It consists of a
single static-analyzer.py script that relies on libclang's Python
bindings, and provides a common framework on which arbitrary static
analysis checks can be developed and run against QEMU's code base.

Summary of the series:

  - Patch 1 adds the base static analyzer, along with a simple check
    that finds static functions whose return value is never used, and
    patch 2 fixes some occurrences of this.

  - Patch 3 adds a check to ensure that non-coroutine_fn functions don't
    perform direct calls to coroutine_fn functions, and patch 4 fixes
    some violations of this rule.

  - Patch 5 adds a check to enforce coroutine_fn restrictions on
    function pointers, namely around assignment and indirect calls, and
    patch 6 fixes some problems it detects. (Implementing this check
    properly is complicated, since AFAICT annotation attributes cannot
    be applied directly to types. This part still needs a lot of work.)

  - Patch 7 introduces a no_coroutine_fn marker for functions that
    should not be called from coroutines, makes generated_co_wrapper
    evaluate to no_coroutine_fn, and adds a check enforcing this rule.
    Patch 8 fixes some violations that it finds.

The current primary motivation for this work is enforcing rules around
block layer coroutines, which is why most of the series focuses on that.
However, the static analyzer is intended to be sufficiently generic to
satisfy other present and future QEMU static analysis needs.

This is very early work-in-progress, and a lot is missing. One notable
omission is build system integration, including keeping track of which
translation units have been modified and need re-analyzing.

Performance is bad, but there is a lot of potential for optimization,
such as avoiding redundant AST traversals. Switching to C libclang is
also a possibility, although Python makes it easy to quickly prototype
new checks, which should encourage adoption and contributions.

The script takes a path to the build directory, and any number of paths
to directories or files to analyze. Example run on a 12-thread laptop:

    $ time ./static-analyzer.py build block
    block/commit.c:525:15: non-coroutine_fn function calls coroutine_fn
    block/nbd.c:206:5: non-coroutine_fn function calls coroutine_fn
    [...]
    block/ssh.c:1167:13: non-coroutine_fn function calls coroutine_fn
    block/nfs.c:229:27: non-coroutine_fn function calls coroutine_fn
    Analyzed 79 translation units.

    real    0m45.277s
    user    7m55.496s
    sys     0m1.445s

You will need libclang's Python bindings to run this. On Fedora, `dnf
install python3-clang` should suffice.

Alberto Faria (8):
  Add an extensible static analyzer
  Drop some unused static function return values
  static-analyzer: Enforce coroutine_fn restrictions for direct calls
  Fix some direct calls from non-coroutine_fn to coroutine_fn
  static-analyzer: Enforce coroutine_fn restrictions on function
    pointers
  Fix some coroutine_fn indirect calls and pointer assignments
  block: Add no_coroutine_fn marker
  Avoid calls from coroutine_fn to no_coroutine_fn

 block/block-backend.c            |  15 +-
 block/copy-before-write.c        |   3 +-
 block/dirty-bitmap.c             |   6 +-
 block/file-posix.c               |   6 +-
 block/io.c                       |  34 +-
 block/iscsi.c                    |   3 +-
 block/parallels.c                |   4 +-
 block/qcow2-bitmap.c             |   6 +-
 block/qcow2-refcount.c           |   2 +-
 block/qcow2.h                    |  14 +-
 block/qed-table.c                |   2 +-
 block/qed.c                      |   8 +-
 block/quorum.c                   |   5 +-
 block/vmdk.c                     |   4 +-
 block/vpc.c                      |   9 +-
 block/vvfat.c                    |  11 +-
 include/block/block-common.h     |   2 +-
 include/block/block-io.h         |   7 +-
 include/block/block_int-common.h |  12 +-
 include/qemu/coroutine.h         |  25 +
 static-analyzer.py               | 890 +++++++++++++++++++++++++++++++
 21 files changed, 987 insertions(+), 81 deletions(-)
 create mode 100755 static-analyzer.py

-- 
2.36.1



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [RFC 1/8] Add an extensible static analyzer
  2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
@ 2022-07-02 11:33 ` Alberto Faria
  2022-07-02 11:33 ` [RFC 2/8] Drop some unused static function return values Alberto Faria
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Alberto Faria @ 2022-07-02 11:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven,
	Alberto Faria

Add a static-analyzer.py script that uses libclang's Python bindings to
provide a common framework on which arbitrary static analysis checks can
be developed and run against QEMU's code base.

As an example, a simple check is included that verifies that the return
value of static, non-void functions is used by at least one caller.

Signed-off-by: Alberto Faria <afaria@redhat.com>
---
 static-analyzer.py | 509 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 509 insertions(+)
 create mode 100755 static-analyzer.py

diff --git a/static-analyzer.py b/static-analyzer.py
new file mode 100755
index 0000000000..010cc92212
--- /dev/null
+++ b/static-analyzer.py
@@ -0,0 +1,509 @@
+#!/usr/bin/env python3
+# ---------------------------------------------------------------------------- #
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+import json
+import os
+import os.path
+import subprocess
+import sys
+import re
+from argparse import ArgumentParser, Namespace, RawDescriptionHelpFormatter
+from multiprocessing import Pool
+from pathlib import Path
+from typing import (
+    Any,
+    Callable,
+    Dict,
+    Iterable,
+    List,
+    NoReturn,
+    Optional,
+    Mapping,
+    Sequence,
+    Tuple,
+)
+
+# ---------------------------------------------------------------------------- #
+
+from clang.cindex import (  # type: ignore
+    Cursor,
+    CursorKind,
+    Diagnostic,
+    StorageClass,
+    TranslationUnit,
+    TranslationUnitLoadError,
+    TypeKind,
+)
+
+Cursor.__hash__ = lambda self: self.hash  # so `Cursor`s can be dict keys
+
+# ---------------------------------------------------------------------------- #
+# Usage
+
+
+def parse_args() -> Namespace:
+
+    available_checks = "\n".join(f"  {name}" for (name, _) in CHECKS)
+
+    parser = ArgumentParser(
+        formatter_class=RawDescriptionHelpFormatter,
+        epilog=f"""
+available checks:
+{available_checks}
+
+exit codes:
+  0  No problems found.
+  1  Analyzer failure.
+  2  Bad usage.
+  3  Problems found in a translation unit.
+""",
+    )
+
+    parser.add_argument("build_dir", type=Path)
+
+    parser.add_argument(
+        "translation_units",
+        type=Path,
+        nargs="*",
+        help=(
+            "Analyze only translation units whose root source file matches or"
+            " is under one of the given paths."
+        ),
+    )
+
+    parser.add_argument(
+        "-c",
+        "--check",
+        metavar="CHECK",
+        dest="check_names",
+        choices=[name for (name, _) in CHECKS],
+        action="append",
+        help=(
+            "Enable the given check. Can be given multiple times. If not given,"
+            " all checks are enabled."
+        ),
+    )
+
+    parser.add_argument(
+        "-j",
+        "--jobs",
+        dest="threads",
+        type=int,
+        help=(
+            "Number of threads to employ. Defaults to the number of logical"
+            " processors."
+        ),
+    )
+
+    parser.add_argument(
+        "--profile",
+        action="store_true",
+        help="Profile execution. Forces single-threaded execution.",
+    )
+
+    return parser.parse_args()
+
+
+# ---------------------------------------------------------------------------- #
+# Main
+
+
+def main() -> NoReturn:
+
+    args = parse_args()
+
+    compile_commands = load_compilation_database(args)
+    contexts = get_translation_unit_contexts(args, compile_commands)
+
+    analyze_translation_units(args, contexts)
+
+
+def load_compilation_database(args: Namespace) -> Sequence[Mapping[str, str]]:
+
+    # clang.cindex.CompilationDatabase.getCompileCommands() apparently produces
+    # entries for files not listed in compile_commands.json in a best-effort
+    # manner, which we don't want, so we parse the JSON ourselves instead.
+
+    path = args.build_dir / "compile_commands.json"
+
+    try:
+        with path.open("r") as f:
+            return json.load(f)
+    except FileNotFoundError:
+        fatal(f"{path} does not exist")
+
+
+def get_translation_unit_contexts(
+    args: Namespace, compile_commands: Iterable[Mapping[str, str]]
+) -> Sequence[TranslationUnitContext]:
+
+    system_include_paths = get_clang_system_include_paths()
+    check_names = args.check_names or [name for (name, _) in CHECKS]
+
+    contexts = (
+        TranslationUnitContext(
+            absolute_path=str(Path(cmd["directory"], cmd["file"]).resolve()),
+            compilation_working_dir=cmd["directory"],
+            compilation_command=cmd["command"],
+            system_include_paths=system_include_paths,
+            check_names=check_names,
+        )
+        for cmd in compile_commands
+    )
+
+    if args.translation_units:
+
+        allowed_prefixes = [
+            # ensure path exists and is slash-terminated (even if it is a file)
+            os.path.join(path.resolve(strict=True), "")
+            for path in args.translation_units
+        ]
+
+        contexts = (
+            ctx
+            for ctx in contexts
+            if any(
+                (ctx.absolute_path + "/").startswith(prefix)
+                for prefix in allowed_prefixes
+            )
+        )
+
+    context_list = list(contexts)
+
+    if not context_list:
+        fatal("No translation units to analyze")
+
+    return context_list
+
+
+def get_clang_system_include_paths() -> Sequence[str]:
+
+    # libclang does not automatically include clang's standard system include
+    # paths, so we ask clang what they are and include them ourselves.
+
+    # TODO: Is there a less hacky way to do this?
+
+    result = subprocess.run(
+        ["clang", "-E", "-", "-v"],
+        stdin=subprocess.DEVNULL,
+        stdout=subprocess.DEVNULL,
+        stderr=subprocess.PIPE,
+        universal_newlines=True,  # decode stdout/stderr using default encoding
+        check=True,
+    )
+
+    # Module `re` does not support repeated group captures.
+    pattern = (
+        r"#include <...> search starts here:\n"
+        r"((?: \S*\n)+)"
+        r"End of search list."
+    )
+
+    match = re.search(pattern, result.stderr, re.MULTILINE)
+    assert match is not None
+
+    return [line[1:] for line in match.group(1).splitlines()]
+
+
+def fatal(message: str) -> NoReturn:
+    print(f"\033[0;31mERROR: {message}\033[0m")
+    sys.exit(1)
+
+
+# ---------------------------------------------------------------------------- #
+# Analysis
+
+
+@dataclass
+class TranslationUnitContext:
+    absolute_path: str
+    compilation_working_dir: str
+    compilation_command: str
+    system_include_paths: Sequence[str]
+    check_names: Sequence[str]
+
+
+def analyze_translation_units(
+    args: Namespace, contexts: Sequence[TranslationUnitContext]
+) -> NoReturn:
+
+    results: List[bool]
+
+    if not args.profile:
+
+        with Pool(processes=args.threads) as pool:
+            results = pool.map(analyze_translation_unit, contexts)
+
+    else:
+
+        import cProfile
+        import pstats
+
+        profile = cProfile.Profile()
+
+        try:
+            results = profile.runcall(
+                lambda: list(map(analyze_translation_unit, contexts))
+            )
+        finally:
+            stats = pstats.Stats(profile, stream=sys.stderr)
+            stats.strip_dirs()
+            stats.sort_stats("tottime")
+            stats.print_stats()
+
+    print(
+        f"\033[0;34mAnalyzed {len(contexts)}"
+        f" translation unit{'' if len(contexts) == 1 else 's'}.\033[0m"
+    )
+
+    sys.exit(0 if all(results) else 3)
+
+
+def analyze_translation_unit(context: TranslationUnitContext) -> bool:
+
+    # relative to script's original working directory
+    relative_path = os.path.relpath(context.absolute_path)
+
+    # load translation unit
+
+    command = context.compilation_command.split()
+
+    adjusted_command = [
+        # keep the original compilation command name
+        command[0],
+        # ignore unknown GCC warning options
+        "-Wno-unknown-warning-option",
+        # add clang system include paths
+        *(
+            arg
+            for path in context.system_include_paths
+            for arg in ("-isystem", path)
+        ),
+        # keep all other arguments but the last, which is the file name
+        *command[1:-1],
+        # replace relative path to get absolute location information
+        context.absolute_path,
+    ]
+
+    original_cwd = os.getcwd()
+    os.chdir(context.compilation_working_dir)  # for options like -I to work
+
+    try:
+        tu = TranslationUnit.from_source(filename=None, args=adjusted_command)
+    except TranslationUnitLoadError as e:
+        raise RuntimeError(f"Failed to load {relative_path}") from e
+
+    os.chdir(original_cwd)  # to have proper relative paths in messages
+
+    # check for fatal diagnostics
+
+    found_problems = False
+
+    for diag in tu.diagnostics:
+        # consider only Fatal diagnostics, like missing includes
+        if diag.severity >= Diagnostic.Fatal:
+            found_problems = True
+            location = format_location(diag, default=relative_path)
+            print(
+                f"\033[0;33m{location}: {diag.spelling}; this may lead to false"
+                f" positives and negatives\033[0m"
+            )
+
+    # analyze translation unit
+
+    def log(node: Cursor, message: str) -> None:
+        nonlocal found_problems
+        found_problems = True
+        print(f"{format_location(node)}: {message}")
+
+    try:
+        for (name, checker) in CHECKS:
+            if name in context.check_names:
+                checker(tu, context.absolute_path, log)
+    except Exception as e:
+        raise RuntimeError(f"Error analyzing {relative_path}") from e
+
+    return not found_problems
+
+
+# obj must have a location field/property that is a `SourceLocation`.
+def format_location(obj: Any, *, default: str = "(none)") -> str:
+
+    location = obj.location
+
+    if location.file is None:
+        return default
+    else:
+        abs_path = Path(location.file.name).resolve()
+        rel_path = os.path.relpath(abs_path)
+        return f"{rel_path}:{location.line}:{location.column}"
+
+
+# ---------------------------------------------------------------------------- #
+# Checks
+
+Checker = Callable[[TranslationUnit, str, Callable[[Cursor, str], None]], None]
+
+CHECKS: List[Tuple[str, Checker]] = []
+
+
+def check(name: str) -> Callable[[Checker], Checker]:
+    def decorator(checker: Checker) -> Checker:
+        CHECKS.append((name, checker))
+        return checker
+
+    return decorator
+
+
+@check("return-value-never-used")
+def check_return_value_never_used(
+    translation_unit: TranslationUnit,
+    translation_unit_path: str,
+    log: Callable[[Cursor, str], None],
+) -> None:
+    """
+    Report static functions with a non-void return value that no caller ever
+    uses.
+
+    This check is best effort, but should never report false positives (positive
+    being error).
+    """
+
+    def function_occurrence_might_use_return_value(
+        ancestors: Sequence[Cursor], node: Cursor
+    ) -> bool:
+
+        if ancestors[-1].kind.is_statement():
+
+            return False
+
+        elif (
+            ancestors[-1].kind == CursorKind.CALL_EXPR
+            and ancestors[-1].referenced == node.referenced
+        ):
+
+            if not ancestors[-2].kind.is_statement():
+                return True
+
+            if ancestors[-2].kind in [
+                CursorKind.IF_STMT,
+                CursorKind.SWITCH_STMT,
+                CursorKind.WHILE_STMT,
+                CursorKind.DO_STMT,
+                CursorKind.RETURN_STMT,
+            ]:
+                return True
+
+            if ancestors[-2].kind == CursorKind.FOR_STMT:
+                [_init, cond, _adv] = ancestors[-2].get_children()
+                if ancestors[-1] == cond:
+                    return True
+
+            return False
+
+        else:
+
+            # might be doing something with a pointer to the function
+            return True
+
+    # Maps canonical function `Cursor`s to whether we found a place that maybe
+    # uses their return value. Only includes static functions that don't return
+    # void and belong to the translation unit's root file (i.e, were not brought
+    # in by an #include).
+    funcs: Dict[Cursor, bool] = {}
+
+    for [*ancestors, node] in all_paths(translation_unit.cursor):
+
+        if (
+            node.kind == CursorKind.FUNCTION_DECL
+            and node.storage_class == StorageClass.STATIC
+            and node.location.file.name == translation_unit_path
+            and node.type.get_result().get_canonical().kind != TypeKind.VOID
+        ):
+            funcs.setdefault(node.canonical, False)
+
+        if (
+            node.kind == CursorKind.DECL_REF_EXPR
+            and node.referenced.kind == CursorKind.FUNCTION_DECL
+            and node.referenced.canonical in funcs
+            and function_occurrence_might_use_return_value(ancestors, node)
+        ):
+            funcs[node.referenced.canonical] = True
+
+    # ---
+
+    for (cursor, return_value_maybe_used) in funcs.items():
+        if not return_value_maybe_used:
+            log(cursor, f"{cursor.spelling}() return value is never used")
+
+
+# ---------------------------------------------------------------------------- #
+# Traversal
+
+# Hides nodes of kind UNEXPOSED_EXPR.
+def all_paths(root: Cursor) -> Iterable[Sequence[Cursor]]:
+
+    path = []
+
+    def aux(node: Cursor) -> Iterable[Sequence[Cursor]]:
+        nonlocal path
+
+        if node.kind != CursorKind.UNEXPOSED_EXPR:
+            path.append(node)
+            yield path
+
+        for child in node.get_children():
+            yield from aux(child)
+
+        if node.kind != CursorKind.UNEXPOSED_EXPR:
+            path.pop()
+
+    return aux(root)
+
+
+# ---------------------------------------------------------------------------- #
+# Utilities handy for development
+
+
+def print_node(node: Cursor) -> None:
+
+    print(f"{format_location(node)}: kind = {node.kind.name}", end="")
+
+    if node.spelling:
+        print(f", name = {node.spelling}", end="")
+
+    if node.type is not None:
+        print(f", type = {node.type.get_canonical().spelling}", end="")
+
+    if node.referenced is not None:
+        print(f", referenced = {node.referenced.spelling}", end="")
+
+    print()
+
+
+def print_tree(
+    node: Cursor, *, max_depth: Optional[int] = None, indentation_level: int = 0
+) -> None:
+
+    if max_depth is None or max_depth >= 0:
+
+        print("  " * indentation_level, end="")
+        print_node(node)
+
+        for child in node.get_children():
+            print_tree(
+                child,
+                max_depth=None if max_depth is None else max_depth - 1,
+                indentation_level=indentation_level + 1,
+            )
+
+
+# ---------------------------------------------------------------------------- #
+
+if __name__ == "__main__":
+    main()
+
+# ---------------------------------------------------------------------------- #
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 2/8] Drop some unused static function return values
  2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
  2022-07-02 11:33 ` [RFC 1/8] Add " Alberto Faria
@ 2022-07-02 11:33 ` Alberto Faria
  2022-07-02 11:33 ` [RFC 3/8] static-analyzer: Enforce coroutine_fn restrictions for direct calls Alberto Faria
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Alberto Faria @ 2022-07-02 11:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven,
	Alberto Faria

Make some non-void static functions whose return values are ignored by
all callers return void instead.

These functions were found by the shiny new static-analyzer.py. Only a
few of the reported cases were fixed.

Signed-off-by: Alberto Faria <afaria@redhat.com>
---
 block/file-posix.c   |  6 +-----
 block/io.c           | 24 +++++++++++-------------
 block/qcow2-bitmap.c |  6 ++----
 block/quorum.c       |  5 +----
 block/vpc.c          |  9 +++------
 block/vvfat.c        | 11 +++++------
 6 files changed, 23 insertions(+), 38 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 48cd096624..a4641da7f9 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1895,7 +1895,7 @@ static int handle_aiocb_discard(void *opaque)
  * Returns: 0 on success, -errno on failure. Since this is an optimization,
  * caller may ignore failures.
  */
-static int allocate_first_block(int fd, size_t max_size)
+static void allocate_first_block(int fd, size_t max_size)
 {
     size_t write_size = (max_size < MAX_BLOCKSIZE)
         ? BDRV_SECTOR_SIZE
@@ -1903,7 +1903,6 @@ static int allocate_first_block(int fd, size_t max_size)
     size_t max_align = MAX(MAX_BLOCKSIZE, qemu_real_host_page_size());
     void *buf;
     ssize_t n;
-    int ret;
 
     buf = qemu_memalign(max_align, write_size);
     memset(buf, 0, write_size);
@@ -1912,10 +1911,7 @@ static int allocate_first_block(int fd, size_t max_size)
         n = pwrite(fd, buf, write_size, 0);
     } while (n == -1 && errno == EINTR);
 
-    ret = (n == -1) ? -errno : 0;
-
     qemu_vfree(buf);
-    return ret;
 }
 
 static int handle_aiocb_truncate(void *opaque)
diff --git a/block/io.c b/block/io.c
index 1e9bf09a49..bbfe94503b 100644
--- a/block/io.c
+++ b/block/io.c
@@ -934,20 +934,18 @@ void bdrv_dec_in_flight(BlockDriverState *bs)
     bdrv_wakeup(bs);
 }
 
-static bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest *self)
+static void coroutine_fn
+bdrv_wait_serialising_requests(BdrvTrackedRequest *self)
 {
     BlockDriverState *bs = self->bs;
-    bool waited = false;
 
     if (!qatomic_read(&bs->serialising_in_flight)) {
-        return false;
+        return;
     }
 
     qemu_co_mutex_lock(&bs->reqs_lock);
-    waited = bdrv_wait_serialising_requests_locked(self);
+    bdrv_wait_serialising_requests_locked(self);
     qemu_co_mutex_unlock(&bs->reqs_lock);
-
-    return waited;
 }
 
 bool coroutine_fn bdrv_make_request_serialising(BdrvTrackedRequest *req,
@@ -1689,10 +1687,10 @@ static bool bdrv_init_padding(BlockDriverState *bs,
     return true;
 }
 
-static int bdrv_padding_rmw_read(BdrvChild *child,
-                                 BdrvTrackedRequest *req,
-                                 BdrvRequestPadding *pad,
-                                 bool zero_middle)
+static void bdrv_padding_rmw_read(BdrvChild *child,
+                                  BdrvTrackedRequest *req,
+                                  BdrvRequestPadding *pad,
+                                  bool zero_middle)
 {
     QEMUIOVector local_qiov;
     BlockDriverState *bs = child->bs;
@@ -1715,7 +1713,7 @@ static int bdrv_padding_rmw_read(BdrvChild *child,
         ret = bdrv_aligned_preadv(child, req, req->overlap_offset, bytes,
                                   align, &local_qiov, 0, 0);
         if (ret < 0) {
-            return ret;
+            return;
         }
         if (pad->head) {
             bdrv_debug_event(bs, BLKDBG_PWRITEV_RMW_AFTER_HEAD);
@@ -1738,7 +1736,7 @@ static int bdrv_padding_rmw_read(BdrvChild *child,
                 req->overlap_offset + req->overlap_bytes - align,
                 align, align, &local_qiov, 0, 0);
         if (ret < 0) {
-            return ret;
+            return;
         }
         bdrv_debug_event(bs, BLKDBG_PWRITEV_RMW_AFTER_TAIL);
     }
@@ -1748,7 +1746,7 @@ zero_mem:
         memset(pad->buf + pad->head, 0, pad->buf_len - pad->head - pad->tail);
     }
 
-    return 0;
+    return;
 }
 
 static void bdrv_padding_destroy(BdrvRequestPadding *pad)
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 8fb4731551..a4e9fe73d4 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -257,14 +257,14 @@ fail:
     return ret;
 }
 
-static int free_bitmap_clusters(BlockDriverState *bs, Qcow2BitmapTable *tb)
+static void free_bitmap_clusters(BlockDriverState *bs, Qcow2BitmapTable *tb)
 {
     int ret;
     uint64_t *bitmap_table;
 
     ret = bitmap_table_load(bs, tb, &bitmap_table);
     if (ret < 0) {
-        return ret;
+        return;
     }
 
     clear_bitmap_table(bs, bitmap_table, tb->size);
@@ -274,8 +274,6 @@ static int free_bitmap_clusters(BlockDriverState *bs, Qcow2BitmapTable *tb)
 
     tb->offset = 0;
     tb->size = 0;
-
-    return 0;
 }
 
 /* load_bitmap_data
diff --git a/block/quorum.c b/block/quorum.c
index f33f30d36b..9c0fbd79be 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -293,7 +293,7 @@ static void quorum_rewrite_entry(void *opaque)
     }
 }
 
-static bool quorum_rewrite_bad_versions(QuorumAIOCB *acb,
+static void quorum_rewrite_bad_versions(QuorumAIOCB *acb,
                                         QuorumVoteValue *value)
 {
     QuorumVoteVersion *version;
@@ -331,9 +331,6 @@ static bool quorum_rewrite_bad_versions(QuorumAIOCB *acb,
             qemu_coroutine_enter(co);
         }
     }
-
-    /* return true if any rewrite is done else false */
-    return count;
 }
 
 static void quorum_count_vote(QuorumVotes *votes,
diff --git a/block/vpc.c b/block/vpc.c
index 4d8f16e199..bd705cffb0 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -777,11 +777,10 @@ static int coroutine_fn vpc_co_block_status(BlockDriverState *bs,
  * Note that the geometry doesn't always exactly match total_sectors but
  * may round it down.
  *
- * Returns 0 on success, -EFBIG if the size is larger than 2040 GiB. Override
- * the hardware EIDE and ATA-2 limit of 16 heads (max disk size of 127 GB)
- * and instead allow up to 255 heads.
+ * Override the hardware EIDE and ATA-2 limit of 16 heads (max disk size of 127
+ * GB) and instead allow up to 255 heads.
  */
-static int calculate_geometry(int64_t total_sectors, uint16_t *cyls,
+static void calculate_geometry(int64_t total_sectors, uint16_t *cyls,
     uint8_t *heads, uint8_t *secs_per_cyl)
 {
     uint32_t cyls_times_heads;
@@ -815,8 +814,6 @@ static int calculate_geometry(int64_t total_sectors, uint16_t *cyls,
     }
 
     *cyls = cyls_times_heads / *heads;
-
-    return 0;
 }
 
 static int create_dynamic_disk(BlockBackend *blk, VHDFooter *footer,
diff --git a/block/vvfat.c b/block/vvfat.c
index b2b58d93b8..fba7581427 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -154,9 +154,9 @@ static inline int array_remove_slice(array_t* array,int index, int count)
     return 0;
 }
 
-static int array_remove(array_t* array,int index)
+static void array_remove(array_t* array,int index)
 {
-    return array_remove_slice(array, index, 1);
+    array_remove_slice(array, index, 1);
 }
 
 /* return the index for a given member */
@@ -2967,13 +2967,12 @@ DLOG(checkpoint());
     return 0;
 }
 
-static int try_commit(BDRVVVFATState* s)
+static void try_commit(BDRVVVFATState* s)
 {
     vvfat_close_current_file(s);
 DLOG(checkpoint());
-    if(!is_consistent(s))
-        return -1;
-    return do_commit(s);
+    if (is_consistent(s))
+        do_commit(s);
 }
 
 static int vvfat_write(BlockDriverState *bs, int64_t sector_num,
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 3/8] static-analyzer: Enforce coroutine_fn restrictions for direct calls
  2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
  2022-07-02 11:33 ` [RFC 1/8] Add " Alberto Faria
  2022-07-02 11:33 ` [RFC 2/8] Drop some unused static function return values Alberto Faria
@ 2022-07-02 11:33 ` Alberto Faria
  2022-07-02 11:33 ` [RFC 4/8] Fix some direct calls from non-coroutine_fn to coroutine_fn Alberto Faria
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Alberto Faria @ 2022-07-02 11:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven,
	Alberto Faria

Add a static-analyzer.py check ensuring that non-coroutine_fn functions
don't perform direct calls to coroutine_fn functions.

For the few cases where this must happen, introduce an
__allow_coroutine_fn_call() macro that wraps offending calls and
overrides the static analyzer.

Signed-off-by: Alberto Faria <afaria@redhat.com>
---
 include/qemu/coroutine.h |  13 +++
 static-analyzer.py       | 207 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 220 insertions(+)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 08c5bb3c76..40a4037525 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -42,7 +42,20 @@
  *       ....
  *   }
  */
+#ifdef __clang__
+#define coroutine_fn __attribute__((__annotate__("coroutine_fn")))
+#else
 #define coroutine_fn
+#endif
+
+/**
+ * This can wrap a call to a coroutine_fn from a non-coroutine_fn function and
+ * suppress the static analyzer's complaints.
+ *
+ * You don't want to use this.
+ */
+#define __allow_coroutine_fn_call(call) \
+    ((void)"__allow_coroutine_fn_call", call)
 
 typedef struct Coroutine Coroutine;
 
diff --git a/static-analyzer.py b/static-analyzer.py
index 010cc92212..8abc552b82 100755
--- a/static-analyzer.py
+++ b/static-analyzer.py
@@ -440,6 +440,81 @@ def function_occurrence_might_use_return_value(
             log(cursor, f"{cursor.spelling}() return value is never used")
 
 
+@check("coroutine-annotation-validity")
+def check_coroutine_annotation_validity(
+    translation_unit: TranslationUnit,
+    translation_unit_path: str,
+    log: Callable[[Cursor, str], None],
+) -> None:
+
+    for [*ancestors, node] in all_paths(translation_unit.cursor):
+
+        # validate annotation usage
+
+        if is_annotation(node, "coroutine_fn") and (
+            ancestors[-1] is None
+            or not is_valid_coroutine_fn_usage(ancestors[-1])
+        ):
+            log(node, "invalid coroutine_fn usage")
+
+        if is_comma_wrapper(
+            node, "__allow_coroutine_fn_call"
+        ) and not is_valid_allow_coroutine_fn_call_usage(node):
+            log(node, "invalid __allow_coroutine_fn_call usage")
+
+        # reject re-declarations with inconsistent annotations
+
+        if node.kind == CursorKind.FUNCTION_DECL:
+
+            def log_annotation_disagreement(annotation: str) -> None:
+                log(
+                    node,
+                    f"{annotation} annotation disagreement with"
+                    f" {format_location(node.canonical)}",
+                )
+
+            if is_coroutine_fn(node) != is_coroutine_fn(node.canonical):
+                log_annotation_disagreement("coroutine_fn")
+
+
+@check("coroutine-calls")
+def check_coroutine_calls(
+    translation_unit: TranslationUnit,
+    translation_unit_path: str,
+    log: Callable[[Cursor, str], None],
+) -> None:
+
+    for caller in subtrees_matching(
+        translation_unit.cursor,
+        lambda n: n.kind == CursorKind.FUNCTION_DECL and n.is_definition(),
+    ):
+
+        caller_is_coroutine = is_coroutine_fn(caller)
+
+        for [*_, call_parent, call] in filter(
+            lambda path: path[-1].kind == CursorKind.CALL_EXPR,
+            all_paths(caller),
+        ):
+
+            # We can get some "calls" that are actually things like top-level
+            # macro invocations.
+            if call.referenced is None:
+                continue
+
+            callee = call.referenced.canonical
+
+            # reject calls from non-coroutine_fn to coroutine_fn
+
+            if (
+                not caller_is_coroutine
+                and is_coroutine_fn(callee)
+                and not is_comma_wrapper(
+                    call_parent, "__allow_coroutine_fn_call"
+                )
+            ):
+                log(call, "non-coroutine_fn function calls coroutine_fn")
+
+
 # ---------------------------------------------------------------------------- #
 # Traversal
 
@@ -464,6 +539,138 @@ def aux(node: Cursor) -> Iterable[Sequence[Cursor]]:
     return aux(root)
 
 
+# Doesn't traverse yielded subtrees.
+def subtrees_matching(
+    root: Cursor, predicate: Callable[[Cursor], bool]
+) -> Iterable[Cursor]:
+
+    if predicate(root):
+        yield root
+    else:
+        for child in root.get_children():
+            yield from subtrees_matching(child, predicate)
+
+
+# ---------------------------------------------------------------------------- #
+# Node predicates
+
+
+def is_valid_coroutine_fn_usage(parent: Cursor) -> bool:
+    """
+    Check if an occurrence of `coroutine_fn` represented by a node with parent
+    `parent` appears at a valid point in the AST. This is the case if `parent`
+    is:
+
+      - A function declaration/definition, OR
+      - A field/variable/parameter declaration with a function pointer type, OR
+      - A typedef of a function type or function pointer type.
+    """
+
+    if parent.kind == CursorKind.FUNCTION_DECL:
+        return True
+
+    canonical_type = parent.type.get_canonical()
+
+    def parent_type_is_function() -> bool:
+        return canonical_type.kind == TypeKind.FUNCTIONPROTO
+
+    def parent_type_is_function_pointer() -> bool:
+        return (
+            canonical_type.kind == TypeKind.POINTER
+            and canonical_type.get_pointee().kind == TypeKind.FUNCTIONPROTO
+        )
+
+    if parent.kind in [
+        CursorKind.FIELD_DECL,
+        CursorKind.VAR_DECL,
+        CursorKind.PARM_DECL,
+    ]:
+        return parent_type_is_function_pointer()
+
+    if parent.kind == CursorKind.TYPEDEF_DECL:
+        return parent_type_is_function() or parent_type_is_function_pointer()
+
+    return False
+
+
+def is_valid_allow_coroutine_fn_call_usage(node: Cursor) -> bool:
+    """
+    Check if an occurrence of `__allow_coroutine_fn_call()` represented by node
+    `node` appears at a valid point in the AST. This is the case if its right
+    operand is a call to:
+
+      - A function declared with the `coroutine_fn` annotation.
+
+    TODO: Ensure that `__allow_coroutine_fn_call()` is in the body of a
+    non-`coroutine_fn` function.
+    """
+
+    [_, call] = node.get_children()
+
+    if call.kind != CursorKind.CALL_EXPR:
+        return False
+
+    return is_coroutine_fn(call.referenced)
+
+
+def is_coroutine_fn(node: Cursor) -> bool:
+    """
+    Check whether the given `node` should be considered to be `coroutine_fn`.
+
+    This assumes valid usage of `coroutine_fn`.
+    """
+
+    while node.kind in [CursorKind.PAREN_EXPR, CursorKind.UNEXPOSED_EXPR]:
+        children = list(node.get_children())
+        if len(children) == 1:
+            node = children[0]
+        else:
+            break
+
+    return node.kind == CursorKind.FUNCTION_DECL and is_annotated_with(
+        node, "coroutine_fn"
+    )
+
+
+def is_annotated_with(node: Cursor, annotation: str) -> bool:
+    return any(is_annotation(c, annotation) for c in node.get_children())
+
+
+def is_annotation(node: Cursor, annotation: str) -> bool:
+    return node.kind == CursorKind.ANNOTATE_ATTR and node.spelling == annotation
+
+
+# A "comma wrapper" is the pattern `((void)string_literal, expr)`. The `expr` is
+# said to be "comma wrapped".
+def is_comma_wrapper(node: Cursor, literal: str) -> bool:
+
+    # TODO: Do we need to check that the operator is `,`? Is there another
+    # operator that can combine void and an expr?
+
+    if node.kind != CursorKind.BINARY_OPERATOR:
+        return False
+
+    [left, _right] = node.get_children()
+
+    if (
+        left.kind != CursorKind.CSTYLE_CAST_EXPR
+        or left.type.kind != TypeKind.VOID
+    ):
+        return False
+
+    [unexposed_expr] = left.get_children()
+
+    if unexposed_expr.kind != CursorKind.UNEXPOSED_EXPR:
+        return False
+
+    [string_literal] = unexposed_expr.get_children()
+
+    return (
+        string_literal.kind == CursorKind.STRING_LITERAL
+        and string_literal.spelling == f'"{literal}"'
+    )
+
+
 # ---------------------------------------------------------------------------- #
 # Utilities handy for development
 
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 4/8] Fix some direct calls from non-coroutine_fn to coroutine_fn
  2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
                   ` (2 preceding siblings ...)
  2022-07-02 11:33 ` [RFC 3/8] static-analyzer: Enforce coroutine_fn restrictions for direct calls Alberto Faria
@ 2022-07-02 11:33 ` Alberto Faria
  2022-07-02 14:13   ` Paolo Bonzini
  2022-07-02 11:33 ` [RFC 5/8] static-analyzer: Enforce coroutine_fn restrictions on function pointers Alberto Faria
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Alberto Faria @ 2022-07-02 11:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven,
	Alberto Faria

These problems were found by static-analyzer.py. Only a few of the
reported cases were fixed.

Signed-off-by: Alberto Faria <afaria@redhat.com>
---
 block/block-backend.c     | 13 ++++++++-----
 block/copy-before-write.c |  3 ++-
 block/dirty-bitmap.c      |  6 ++++--
 block/iscsi.c             |  3 ++-
 block/qcow2.h             | 14 +++++++-------
 block/qed.c               |  6 +++---
 include/block/block-io.h  |  7 ++++---
 7 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index f425b00793..5f2a912a59 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1537,8 +1537,9 @@ static void blk_aio_read_entry(void *opaque)
     QEMUIOVector *qiov = rwco->iobuf;
 
     assert(qiov->size == acb->bytes);
-    rwco->ret = blk_co_do_preadv(rwco->blk, rwco->offset, acb->bytes,
-                                 qiov, rwco->flags);
+    rwco->ret = __allow_coroutine_fn_call(
+        blk_co_do_preadv(rwco->blk, rwco->offset, acb->bytes, qiov,
+                         rwco->flags));
     blk_aio_complete(acb);
 }
 
@@ -1682,7 +1683,8 @@ static void blk_aio_ioctl_entry(void *opaque)
     BlkAioEmAIOCB *acb = opaque;
     BlkRwCo *rwco = &acb->rwco;
 
-    rwco->ret = blk_co_do_ioctl(rwco->blk, rwco->offset, rwco->iobuf);
+    rwco->ret = __allow_coroutine_fn_call(
+        blk_co_do_ioctl(rwco->blk, rwco->offset, rwco->iobuf));
 
     blk_aio_complete(acb);
 }
@@ -1716,7 +1718,8 @@ static void blk_aio_pdiscard_entry(void *opaque)
     BlkAioEmAIOCB *acb = opaque;
     BlkRwCo *rwco = &acb->rwco;
 
-    rwco->ret = blk_co_do_pdiscard(rwco->blk, rwco->offset, acb->bytes);
+    rwco->ret = __allow_coroutine_fn_call(
+        blk_co_do_pdiscard(rwco->blk, rwco->offset, acb->bytes));
     blk_aio_complete(acb);
 }
 
@@ -1772,7 +1775,7 @@ static void blk_aio_flush_entry(void *opaque)
     BlkAioEmAIOCB *acb = opaque;
     BlkRwCo *rwco = &acb->rwco;
 
-    rwco->ret = blk_co_do_flush(rwco->blk);
+    rwco->ret = __allow_coroutine_fn_call(blk_co_do_flush(rwco->blk));
     blk_aio_complete(acb);
 }
 
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index c24b8dd117..5096abbc08 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -240,7 +240,8 @@ static BlockReq *cbw_snapshot_read_lock(BlockDriverState *bs,
     return req;
 }
 
-static void cbw_snapshot_read_unlock(BlockDriverState *bs, BlockReq *req)
+static coroutine_fn void
+cbw_snapshot_read_unlock(BlockDriverState *bs, BlockReq *req)
 {
     BDRVCopyBeforeWriteState *s = bs->opaque;
 
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index bf3dc0512a..ccf46c0b1f 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -419,7 +419,8 @@ int bdrv_remove_persistent_dirty_bitmap(BlockDriverState *bs, const char *name,
                                         Error **errp)
 {
     if (qemu_in_coroutine()) {
-        return bdrv_co_remove_persistent_dirty_bitmap(bs, name, errp);
+        return __allow_coroutine_fn_call(
+            bdrv_co_remove_persistent_dirty_bitmap(bs, name, errp));
     } else {
         Coroutine *co;
         BdrvRemovePersistentDirtyBitmapCo s = {
@@ -495,7 +496,8 @@ bool bdrv_can_store_new_dirty_bitmap(BlockDriverState *bs, const char *name,
 {
     IO_CODE();
     if (qemu_in_coroutine()) {
-        return bdrv_co_can_store_new_dirty_bitmap(bs, name, granularity, errp);
+        return __allow_coroutine_fn_call(
+            bdrv_co_can_store_new_dirty_bitmap(bs, name, granularity, errp));
     } else {
         Coroutine *co;
         BdrvCanStoreNewDirtyBitmapCo s = {
diff --git a/block/iscsi.c b/block/iscsi.c
index d707d0b354..967438b4bd 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -290,7 +290,8 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
     }
 }
 
-static void iscsi_co_init_iscsitask(IscsiLun *iscsilun, struct IscsiTask *iTask)
+static void coroutine_fn
+iscsi_co_init_iscsitask(IscsiLun *iscsilun, struct IscsiTask *iTask)
 {
     *iTask = (struct IscsiTask) {
         .co         = qemu_coroutine_self(),
diff --git a/block/qcow2.h b/block/qcow2.h
index ba436a8d0d..e68d127d8e 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -990,13 +990,13 @@ int qcow2_truncate_bitmaps_check(BlockDriverState *bs, Error **errp);
 bool qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs,
                                           bool release_stored, Error **errp);
 int qcow2_reopen_bitmaps_ro(BlockDriverState *bs, Error **errp);
-bool qcow2_co_can_store_new_dirty_bitmap(BlockDriverState *bs,
-                                         const char *name,
-                                         uint32_t granularity,
-                                         Error **errp);
-int qcow2_co_remove_persistent_dirty_bitmap(BlockDriverState *bs,
-                                            const char *name,
-                                            Error **errp);
+bool coroutine_fn qcow2_co_can_store_new_dirty_bitmap(BlockDriverState *bs,
+                                                      const char *name,
+                                                      uint32_t granularity,
+                                                      Error **errp);
+int coroutine_fn qcow2_co_remove_persistent_dirty_bitmap(BlockDriverState *bs,
+                                                         const char *name,
+                                                         Error **errp);
 bool qcow2_supports_persistent_dirty_bitmap(BlockDriverState *bs);
 uint64_t qcow2_get_persistent_dirty_bitmap_size(BlockDriverState *bs,
                                                 uint32_t cluster_size);
diff --git a/block/qed.c b/block/qed.c
index f34d9a3ac1..96f4cda83f 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -259,7 +259,7 @@ static CachedL2Table *qed_new_l2_table(BDRVQEDState *s)
     return l2_table;
 }
 
-static bool qed_plug_allocating_write_reqs(BDRVQEDState *s)
+static bool coroutine_fn qed_plug_allocating_write_reqs(BDRVQEDState *s)
 {
     qemu_co_mutex_lock(&s->table_lock);
 
@@ -278,7 +278,7 @@ static bool qed_plug_allocating_write_reqs(BDRVQEDState *s)
     return true;
 }
 
-static void qed_unplug_allocating_write_reqs(BDRVQEDState *s)
+static void coroutine_fn qed_unplug_allocating_write_reqs(BDRVQEDState *s)
 {
     qemu_co_mutex_lock(&s->table_lock);
     assert(s->allocating_write_reqs_plugged);
@@ -568,7 +568,7 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
 
     bdrv_qed_init_state(bs);
     if (qemu_in_coroutine()) {
-        bdrv_qed_open_entry(&qoc);
+        __allow_coroutine_fn_call(bdrv_qed_open_entry(&qoc));
     } else {
         assert(qemu_get_current_aio_context() == qemu_get_aio_context());
         qemu_coroutine_enter(qemu_coroutine_create(bdrv_qed_open_entry, &qoc));
diff --git a/include/block/block-io.h b/include/block/block-io.h
index 053a27141a..cefe3494fe 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -80,7 +80,8 @@ int bdrv_co_ioctl(BlockDriverState *bs, int req, void *buf);
 /* Ensure contents are flushed to disk.  */
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
 
-int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
+int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset,
+                                  int64_t bytes);
 bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs);
 int bdrv_block_status(BlockDriverState *bs, int64_t offset,
                       int64_t bytes, int64_t *pnum, int64_t *map,
@@ -88,8 +89,8 @@ int bdrv_block_status(BlockDriverState *bs, int64_t offset,
 int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base,
                             int64_t offset, int64_t bytes, int64_t *pnum,
                             int64_t *map, BlockDriverState **file);
-int bdrv_is_allocated(BlockDriverState *bs, int64_t offset, int64_t bytes,
-                      int64_t *pnum);
+int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
+                                   int64_t bytes, int64_t *pnum);
 int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
                             bool include_base, int64_t offset, int64_t bytes,
                             int64_t *pnum);
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 5/8] static-analyzer: Enforce coroutine_fn restrictions on function pointers
  2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
                   ` (3 preceding siblings ...)
  2022-07-02 11:33 ` [RFC 4/8] Fix some direct calls from non-coroutine_fn to coroutine_fn Alberto Faria
@ 2022-07-02 11:33 ` Alberto Faria
  2022-07-04 14:16   ` Víctor Colombo
  2022-07-02 11:33 ` [RFC 6/8] Fix some coroutine_fn indirect calls and pointer assignments Alberto Faria
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Alberto Faria @ 2022-07-02 11:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven,
	Alberto Faria

Extend static-analyzer.py to enforce coroutine_fn restrictions on
function pointer operations.

Invalid operations include assigning a coroutine_fn value to a
non-coroutine_fn function pointer, and invoking a coroutine_fn function
pointer from a non-coroutine_fn function.

Signed-off-by: Alberto Faria <afaria@redhat.com>
---
 static-analyzer.py | 147 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 143 insertions(+), 4 deletions(-)

diff --git a/static-analyzer.py b/static-analyzer.py
index 8abc552b82..485d054052 100755
--- a/static-analyzer.py
+++ b/static-analyzer.py
@@ -36,6 +36,8 @@
     TranslationUnit,
     TranslationUnitLoadError,
     TypeKind,
+    SourceRange,
+    TokenGroup,
 )
 
 Cursor.__hash__ = lambda self: self.hash  # so `Cursor`s can be dict keys
@@ -515,6 +517,90 @@ def check_coroutine_calls(
                 log(call, "non-coroutine_fn function calls coroutine_fn")
 
 
+@check("coroutine-pointers")
+def check_coroutine_pointers(
+    translation_unit: TranslationUnit,
+    translation_unit_path: str,
+    log: Callable[[Cursor, str], None],
+) -> None:
+
+    # What we would really like is to associate annotation attributes with types
+    # directly, but that doesn't seem possible. Instead, we have to look at the
+    # relevant variable/field/parameter declarations, and follow typedefs.
+
+    # This doesn't check all possible ways of assigning to a coroutine_fn
+    # field/variable/parameter. That would probably be too much work.
+
+    # TODO: Check struct/union/array initialization.
+    # TODO: Check assignment to struct/union/array fields.
+
+    for [*_, node] in all_paths(translation_unit.cursor):
+
+        # check initialization of variables using coroutine_fn values
+
+        if node.kind == CursorKind.VAR_DECL:
+
+            children = [
+                c
+                for c in node.get_children()
+                if c.kind
+                not in [
+                    CursorKind.ANNOTATE_ATTR,
+                    CursorKind.INIT_LIST_EXPR,
+                    CursorKind.TYPE_REF,
+                    CursorKind.UNEXPOSED_ATTR,
+                ]
+            ]
+
+            if (
+                len(children) == 1
+                and not is_coroutine_fn(node)
+                and is_coroutine_fn(children[0])
+            ):
+                log(node, "assigning coroutine_fn to non-coroutine_fn")
+
+        # check initialization of fields using coroutine_fn values
+
+        # TODO: This only checks designator initializers.
+
+        if node.kind == CursorKind.INIT_LIST_EXPR:
+
+            for initializer in filter(
+                lambda n: n.kind == CursorKind.UNEXPOSED_EXPR,
+                node.get_children(),
+            ):
+
+                children = list(initializer.get_children())
+
+                if (
+                    len(children) == 2
+                    and children[0].kind == CursorKind.MEMBER_REF
+                    and not is_coroutine_fn(children[0].referenced)
+                    and is_coroutine_fn(children[1])
+                ):
+                    log(
+                        initializer,
+                        "assigning coroutine_fn to non-coroutine_fn",
+                    )
+
+        # check assignments of coroutine_fn values to variables or fields
+
+        if is_binary_operator(node, "="):
+
+            [left, right] = node.get_children()
+
+            if (
+                left.kind
+                in [
+                    CursorKind.DECL_REF_EXPR,
+                    CursorKind.MEMBER_REF_EXPR,
+                ]
+                and not is_coroutine_fn(left.referenced)
+                and is_coroutine_fn(right)
+            ):
+                log(node, "assigning coroutine_fn to non-coroutine_fn")
+
+
 # ---------------------------------------------------------------------------- #
 # Traversal
 
@@ -555,6 +641,31 @@ def subtrees_matching(
 # Node predicates
 
 
+def is_binary_operator(node: Cursor, operator: str) -> bool:
+    return (
+        node.kind == CursorKind.BINARY_OPERATOR
+        and get_binary_operator_spelling(node) == operator
+    )
+
+
+def get_binary_operator_spelling(node: Cursor) -> Optional[str]:
+
+    assert node.kind == CursorKind.BINARY_OPERATOR
+
+    [left, right] = node.get_children()
+
+    op_range = SourceRange.from_locations(left.extent.end, right.extent.start)
+
+    tokens = list(TokenGroup.get_tokens(node.translation_unit, op_range))
+    if not tokens:
+        # Can occur when left and right children extents overlap due to
+        # misparsing.
+        return None
+
+    [op_token, *_] = tokens
+    return op_token.spelling
+
+
 def is_valid_coroutine_fn_usage(parent: Cursor) -> bool:
     """
     Check if an occurrence of `coroutine_fn` represented by a node with parent
@@ -599,7 +710,13 @@ def is_valid_allow_coroutine_fn_call_usage(node: Cursor) -> bool:
     `node` appears at a valid point in the AST. This is the case if its right
     operand is a call to:
 
-      - A function declared with the `coroutine_fn` annotation.
+      - A function declared with the `coroutine_fn` annotation, OR
+      - A field/variable/parameter whose declaration has the `coroutine_fn`
+        annotation, and of function pointer type, OR
+      - [TODO] A field/variable/parameter of a typedef function pointer type,
+        and the typedef has the `coroutine_fn` annotation, OR
+      - [TODO] A field/variable/parameter of a pointer to typedef function type,
+        and the typedef has the `coroutine_fn` annotation.
 
     TODO: Ensure that `__allow_coroutine_fn_call()` is in the body of a
     non-`coroutine_fn` function.
@@ -627,9 +744,31 @@ def is_coroutine_fn(node: Cursor) -> bool:
         else:
             break
 
-    return node.kind == CursorKind.FUNCTION_DECL and is_annotated_with(
-        node, "coroutine_fn"
-    )
+    if node.kind in [CursorKind.DECL_REF_EXPR, CursorKind.MEMBER_REF_EXPR]:
+        node = node.referenced
+
+    # ---
+
+    if node.kind == CursorKind.FUNCTION_DECL:
+        return is_annotated_with(node, "coroutine_fn")
+
+    if node.kind in [
+        CursorKind.FIELD_DECL,
+        CursorKind.VAR_DECL,
+        CursorKind.PARM_DECL,
+    ]:
+
+        if is_annotated_with(node, "coroutine_fn"):
+            return True
+
+        # TODO: If type is typedef or pointer to typedef, follow typedef.
+
+        return False
+
+    if node.kind == CursorKind.TYPEDEF_DECL:
+        return is_annotated_with(node, "coroutine_fn")
+
+    return False
 
 
 def is_annotated_with(node: Cursor, annotation: str) -> bool:
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 6/8] Fix some coroutine_fn indirect calls and pointer assignments
  2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
                   ` (4 preceding siblings ...)
  2022-07-02 11:33 ` [RFC 5/8] static-analyzer: Enforce coroutine_fn restrictions on function pointers Alberto Faria
@ 2022-07-02 11:33 ` Alberto Faria
  2022-07-02 11:33 ` [RFC 7/8] block: Add no_coroutine_fn marker Alberto Faria
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Alberto Faria @ 2022-07-02 11:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven,
	Alberto Faria

These problems were found by static-analyzer.py. Only a few of the
reported cases were fixed.

Signed-off-by: Alberto Faria <afaria@redhat.com>
---
 include/block/block_int-common.h | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 8947abab76..16c45d1262 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -731,13 +731,11 @@ struct BlockDriver {
     void coroutine_fn (*bdrv_co_drain_end)(BlockDriverState *bs);
 
     bool (*bdrv_supports_persistent_dirty_bitmap)(BlockDriverState *bs);
-    bool (*bdrv_co_can_store_new_dirty_bitmap)(BlockDriverState *bs,
-                                               const char *name,
-                                               uint32_t granularity,
-                                               Error **errp);
-    int (*bdrv_co_remove_persistent_dirty_bitmap)(BlockDriverState *bs,
-                                                  const char *name,
-                                                  Error **errp);
+    bool coroutine_fn (*bdrv_co_can_store_new_dirty_bitmap)(
+        BlockDriverState *bs, const char *name, uint32_t granularity,
+        Error **errp);
+    int coroutine_fn (*bdrv_co_remove_persistent_dirty_bitmap)(
+        BlockDriverState *bs, const char *name, Error **errp);
 };
 
 static inline bool block_driver_can_compress(BlockDriver *drv)
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 7/8] block: Add no_coroutine_fn marker
  2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
                   ` (5 preceding siblings ...)
  2022-07-02 11:33 ` [RFC 6/8] Fix some coroutine_fn indirect calls and pointer assignments Alberto Faria
@ 2022-07-02 11:33 ` Alberto Faria
  2022-07-02 11:33 ` [RFC 8/8] Avoid calls from coroutine_fn to no_coroutine_fn Alberto Faria
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Alberto Faria @ 2022-07-02 11:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven,
	Alberto Faria

When applied to a function, it advertises that it should not be called
from coroutine_fn functions.

Make generated_co_wrapper evaluate to no_coroutine_fn, as coroutine_fn
functions should instead directly call the coroutine_fn that backs the
generated_co_wrapper.

Extend static-analyzer.py's "coroutine-annotation-validity" and
"coroutine-calls" checks to enforce no_coroutine_fn rules.

Signed-off-by: Alberto Faria <afaria@redhat.com>
---
 include/block/block-common.h |  2 +-
 include/qemu/coroutine.h     | 12 ++++++++++++
 static-analyzer.py           | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/include/block/block-common.h b/include/block/block-common.h
index fdb7306e78..4b60c8c6f2 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -42,7 +42,7 @@
  *
  * Read more in docs/devel/block-coroutine-wrapper.rst
  */
-#define generated_co_wrapper
+#define generated_co_wrapper no_coroutine_fn
 
 /* block.c */
 typedef struct BlockDriver BlockDriver;
diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 40a4037525..11eea8cc79 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -48,6 +48,18 @@
 #define coroutine_fn
 #endif
 
+/**
+ * Mark a function that should never be called from a coroutine context
+ *
+ * This typically means that there is an analogous, coroutine_fn function that
+ * should be used instead.
+ */
+#ifdef __clang__
+#define no_coroutine_fn __attribute__((__annotate__("no_coroutine_fn")))
+#else
+#define no_coroutine_fn
+#endif
+
 /**
  * This can wrap a call to a coroutine_fn from a non-coroutine_fn function and
  * suppress the static analyzer's complaints.
diff --git a/static-analyzer.py b/static-analyzer.py
index 485d054052..ad7ee1ccf5 100755
--- a/static-analyzer.py
+++ b/static-analyzer.py
@@ -459,6 +459,12 @@ def check_coroutine_annotation_validity(
         ):
             log(node, "invalid coroutine_fn usage")
 
+        if is_annotation(node, "no_coroutine_fn") and (
+            ancestors[-1] is None
+            or not is_valid_no_coroutine_fn_usage(ancestors[-1])
+        ):
+            log(node, "invalid no_coroutine_fn usage")
+
         if is_comma_wrapper(
             node, "__allow_coroutine_fn_call"
         ) and not is_valid_allow_coroutine_fn_call_usage(node):
@@ -478,6 +484,9 @@ def log_annotation_disagreement(annotation: str) -> None:
             if is_coroutine_fn(node) != is_coroutine_fn(node.canonical):
                 log_annotation_disagreement("coroutine_fn")
 
+            if is_no_coroutine_fn(node) != is_no_coroutine_fn(node.canonical):
+                log_annotation_disagreement("no_coroutine_fn")
+
 
 @check("coroutine-calls")
 def check_coroutine_calls(
@@ -516,6 +525,11 @@ def check_coroutine_calls(
             ):
                 log(call, "non-coroutine_fn function calls coroutine_fn")
 
+            # reject calls from coroutine_fn to no_coroutine_fn
+
+            if caller_is_coroutine and is_no_coroutine_fn(callee):
+                log(call, f"coroutine_fn calls no_coroutine_fn function")
+
 
 @check("coroutine-pointers")
 def check_coroutine_pointers(
@@ -704,6 +718,16 @@ def parent_type_is_function_pointer() -> bool:
     return False
 
 
+def is_valid_no_coroutine_fn_usage(parent: Cursor) -> bool:
+    """
+    Checks if an occurrence of `no_coroutine_fn` represented by a node with
+    parent `parent` appears at a valid point in the AST. This is the case if the
+    parent is a function declaration/definition.
+    """
+
+    return parent.kind == CursorKind.FUNCTION_DECL
+
+
 def is_valid_allow_coroutine_fn_call_usage(node: Cursor) -> bool:
     """
     Check if an occurrence of `__allow_coroutine_fn_call()` represented by node
@@ -771,6 +795,17 @@ def is_coroutine_fn(node: Cursor) -> bool:
     return False
 
 
+def is_no_coroutine_fn(node: Cursor) -> bool:
+    """
+    Checks whether the given `node` should be considered to be
+    `no_coroutine_fn`.
+
+    This assumes valid usage of `no_coroutine_fn`.
+    """
+
+    return is_annotated_with(node, "no_coroutine_fn")
+
+
 def is_annotated_with(node: Cursor, annotation: str) -> bool:
     return any(is_annotation(c, annotation) for c in node.get_children())
 
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 8/8] Avoid calls from coroutine_fn to no_coroutine_fn
  2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
                   ` (6 preceding siblings ...)
  2022-07-02 11:33 ` [RFC 7/8] block: Add no_coroutine_fn marker Alberto Faria
@ 2022-07-02 11:33 ` Alberto Faria
  2022-07-02 14:17 ` [RFC 0/8] Introduce an extensible static analyzer Paolo Bonzini
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Alberto Faria @ 2022-07-02 11:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven,
	Alberto Faria

These calls were found by static-analyzer.py.

Signed-off-by: Alberto Faria <afaria@redhat.com>
---
 block/block-backend.c  |  2 +-
 block/io.c             | 10 +++++-----
 block/parallels.c      |  4 ++--
 block/qcow2-refcount.c |  2 +-
 block/qed-table.c      |  2 +-
 block/qed.c            |  2 +-
 block/vmdk.c           |  4 ++--
 7 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 5f2a912a59..8fa48576cd 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1395,7 +1395,7 @@ static int coroutine_fn blk_pwritev_part(BlockBackend *blk, int64_t offset,
     int ret;
 
     blk_inc_in_flight(blk);
-    ret = blk_do_pwritev_part(blk, offset, bytes, qiov, qiov_offset, flags);
+    ret = blk_co_do_pwritev_part(blk, offset, bytes, qiov, qiov_offset, flags);
     blk_dec_in_flight(blk);
 
     return ret;
diff --git a/block/io.c b/block/io.c
index bbfe94503b..832bccd31e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2781,8 +2781,8 @@ int coroutine_fn bdrv_co_is_zero_fast(BlockDriverState *bs, int64_t offset,
         return 1;
     }
 
-    ret = bdrv_common_block_status_above(bs, NULL, false, false, offset,
-                                         bytes, &pnum, NULL, NULL, NULL);
+    ret = bdrv_co_common_block_status_above(bs, NULL, false, false, offset,
+                                            bytes, &pnum, NULL, NULL, NULL);
 
     if (ret < 0) {
         return ret;
@@ -2798,9 +2798,9 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
     int64_t dummy;
     IO_CODE();
 
-    ret = bdrv_common_block_status_above(bs, bs, true, false, offset,
-                                         bytes, pnum ? pnum : &dummy, NULL,
-                                         NULL, NULL);
+    ret = bdrv_co_common_block_status_above(bs, bs, true, false, offset,
+                                            bytes, pnum ? pnum : &dummy, NULL,
+                                            NULL, NULL);
     if (ret < 0) {
         return ret;
     }
diff --git a/block/parallels.c b/block/parallels.c
index 8879b7027a..22c59a1ee2 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -503,8 +503,8 @@ static int coroutine_fn parallels_co_check(BlockDriverState *bs,
              * In order to really repair the image, we must shrink it.
              * That means we have to pass exact=true.
              */
-            ret = bdrv_truncate(bs->file, res->image_end_offset, true,
-                                PREALLOC_MODE_OFF, 0, &local_err);
+            ret = bdrv_co_truncate(bs->file, res->image_end_offset, true,
+                                   PREALLOC_MODE_OFF, 0, &local_err);
             if (ret < 0) {
                 error_report_err(local_err);
                 res->check_errors++;
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index ed0ecfaa89..e30fd38e14 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1233,7 +1233,7 @@ int coroutine_fn qcow2_flush_caches(BlockDriverState *bs)
         return ret;
     }
 
-    return bdrv_flush(bs->file->bs);
+    return bdrv_co_flush(bs->file->bs);
 }
 
 /*********************************************************/
diff --git a/block/qed-table.c b/block/qed-table.c
index 1cc844b1a5..aa203f2627 100644
--- a/block/qed-table.c
+++ b/block/qed-table.c
@@ -100,7 +100,7 @@ static int coroutine_fn qed_write_table(BDRVQEDState *s, uint64_t offset,
     }
 
     if (flush) {
-        ret = bdrv_flush(s->bs);
+        ret = bdrv_co_flush(s->bs);
         if (ret < 0) {
             goto out;
         }
diff --git a/block/qed.c b/block/qed.c
index 96f4cda83f..dc9f065c02 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -490,7 +490,7 @@ static int coroutine_fn bdrv_qed_do_open(BlockDriverState *bs, QDict *options,
         }
 
         /* From here on only known autoclear feature bits are valid */
-        bdrv_flush(bs->file->bs);
+        bdrv_co_flush(bs->file->bs);
     }
 
     s->l1_table = qed_alloc_table(s);
diff --git a/block/vmdk.c b/block/vmdk.c
index 38e5ab3806..5c94a2f27c 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -2150,8 +2150,8 @@ vmdk_co_pwritev_compressed(BlockDriverState *bs, int64_t offset, int64_t bytes,
                 return length;
             }
             length = QEMU_ALIGN_UP(length, BDRV_SECTOR_SIZE);
-            ret = bdrv_truncate(s->extents[i].file, length, false,
-                                PREALLOC_MODE_OFF, 0, NULL);
+            ret = bdrv_co_truncate(s->extents[i].file, length, false,
+                                   PREALLOC_MODE_OFF, 0, NULL);
             if (ret < 0) {
                 return ret;
             }
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC 4/8] Fix some direct calls from non-coroutine_fn to coroutine_fn
  2022-07-02 11:33 ` [RFC 4/8] Fix some direct calls from non-coroutine_fn to coroutine_fn Alberto Faria
@ 2022-07-02 14:13   ` Paolo Bonzini
  2022-07-03 22:20     ` Alberto Faria
  0 siblings, 1 reply; 27+ messages in thread
From: Paolo Bonzini @ 2022-07-02 14:13 UTC (permalink / raw)
  To: Alberto Faria, qemu-devel
  Cc: qemu-block, Denis V. Lunev, Emanuele Giuseppe Esposito,
	Stefan Hajnoczi, Ronnie Sahlberg, Hanna Reitz,
	Stefano Garzarella, Kevin Wolf, Peter Xu, Alberto Garcia,
	John Snow, Eric Blake, Fam Zheng, Markus Armbruster,
	Vladimir Sementsov-Ogievskiy, Peter Lieven

On 7/2/22 13:33, Alberto Faria wrote:
> @@ -1537,8 +1537,9 @@ static void blk_aio_read_entry(void *opaque)
>       QEMUIOVector *qiov = rwco->iobuf;
>   
>       assert(qiov->size == acb->bytes);
> -    rwco->ret = blk_co_do_preadv(rwco->blk, rwco->offset, acb->bytes,
> -                                 qiov, rwco->flags);
> +    rwco->ret = __allow_coroutine_fn_call(
> +        blk_co_do_preadv(rwco->blk, rwco->offset, acb->bytes, qiov,
> +                         rwco->flags));
>       blk_aio_complete(acb);
>   }
>   
> @@ -1682,7 +1683,8 @@ static void blk_aio_ioctl_entry(void *opaque)
>       BlkAioEmAIOCB *acb = opaque;
>       BlkRwCo *rwco = &acb->rwco;
>   
> -    rwco->ret = blk_co_do_ioctl(rwco->blk, rwco->offset, rwco->iobuf);
> +    rwco->ret = __allow_coroutine_fn_call(
> +        blk_co_do_ioctl(rwco->blk, rwco->offset, rwco->iobuf));
>   
>       blk_aio_complete(acb);
>   }
> @@ -1716,7 +1718,8 @@ static void blk_aio_pdiscard_entry(void *opaque)
>       BlkAioEmAIOCB *acb = opaque;
>       BlkRwCo *rwco = &acb->rwco;
>   
> -    rwco->ret = blk_co_do_pdiscard(rwco->blk, rwco->offset, acb->bytes);
> +    rwco->ret = __allow_coroutine_fn_call(
> +        blk_co_do_pdiscard(rwco->blk, rwco->offset, acb->bytes));
>       blk_aio_complete(acb);
>   }
>   
> @@ -1772,7 +1775,7 @@ static void blk_aio_flush_entry(void *opaque)
>       BlkAioEmAIOCB *acb = opaque;
>       BlkRwCo *rwco = &acb->rwco;
>   
> -    rwco->ret = blk_co_do_flush(rwco->blk);
> +    rwco->ret = __allow_coroutine_fn_call(blk_co_do_flush(rwco->blk));
>       blk_aio_complete(acb);
>   }
>   

These functions should be coroutine_fn (all coroutine entry points 
should be).

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 0/8] Introduce an extensible static analyzer
  2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
                   ` (7 preceding siblings ...)
  2022-07-02 11:33 ` [RFC 8/8] Avoid calls from coroutine_fn to no_coroutine_fn Alberto Faria
@ 2022-07-02 14:17 ` Paolo Bonzini
  2022-07-04 16:28 ` Daniel P. Berrangé
  2022-07-05 16:13 ` Daniel P. Berrangé
  10 siblings, 0 replies; 27+ messages in thread
From: Paolo Bonzini @ 2022-07-02 14:17 UTC (permalink / raw)
  To: Alberto Faria, qemu-devel
  Cc: qemu-block, Denis V. Lunev, Emanuele Giuseppe Esposito,
	Stefan Hajnoczi, Ronnie Sahlberg, Hanna Reitz,
	Stefano Garzarella, Kevin Wolf, Peter Xu, Alberto Garcia,
	John Snow, Eric Blake, Fam Zheng, Markus Armbruster,
	Vladimir Sementsov-Ogievskiy, Peter Lieven

On 7/2/22 13:33, Alberto Faria wrote:
> The current primary motivation for this work is enforcing rules around
> block layer coroutines, which is why most of the series focuses on that.
> However, the static analyzer is intended to be sufficiently generic to
> satisfy other present and future QEMU static analysis needs.
> 
> This is very early work-in-progress, and a lot is missing. One notable
> omission is build system integration, including keeping track of which
> translation units have been modified and need re-analyzing.
> 
> Performance is bad, but there is a lot of potential for optimization,
> such as avoiding redundant AST traversals. Switching to C libclang is
> also a possibility, although Python makes it easy to quickly prototype
> new checks, which should encourage adoption and contributions.
> 
> The script takes a path to the build directory, and any number of paths
> to directories or files to analyze. Example run on a 12-thread laptop:

Thanks, this is very useful!  Unfortunately there's quite a lot of fixes 
to go in (including your generated_co_wrapper cleanup series and 
mine[1]) before this can be enabled, but I think this is the way to go 
to 1) ease maintainability of coroutine code 2) move towards a model 
where there are no mixed coroutine/non-coroutine functions.

I'll review it when I'm back, since a phone screen is not the best 
environment to do that. :)

Paolo

[1] https://patchew.org/QEMU/20220509103019.215041-1-pbonzini@redhat.com/


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 4/8] Fix some direct calls from non-coroutine_fn to coroutine_fn
  2022-07-02 14:13   ` Paolo Bonzini
@ 2022-07-03 22:20     ` Alberto Faria
  0 siblings, 0 replies; 27+ messages in thread
From: Alberto Faria @ 2022-07-03 22:20 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Sat, Jul 2, 2022 at 3:13 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> These functions should be coroutine_fn (all coroutine entry points
> should be).

Thanks, I see now that you fixed this in [1].

Alberto

[1] https://patchew.org/QEMU/20220509103019.215041-1-pbonzini@redhat.com/



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 5/8] static-analyzer: Enforce coroutine_fn restrictions on function pointers
  2022-07-02 11:33 ` [RFC 5/8] static-analyzer: Enforce coroutine_fn restrictions on function pointers Alberto Faria
@ 2022-07-04 14:16   ` Víctor Colombo
  2022-07-04 16:57     ` Alberto Faria
  0 siblings, 1 reply; 27+ messages in thread
From: Víctor Colombo @ 2022-07-04 14:16 UTC (permalink / raw)
  To: Alberto Faria, qemu-devel
  Cc: Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On 02/07/2022 08:33, Alberto Faria wrote:

Alberto, hello. I was testing this patch as follows:

./static-analyzer.py build target/ppc/mmu-hash64.c

<snip>

> @@ -627,9 +744,31 @@ def is_coroutine_fn(node: Cursor) -> bool:
>           else:
>               break
> 
> -    return node.kind == CursorKind.FUNCTION_DECL and is_annotated_with(
> -        node, "coroutine_fn"
> -    )
> +    if node.kind in [CursorKind.DECL_REF_EXPR, CursorKind.MEMBER_REF_EXPR]:
> +        node = node.referenced
> +
> +    # ---
> +
> +    if node.kind == CursorKind.FUNCTION_DECL:


And I receive an exception on the line above saying that node is of type
NoneType. Seems that `node = node.referenced` is setting `node` to None
in this case.

I was unable to understand the root cause of it. Is this an incorrect
usage of the tool from my part? Full error message below

Traceback (most recent call last):
   File "./static-analyzer.py", line 327, in analyze_translation_unit
     checker(tu, context.absolute_path, log)
   File "./static-analyzer.py", line 613, in check_coroutine_pointers
     and is_coroutine_fn(right)
   File "./static-analyzer.py", line 781, in is_coroutine_fn
     if node.kind == CursorKind.FUNCTION_DECL:
AttributeError: 'NoneType' object has no attribute 'kind'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
   File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
     result = (True, func(*args, **kwds))
   File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
     return list(map(*args))
   File "./static-analyzer.py", line 329, in analyze_translation_unit
     raise RuntimeError(f"Error analyzing {relative_path}") from e
RuntimeError: Error analyzing target/ppc/mmu-hash64.c
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
   File "./static-analyzer.py", line 893, in <module>
     main()
   File "./static-analyzer.py", line 123, in main
     analyze_translation_units(args, contexts)
   File "./static-analyzer.py", line 240, in analyze_translation_units
     results = pool.map(analyze_translation_unit, contexts)
   File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
     return self._map_async(func, iterable, mapstar, chunksize).get()
   File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
     raise self._value
RuntimeError: Error analyzing target/ppc/mmu-hash64.c

 > +        return is_annotated_with(node, "coroutine_fn")
> +
> +    if node.kind in [
> +        CursorKind.FIELD_DECL,
> +        CursorKind.VAR_DECL,
> +        CursorKind.PARM_DECL,
> +    ]:
> +
> +        if is_annotated_with(node, "coroutine_fn"):
> +            return True
> +
> +        # TODO: If type is typedef or pointer to typedef, follow typedef.
> +
> +        return False
> +
> +    if node.kind == CursorKind.TYPEDEF_DECL:
> +        return is_annotated_with(node, "coroutine_fn")
> +
> +    return False
Best regards,

-- 
Víctor Cora Colombo
Instituto de Pesquisas ELDORADO
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 0/8] Introduce an extensible static analyzer
  2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
                   ` (8 preceding siblings ...)
  2022-07-02 14:17 ` [RFC 0/8] Introduce an extensible static analyzer Paolo Bonzini
@ 2022-07-04 16:28 ` Daniel P. Berrangé
  2022-07-04 19:30   ` Alberto Faria
  2022-07-05 16:13 ` Daniel P. Berrangé
  10 siblings, 1 reply; 27+ messages in thread
From: Daniel P. Berrangé @ 2022-07-04 16:28 UTC (permalink / raw)
  To: Alberto Faria
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Sat, Jul 02, 2022 at 12:33:23PM +0100, Alberto Faria wrote:
> This series introduces a static analyzer for QEMU. It consists of a
> single static-analyzer.py script that relies on libclang's Python
> bindings, and provides a common framework on which arbitrary static
> analysis checks can be developed and run against QEMU's code base.
> 
> Summary of the series:
> 
>   - Patch 1 adds the base static analyzer, along with a simple check
>     that finds static functions whose return value is never used, and
>     patch 2 fixes some occurrences of this.
> 
>   - Patch 3 adds a check to ensure that non-coroutine_fn functions don't
>     perform direct calls to coroutine_fn functions, and patch 4 fixes
>     some violations of this rule.
> 
>   - Patch 5 adds a check to enforce coroutine_fn restrictions on
>     function pointers, namely around assignment and indirect calls, and
>     patch 6 fixes some problems it detects. (Implementing this check
>     properly is complicated, since AFAICT annotation attributes cannot
>     be applied directly to types. This part still needs a lot of work.)
> 
>   - Patch 7 introduces a no_coroutine_fn marker for functions that
>     should not be called from coroutines, makes generated_co_wrapper
>     evaluate to no_coroutine_fn, and adds a check enforcing this rule.
>     Patch 8 fixes some violations that it finds.
> 
> The current primary motivation for this work is enforcing rules around
> block layer coroutines, which is why most of the series focuses on that.
> However, the static analyzer is intended to be sufficiently generic to
> satisfy other present and future QEMU static analysis needs.
> 
> This is very early work-in-progress, and a lot is missing. One notable
> omission is build system integration, including keeping track of which
> translation units have been modified and need re-analyzing.
> 
> Performance is bad, but there is a lot of potential for optimization,
> such as avoiding redundant AST traversals. Switching to C libclang is
> also a possibility, although Python makes it easy to quickly prototype
> new checks, which should encourage adoption and contributions.

Have you done any measurement see how much of the overhead is from
the checks you implemented, vs how much is inherantly forced on us
by libclang ? ie what does it look like if you just load the libclang
framework and run it actross all source files, without doing any
checks in python.

If i run 'clang-tidy' across the entire source tree, it takes 3 minutes
on my machine, but there's overhead of repeatedly starting the process
in there.

I think anything that actually fully parses the source files is going
to have a decent sized unavoidable overhead, when run across the whole
source tree.

Still having a properly parsed abstract source tree is an inherantly
useful thing. for certain types of static analysis check. Some things
get unreliable real fast if you try to anlyse using regex matches and
similar approaches that are the common alternative. So libclang is
understandably appealing in this respect.

> The script takes a path to the build directory, and any number of paths
> to directories or files to analyze. Example run on a 12-thread laptop:
> 
>     $ time ./static-analyzer.py build block
>     block/commit.c:525:15: non-coroutine_fn function calls coroutine_fn
>     block/nbd.c:206:5: non-coroutine_fn function calls coroutine_fn
>     [...]
>     block/ssh.c:1167:13: non-coroutine_fn function calls coroutine_fn
>     block/nfs.c:229:27: non-coroutine_fn function calls coroutine_fn
>     Analyzed 79 translation units.
> 
>     real    0m45.277s
>     user    7m55.496s
>     sys     0m1.445s

Ouch, that is incredibly slow :-(

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 5/8] static-analyzer: Enforce coroutine_fn restrictions on function pointers
  2022-07-04 14:16   ` Víctor Colombo
@ 2022-07-04 16:57     ` Alberto Faria
  2022-07-04 17:46       ` Víctor Colombo
  0 siblings, 1 reply; 27+ messages in thread
From: Alberto Faria @ 2022-07-04 16:57 UTC (permalink / raw)
  To: Víctor Colombo
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

Hi Víctor,

On Mon, Jul 4, 2022 at 3:18 PM Víctor Colombo
<victor.colombo@eldorado.org.br> wrote:
> And I receive an exception on the line above saying that node is of type
> NoneType. Seems that `node = node.referenced` is setting `node` to None
> in this case.
>
> I was unable to understand the root cause of it. Is this an incorrect
> usage of the tool from my part? Full error message below

Unfortunately there seem to be a lot of corner cases that libclang can
throw at us. I hadn't come across this one before. I expected that
DECL_REF_EXPR/MEMBER_REF_EXPR would always reference something.

This may be due to some build error -- libclang tries to continue
processing a translation unit by dropping subtrees or nodes that have
problems. Is there a "too many errors emitted, stopping now; this may
lead to false positives and negatives" line at the top of the script's
output?

Thanks,
Alberto



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 5/8] static-analyzer: Enforce coroutine_fn restrictions on function pointers
  2022-07-04 16:57     ` Alberto Faria
@ 2022-07-04 17:46       ` Víctor Colombo
  2022-07-04 18:04         ` Alberto Faria
  0 siblings, 1 reply; 27+ messages in thread
From: Víctor Colombo @ 2022-07-04 17:46 UTC (permalink / raw)
  To: Alberto Faria
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On 04/07/2022 13:57, Alberto Faria wrote:
> Hi Víctor,
> 
> On Mon, Jul 4, 2022 at 3:18 PM Víctor Colombo
> <victor.colombo@eldorado.org.br> wrote:
>> And I receive an exception on the line above saying that node is of type
>> NoneType. Seems that `node = node.referenced` is setting `node` to None
>> in this case.
>>
>> I was unable to understand the root cause of it. Is this an incorrect
>> usage of the tool from my part? Full error message below
> 
> Unfortunately there seem to be a lot of corner cases that libclang can
> throw at us. I hadn't come across this one before. I expected that
> DECL_REF_EXPR/MEMBER_REF_EXPR would always reference something.
> 
> This may be due to some build error -- libclang tries to continue
> processing a translation unit by dropping subtrees or nodes that have
> problems. Is there a "too many errors emitted, stopping now; this may
> lead to false positives and negatives" line at the top of the script's
> output?
> 

Yes, this line is present at the beginning of the output
Is this caused by problems with the code being analyzed or is it because
libclang is getting confused with something that is outside of our
control?

-- 
Víctor Cora Colombo
Instituto de Pesquisas ELDORADO
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 5/8] static-analyzer: Enforce coroutine_fn restrictions on function pointers
  2022-07-04 17:46       ` Víctor Colombo
@ 2022-07-04 18:04         ` Alberto Faria
  2022-07-04 19:06           ` Víctor Colombo
  0 siblings, 1 reply; 27+ messages in thread
From: Alberto Faria @ 2022-07-04 18:04 UTC (permalink / raw)
  To: Víctor Colombo
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Mon, Jul 4, 2022 at 6:46 PM Víctor Colombo
<victor.colombo@eldorado.org.br> wrote:
> Yes, this line is present at the beginning of the output
> Is this caused by problems with the code being analyzed or is it because
> libclang is getting confused with something that is outside of our
> control?

I think I found the problem: the commands in the compilation database
weren't being parsed properly. I switched to shlex.split() and it
seems to be working now. The WIP v2 is available at [1], if you want
to give it a try.

Thanks for reporting this!

Alberto

[1] https://gitlab.com/albertofaria/qemu/-/tree/static-analysis



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 5/8] static-analyzer: Enforce coroutine_fn restrictions on function pointers
  2022-07-04 18:04         ` Alberto Faria
@ 2022-07-04 19:06           ` Víctor Colombo
  0 siblings, 0 replies; 27+ messages in thread
From: Víctor Colombo @ 2022-07-04 19:06 UTC (permalink / raw)
  To: Alberto Faria
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On 04/07/2022 15:04, Alberto Faria wrote:
> On Mon, Jul 4, 2022 at 6:46 PM Víctor Colombo
> <victor.colombo@eldorado.org.br> wrote:
>> Yes, this line is present at the beginning of the output
>> Is this caused by problems with the code being analyzed or is it because
>> libclang is getting confused with something that is outside of our
>> control?
> 
> I think I found the problem: the commands in the compilation database
> weren't being parsed properly. I switched to shlex.split() and it
> seems to be working now. The WIP v2 is available at [1], if you want
> to give it a try.
> 
> Thanks for reporting this!
> 
> Alberto
> 
> [1] https://gitlab.com/albertofaria/qemu/-/tree/static-analysis
> 

I tested the version from the WIP v2 and seems to be working now.
Thanks!

-- 
Víctor Cora Colombo
Instituto de Pesquisas ELDORADO
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 0/8] Introduce an extensible static analyzer
  2022-07-04 16:28 ` Daniel P. Berrangé
@ 2022-07-04 19:30   ` Alberto Faria
  2022-07-05  7:16     ` Daniel P. Berrangé
  0 siblings, 1 reply; 27+ messages in thread
From: Alberto Faria @ 2022-07-04 19:30 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Mon, Jul 4, 2022 at 5:28 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> Have you done any measurement see how much of the overhead is from
> the checks you implemented, vs how much is inherantly forced on us
> by libclang ? ie what does it look like if you just load the libclang
> framework and run it actross all source files, without doing any
> checks in python.

Running the script with all checks disabled, i.e., doing nothing after
TranslationUnit.from_source():

    $ time ./static-analyzer.py build
    [...]
    Analyzed 8775 translation units in 274.0 seconds.

    real    4m34.537s
    user    49m32.555s
    sys     1m18.731s

    $ time ./static-analyzer.py build block util
    Analyzed 162 translation units in 4.2 seconds.

    real    0m4.804s
    user    0m40.389s
    sys     0m1.690s

This is still with 12 threads on a 12-hardware thread laptop, but
scalability is near perfect. (The time reported by the script doesn't
include loading and inspection of the compilation database.)

So, not great. What's more, TranslationUnit.from_source() delegates
all work to clang_parseTranslationUnit(), so I suspect C libclang
wouldn't do much better.

And with all checks enabled:

    $ time ./static-analyzer.py build block util
    [...]
    Analyzed 162 translation units in 86.4 seconds.

    real    1m26.999s
    user    14m51.163s
    sys     0m2.205s

Yikes. Also not great at all, although the current implementation does
many inefficient things, like redundant AST traversals. Cutting
through some of clang.cindex's abstractions should also help, e.g.,
using libclang's visitor API properly instead of calling
clang_visitChildren() for every get_children().

Perhaps we should set a target for how slow we can tolerate this thing
to be, as a percentage of total build time, and determine if the
libclang approach is viable. I'm thinking maybe 10%?

> If i run 'clang-tidy' across the entire source tree, it takes 3 minutes
> on my machine, but there's overhead of repeatedly starting the process
> in there.

Is that parallelized in some way? It seems strange that clang-tidy
would be so much faster than libclang.

> I think anything that actually fully parses the source files is going
> to have a decent sized unavoidable overhead, when run across the whole
> source tree.
>
> Still having a properly parsed abstract source tree is an inherantly
> useful thing. for certain types of static analysis check. Some things
> get unreliable real fast if you try to anlyse using regex matches and
> similar approaches that are the common alternative. So libclang is
> understandably appealing in this respect.
>
> > The script takes a path to the build directory, and any number of paths
> > to directories or files to analyze. Example run on a 12-thread laptop:
> >
> >     $ time ./static-analyzer.py build block
> >     block/commit.c:525:15: non-coroutine_fn function calls coroutine_fn
> >     block/nbd.c:206:5: non-coroutine_fn function calls coroutine_fn
> >     [...]
> >     block/ssh.c:1167:13: non-coroutine_fn function calls coroutine_fn
> >     block/nfs.c:229:27: non-coroutine_fn function calls coroutine_fn
> >     Analyzed 79 translation units.
> >
> >     real    0m45.277s
> >     user    7m55.496s
> >     sys     0m1.445s
>
> Ouch, that is incredibly slow :-(

Yup :-(

Alberto



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 0/8] Introduce an extensible static analyzer
  2022-07-04 19:30   ` Alberto Faria
@ 2022-07-05  7:16     ` Daniel P. Berrangé
  2022-07-05 11:28       ` Alberto Faria
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel P. Berrangé @ 2022-07-05  7:16 UTC (permalink / raw)
  To: Alberto Faria
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Mon, Jul 04, 2022 at 08:30:08PM +0100, Alberto Faria wrote:
> On Mon, Jul 4, 2022 at 5:28 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > Have you done any measurement see how much of the overhead is from
> > the checks you implemented, vs how much is inherantly forced on us
> > by libclang ? ie what does it look like if you just load the libclang
> > framework and run it actross all source files, without doing any
> > checks in python.
> 
> Running the script with all checks disabled, i.e., doing nothing after
> TranslationUnit.from_source():
> 
>     $ time ./static-analyzer.py build
>     [...]
>     Analyzed 8775 translation units in 274.0 seconds.
> 
>     real    4m34.537s
>     user    49m32.555s
>     sys     1m18.731s
> 
>     $ time ./static-analyzer.py build block util
>     Analyzed 162 translation units in 4.2 seconds.
> 
>     real    0m4.804s
>     user    0m40.389s
>     sys     0m1.690s
> 
> This is still with 12 threads on a 12-hardware thread laptop, but
> scalability is near perfect. (The time reported by the script doesn't
> include loading and inspection of the compilation database.)
> 
> So, not great. What's more, TranslationUnit.from_source() delegates
> all work to clang_parseTranslationUnit(), so I suspect C libclang
> wouldn't do much better.
> 
> And with all checks enabled:
> 
>     $ time ./static-analyzer.py build block util
>     [...]
>     Analyzed 162 translation units in 86.4 seconds.
> 
>     real    1m26.999s
>     user    14m51.163s
>     sys     0m2.205s
> 
> Yikes. Also not great at all, although the current implementation does
> many inefficient things, like redundant AST traversals. Cutting
> through some of clang.cindex's abstractions should also help, e.g.,
> using libclang's visitor API properly instead of calling
> clang_visitChildren() for every get_children().
> 
> Perhaps we should set a target for how slow we can tolerate this thing
> to be, as a percentage of total build time, and determine if the
> libclang approach is viable. I'm thinking maybe 10%?
> 
> > If i run 'clang-tidy' across the entire source tree, it takes 3 minutes
> > on my machine, but there's overhead of repeatedly starting the process
> > in there.
> 
> Is that parallelized in some way? It seems strange that clang-tidy
> would be so much faster than libclang.

No, that was me doing a dumb

     for i in `git ls-tree --name-only -r HEAD:`
     do
        clang-tidy $i 1>/dev/null 2>&1
     done

so in fact it was parsing all source files, not just .c files (and
likely whining about non-C files.  This was on my laptop with 6
cores / 2 SMT

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 0/8] Introduce an extensible static analyzer
  2022-07-05  7:16     ` Daniel P. Berrangé
@ 2022-07-05 11:28       ` Alberto Faria
  2022-07-05 16:12         ` Daniel P. Berrangé
  0 siblings, 1 reply; 27+ messages in thread
From: Alberto Faria @ 2022-07-05 11:28 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Tue, Jul 5, 2022 at 8:16 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
>      for i in `git ls-tree --name-only -r HEAD:`
>      do
>         clang-tidy $i 1>/dev/null 2>&1
>      done

All of those invocations are probably failing quickly due to missing
includes and other problems, since the location of the compilation
database and some other arguments haven't been specified.

Accounting for those problems (and enabling just one random C++ check):

    $ time clang-tidy -p build \
        --extra-arg-before=-Wno-unknown-warning-option \
        --extra-arg='-isystem [...]' \
        --checks='-*,clang-analyzer-cplusplus.Move' \
        $( find block -name '*.c' )
    [...]

    real    3m0.260s
    user    2m58.041s
    sys     0m1.467s

Single-threaded static-analyzer.py without any checks:

    $ time ./static-analyzer.py build block -j 1
    Analyzed 79 translation units in 16.0 seconds.

    real    0m16.665s
    user    0m15.967s
    sys     0m0.604s

And with just the 'return-value-never-used' check enabled for a
somewhat fairer comparison:

    $ time ./static-analyzer.py build block -j 1 \
        -c return-value-never-used
    Analyzed 79 translation units in 61.5 seconds.

    real    1m2.080s
    user    1m1.372s
    sys     0m0.513s

Which is good news!

Alberto



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 0/8] Introduce an extensible static analyzer
  2022-07-05 11:28       ` Alberto Faria
@ 2022-07-05 16:12         ` Daniel P. Berrangé
  2022-07-06  9:54           ` Alberto Faria
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel P. Berrangé @ 2022-07-05 16:12 UTC (permalink / raw)
  To: Alberto Faria
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Tue, Jul 05, 2022 at 12:28:55PM +0100, Alberto Faria wrote:
> On Tue, Jul 5, 2022 at 8:16 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> >      for i in `git ls-tree --name-only -r HEAD:`
> >      do
> >         clang-tidy $i 1>/dev/null 2>&1
> >      done
> 
> All of those invocations are probably failing quickly due to missing
> includes and other problems, since the location of the compilation
> database and some other arguments haven't been specified.

Opps yes, I was waaaay too minimalist in testing that.

> 
> Accounting for those problems (and enabling just one random C++ check):
> 
>     $ time clang-tidy -p build \
>         --extra-arg-before=-Wno-unknown-warning-option \
>         --extra-arg='-isystem [...]' \
>         --checks='-*,clang-analyzer-cplusplus.Move' \
>         $( find block -name '*.c' )
>     [...]
> 
>     real    3m0.260s
>     user    2m58.041s
>     sys     0m1.467s

Only analysing the block tree, but if we consider a static analysis
framework is desirable to use for whole of qemu.git, lets see the
numbers for everything.

What follows was done on  my P1 Gen2 thinkpad with 6 core / 12 threads,
where I use 'make -j 12' normally.

First as a benchmark, lets see a full compile of whole of QEMU (with
GCC) on Fedora 36 x86_64

    => 14 minutes


I find this waaaaay too slow though, so I typically configure QEMU with
--target-list=x86_64-softmmu since that suffices 90% of the time.

   => 2 minutes


A 'make check' on this x86_64-only build takes another 2 minutes.


Now, a static analysis baseline across the whole tree with default
tests enabled

 $ clang-tidy --quiet -p build $(git ls-tree -r --name-only HEAD: | grep '\.c$')

  => 45 minutes

wow, wasn't expecting it to be that slow !

Lets restrict to just the block/ dir

 $ clang-tidy --quiet -p build $(find block -name '*.c')

  => 4 minutes

And further restrict to just 1 simple check

 $ clang-tidy --quiet   --checks='-*,clang-analyzer-cplusplus.Move'  -p build $(find block -name '*.c')
  => 2 minutes 30


So extrapolated just that single check would probably work out
at 30 mins for the whole tree.

Overall this isn't cheap, and in the same order of magnitude
as a full compile. I guess this shouldn't be that surprising
really.



> Single-threaded static-analyzer.py without any checks:
> 
>     $ time ./static-analyzer.py build block -j 1
>     Analyzed 79 translation units in 16.0 seconds.
> 
>     real    0m16.665s
>     user    0m15.967s
>     sys     0m0.604s
> 
> And with just the 'return-value-never-used' check enabled for a
> somewhat fairer comparison:
> 
>     $ time ./static-analyzer.py build block -j 1 \
>         -c return-value-never-used
>     Analyzed 79 translation units in 61.5 seconds.
> 
>     real    1m2.080s
>     user    1m1.372s
>     sys     0m0.513s
> 
> Which is good news!

On my machine, a whole tree analysis allowing parallel execution
(iiuc no -j arg means use all cores):

  ./static-analyzer.py build  $(git ls-tree -r --name-only HEAD: | grep '\.c$'

   => 13 minutes

Or just the block layer

  ./static-analyzer.py build  $(find block -name '*.c')

   => 45 seconds


So your script is faster than clang-tidy, which suggests python probably
isn't the major dominating factor in speed, at least at this point in
time.


Still, a full tree analysis time of 13 minutes, compared to  my normal
'make' time of 2 minutes is an order of magnitude.


One thing that I noticed when doing this is that we can only really
do static analysis of files that we can actually compile on the host.
IOW, if on a Linux host, we don't get static analysis of code that
is targetted at FreeBSD / Windows platforms. Obvious really, since
libclang has to do a full parse and this will lack header files
for those platforms. That's just the tradeoff you have to accept
when using a compiler for static analysis, vs a tool that does
"dumb" string based regex matching to detect mistakes. Both kinds
of tools likely have their place for different tasks.


Overall I think a libclang based analysis tool will be useful, but
I can't see us enabling it as a standard part of 'make check'
given the time penalty.


Feels like something that'll have to be opt-in to a large degree
for regular contributors. In terms of gating CI though, it is less
of an issue, since we massively parallelize jobs. As long as we
have a dedicated build job just for running this static analysis
check in isolation, and NOT as 'make check' in all existing jobs,
it can happen in parallel with all the other build jobs, and we
won't notice the speed.

In summary, I think this approach is viable despite the speed
penalty provided we dont wire it into 'make check' by default.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 0/8] Introduce an extensible static analyzer
  2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
                   ` (9 preceding siblings ...)
  2022-07-04 16:28 ` Daniel P. Berrangé
@ 2022-07-05 16:13 ` Daniel P. Berrangé
  2022-07-06  9:56   ` Alberto Faria
  10 siblings, 1 reply; 27+ messages in thread
From: Daniel P. Berrangé @ 2022-07-05 16:13 UTC (permalink / raw)
  To: Alberto Faria
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Sat, Jul 02, 2022 at 12:33:23PM +0100, Alberto Faria wrote:
> This series introduces a static analyzer for QEMU. It consists of a
> single static-analyzer.py script that relies on libclang's Python
> bindings, and provides a common framework on which arbitrary static
> analysis checks can be developed and run against QEMU's code base.
> 
> Summary of the series:
> 
>   - Patch 1 adds the base static analyzer, along with a simple check
>     that finds static functions whose return value is never used, and
>     patch 2 fixes some occurrences of this.
> 
>   - Patch 3 adds a check to ensure that non-coroutine_fn functions don't
>     perform direct calls to coroutine_fn functions, and patch 4 fixes
>     some violations of this rule.
> 
>   - Patch 5 adds a check to enforce coroutine_fn restrictions on
>     function pointers, namely around assignment and indirect calls, and
>     patch 6 fixes some problems it detects. (Implementing this check
>     properly is complicated, since AFAICT annotation attributes cannot
>     be applied directly to types. This part still needs a lot of work.)
> 
>   - Patch 7 introduces a no_coroutine_fn marker for functions that
>     should not be called from coroutines, makes generated_co_wrapper
>     evaluate to no_coroutine_fn, and adds a check enforcing this rule.
>     Patch 8 fixes some violations that it finds.


FWIW, after applying this series 'make check' throws lots of failures
and hangs for me in the block I/O tests, so something appears not quite
correct here. I didn't bother to investigate/debug since you marked this
as just an RFC

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 0/8] Introduce an extensible static analyzer
  2022-07-05 16:12         ` Daniel P. Berrangé
@ 2022-07-06  9:54           ` Alberto Faria
  2022-07-06 10:15             ` Daniel P. Berrangé
  0 siblings, 1 reply; 27+ messages in thread
From: Alberto Faria @ 2022-07-06  9:54 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Tue, Jul 5, 2022 at 5:12 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Tue, Jul 05, 2022 at 12:28:55PM +0100, Alberto Faria wrote:
> > On Tue, Jul 5, 2022 at 8:16 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >      for i in `git ls-tree --name-only -r HEAD:`
> > >      do
> > >         clang-tidy $i 1>/dev/null 2>&1
> > >      done
> >
> > All of those invocations are probably failing quickly due to missing
> > includes and other problems, since the location of the compilation
> > database and some other arguments haven't been specified.
>
> Opps yes, I was waaaay too minimalist in testing that.
>
> >
> > Accounting for those problems (and enabling just one random C++ check):
> >
> >     $ time clang-tidy -p build \
> >         --extra-arg-before=-Wno-unknown-warning-option \
> >         --extra-arg='-isystem [...]' \
> >         --checks='-*,clang-analyzer-cplusplus.Move' \
> >         $( find block -name '*.c' )
> >     [...]
> >
> >     real    3m0.260s
> >     user    2m58.041s
> >     sys     0m1.467s
>
> Only analysing the block tree, but if we consider a static analysis
> framework is desirable to use for whole of qemu.git, lets see the
> numbers for everything.
>
> What follows was done on  my P1 Gen2 thinkpad with 6 core / 12 threads,
> where I use 'make -j 12' normally.
>
> First as a benchmark, lets see a full compile of whole of QEMU (with
> GCC) on Fedora 36 x86_64
>
>     => 14 minutes
>
>
> I find this waaaaay too slow though, so I typically configure QEMU with
> --target-list=x86_64-softmmu since that suffices 90% of the time.
>
>    => 2 minutes
>
>
> A 'make check' on this x86_64-only build takes another 2 minutes.
>
>
> Now, a static analysis baseline across the whole tree with default
> tests enabled
>
>  $ clang-tidy --quiet -p build $(git ls-tree -r --name-only HEAD: | grep '\.c$')
>
>   => 45 minutes
>
> wow, wasn't expecting it to be that slow !
>
> Lets restrict to just the block/ dir
>
>  $ clang-tidy --quiet -p build $(find block -name '*.c')
>
>   => 4 minutes
>
> And further restrict to just 1 simple check
>
>  $ clang-tidy --quiet   --checks='-*,clang-analyzer-cplusplus.Move'  -p build $(find block -name '*.c')
>   => 2 minutes 30
>
>
> So extrapolated just that single check would probably work out
> at 30 mins for the whole tree.
>
> Overall this isn't cheap, and in the same order of magnitude
> as a full compile. I guess this shouldn't be that surprising
> really.
>
>
>
> > Single-threaded static-analyzer.py without any checks:
> >
> >     $ time ./static-analyzer.py build block -j 1
> >     Analyzed 79 translation units in 16.0 seconds.
> >
> >     real    0m16.665s
> >     user    0m15.967s
> >     sys     0m0.604s
> >
> > And with just the 'return-value-never-used' check enabled for a
> > somewhat fairer comparison:
> >
> >     $ time ./static-analyzer.py build block -j 1 \
> >         -c return-value-never-used
> >     Analyzed 79 translation units in 61.5 seconds.
> >
> >     real    1m2.080s
> >     user    1m1.372s
> >     sys     0m0.513s
> >
> > Which is good news!

(Well, good news for the Python libclang approach vs others like
clang-tidy plugins; bad news in absolute terms.)

>
> On my machine, a whole tree analysis allowing parallel execution
> (iiuc no -j arg means use all cores):
>
>   ./static-analyzer.py build  $(git ls-tree -r --name-only HEAD: | grep '\.c$'
>
>    => 13 minutes
>
> Or just the block layer
>
>   ./static-analyzer.py build  $(find block -name '*.c')
>
>    => 45 seconds
>
>
> So your script is faster than clang-tidy, which suggests python probably
> isn't the major dominating factor in speed, at least at this point in
> time.
>
>
> Still, a full tree analysis time of 13 minutes, compared to  my normal
> 'make' time of 2 minutes is an order of magnitude.

There goes my 10% overhead target...

>
>
> One thing that I noticed when doing this is that we can only really
> do static analysis of files that we can actually compile on the host.
> IOW, if on a Linux host, we don't get static analysis of code that
> is targetted at FreeBSD / Windows platforms. Obvious really, since
> libclang has to do a full parse and this will lack header files
> for those platforms. That's just the tradeoff you have to accept
> when using a compiler for static analysis, vs a tool that does
> "dumb" string based regex matching to detect mistakes. Both kinds
> of tools likely have their place for different tasks.

Right, I don't think there's anything reasonable we can do about this
limitation.

>
>
> Overall I think a libclang based analysis tool will be useful, but
> I can't see us enabling it as a standard part of 'make check'
> given the time penalty.
>
>
> Feels like something that'll have to be opt-in to a large degree
> for regular contributors. In terms of gating CI though, it is less
> of an issue, since we massively parallelize jobs. As long as we
> have a dedicated build job just for running this static analysis
> check in isolation, and NOT as 'make check' in all existing jobs,
> it can happen in parallel with all the other build jobs, and we
> won't notice the speed.
>
> In summary, I think this approach is viable despite the speed
> penalty provided we dont wire it into 'make check' by default.

Agreed. Thanks for gathering these numbers.

Making the script use build dependency information, to avoid
re-analyzing translation units that weren't modified since the last
analysis, should make it fast enough to be usable iteratively during
development. Header precompilation could also be worth looking into.
Doing that + running a full analysis in CI should be good enough.

Alberto

>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 0/8] Introduce an extensible static analyzer
  2022-07-05 16:13 ` Daniel P. Berrangé
@ 2022-07-06  9:56   ` Alberto Faria
  0 siblings, 0 replies; 27+ messages in thread
From: Alberto Faria @ 2022-07-06  9:56 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Tue, Jul 5, 2022 at 5:13 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> FWIW, after applying this series 'make check' throws lots of failures
> and hangs for me in the block I/O tests, so something appears not quite
> correct here. I didn't bother to investigate/debug since you marked this
> as just an RFC

Thanks, it appears some coroutine_fn functions are being called from
non-coroutine context, so some call conversions from bdrv_... to
bdrv_co_... introduce problems. These changes are only intended as
examples of using the tool for the time being.

Alberto



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 0/8] Introduce an extensible static analyzer
  2022-07-06  9:54           ` Alberto Faria
@ 2022-07-06 10:15             ` Daniel P. Berrangé
  2022-07-08 17:18               ` Alberto Faria
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel P. Berrangé @ 2022-07-06 10:15 UTC (permalink / raw)
  To: Alberto Faria
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Wed, Jul 06, 2022 at 10:54:51AM +0100, Alberto Faria wrote:
> On Tue, Jul 5, 2022 at 5:12 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > On Tue, Jul 05, 2022 at 12:28:55PM +0100, Alberto Faria wrote:
> > > On Tue, Jul 5, 2022 at 8:16 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > Overall I think a libclang based analysis tool will be useful, but
> > I can't see us enabling it as a standard part of 'make check'
> > given the time penalty.
> >
> >
> > Feels like something that'll have to be opt-in to a large degree
> > for regular contributors. In terms of gating CI though, it is less
> > of an issue, since we massively parallelize jobs. As long as we
> > have a dedicated build job just for running this static analysis
> > check in isolation, and NOT as 'make check' in all existing jobs,
> > it can happen in parallel with all the other build jobs, and we
> > won't notice the speed.
> >
> > In summary, I think this approach is viable despite the speed
> > penalty provided we dont wire it into 'make check' by default.
> 
> Agreed. Thanks for gathering these numbers.
> 
> Making the script use build dependency information, to avoid
> re-analyzing translation units that weren't modified since the last
> analysis, should make it fast enough to be usable iteratively during
> development. Header precompilation could also be worth looking into.
> Doing that + running a full analysis in CI should be good enough.

For clang-tidy, I've been trying it out integrated into emacs
via eglot and clangd. This means I get clang-tidy errors reported
interactively as I write code, so wouldn't need to run a full
tree analysis. Unfortunately, unless I'm missing something, there's
no way to extend clangd to plugin extra checks.  So it would need
to re-implement something equivalent to clangd for our custom checks,
and then integrate that into eglot (or equiv for other editors).


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 0/8] Introduce an extensible static analyzer
  2022-07-06 10:15             ` Daniel P. Berrangé
@ 2022-07-08 17:18               ` Alberto Faria
  0 siblings, 0 replies; 27+ messages in thread
From: Alberto Faria @ 2022-07-08 17:18 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, Paolo Bonzini, qemu-block, Denis V. Lunev,
	Emanuele Giuseppe Esposito, Stefan Hajnoczi, Ronnie Sahlberg,
	Hanna Reitz, Stefano Garzarella, Kevin Wolf, Peter Xu,
	Alberto Garcia, John Snow, Eric Blake, Fam Zheng,
	Markus Armbruster, Vladimir Sementsov-Ogievskiy, Peter Lieven

On Wed, Jul 6, 2022 at 11:15 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> For clang-tidy, I've been trying it out integrated into emacs
> via eglot and clangd. This means I get clang-tidy errors reported
> interactively as I write code, so wouldn't need to run a full
> tree analysis. Unfortunately, unless I'm missing something, there's
> no way to extend clangd to plugin extra checks.  So it would need
> to re-implement something equivalent to clangd for our custom checks,
> and then integrate that into eglot (or equiv for other editors).

That would be very handy. Still, running the script on the whole tree
would be necessary to ensure that changes to headers don't break
translation units that are not open in the editor.

Alberto



^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-07-08 17:25 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-02 11:33 [RFC 0/8] Introduce an extensible static analyzer Alberto Faria
2022-07-02 11:33 ` [RFC 1/8] Add " Alberto Faria
2022-07-02 11:33 ` [RFC 2/8] Drop some unused static function return values Alberto Faria
2022-07-02 11:33 ` [RFC 3/8] static-analyzer: Enforce coroutine_fn restrictions for direct calls Alberto Faria
2022-07-02 11:33 ` [RFC 4/8] Fix some direct calls from non-coroutine_fn to coroutine_fn Alberto Faria
2022-07-02 14:13   ` Paolo Bonzini
2022-07-03 22:20     ` Alberto Faria
2022-07-02 11:33 ` [RFC 5/8] static-analyzer: Enforce coroutine_fn restrictions on function pointers Alberto Faria
2022-07-04 14:16   ` Víctor Colombo
2022-07-04 16:57     ` Alberto Faria
2022-07-04 17:46       ` Víctor Colombo
2022-07-04 18:04         ` Alberto Faria
2022-07-04 19:06           ` Víctor Colombo
2022-07-02 11:33 ` [RFC 6/8] Fix some coroutine_fn indirect calls and pointer assignments Alberto Faria
2022-07-02 11:33 ` [RFC 7/8] block: Add no_coroutine_fn marker Alberto Faria
2022-07-02 11:33 ` [RFC 8/8] Avoid calls from coroutine_fn to no_coroutine_fn Alberto Faria
2022-07-02 14:17 ` [RFC 0/8] Introduce an extensible static analyzer Paolo Bonzini
2022-07-04 16:28 ` Daniel P. Berrangé
2022-07-04 19:30   ` Alberto Faria
2022-07-05  7:16     ` Daniel P. Berrangé
2022-07-05 11:28       ` Alberto Faria
2022-07-05 16:12         ` Daniel P. Berrangé
2022-07-06  9:54           ` Alberto Faria
2022-07-06 10:15             ` Daniel P. Berrangé
2022-07-08 17:18               ` Alberto Faria
2022-07-05 16:13 ` Daniel P. Berrangé
2022-07-06  9:56   ` Alberto Faria

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.