ltp.lists.linux.it archive mirror
 help / color / mirror / Atom feed
From: Richard Palethorpe <rpalethorpe@suse.de>
To: Cyril Hrubis <chrubis@suse.cz>
Cc: ltp@lists.linux.it
Subject: Re: [LTP] [PATCH 0/7] docparse improvements
Date: Thu, 28 Oct 2021 09:11:09 +0100	[thread overview]
Message-ID: <87pmrppj9u.fsf@suse.de> (raw)
In-Reply-To: <YXlYwi7+VUIitM7H@yuki>

Hello,

Cyril Hrubis <chrubis@suse.cz> writes:

> Hi!
>> It's unfortunate that before starting this effort and the checker that
>> we didn't know about tree-sitter (although Sparse may still be the best
>> choice for the checker).
>> 
>> Tree-sitter can parse C into an AST and can easily be vendored into LTP:
>> https://tree-sitter.github.io/tree-sitter/using-parsers#building-the-library
>> 
>> Then we just need to work on the level of the AST. It also has a query
>> language. This should allow the initial matching to be done on a high
>> level.
>
> The only worry that I have about this would be speed, currently the code
> I wrote takes a few second to process thousands of C files in LTP, that
> is because we take a lot of shortcuts and ignore all the stuff we do not
> need. Full parser that builds AST would be orders of magnitude slower,
> so before we attempt to use it it should be benchmarked properly to see
> if it's fast enough.

It's incredibly fast, it has no trouble parsing the entire kernel.

Weggli uses tree-sitter

https://github.com/googleprojectzero/weggli

rich@g78 ~/q/ltp (master)> time weggli '_ verify_alarm(_) { exit(0); }' .
/home/rich/qa/ltp/./testcases/kernel/syscalls/alarm/alarm03.c:21
static void verify_alarm(void)
{
	pid_t pid;

	TEST(alarm(100));

..
		} else {
			tst_res(TPASS,
				"alarm(100), fork, alarm(0) child's "
				"alarm returned %ld", TST_RET);
		}
		exit(0);
	}

	TEST(alarm(0));
	if (TST_RET != 100) {
		tst_res(TFAIL,
..
}
/home/rich/qa/ltp/./testcases/kernel/syscalls/alarm/alarm07.c:20
static void verify_alarm(void)
{
	pid_t pid;
	alarm_cnt = 0;

	TEST(alarm(1));
..
			tst_res(TPASS, "alarm() request cleared in child");
		} else {
			tst_res(TFAIL, "alarm() request not cleared in "
				"child; alarms received:%d", alarm_cnt);
		}
		exit(0);
	}

	if (alarm_cnt != 1)
		tst_res(TFAIL, "Sigalarms in parent %i, expected 1", alarm_cnt);
	else
..
}

________________________________________________________
Executed in   49.35 millis    fish           external
   usr time  110.88 millis    0.00 millis  110.88 millis
   sys time   87.44 millis    1.20 millis   86.24 millis

>
>> If we continue down the path of hand parsing C, then it will most likely
>> result in constant tweaks and additions.
>
> Well I would say that this patchset is the last addition for the parser,
> if we ever need anything more complex we should really switch to
> something else. On the other hand I do not think that we will ever need
> more complexity in the parser than this, as long as we keep things
> sane.

This closes the door on a lot of options for no upside AFAICT. We have
two tools (Sparse and tree-sitter) that can be (or have been) vendored
and will parse a large subset of C. Sparse goes a step further allowing
control flow analysis. The usual reasons for reinventing the wheel are
not present.

-- 
Thank you,
Richard.

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

  reply	other threads:[~2021-10-28  9:54 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-18 15:47 [LTP] [PATCH 0/7] docparse improvements Cyril Hrubis
2021-10-18 15:47 ` [LTP] [PATCH 1/7] docparse: Implement #define and #include Cyril Hrubis
2021-10-29  8:26   ` Petr Vorel
2021-10-29  8:27   ` Petr Vorel
2021-10-18 15:47 ` [LTP] [PATCH 2/7] docparse: Add tests Cyril Hrubis
2021-10-22 11:32   ` Petr Vorel
2021-10-25 12:46     ` Cyril Hrubis
2021-10-25 20:00       ` Petr Vorel
2021-10-22 11:41   ` Petr Vorel
2021-10-25 12:51     ` Cyril Hrubis
2021-10-25 20:01       ` Petr Vorel
2021-10-18 15:47 ` [LTP] [PATCH 3/7] docparse: data_storage: Add integer type node Cyril Hrubis
2021-10-18 15:47 ` [LTP] [PATCH 4/7] docparse: Implement ARRAY_SIZE() Cyril Hrubis
2021-11-01 12:36   ` Richard Palethorpe
2021-11-01 13:18     ` Cyril Hrubis
2021-10-18 15:47 ` [LTP] [PATCH 5/7] docparse: Add type normalization Cyril Hrubis
2021-10-18 15:47 ` [LTP] [PATCH 6/7] docparse: Group data to 'testsuite' and 'defaults' Cyril Hrubis
2021-10-18 15:47 ` [LTP] [PATCH 7/7] docparse/Makefile: Do not abort on missing generators Cyril Hrubis
2021-10-22 11:29   ` Petr Vorel
2021-10-25 12:48     ` Cyril Hrubis
2021-10-27  9:47       ` Petr Vorel
2021-10-18 15:48 ` [LTP] [PATCH 0/7] docparse improvements Cyril Hrubis
2021-10-27 13:22 ` Richard Palethorpe
2021-10-27 13:48   ` Cyril Hrubis
2021-10-28  8:11     ` Richard Palethorpe [this message]
2021-10-29  8:54       ` Cyril Hrubis
2021-11-01  9:04         ` Richard Palethorpe
2021-11-01  9:59           ` Cyril Hrubis
2021-11-01 12:20             ` Richard Palethorpe
2021-11-01 15:10               ` Cyril Hrubis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pmrppj9u.fsf@suse.de \
    --to=rpalethorpe@suse.de \
    --cc=chrubis@suse.cz \
    --cc=ltp@lists.linux.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).