From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.3 required=3.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9D78C432C0 for ; Wed, 4 Dec 2019 00:22:07 +0000 (UTC) Received: from isis.lip6.fr (isis.lip6.fr [132.227.60.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0525B206E4 for ; Wed, 4 Dec 2019 00:22:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Yo4r9L1v" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0525B206E4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=cocci-bounces@systeme.lip6.fr Received: from systeme.lip6.fr (systeme.lip6.fr [132.227.104.7]) by isis.lip6.fr (8.15.2/8.15.2) with ESMTP id xB40Lf7V002900; Wed, 4 Dec 2019 01:21:41 +0100 (CET) Received: from systeme.lip6.fr (systeme.lip6.fr [127.0.0.1]) by systeme.lip6.fr (Postfix) with ESMTP id 0444777D7; Wed, 4 Dec 2019 01:21:41 +0100 (CET) Received: from isis.lip6.fr (isis.lip6.fr [132.227.60.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by systeme.lip6.fr (Postfix) with ESMTPS id 00B0D4386 for ; Wed, 4 Dec 2019 01:21:38 +0100 (CET) Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20:0:0:0:431] (may be forged)) by isis.lip6.fr (8.15.2/8.15.2) with ESMTP id xB40LbRw004843 for ; Wed, 4 Dec 2019 01:21:37 +0100 (CET) Received: by mail-wr1-x431.google.com with SMTP id q10so6291721wrm.11 for ; Tue, 03 Dec 2019 16:21:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=azjRNxPeF1Gh89sSJbtticYsm09tsWDZ6zjFIUp9WVI=; b=Yo4r9L1vd+XPKJgBKhdtLPdO+C2Xpk5NxdiL5rI/W6eeQnGUI3qXIxSyFfzjmA7Uzi DmUfqjgMOT0lFAlxZYVIh0YuBM5ETVwSYgf5CISubPTR0a3915lYtPw1Z6mDpcUI39lH YfF5XFmiOs5ysjkenZ7yzOvFDRyl6mvD3nSlOrYwL+46PiyzMhwfVZOEIl/rhEU9ht94 vRtmNCY7fWuwtixTRw8sjwzIXoHFvOqN5Tv4od8g3OX4eqcrGe+b8kbIkGZwukrJI49t O/nSjS2JFT65E4sZKslW3imio4ujfR/iYCEpLcScnXlVziULzqUmQdciuy8M2QIUjsXG bIaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=azjRNxPeF1Gh89sSJbtticYsm09tsWDZ6zjFIUp9WVI=; b=LGrO8U7+yyVRdMl20dVTrhfDYfTV4ZQKZ8M5WD6BH8SZdLiRsFaAunmRXFKwvGyjF0 CTqkLvmHbwlKe2h/6k7Mt5KQH8GFCcH9/xMFF3OOr6l3Yz4inOvALpG5dvScMWNa/G7r TaAQzqGt7zMrViTkpz+RBC5AHykE7uAkd4DwoEfEdwAQ0AwRvN8MsIMU0W3jgYUZ3GKT dFj5/zDY79qGl4DDlzb/e5obWgVmPzMLYAotbjeEZIpC3Ts2VOXpYQkfqZzdS1/4T0Rj sjsF4uP8NaNEngt9vqWH7uv574/g42/YJum05mmHZjLeFTPQkNQfLEs6sHA55nPJndgP tsaA== X-Gm-Message-State: APjAAAX7L5ME6S3SeNPyxwxAXCV8lM09SxpJ0f6bk39b62puOPjXDbsj 7vEgLfyrya1/F+BxbtE+68JUQL2U84YdUhxfGuY= X-Google-Smtp-Source: APXvYqxbk0HJLxCPIZY3d8HM69c3AIdf4wFDxiSfMLpcAfO0w5yD16TzOxyP60x12VqRENtLZsK4j3FjMWufTnWAyuY= X-Received: by 2002:adf:da52:: with SMTP id r18mr763707wrl.167.1575418896912; Tue, 03 Dec 2019 16:21:36 -0800 (PST) MIME-Version: 1.0 References: <02fa7455-e76e-7d7d-0d64-41b2803a8025@web.de> <0c03f84d-a05b-2811-96aa-6f82541fb8a3@web.de> <1865799483.10870980.1575350298758.JavaMail.zimbra@inria.fr> In-Reply-To: From: Strace Labs Date: Tue, 3 Dec 2019 22:21:01 -0200 Message-ID: To: Julia Lawall X-Greylist: Sender IP whitelisted, Sender e-mail whitelisted, not delayed by milter-greylist-4.4.3 (isis.lip6.fr [132.227.60.2]); Wed, 04 Dec 2019 01:21:41 +0100 (CET) X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.4.3 (isis.lip6.fr [IPv6:2001:660:3302:283c:0:0:0:2]); Wed, 04 Dec 2019 01:21:37 +0100 (CET) X-Scanned-By: MIMEDefang 2.78 on 132.227.60.2 X-Scanned-By: MIMEDefang 2.78 Cc: cocci@systeme.lip6.fr Subject: Re: [Cocci] Changing format string usage with SmPL? X-BeenThere: cocci@systeme.lip6.fr X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============1989351233==" Sender: cocci-bounces@systeme.lip6.fr Errors-To: cocci-bounces@systeme.lip6.fr --===============1989351233== Content-Type: multipart/alternative; boundary="000000000000b914140598d5cbb7" --000000000000b914140598d5cbb7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable After some research, I could create a Python function called fmt_replace_by_pos() to replace the %fmt by the Indice position. *1. Input test* int foo() { int id; struct mydata h1, *h2, s1, *s2; char *city =3D "Hello"; my_printf("test: char*=3D%s mydata=3D%s int=3D%d mydata*=3D%s (*mydata)= =3D%s", city, h1.name, id, s2->name, (*h2)->name); } *2. My *.cocci* (~> https://pastebin.com/a6Tfav4x ) @initialize:python@ @@ import re def fmt_parser(_st): # a bunch of (non-% or %%), and then (% followed by non-%). REG =3D re.compile('([^%]|%%)*(%[^%])') retval =3D {} pos =3D 0 # where to start searching for next time i =3D 0 while True: match =3D REG.match(_st, pos) if match is None: return retval fmt =3D match.group(2) pos =3D match.end() idx =3D pos - len(fmt) retval[i] =3D { 'idx': idx, 'fmt': fmt } i +=3D 1 def fmt_replace_by_pos(_str, _idx, _fmt): try: fmts =3D fmt_parser(_str) new =3D _str if _idx =3D=3D -1: _idx =3D [item for item in range(0, len(fmts))] for _i in _idx: f =3D fmts[_i] idx =3D f['idx'] fmt =3D f['fmt'] fmt_l =3D len(fmt) new =3D new[:idx] + _fmt + new[idx + fmt_l:] return ''.join(new) except Exception as e: print("** ERROR: Something wrong in fmt_replace_by_pos():\n {}\n".format(str(e))) @r1@ format list fl; identifier fn; expression list e; position p; @@ fn("%@fl@", e@p) @script:python s1@ fl << r1.fl; fn << r1.fn; e << r1.e; p << r1.p; new_fmt; to_e; @@ // Update the %fmt by the position (Position currently hardcode) new_fmt =3D fmt_replace_by_pos(coccinelle.fl, { 1, 3, 4 }, "%m") coccinelle.new_fmt =3D cocci.make_expr("\"{}\"".format(new_fmt)) @main depends on s1 && r1@ format list r1.fl; expression s1.new_fmt; identifier r1.fn; expression list r1.e; expression list s1.to_e; position r1.p; //struct mydata SMD; //struct mydata* SMDP; @@ fn( -"%@fl@" +new_fmt , e@p ); *3. Execution* # spatch --sp-file fix-format.cocci sample.c init_defs_builtins: /usr/local/Cellar/coccinelle/1.0.9/bin/../lib/coccinelle/standard.h warning: main: inherited metavariable to_e not used in the -, +, or context code HANDLING: sample.c diff =3D --- sample.c +++ /tmp/cocci-output-17883-e8cce6-sample.c @@ -4,7 +4,8 @@ int foo() { struct mydata h1, *h2, s1, *s2; char *city =3D "Hello"; - my_printf("test: char*=3D%s mydata=3D%s int=3D%d mydata*=3D%s (*mydata)= =3D%s", city, h1.name, id, s2->name, (*h2)->name); + my_printf("test: char*=3D%s mydata=3D*%m* int=3D%d mydata*=3D*%m* (*mydat= a)=3D*%m*", + city, h1.name, id, s2->name, (*h2)->name); } # Therefore, I could find the %fmt and replace by whatever I want based on the *expression-list. *currently struggled on that. On Tue, Dec 3, 2019 at 3:28 PM Strace Labs wrote: > Unfortunately, it doesn't work. But, I am working on some solutions using > Python. > > therefore, once we have something like: > > ... > @r1@ > format list fl; > identifier fn; > expression list e; > position p; > @@ > > fn("%@fl@", e@p) > .... > > Then, I could handle the *format list* using *make_expr()* as well. But, > Is it possible to rename/handle the *expression list?* > > > On Tue, Dec 3, 2019 at 3:18 AM Julia Lawall wrote= : > >> ------------------------------ >> >> *De: *"Strace Labs" >> *=C3=80: *"Markus Elfring" >> *Cc: *"Julia Lawall" , cocci@systeme.lip6.fr >> *Envoy=C3=A9: *Mardi 3 D=C3=A9cembre 2019 11:30:14 >> *Objet: *Re: [Cocci] Changing format string usage with SmPL? >> >> On Sun, Dec 1, 2019 at 6:00 AM Markus Elfring >> wrote: >> >>> > Basically, I intend to replace alls "%s" called with "mydata->name" b= y >>> "%m" with "mydata" or "&mydata" >>> >>> How far would you get the desired source code transformation based on >>> software extensions around a search pattern like the following. >>> .......... >>> Which algorithm will become sufficient for your data processing needs >>> around the usage of functions with variadic arguments because of format >>> strings? >>> >>> >> Actually, I really didn't get why you're asking about that. because we >> are talking about X and you're asking for Y. but, either way. that is no= t >> the point. the point is because I am studying about the Coccinelle and I= am >> just trying to figure out if the tool could detect "%s" called with >> "mydata->name" and then replace by "%m" and remove the "->name" >> >> e.g: Once if we have: >> >> int foo() { >> int id; >> struct mydata h1, *h2, s1, *s2; >> char *city =3D "Hello"; >> my_printf("%s", s2->name); >> my_printf("hi hi %s gggg", h1.name); >> my_printf("1234 %d *%s* @ %d *%s* | *%s* -> city=3D%s", id, *s1.name >> *, 12, *(*h2).name*, *h2->name*, city); >> my_printf("aaaa %s hhhhh", h2->name); >> my_printf("%s", city); >> } >> >> Then, replace by: >> >> int foo() { >> int id; >> struct mydata h1, *h2, s1, *s2; >> char *city =3D "Hello"; >> *my_printf("%m", s2);* >> *my_printf("hi hi %s gggg", &h1);* >> my_printf("1234 %d *%m* @ %d *%m* | *%m* -> city=3D%s", id, *s1.name >> *, 12, *(*h2).name*, *h2->name*, city); >> * my_printf("aaaa %s hhhhh", h2);* >> my_printf("%s", city); >> } >> >> But, I've read again the other samples and the documentation. therefore, >> I didn't figure out how it should be. btw, thank you Julia for the >> suggestion performing the *Ocalm/make_expr/replace*. (Due to something >> wrong with the Coccinelle distributed by Brew/Osx. I just rewrote your >> sample using Python and the result was the same. But, I can't just repla= ce >> all "%s" by "%m". As I said, it should be only if the "%s" was declared = to >> use "mydata->name". >> >> so, I still fighting yet. thanks in Advance. >> >> OK, if you may have more than one argument to your print, then you can >> find the offset using an expression list metavariable: >> >> @r@ >> expression list[n] between; >> @@ >> >> print(s,between,h2->name,...) >> >> Then you can use r.n in your python rule to figure out where is the %s t= o >> change. Unfortunately, this will not work well if there are multiple na= me >> references in the argument list. Because you will be trying to change t= he >> format string in multiple ways, eg once where between has length 2 and o= nce >> where between has length 4. Substantial hacks would be required to deal >> with this. >> >> It would be nice if you could do >> >> @r@ >> expression list[bn] between; >> expression list[an] after; >> position p; >> @@ >> print@p(s,between,name,after) >> >> @@ >> format list[r.bn] f1; >> format list[r.an] f2; >> position r.p; >> @@ >> print@p( >> - "%@f1@%s%@f2@" >> + "%@f1@%m%@f2@" >> , l) >> >> I don't know if that would work, though. >> >> julia >> >> Regards, >>> Markus >>> >> >> --000000000000b914140598d5cbb7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
After some=C2=A0research, I could create = a Python function called fmt_replace_by_pos() to replace the %fmt by the In= dice position.

1. Input test

=
int foo() {
=C2=A0 =C2=A0int id;
=C2=A0 =C2=A0struct mydata h1, = *h2, s1, *s2;
=C2=A0 =C2=A0char *city =3D "Hello";
=C2=A0 = =C2=A0my_printf("test: char*=3D%s mydata=3D%s int=3D%d mydata*=3D%s (*= mydata)=3D%s", city, h1.n= ame, id, s2->name, (*h2)->name);
}

2. My *.cocci (~>=C2=A0https://pastebin.com/a6Tfav4x=C2=A0)

=
@initialize:python@
@@
import re
def fmt_parser(_st):
# a bunch of (non-% or %%), and then (% followed= by non-%).
REG =3D re.compile('([^%]|%%)*(%[^%])')
retval = =3D {}
pos =3D 0 =C2=A0# where to start searching for next time
i = =3D 0
while True:
match =3D REG.match(_st, pos)
if match is N= one:
return retval
fmt =3D match.group(2)
pos =3D match.end= ()
idx =3D pos - len(fmt)
retval[i] =3D { 'idx': idx, = 9;fmt': fmt }
i +=3D 1

def fmt_replace_by_pos(_str, _idx, _= fmt):
try:
fmts =C2=A0=3D fmt_parser(_str)
new =C2=A0 =3D _st= r

if _idx =3D=3D -1:
_idx =3D [item for item in range(0, len= (fmts))]

for _i in _idx:
f =C2=A0 =C2=A0 =3D fmts[_i]
= idx =C2=A0 =3D f['idx']
fmt =C2=A0 =3D f['fmt']
= fmt_l =3D len(fmt)
new =C2=A0 =3D new[:idx] + _fmt + new[idx + fmt_l= :]

return ''.join(new)
except Exception as e:
p= rint("** ERROR: Something wrong in fmt_replace_by_pos():\n {}\n".= format(str(e)))

@r1@
format list fl;
identifier fn;
express= ion list e;
position p;
@@

fn("%@fl@", e@p)

@= script:python s1@
fl << r1.fl;
fn << r1.fn;
e <<= r1.e;
p << r1.p;
new_fmt;
to_e;
@@
// Update the %fmt= by the position (Position currently hardcode)
new_fmt =3D fmt_replace_b= y_pos(coccinelle.fl, { 1, 3, 4 }, "%m")
coccinelle.new_fmt =3D= cocci.make_expr("\"{}\"".format(new_fmt))
=

@main depends on s1 &&= ; r1@
format list r1.fl;
expression s1.new_fmt;
identifier r1.fn;<= br>expression list r1.e;
expression list s1.to_e;
position r1.p;
/= /struct mydata SMD;
//struct mydata* SMDP;
@@

=C2=A0fn(
-&q= uot;%@fl@"
+new_fmt
,
e@p
=C2=A0);=C2=A0

3. Execution

# spatch --sp-file fix-format.cocci = sample.c
init_defs_builtins: /usr/local/Cellar/coccinelle/1.0.9/bin/../l= ib/coccinelle/standard.h
warning: main: inherited metavariable to_e not = used in the -, +, or context code
HANDLING: sample.c
diff =3D
--- = sample.c
+++ /tmp/cocci-output-17883-e8cce6-sample.c
@@ -4,7 +4,8 @@ = int foo() {
=C2=A0 struct mydata h1, *h2, s1, *s2;
=C2=A0 char *city = =3D "Hello";

- my_printf("test: char*=3D%s mydata=3D%= s int=3D%d mydata*=3D%s (*mydata)=3D%s", city, h1.name, id, s2->name, (*h2)->name);
+ m= y_printf("test: char*=3D%s mydata=3D%m int=3D%d mydata*=3D%m= (*mydata)=3D%m",
+ =C2=A0city, h1.name, id, s2->name, (*h2)->name);
= =C2=A0}

#
=
Therefore, I could find the %fmt and re= place by whatever I want based on the expression-list. curren= tly struggled on that.







On Tue, Dec 3, 2019 at 3:28 PM Strace Labs &= lt;stracelabs@gma= il.com> wrote:
Unfortunately, it doesn't work.= But, I am working on some solutions using Python.=C2=A0

therefore, once we have something like:

...
@r1@
format list fl;
identifier fn;
express= ion list e;
position p;
@@

fn("%@fl@", e@p)
....

Then, I could handle the format list= =C2=A0using=C2=A0make_expr()=C2=A0as well. But, Is it possible t= o rename/handle the expression list?


On Tue, Dec 3, 20= 19 at 3:18 AM Julia Lawall <julia.lawall@inria.fr> wrote:

De: = "Strace Labs" <stracelabs@gmail.com>
=C3=80: "Markus = Elfring" <Markus.Elfring@web.de>
Cc: "Julia Lawall" <= julia.lawall@inr= ia.fr>, c= occi@systeme.lip6.fr
Envoy=C3=A9: Mardi 3 D=C3=A9cembre 2019 = 11:30:14
Objet: Re: [Cocci] Changing format string usage with SmP= L?
On Sun, Dec 1, = 2019 at 6:00 AM Markus Elfring <Markus.Elfring@web.de> wrote:
> Bas= ically, I intend to replace alls "%s" called with "mydata-&g= t;name" by "%m" with "mydata" or "&mydata= "

How far would you get the desired source code transformation based on
software extensions around a search pattern like the following.
........= ..
Which algorithm will become sufficient for your data processing needs
around the usage of functions with variadic arguments because of format str= ings?


Actually, I really didn't get why yo= u're asking about that. because we are talking about X and you're a= sking for Y. but, either way. that is not the point. the point is because I= am studying about the Coccinelle and I am just trying to figure out if the= tool could detect "%s" called with "mydata->name" a= nd then replace by "%m" and remove the "->name"
e.g: Once if we have:

int foo() {
=C2=A0 int id;=
=C2=A0 struct mydata h1, *h2, s1, *s2;
=C2=A0 char *city =3D "H= ello";
=C2=A0 my_printf("%s", s2->name);
=C2=A0 my_= printf("hi hi %s gggg", h1.name);
=C2=A0 my_printf("1234 %d %s @ %d %s=C2=A0| %s -> city=3D%s", id, s1.name, 12, (*h2).name, h2->name, city);
=C2=A0 my_printf("aaaa %s hhhhh", h2->name);
= =C2=A0 my_printf("%s", city);
}

Then,= replace by:

int foo() {
=C2=A0 int id;
=C2=A0 str= uct mydata h1, *h2, s1, *s2;
=C2=A0 char *city =3D "Hello";my_printf("%m", s2);
my_printf("hi hi %s gggg&= quot;, &h1);
=C2=A0 my_printf("1234 %d %m=C2=A0@ %d = %m=C2=A0| %m=C2=A0-> city=3D%s", id, s1.name, 12, (*h2).name, = h2->name, city);
=C2=A0 my_printf("aaaa %s hhhhh", h= 2);
=C2=A0 my_printf("%s", city);
}

But, I've read again the other samples and the documentatio= n. therefore, I didn't figure out how it should be. btw, thank you Juli= a for the suggestion performing the Ocalm/make_expr/replace. (Due to= something wrong with the Coccinelle distributed by Brew/Osx. I just rewrot= e your sample using Python and the result was the same. But, I can't ju= st replace all "%s" by "%m". As I said, it should be on= ly if the "%s" was declared to use "mydata->name".
so, I still fighting yet. thanks in Advance.
=
OK, if you may have more than one argument to your print,= then you can find the offset using an expression list metavariable:

@r@
expression list[n] between;
@@

print(s,between,h2->name,...)

Then you can use r.n in your python rule to figu= re out where is the %s to change.=C2=A0 Unfortunately, this will not work w= ell if there are multiple name references in the argument list.=C2=A0 Becau= se you will be trying to change the format string in multiple ways, eg once= where between has length 2 and once where between has length 4.=C2=A0 Subs= tantial hacks would be required to deal with this.

=
It would be nice if you could do

@r@
<= /div>
expression list[bn] between;
expression list[an] af= ter;
position p;
@@
print@p(s,bet= ween,name,after)

@@
format list[= r.bn] f1;
forma= t list[r.an] f2;
position r.p;
@@
print@p(
-=C2=A0= =C2=A0=C2=A0 "%@f1@%s%@f2@"
+=C2=A0=C2=A0 "%@f= 1@%m%@f2@"
, l)

I don't= know if that would work, though.

julia

Regards,
Markus

--000000000000b914140598d5cbb7-- --===============1989351233== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci --===============1989351233==--