cocci.inria.fr archive mirror
 help / color / mirror / Atom feed
From: Strace Labs <stracelabs@gmail.com>
To: Julia Lawall <julia.lawall@inria.fr>
Cc: cocci@systeme.lip6.fr
Subject: Re: [Cocci] Changing format string usage with SmPL?
Date: Tue, 3 Dec 2019 22:21:01 -0200	[thread overview]
Message-ID: <CABvP5W0QkSgJRZRL4xu-DdtQ0RKkQuR-5wVn2QhvjUZCZVooUA@mail.gmail.com> (raw)
In-Reply-To: <CABvP5W2+fUip+jEAO-G+ZyUPJhx5iCHcTRxkiYsiok_a3zTuRw@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 6937 bytes --]

After some research, I could create a Python function called
fmt_replace_by_pos() to replace the %fmt by the Indice position.

*1. Input test*

int foo() {
   int id;
   struct mydata h1, *h2, s1, *s2;
   char *city = "Hello";
   my_printf("test: char*=%s mydata=%s int=%d mydata*=%s (*mydata)=%s",
city, h1.name, id, s2->name, (*h2)->name);
}

*2. My *.cocci* (~> https://pastebin.com/a6Tfav4x )

@initialize:python@
@@
import re

def fmt_parser(_st):
# a bunch of (non-% or %%), and then (% followed by non-%).
REG = re.compile('([^%]|%%)*(%[^%])')
retval = {}
pos = 0  # where to start searching for next time
i = 0
while True:
match = REG.match(_st, pos)
if match is None:
return retval
fmt = match.group(2)
pos = match.end()
idx = pos - len(fmt)
retval[i] = { 'idx': idx, 'fmt': fmt }
i += 1

def fmt_replace_by_pos(_str, _idx, _fmt):
try:
fmts  = fmt_parser(_str)
new   = _str

if _idx == -1:
_idx = [item for item in range(0, len(fmts))]

for _i in _idx:
f     = fmts[_i]
idx   = f['idx']
fmt   = f['fmt']
fmt_l = len(fmt)
new   = new[:idx] + _fmt + new[idx + fmt_l:]

return ''.join(new)
except Exception as e:
print("** ERROR: Something wrong in fmt_replace_by_pos():\n
{}\n".format(str(e)))

@r1@
format list fl;
identifier fn;
expression list e;
position p;
@@

fn("%@fl@", e@p)

@script:python s1@
fl << r1.fl;
fn << r1.fn;
e << r1.e;
p << r1.p;
new_fmt;
to_e;
@@
// Update the %fmt by the position (Position currently hardcode)
new_fmt = fmt_replace_by_pos(coccinelle.fl, { 1, 3, 4 }, "%m")
coccinelle.new_fmt = cocci.make_expr("\"{}\"".format(new_fmt))

@main depends on s1 && r1@
format list r1.fl;
expression s1.new_fmt;
identifier r1.fn;
expression list r1.e;
expression list s1.to_e;
position r1.p;
//struct mydata SMD;
//struct mydata* SMDP;
@@

 fn(
-"%@fl@"
+new_fmt
,
e@p
 );

*3. Execution*

# spatch --sp-file fix-format.cocci sample.c
init_defs_builtins:
/usr/local/Cellar/coccinelle/1.0.9/bin/../lib/coccinelle/standard.h
warning: main: inherited metavariable to_e not used in the -, +, or context
code
HANDLING: sample.c
diff =
--- sample.c
+++ /tmp/cocci-output-17883-e8cce6-sample.c
@@ -4,7 +4,8 @@ int foo() {
  struct mydata h1, *h2, s1, *s2;
  char *city = "Hello";

- my_printf("test: char*=%s mydata=%s int=%d mydata*=%s (*mydata)=%s",
city, h1.name, id, s2->name, (*h2)->name);
+ my_printf("test: char*=%s mydata=*%m* int=%d mydata*=*%m* (*mydata)=*%m*",
+  city, h1.name, id, s2->name, (*h2)->name);
 }

#

Therefore, I could find the %fmt and replace by whatever I want based on
the *expression-list. *currently struggled on that.







On Tue, Dec 3, 2019 at 3:28 PM Strace Labs <stracelabs@gmail.com> wrote:

> Unfortunately, it doesn't work. But, I am working on some solutions using
> Python.
>
> therefore, once we have something like:
>
> ...
> @r1@
> format list fl;
> identifier fn;
> expression list e;
> position p;
> @@
>
> fn("%@fl@", e@p)
> ....
>
> Then, I could handle the *format list* using *make_expr()* as well. But,
> Is it possible to rename/handle the *expression list?*
>
>
> On Tue, Dec 3, 2019 at 3:18 AM Julia Lawall <julia.lawall@inria.fr> wrote:
>
>> ------------------------------
>>
>> *De: *"Strace Labs" <stracelabs@gmail.com>
>> *À: *"Markus Elfring" <Markus.Elfring@web.de>
>> *Cc: *"Julia Lawall" <julia.lawall@inria.fr>, cocci@systeme.lip6.fr
>> *Envoyé: *Mardi 3 Décembre 2019 11:30:14
>> *Objet: *Re: [Cocci] Changing format string usage with SmPL?
>>
>> On Sun, Dec 1, 2019 at 6:00 AM Markus Elfring <Markus.Elfring@web.de>
>> wrote:
>>
>>> > Basically, I intend to replace alls "%s" called with "mydata->name" by
>>> "%m" with "mydata" or "&mydata"
>>>
>>> How far would you get the desired source code transformation based on
>>> software extensions around a search pattern like the following.
>>> ..........
>>> Which algorithm will become sufficient for your data processing needs
>>> around the usage of functions with variadic arguments because of format
>>> strings?
>>>
>>>
>> Actually, I really didn't get why you're asking about that. because we
>> are talking about X and you're asking for Y. but, either way. that is not
>> the point. the point is because I am studying about the Coccinelle and I am
>> just trying to figure out if the tool could detect "%s" called with
>> "mydata->name" and then replace by "%m" and remove the "->name"
>>
>> e.g: Once if we have:
>>
>> int foo() {
>>   int id;
>>   struct mydata h1, *h2, s1, *s2;
>>   char *city = "Hello";
>>   my_printf("%s", s2->name);
>>   my_printf("hi hi %s gggg", h1.name);
>>   my_printf("1234 %d *%s* @ %d *%s* | *%s* -> city=%s", id, *s1.name
>> <http://s1.name>*, 12, *(*h2).name*, *h2->name*, city);
>>   my_printf("aaaa %s hhhhh", h2->name);
>>   my_printf("%s", city);
>> }
>>
>> Then, replace by:
>>
>> int foo() {
>>   int id;
>>   struct mydata h1, *h2, s1, *s2;
>>   char *city = "Hello";
>> *my_printf("%m", s2);*
>> *my_printf("hi hi %s gggg", &h1);*
>>   my_printf("1234 %d *%m* @ %d *%m* | *%m* -> city=%s", id, *s1.name
>> <http://s1.name>*, 12, *(*h2).name*, *h2->name*, city);
>> *  my_printf("aaaa %s hhhhh", h2);*
>>   my_printf("%s", city);
>> }
>>
>> But, I've read again the other samples and the documentation. therefore,
>> I didn't figure out how it should be. btw, thank you Julia for the
>> suggestion performing the *Ocalm/make_expr/replace*. (Due to something
>> wrong with the Coccinelle distributed by Brew/Osx. I just rewrote your
>> sample using Python and the result was the same. But, I can't just replace
>> all "%s" by "%m". As I said, it should be only if the "%s" was declared to
>> use "mydata->name".
>>
>> so, I still fighting yet. thanks in Advance.
>>
>> OK, if you may have more than one argument to your print, then you can
>> find the offset using an expression list metavariable:
>>
>> @r@
>> expression list[n] between;
>> @@
>>
>> print(s,between,h2->name,...)
>>
>> Then you can use r.n in your python rule to figure out where is the %s to
>> change.  Unfortunately, this will not work well if there are multiple name
>> references in the argument list.  Because you will be trying to change the
>> format string in multiple ways, eg once where between has length 2 and once
>> where between has length 4.  Substantial hacks would be required to deal
>> with this.
>>
>> It would be nice if you could do
>>
>> @r@
>> expression list[bn] between;
>> expression list[an] after;
>> position p;
>> @@
>> print@p(s,between,name,after)
>>
>> @@
>> format list[r.bn] f1;
>> format list[r.an] f2;
>> position r.p;
>> @@
>> print@p(
>> -    "%@f1@%s%@f2@"
>> +   "%@f1@%m%@f2@"
>> , l)
>>
>> I don't know if that would work, though.
>>
>> julia
>>
>> Regards,
>>> Markus
>>>
>>
>>

[-- Attachment #1.2: Type: text/html, Size: 11620 bytes --]

[-- Attachment #2: Type: text/plain, Size: 136 bytes --]

_______________________________________________
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci

  reply	other threads:[~2019-12-04  0:22 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-28  2:11 [Cocci] Replacing printf/format calls based on the data-type Strace Labs
2019-11-28  7:07 ` Julia Lawall
2019-11-28 17:45   ` Strace Labs
2019-11-29 14:48   ` [Cocci] Replacing printf() parameters according to used data types Markus Elfring
2019-11-28  7:50 ` Markus Elfring
2019-11-29  0:35   ` Jorge Pereira
2019-11-29  8:29     ` Markus Elfring
2019-11-29 10:57       ` Strace Labs
2019-11-29 12:33         ` Markus Elfring
2019-11-29 14:47           ` Strace Labs
2019-11-29 16:08             ` Markus Elfring
2019-11-29 17:19               ` Strace Labs
2019-11-29 17:45                 ` Markus Elfring
2019-11-29 20:55             ` Julia Lawall
2019-11-30  2:25               ` Strace Labs
2019-11-30  6:35                 ` Julia Lawall
2019-11-30  8:46                 ` Markus Elfring
2019-12-01  8:00                 ` [Cocci] Changing format string usage with SmPL? Markus Elfring
2019-12-03  3:30                   ` Strace Labs
2019-12-03  5:18                     ` Julia Lawall
2019-12-03 13:28                       ` Markus Elfring
2019-12-03 15:43                       ` [Cocci] Generation of expression lists by SmPL script rules? Markus Elfring
2019-12-03 17:28                       ` [Cocci] Changing format string usage with SmPL? Strace Labs
2019-12-04  0:21                         ` Strace Labs [this message]
2019-12-06 19:36                           ` Markus Elfring
2019-12-07  7:49                           ` Markus Elfring
2019-12-04  6:47                         ` Julia Lawall
2019-12-06 19:44                           ` Markus Elfring
2019-12-06 19:20                         ` Markus Elfring
2019-12-03 10:01                     ` Markus Elfring
2019-11-30 15:11               ` [Cocci] Replacing printf() parameters according to used data types Markus Elfring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABvP5W0QkSgJRZRL4xu-DdtQ0RKkQuR-5wVn2QhvjUZCZVooUA@mail.gmail.com \
    --to=stracelabs@gmail.com \
    --cc=cocci@systeme.lip6.fr \
    --cc=julia.lawall@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).