All of lore.kernel.org
 help / color / mirror / Atom feed
* dependency tee from c parser entities downto token
@ 2012-04-24  9:54 Konrad Eisele
  2012-04-25 20:10 ` [PATCH] depend.c: build up a dependency tree from c entities downto tokens: entries in the tree are: macro-depend: tree of #if nesting macro-expansions: possible macro expansion source of a token tok->macro-expansions->macro tok->macro-depend->macro c entities are linked in via [stmt|expr|sym]->start-end-token Konrad Eisele
  2012-04-30 22:58 ` dependency tee from c parser entities downto token Christopher Li
  0 siblings, 2 replies; 50+ messages in thread
From: Konrad Eisele @ 2012-04-24  9:54 UTC (permalink / raw)
  To: linux-sparse

Hi, I'd like to extend sparse so that I can preserve a
dependency tree that goes from c parse entities all
the way down to single tokens. There are several places
this can be useful:
  1. If you have :
    "#include <stdio.h> int main() {}"
    and add for instance
    "#include <stdio.h> int main() {FILE *f;}"
    what are the macros and structs that are used for this single "FILE *f;"?
    (With macros and all #ifdef nesting macros dependencies and their dependendies
     and macros in macro arg/body substitutions and also macros in c parse entities
     that FILE (struct _IO_FILE) c-depends on)
  2. If you have a compilation with a fixed options-line, calculate
    a minimal c sourcefile that is "pre pre-processor" that compiles correctly.
    This would save time in compilation, stripping away all unneeded
    macros defines and declarations.
  3. You could also use it to write tools to show the
    dependency tree of macros at a particular location of the source...
    something all c-coders long for (at least me :-).

For case (1.) I have created a dirty prototype that on sourceforge
does this

$git clone git://git.code.sf.net/p/decpp/code decpp
$cd decpp
$make
$./shrinkc t1.c

The output is below ("output of "shrinkc"").

Before I start to submit patches I'd like to as weather
there is there is interest for such a development, or better
weather there is interest of such a development if I do
the work, as maybe you rather want to do it yourself, I
also first ask because otherwise I'll put time in it
and none will apply the patches in the end anyway.

Here is how i'd do it:

- Try to not mess with basic structs like "struct token" etc.
   by adding a little hash based attibute tagging to void * pointers
   so that additional information can be stored for each sparse
   entity in seperate stuctures.
- modify pre-processor.c to not overwrite already created  tokens
   but instead duplicate them. Of course also avoid freeing tokens.
   Also preserve the undef/define history of macros (by not reusing
   the macro symbols)
- Buildup the macro dependency tree created from macros expansion
   and #if statements
- Tag all tokens with extra informations: macro dependency as well
   as source-stream end location (to be able to reconstruct).
- write a traversal that descends from translation-units through the
   dependency tree down to the tokens and marks them as used. Or maybe
   downto a "used-char" bitfield in streams.

Opon that, useful tools (as I think) could be built up.

-- Konrad



============= output of "shrinkc" ===============

eiselekd+~/tmp/>shrinkc t1.c

********* builtin *********


********* preprocessor *********


********* t1.c *********

int main(int argc, char **argv) {
         FILE *f;
};

********* preprocessor *********


********* /usr/include/stdio.h *********

#ifndef _STDIO_H
#if !defined __need_FILE && !defined __need___FILE
# define _STDIO_H       1
# define __need_FILE
# define __need___FILE
#endif /* Don't need FILE.  */
#if !defined __FILE_defined && defined __need_FILE
struct _IO_FILE;
typedef struct _IO_FILE FILE;
#endif /* FILE not defined.  */
#if !defined ____FILE_defined && defined __need___FILE
typedef struct _IO_FILE __FILE;
#endif /* __FILE not defined.  */
#ifdef  _STDIO_H

********* /usr/include/features.h *********


********* /usr/include/sys/cdefs.h *********


********* /usr/include/bits/wordsize.h *********


********* /usr/include/gnu/stubs.h *********


********* /usr/include/bits/wordsize.h *********


********* /usr/include/gnu/stubs-32.h *********


********* /usr/lib/gcc/i486-slackware-linux/4.2.4//include/stddef.h *********

#endif
#endif

********* /usr/include/bits/types.h *********


********* /usr/include/bits/wordsize.h *********


********* /usr/include/bits/typesizes.h *********


********* /usr/include/libio.h *********

#ifndef _IO_STDIO_H
#ifdef _G_NEED_STDARG_H
#endif
# else
struct _IO_jump_t;  struct _IO_FILE;
# else
#else
typedef void _IO_lock_t;
#endif
struct _IO_marker {
   struct _IO_marker *_next;
   struct _IO_FILE *_sbuf;
   int _pos;
};
enum __codecvt_result
{
   __codecvt_ok,
   __codecvt_partial,
   __codecvt_error,
   __codecvt_noconv
};
struct _IO_FILE {
   int _flags;           /* High-order word is _IO_MAGIC; rest is flags. */
   char* _IO_read_ptr;   /* Current read pointer */
   char* _IO_read_end;   /* End of get area. */
   char* _IO_read_base;  /* Start of putback+get area. */
   char* _IO_write_base; /* Start of put area. */
   char* _IO_write_ptr;  /* Current put pointer. */
   char* _IO_write_end;  /* End of put area. */
   char* _IO_buf_base;   /* Start of reserve area. */
   char* _IO_buf_end;    /* End of reserve area. */
   char *_IO_save_base; /* Pointer to start of non-current get area. */
   char *_IO_backup_base;  /* Pointer to first valid character of backup area */
   char *_IO_save_end; /* Pointer to end of non-current get area. */
   struct _IO_marker *_markers;
   struct _IO_FILE *_chain;
   int _fileno;
#if 0
#else
   int _flags2;
#endif
   _IO_off_t _old_offset; /* This used to be _offset but it's too small.  */
   unsigned short _cur_column;
   signed char _vtable_offset;
   char _shortbuf[1];
   _IO_lock_t *_lock;
#if defined _G_IO_IO_FILE_VERSION && _G_IO_IO_FILE_VERSION == 0x20001
   _IO_off64_t _offset;
# if defined _LIBC || defined _GLIBCPP_USE_WCHAR_T
# else
   void *__pad1;
   void *__pad2;
   void *__pad3;
   void *__pad4;
   size_t __pad5;
# endif
   int _mode;
   char _unused2[15 * sizeof (int) - 4 * sizeof (void *) - sizeof (size_t)];
#endif
};
#endif /* _IO_STDIO_H */

********* /usr/include/_G_config.h *********

#ifndef _G_config_h
#define __need_mbstate_t
typedef struct
{
   __off_t __pos;
   __mbstate_t __state;
} _G_fpos_t;
typedef struct
{
   __off64_t __pos;
   __mbstate_t __state;
} _G_fpos64_t;
#define _G_off_t        __off_t
#define _G_off64_t      __off64_t
typedef int _G_int16_t __attribute__ ((__mode__ (__HI__)));
typedef int _G_int32_t __attribute__ ((__mode__ (__SI__)));
typedef unsigned int _G_uint16_t __attribute__ ((__mode__ (__HI__)));
typedef unsigned int _G_uint32_t __attribute__ ((__mode__ (__SI__)));
#define _G_NEED_STDARG_H 1
#define _G_IO_IO_FILE_VERSION 0x20001

********* /usr/lib/gcc/i486-slackware-linux/4.2.4//include/stddef.h *********

#endif
#endif
#endif
#endif /* defined(_ANSI_H_) || defined(_MACHINE_ANSI_H_) */
#endif /* _STDDEF_H or __need_size_t.  */
#endif

********* /usr/include/wchar.h *********

#ifndef _WCHAR_H
#if defined _WCHAR_H || defined __need_wint_t || !defined __WINT_TYPE__
# define __need_wint_t
#if (defined _WCHAR_H || defined __need_mbstate_t) && !defined __mbstate_t_defined
typedef struct
{
   int __count;
   union
   {
# ifdef __WINT_TYPE__
     __WINT_TYPE__ __wch;
# else
# endif
     char __wchb[4];
   } __value;            /* Value so far.  */
} __mbstate_t;
#endif
#endif /* ISO C99 or GCC and GNU.  */
#endif /* GCC and use GNU.  */
#endif /* Use ISO C95, C99 and Unix98. */
#endif
#endif
#endif  /* _WCHAR_H defined */
#endif /* wchar.h  */

********* /usr/lib/gcc/i486-slackware-linux/4.2.4//include/stddef.h *********

#endif
#endif
#endif
#endif /* defined(_ANSI_H_) || defined(_MACHINE_ANSI_H_) */
#endif /* _STDDEF_H or __need_size_t.  */
#endif
#endif
#endif /* __WCHAR_T__ */
#endif /* __wchar_t__ */
#endif /* _STDDEF_H or __need_wchar_t.  */
#if defined (__need_wint_t)
#ifndef _WINT_T
#ifndef __WINT_TYPE__
#define __WINT_TYPE__ unsigned int
#endif
typedef __WINT_TYPE__ wint_t;
#endif
#endif
#endif /* __sys_stdtypes_h */

********* /usr/lib/gcc/i486-slackware-linux/4.2.4//include/stdarg.h *********

#ifndef _STDARG_H
#ifndef _ANSI_STDARG_H_
#ifndef __GNUC_VA_LIST
typedef __builtin_va_list __gnuc_va_list;
#endif
#else /* not __svr4__ || _SCO_DS */
#endif
#endif /* not _VA_LIST_, except on certain systems */
#endif /* not __svr4__ */
#endif /* _STDARG_H */
#endif /* not _ANSI_STDARG_H_ */
#endif /* not _STDARG_H */

********* /usr/include/bits/stdio_lim.h *********


********* /usr/include/bits/sys_errlist.h *********

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH] depend.c: build up a dependency tree from c entities downto  tokens: entries in the tree are: macro-depend: tree of #if nesting macro-expansions: possible macro expansion source of a token tok->macro-expansions->macro tok->macro-depend->macro c entities are linked in via [stmt|expr|sym]->start-end-token
  2012-04-24  9:54 dependency tee from c parser entities downto token Konrad Eisele
@ 2012-04-25 20:10 ` Konrad Eisele
  2012-04-30 22:58 ` dependency tee from c parser entities downto token Christopher Li
  1 sibling, 0 replies; 50+ messages in thread
From: Konrad Eisele @ 2012-04-25 20:10 UTC (permalink / raw)
  To: linux-sparse, sparse; +Cc: eiselekd, sam, davem, KE

From: KE <eiselekd@gmx.de>

---
 Makefile      |    6 +-
 allocate.c    |    8 ++-
 allocate.h    |   10 ++-
 attr.c        |   67 ++++++++++++
 attr.h        |   18 +++
 depend.c      |  329 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 depend.h      |   91 ++++++++++++++++
 expand.c      |    1 +
 expression.c  |   60 +++++-----
 expression.h  |    8 ++
 inline.c      |   10 ++
 lib.c         |    8 ++
 parse.c       |   73 +++++++++----
 parse.h       |    1 +
 pre-process.c |  104 ++++++++++++------
 shrink.c      |  102 ++++++++++++++++++
 sparse.c      |    3 +
 symbol.h      |    9 ++
 token.h       |   10 ++
 19 files changed, 826 insertions(+), 92 deletions(-)
 create mode 100644 attr.c
 create mode 100644 attr.h
 create mode 100644 depend.c
 create mode 100644 depend.h
 create mode 100644 shrink.c

diff --git a/Makefile b/Makefile
index 79cadb0..833ec70 100644
--- a/Makefile
+++ b/Makefile
@@ -39,7 +39,8 @@ INCLUDEDIR=$(PREFIX)/include
 PKGCONFIGDIR=$(LIBDIR)/pkgconfig
 
 PROGRAMS=test-lexing test-parsing obfuscate compile graph sparse \
-	 test-linearize example test-unssa test-dissect ctags
+	 test-linearize example test-unssa test-dissect ctags \
+	 shrink
 INST_PROGRAMS=sparse cgcc
 INST_MAN1=sparse.1 cgcc.1
 
@@ -70,7 +71,8 @@ LIB_H=    token.h parse.h lib.h symbol.h scope.h expression.h target.h \
 LIB_OBJS= target.o parse.o tokenize.o pre-process.o symbol.o lib.o scope.o \
 	  expression.o show-parse.o evaluate.o expand.o inline.o linearize.o \
 	  sort.o allocate.o compat-$(OS).o ptrlist.o \
-	  flow.o cse.o simplify.o memops.o liveness.o storage.o unssa.o dissect.o
+	  flow.o cse.o simplify.o memops.o liveness.o storage.o unssa.o dissect.o \
+	  depend.o attr.o
 
 LIB_FILE= libsparse.a
 SLIB_FILE= libsparse.so
diff --git a/allocate.c b/allocate.c
index 5cc52a9..30378d3 100644
--- a/allocate.c
+++ b/allocate.c
@@ -26,6 +26,7 @@
 #include "scope.h"
 #include "expression.h"
 #include "linearize.h"
+#include "depend.h"
 
 void protect_allocations(struct allocator_struct *desc)
 {
@@ -125,5 +126,10 @@ ALLOCATOR(entrypoint, "entrypoint");
 ALLOCATOR(instruction, "instruction");
 ALLOCATOR(multijmp, "multijmp");
 ALLOCATOR(pseudo, "pseudo");
+ALLOCATOR(macro_dep, "macro dependency");
+ALLOCATOR(macro_expansion, "macro expansion");
 
-
+void token_allocator_nofree(void) 
+{
+	token_allocator.nofree = 1;
+}
diff --git a/allocate.h b/allocate.h
index 9f1dc8c..de85320 100644
--- a/allocate.h
+++ b/allocate.h
@@ -12,6 +12,7 @@ struct allocator_struct {
 	struct allocation_blob *blobs;
 	unsigned int alignment;
 	unsigned int chunking;
+	unsigned int nofree;
 	void *freelist;
 	/* statistics */
 	unsigned int allocations, total_bytes, useful_bytes;
@@ -22,6 +23,7 @@ extern void drop_all_allocations(struct allocator_struct *desc);
 extern void *allocate(struct allocator_struct *desc, unsigned int size);
 extern void free_one_entry(struct allocator_struct *desc, void *entry);
 extern void show_allocations(struct allocator_struct *);
+extern void token_allocator_nofree(void);
 
 #define __DECLARE_ALLOCATOR(type, x)		\
 	extern type *__alloc_##x(int);		\
@@ -42,7 +44,8 @@ extern void show_allocations(struct allocator_struct *);
 	}							\
 	void __free_##x(type *entry)				\
 	{							\
-		free_one_entry(&x##_allocator, entry);		\
+		if (!x##_allocator.nofree)			\
+			free_one_entry(&x##_allocator, entry);	\
 	}							\
 	void show_##x##_alloc(void)				\
 	{							\
@@ -50,7 +53,8 @@ extern void show_allocations(struct allocator_struct *);
 	}							\
 	void clear_##x##_alloc(void)				\
 	{							\
-		drop_all_allocations(&x##_allocator);		\
+		if (!x##_allocator.nofree)			\
+			drop_all_allocations(&x##_allocator);	\
 	}							\
 	void protect_##x##_alloc(void)				\
 	{							\
@@ -77,5 +81,7 @@ DECLARE_ALLOCATOR(instruction);
 DECLARE_ALLOCATOR(multijmp);
 DECLARE_ALLOCATOR(phi);
 DECLARE_ALLOCATOR(pseudo);
+DECLARE_ALLOCATOR(macro_dep);
+DECLARE_ALLOCATOR(macro_expansion);
 
 #endif
diff --git a/attr.c b/attr.c
new file mode 100644
index 0000000..54c4c21
--- /dev/null
+++ b/attr.c
@@ -0,0 +1,67 @@
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+#include "allocate.h"
+#include "compat.h"
+#include "attr.h"
+
+struct hash_v {
+	struct hash_v *n;
+	void *key;
+	enum attr_tags tag;
+	void *v;
+};
+
+__DECLARE_ALLOCATOR(struct hash_v, hash_v);
+__ALLOCATOR(struct hash_v, "hash value", hash_v);
+
+#define HASH_LEN (1024*4)
+struct hash {
+	struct hash_v *f;
+} h[HASH_LEN];
+
+void init_attr(void) {
+	memset(h, 0, sizeof(h));
+}
+
+static int hash_func(void *key, enum attr_tags tag) {
+	unsigned int k = ((unsigned int)key) >> 4;
+	return ((k) ^ (k >> 16) ^ (k >> 24) ^ tag) & (HASH_LEN-1);
+}
+
+void **lookup_attr(void *key, enum attr_tags tag, int create) {
+	int i = hash_func(key, tag);
+	struct hash *hp = &h[i];
+	struct hash_v *p;
+	struct hash_v **c = &hp->f;
+	while((p = *c)) {
+		if ((p ->tag == tag)
+		    && (p ->key == key)) {
+			return &p->v;
+		}
+		c = &p->n;
+	}
+	if (create) {
+		p = __alloc_hash_v(0);
+		p->key = key;
+		p->tag = tag;
+		p->v = 0;
+		*c = p;
+		return &p->v;
+	}
+	return 0;
+}
+
+void *set_attr(void *key, enum attr_tags tag, void *data) {
+	void **p = lookup_attr(key, tag, 1);
+	assert(p);
+	*p = data;
+	return data;
+}
+
+void *get_attr(void *key, enum attr_tags tag) {
+	void **p = lookup_attr(key, tag, 0);
+	if (!p)
+		return 0;
+	return *p;
+}
diff --git a/attr.h b/attr.h
new file mode 100644
index 0000000..aa5750c
--- /dev/null
+++ b/attr.h
@@ -0,0 +1,18 @@
+#ifndef _SPARSE_ATTR_H
+#define _SPARSE_ATTR_H
+
+enum attr_tags {
+	ATTR_TAG_MACRO_DEP = 1,
+	ATTR_TAG_FLAGS = 2,
+	ATTR_TAG_MACRO_EXP = 3
+};
+
+#define SET_MACRO_DEP(a,v) set_attr(a, ATTR_TAG_MACRO_DEP, v)
+#define GET_MACRO_DEP(a) ((struct macro_dep *)get_attr(a, ATTR_TAG_MACRO_DEP))
+#define NEW_MACRO_DEP() ((struct macro_dep *)attr_data(ATTR_TAG_MACRO_DEP))
+
+extern void *set_attr(void *key, enum attr_tags tag, void *data);
+extern void *get_attr(void *key, enum attr_tags tag);
+extern void *attr_data(enum attr_tags tag);
+
+#endif
diff --git a/depend.c b/depend.c
new file mode 100644
index 0000000..f36134a
--- /dev/null
+++ b/depend.c
@@ -0,0 +1,329 @@
+/*
+ * Build macros dependency tree 
+ * Copyright (C) 2012 Konrad Eisele <konrad@gaisler.com,eiselekd@gmail.com>
+ * BSD-License
+ * Redistribution and use in source and binary forms are permitted
+ * provided that the above copyright notice and this paragraph are
+ * duplicated in all such forms and that any documentation,
+ * advertising materials, and other materials related to such
+ * distribution and use acknowledge that the software was developed
+ * by the <organization>.  The name of the
+ * University may not be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ */
+
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+#include "token.h"
+#include "allocate.h"
+#include "compat.h"
+#include "depend.h"
+#include "attr.h"
+#include "parse.h"
+#include "symbol.h"
+#include "expression.h"
+
+__DECLARE_ALLOCATOR(struct dep_flags, dep_flags);
+__ALLOCATOR(struct dep_flags, "dependency flags", dep_flags);
+__DECLARE_ALLOCATOR(struct tok_macro_dep, tok_macro_dep);
+__ALLOCATOR(struct tok_macro_dep, "token macro expansion relation", tok_macro_dep);
+__DECLARE_ALLOCATOR(struct macro_dep_el, macro_dep_el);
+__ALLOCATOR(struct macro_dep_el, "macro dependency element", macro_dep_el);
+
+static void * push_dep(struct depend_if *dif, struct token *tok);
+static void * else_dep(struct depend_if *dif, struct token *tok);
+static void * pop_dep(struct depend_if *dif, struct token *tok);
+static void * push_sym(struct depend_if *dif, struct symbol *sym);
+static void * on_dep(struct depend_if *dif, void *p);
+static void * off_dep(struct depend_if *dif, void *p);
+static void * tag_dep(struct depend_if *dif,struct token *tok);
+static void * set_tok(struct depend_if *dif,void *p, struct token *tok);
+static void * end_tok(struct depend_if *dif,void *p, struct token *tok);
+static void * inherit_tok(struct depend_if *dif,void *a, void *b);
+static void * set_symlist(struct depend_if *dif,void *a, struct symbol_list * l);
+static void * macro_exp(struct depend_if *dif,struct token *tok, struct tok_macro_dep *m);
+
+static void depend_symbol_list(struct symbol_list *ptrlist);
+static void depend_expression(struct expression *expr);
+static void depend_statement(struct statement *stmt);
+
+struct depend_if d = {
+	.push_dep = push_dep,
+	.else_dep = else_dep,
+	.pop_dep = pop_dep,
+	.push_sym = push_sym,
+	.on = on_dep,
+	.off = off_dep,
+	.tag_dep = tag_dep,
+	.set_tok = set_tok,
+	.end_tok = end_tok,
+	.inherit_tok = inherit_tok,
+	.set_symlist = set_symlist,
+	.macro_exp = macro_exp,
+	.dep = &d.root_dep,
+};
+
+struct macro_dep *alloc_macro_dep(struct token *tok) {
+	struct macro_dep *e;
+	e = __alloc_macro_dep(0);
+	e->ppline = tok;
+	return e;
+}
+
+static int is_root(struct depend_if *dif) {
+	return dif->dep == &dif->root_dep ? 1 : 0;
+}
+
+static void *on_dep(struct depend_if *dif, void *p) {
+	dif->dep_use = dif->dep;
+	return 0;
+}
+
+static void *off_dep(struct depend_if *dif, void *p) {
+	dif->dep_use = 0;
+	return 0;
+}
+
+static void *push_dep(struct depend_if *dif, struct token *tok) {
+	struct macro_dep *e = alloc_macro_dep(tok);
+	e->up = dif->dep;
+	dif->dep = e;
+	on_dep(dif, 0);
+	return e;
+}
+
+/* push a macro_dep not on top of stack but one below, to reference <tok> */
+static void *else_dep(struct depend_if *dif, struct token *tok) {
+	struct macro_dep *e;
+	e = alloc_macro_dep(tok);
+	e->up = dif->dep->up;
+	dif->dep->up = e;
+	return e;
+}
+
+static void *pop_dep(struct depend_if *dif, struct token *tok) {
+	struct macro_dep *r = 0;
+	if (!is_root(dif)) {
+		dif->dep = dif->dep->up;
+	}
+	return r;
+}
+
+static void *push_sym(struct depend_if *dif, struct symbol *sym) {
+	
+	struct macro_dep_el *e = 0;
+	if (dif->dep_use) {
+		e = __alloc_macro_dep_el(0);
+		e->sym = sym;
+		e->n = dif->dep_use->f;
+		dif->dep_use->f = e;
+	}
+	return e;
+}
+
+static void * tag_dep(struct depend_if *dif,struct token *tok)
+{
+	struct macro_dep *a = 0, *c, *d = GET_MACRO_DEP(tok);
+	if (!eof_token(tok) && (a = dif->dep) && !is_root(dif)) {
+		if (d) {
+			c = alloc_macro_dep(0);
+			c->up = a;
+			c->same = d;
+			a = c;
+		}
+		SET_MACRO_DEP(tok, a);
+	}
+	return a;
+}
+
+void init_dep(void) {
+	dif = &d;
+	token_allocator_nofree();
+}
+
+struct dep_flags *depend_flags(void *p, int create) {
+	struct dep_flags *f;
+	if (!(f = (struct dep_flags *)get_attr(p, ATTR_TAG_FLAGS))) {
+		if (create) {
+			f = __alloc_dep_flags(0);
+			memset(f, 0, sizeof(struct dep_flags));
+			set_attr(p, ATTR_TAG_FLAGS, f);
+		}
+	}
+	return f;
+}
+
+struct dep_flags *depend_visited(void *p) {
+	struct dep_flags *f = 0; int r = 1;
+	if (p) {
+		f = depend_flags(p,1);
+		r = f->visited;
+		f->visited = 1;
+	}
+	return r ? 0 : f;
+}
+
+static void * set_tok(struct depend_if *dif,void *p, struct token *tok)
+{
+	struct dep_flags *f = depend_flags(p,1);
+	f->tok = tok;
+	return f;
+}
+
+static void * end_tok(struct depend_if *dif,void *p, struct token *tok)
+{
+	struct dep_flags *f = depend_flags(p,1);
+	f->end = tok;
+	return f;
+}
+
+static void * inherit_tok(struct depend_if *dif,void *a, void *b)
+{
+	struct dep_flags *af = depend_flags(a,1);
+	struct dep_flags *bf = depend_flags(b,1);
+	af->tok = bf->tok;
+	af->end = bf->end;
+	return af;
+}
+
+static void * set_symlist(struct depend_if *dif,void *a, struct symbol_list * l)
+{
+	struct dep_flags *f = depend_flags(a,1);
+	f->symlist = l;
+	return f;
+}
+
+static void * macro_exp(struct depend_if *dif, struct token *tok, struct tok_macro_dep *m) {
+	struct tok_macro_dep *p = 0;
+	if (tok && m) {
+		p  = __alloc_tok_macro_dep(0);
+		*p = *m;
+		set_attr(tok, ATTR_TAG_MACRO_EXP, p);
+		return p;
+	}
+	return p;
+}
+
+void depend_statement_list(struct ptr_list *ptrlist)
+{
+	void *ptr;
+	if (!ptrlist)
+		return;
+	FOR_EACH_PTR(ptrlist, ptr) {
+		depend_statement(ptr);
+	} END_FOR_EACH_PTR(ptr);
+}
+
+void depend_symbol_list(struct symbol_list *ptrlist)
+{
+	void *ptr;
+	if (!ptrlist) 
+		return;
+	FOR_EACH_PTR(((struct ptr_list *)ptrlist), ptr) {
+		depend_symbol(ptr);
+	} END_FOR_EACH_PTR(ptr);
+}
+
+void depend_statement(struct statement *stmt)
+{
+	struct dep_flags *f;
+	if (!(f = depend_visited(stmt)))
+		return;
+	
+	switch (stmt->type) {
+	case STMT_COMPOUND:
+		depend_statement_list((struct ptr_list *)stmt->stmts);
+		break;
+	case STMT_EXPRESSION:
+		depend_expression(stmt->expression);
+		break;
+	case STMT_IF:
+		depend_expression(stmt->if_conditional);
+		depend_statement(stmt->if_true);
+		depend_statement(stmt->if_false);
+		break;
+	case STMT_ITERATOR:
+		depend_symbol(stmt->iterator_break);
+		depend_symbol(stmt->iterator_continue);
+		depend_statement(stmt->iterator_pre_statement);
+		depend_statement(stmt->iterator_statement);
+		depend_statement(stmt->iterator_post_statement);
+		break;
+	case STMT_SWITCH:
+		depend_expression(stmt->switch_expression);
+		depend_statement(stmt->switch_statement);
+		depend_symbol(stmt->switch_break);
+		depend_symbol(stmt->switch_case);
+		break;
+	case STMT_CASE:
+		depend_expression(stmt->case_expression);
+		depend_expression(stmt->case_to);
+		depend_statement(stmt->case_statement);
+		depend_symbol(stmt->case_label);
+		break;
+	case STMT_RETURN:
+		depend_expression(stmt->ret_value);
+		depend_symbol(stmt->ret_target);
+		break;
+	default:
+		break;
+	}
+}
+
+void depend_expression(struct expression *expr)
+{
+	struct dep_flags *f;
+	if (!(f = depend_visited(expr)))
+		return;
+	switch (expr->type) {
+	case EXPR_STATEMENT:
+		depend_statement(expr->statement);
+		break;
+	case EXPR_BINOP:
+	case EXPR_COMMA:
+	case EXPR_COMPARE:
+	case EXPR_LOGICAL:
+	case EXPR_ASSIGNMENT:
+		depend_expression(expr->left);
+		depend_expression(expr->right);
+		break;
+	case EXPR_CAST:
+	case EXPR_FORCE_CAST:
+	case EXPR_IMPLIED_CAST:
+		depend_symbol(expr->cast_type);
+		depend_expression(expr->cast_expression);
+		break;
+	case EXPR_PREOP:
+		depend_expression(expr->unop);
+		break;
+	default:
+		break;
+	}
+}
+
+void depend_symbol(struct symbol *sym)
+{
+	struct dep_flags *f;
+	if (!(f = depend_visited(sym)))
+		return;
+	
+	depend_symbol(sym->ctype.base_type);
+	depend_symbol_list(f->symlist);
+
+	switch (sym->namespace) {
+	case NS_PREPROCESSOR:
+		break;
+	case NS_MACRO:
+		/*depend_macro_dep(sym->dep);*/
+	default:
+		/*use_line_fromto(sym->token, sym->endtoken);*/
+		depend_symbol_list(sym->arguments);
+		depend_symbol_list(sym->symbol_list);
+		depend_statement(sym->stmt);
+		break;
+	}
+}
diff --git a/depend.h b/depend.h
new file mode 100644
index 0000000..34a2159
--- /dev/null
+++ b/depend.h
@@ -0,0 +1,91 @@
+/*
+ * Build macros dependency tree 
+ * Copyright (C) 2012 Konrad Eisele <konrad@gaisler.com>
+ * BSD-License
+ * Redistribution and use in source and binary forms are permitted
+ * provided that the above copyright notice and this paragraph are
+ * duplicated in all such forms and that any documentation,
+ * advertising materials, and other materials related to such
+ * distribution and use acknowledge that the software was developed
+ * by the <organization>.  The name of the
+ * University may not be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ */
+
+#ifndef _SPARSE_DEPEND_H
+#define _SPARSE_DEPEND_H
+
+#include "attr.h"
+#include "token.h"
+
+struct macro_dep_el {
+	struct macro_dep_el *n;
+	struct symbol *sym;
+};
+
+struct macro_dep {
+	struct macro_dep *up, *same;
+	struct token *ppline;
+	struct macro_dep_el *f;
+	int visited : 1;
+};
+
+struct dep_flags {
+	struct token *tok;
+	struct token *end;
+	struct symbol_list *symlist;
+	unsigned int visited : 1;
+};
+
+struct macro_expansion {
+	int nargs;
+	struct symbol *sym;
+	struct token *m;
+	struct arg args[0];
+};
+
+struct tok_macro_dep {
+	struct macro_expansion *m;
+	unsigned int argi;
+	unsigned int isbody : 1;
+	unsigned int visited : 1;
+};
+
+struct depend_if {
+	/* ATTR_TAG_MACRO_DEP tagging */
+	void * (*push_dep)(struct depend_if *dif, struct token *tok);
+	void * (*else_dep)(struct depend_if *dif, struct token *tok);
+	void * (*pop_dep)(struct depend_if *dif, struct token *tok);
+	void * (*push_sym)(struct depend_if *dif, struct symbol *sym);
+	void * (*on)(struct depend_if *dif, void *p);
+	void * (*off)(struct depend_if *dif, void *p);
+	/* ATTR_TAG_FLAGS tagging */
+	void * (*tag_dep)(struct depend_if *dif,struct token *tok);
+	void * (*set_tok)(struct depend_if *dif,void *p, struct token *tok);
+	void * (*end_tok)(struct depend_if *dif,void *p, struct token *tok);
+	void * (*inherit_tok)(struct depend_if *dif,void *a, void *b);
+	void * (*set_symlist)(struct depend_if *dif,void *a, struct symbol_list * l);
+	/* ATTR_TAG_MACRO_EXP tagging */
+	void * (*macro_exp)(struct depend_if *dif, struct token *tok, struct tok_macro_dep *m);
+	
+	struct macro_dep *dep, *dep_use;
+	struct macro_dep root_dep;
+};
+
+#define DEPEN() (dif != 0)
+#define DEPCALL(f,v) if (DEPEN()) dif->f(dif,v)
+#define DEP_TOK(p,tok) if (DEPEN()) dif->set_tok(dif,p,tok)
+#define DEP_INHERIT(a,b) if (DEPEN()) dif->inherit_tok(dif,a,b)
+#define DEP_END(p,tok) if (DEPEN()) dif->end_tok(dif,p,tok)
+#define DEP_SYMLIST(p,l) if (DEPEN()) dif->set_symlist(dif,p,l)
+#define DEP_MACRO_EXP(p,m,i,b) if (DEPEN()) {		\
+		struct tok_macro_dep n = { m, i, b };	\
+		dif->macro_exp(dif,p,&n);		\
+	}
+
+extern void init_dep(void);
+extern void depend_symbol(struct symbol *sym);
+#endif
diff --git a/expand.c b/expand.c
index 63a9075..cf292e1 100644
--- a/expand.c
+++ b/expand.c
@@ -859,6 +859,7 @@ static int expand_pos_expression(struct expression *expr)
 						 * zero..
 						 */
 						reuse = alloc_expression(entry->pos, EXPR_POS);
+						DEP_INHERIT (reuse, entry);
 					}
 					reuse->type = EXPR_POS;
 					reuse->ctype = entry->ctype;
diff --git a/expression.c b/expression.c
index 0ae3a60..9ae8148 100644
--- a/expression.c
+++ b/expression.c
@@ -47,7 +47,7 @@ struct token *parens_expression(struct token *token, struct expression **expr, c
 {
 	token = expect(token, '(', where);
 	if (match_op(token, '{')) {
-		struct expression *e = alloc_expression(token->pos, EXPR_STATEMENT);
+		struct expression *e = alloc_expression_tok(token, EXPR_STATEMENT);
 		struct statement *stmt = alloc_statement(token->pos, STMT_COMPOUND);
 		*expr = e;
 		e->statement = stmt;
@@ -116,7 +116,7 @@ static int convert_function(struct token *next)
 static struct token *parse_type(struct token *token, struct expression **tree)
 {
 	struct symbol *sym;
-	*tree = alloc_expression(token->pos, EXPR_TYPE);
+	*tree = alloc_expression_tok(token, EXPR_TYPE);
 	(*tree)->flags = Int_const_expr; /* sic */
 	token = typename(token, &sym, NULL);
 	if (sym->ident)
@@ -130,8 +130,8 @@ static struct token *parse_type(struct token *token, struct expression **tree)
 static struct token *builtin_types_compatible_p_expr(struct token *token,
 						     struct expression **tree)
 {
-	struct expression *expr = alloc_expression(
-		token->pos, EXPR_COMPARE);
+	struct expression *expr = alloc_expression_tok(
+		token, EXPR_COMPARE);
 	expr->flags = Int_const_expr;
 	expr->op = SPECIAL_EQUAL;
 	token = token->next;
@@ -185,7 +185,7 @@ static struct token *builtin_offsetof_expr(struct token *token,
 		default:
 			return expect(token, ')', "at end of __builtin_offset");
 		case SPECIAL_DEREFERENCE:
-			e = alloc_expression(token->pos, EXPR_OFFSETOF);
+			e = alloc_expression_tok(token, EXPR_OFFSETOF);
 			e->flags = Int_const_expr;
 			e->op = '[';
 			*p = e;
@@ -193,7 +193,7 @@ static struct token *builtin_offsetof_expr(struct token *token,
 			/* fall through */
 		case '.':
 			token = token->next;
-			e = alloc_expression(token->pos, EXPR_OFFSETOF);
+			e = alloc_expression_tok(token, EXPR_OFFSETOF);
 			e->flags = Int_const_expr;
 			e->op = '.';
 			if (token_type(token) != TOKEN_IDENT) {
@@ -205,7 +205,7 @@ static struct token *builtin_offsetof_expr(struct token *token,
 			break;
 		case '[':
 			token = token->next;
-			e = alloc_expression(token->pos, EXPR_OFFSETOF);
+			e = alloc_expression_tok(token, EXPR_OFFSETOF);
 			e->flags = Int_const_expr;
 			e->op = '[';
 			token = parse_expression(token, &e->index);
@@ -406,7 +406,7 @@ struct token *primary_expression(struct token *token, struct expression **tree)
 	switch (token_type(token)) {
 	case TOKEN_CHAR:
 	case TOKEN_WIDE_CHAR:
-		expr = alloc_expression(token->pos, EXPR_VALUE);   
+		expr = alloc_expression_tok(token, EXPR_VALUE);   
 		expr->flags = Int_const_expr;
 		expr->ctype = token_type(token) == TOKEN_CHAR ? &int_ctype : &long_ctype;
 		expr->value = (unsigned char) token->character;
@@ -414,13 +414,13 @@ struct token *primary_expression(struct token *token, struct expression **tree)
 		break;
 
 	case TOKEN_NUMBER:
-		expr = alloc_expression(token->pos, EXPR_VALUE);
+		expr = alloc_expression_tok(token, EXPR_VALUE);
 		get_number_value(expr, token); /* will see if it's an integer */
 		token = token->next;
 		break;
 
 	case TOKEN_ZERO_IDENT: {
-		expr = alloc_expression(token->pos, EXPR_SYMBOL);
+		expr = alloc_expression_tok(token, EXPR_SYMBOL);
 		expr->flags = Int_const_expr;
 		expr->ctype = &int_ctype;
 		expr->symbol = &zero_int;
@@ -445,7 +445,7 @@ struct token *primary_expression(struct token *token, struct expression **tree)
 				break;
 			}
 		} else if (sym->enum_member) {
-			expr = alloc_expression(token->pos, EXPR_VALUE);
+			expr = alloc_expression_tok(token, EXPR_VALUE);
 			*expr = *sym->initializer;
 			/* we want the right position reported, thus the copy */
 			expr->pos = token->pos;
@@ -454,7 +454,7 @@ struct token *primary_expression(struct token *token, struct expression **tree)
 			break;
 		}
 
-		expr = alloc_expression(token->pos, EXPR_SYMBOL);
+		expr = alloc_expression_tok(token, EXPR_SYMBOL);
 
 		/*
 		 * We support types as real first-class citizens, with type
@@ -475,7 +475,7 @@ struct token *primary_expression(struct token *token, struct expression **tree)
 	case TOKEN_STRING:
 	case TOKEN_WIDE_STRING: {
 	handle_string:
-		expr = alloc_expression(token->pos, EXPR_STRING);
+		expr = alloc_expression_tok(token, EXPR_STRING);
 		expr->wide = token_type(token) == TOKEN_WIDE_STRING;
 		token = string_expression(token, expr);
 		break;
@@ -483,7 +483,7 @@ struct token *primary_expression(struct token *token, struct expression **tree)
 
 	case TOKEN_SPECIAL:
 		if (token->special == '(') {
-			expr = alloc_expression(token->pos, EXPR_PREOP);
+			expr = alloc_expression_tok(token, EXPR_PREOP);
 			expr->op = '(';
 			token = parens_expression(token, &expr->unop, "in expression");
 			if (expr->unop)
@@ -491,7 +491,7 @@ struct token *primary_expression(struct token *token, struct expression **tree)
 			break;
 		}
 		if (token->special == '[' && lookup_type(token->next)) {
-			expr = alloc_expression(token->pos, EXPR_TYPE);
+			expr = alloc_expression_tok(token, EXPR_TYPE);
 			expr->flags = Int_const_expr; /* sic */
 			token = typename(token->next, &expr->symbol, NULL);
 			token = expect(token, ']', "in type expression");
@@ -534,8 +534,8 @@ static struct token *postfix_expression(struct token *token, struct expression *
 	while (expr && token_type(token) == TOKEN_SPECIAL) {
 		switch (token->special) {
 		case '[': {			/* Array dereference */
-			struct expression *deref = alloc_expression(token->pos, EXPR_PREOP);
-			struct expression *add = alloc_expression(token->pos, EXPR_BINOP);
+			struct expression *deref = alloc_expression_tok(token, EXPR_PREOP);
+			struct expression *add = alloc_expression_tok(token, EXPR_BINOP);
 
 			deref->op = '*';
 			deref->unop = add;
@@ -549,7 +549,7 @@ static struct token *postfix_expression(struct token *token, struct expression *
 		}
 		case SPECIAL_INCREMENT:		/* Post-increment */
 		case SPECIAL_DECREMENT:	{	/* Post-decrement */
-			struct expression *post = alloc_expression(token->pos, EXPR_POSTOP);
+			struct expression *post = alloc_expression_tok(token, EXPR_POSTOP);
 			post->op = token->special;
 			post->unop = expr;
 			expr = post;
@@ -558,14 +558,14 @@ static struct token *postfix_expression(struct token *token, struct expression *
 		}
 		case SPECIAL_DEREFERENCE: {	/* Structure pointer member dereference */
 			/* "x->y" is just shorthand for "(*x).y" */
-			struct expression *inner = alloc_expression(token->pos, EXPR_PREOP);
+			struct expression *inner = alloc_expression_tok(token, EXPR_PREOP);
 			inner->op = '*';
 			inner->unop = expr;
 			expr = inner;
 		}
 		/* Fall through!! */
 		case '.': {			/* Structure member dereference */
-			struct expression *deref = alloc_expression(token->pos, EXPR_DEREF);
+			struct expression *deref = alloc_expression_tok(token, EXPR_DEREF);
 			deref->op = '.';
 			deref->deref = expr;
 			token = token->next;
@@ -580,7 +580,7 @@ static struct token *postfix_expression(struct token *token, struct expression *
 		}
 
 		case '(': {			/* Function call */
-			struct expression *call = alloc_expression(token->pos, EXPR_CALL);
+			struct expression *call = alloc_expression_tok(token, EXPR_CALL);
 			call->op = '(';
 			call->fn = expr;
 			token = expression_list(token->next, &call->args);
@@ -604,7 +604,7 @@ static struct token *unary_expression(struct token *token, struct expression **t
 static struct token *type_info_expression(struct token *token,
 	struct expression **tree, int type)
 {
-	struct expression *expr = alloc_expression(token->pos, type);
+	struct expression *expr = alloc_expression_tok(token, type);
 	struct token *p;
 
 	*tree = expr;
@@ -630,7 +630,7 @@ static struct token *type_info_expression(struct token *token,
 	 * of a typed initializer expression..
 	 */
 	if (match_op(token, '{')) {
-		struct expression *cast = alloc_expression(p->pos, EXPR_CAST);
+		struct expression *cast = alloc_expression_tok(p, EXPR_CAST);
 		cast->cast_type = expr->cast_type;
 		expr->cast_type = NULL;
 		expr->cast_expression = cast;
@@ -676,7 +676,7 @@ static struct token *unary_expression(struct token *token, struct expression **t
 				*tree = NULL;
 				return next;
 			}
-			unary = alloc_expression(token->pos, EXPR_PREOP);
+			unary = alloc_expression_tok(token, EXPR_PREOP);
 			unary->op = token->special;
 			unary->unop = unop;
 			*tree = unary;
@@ -694,7 +694,7 @@ static struct token *unary_expression(struct token *token, struct expression **t
 				*tree = NULL;
 				return next;
 			}
-			unary = alloc_expression(token->pos, EXPR_PREOP);
+			unary = alloc_expression_tok(token, EXPR_PREOP);
 			unary->op = token->special;
 			unary->unop = unop;
 			unary->flags = unop->flags & Int_const_expr;
@@ -704,7 +704,7 @@ static struct token *unary_expression(struct token *token, struct expression **t
 		/* Gcc extension: &&label gives the address of a label */
 		if (match_op(token, SPECIAL_LOGICAL_AND) &&
 		    token_type(token->next) == TOKEN_IDENT) {
-			struct expression *label = alloc_expression(token->pos, EXPR_LABEL);
+			struct expression *label = alloc_expression_tok(token, EXPR_LABEL);
 			struct symbol *sym = label_symbol(token->next);
 			if (!(sym->ctype.modifiers & MOD_ADDRESSABLE)) {
 				sym->ctype.modifiers |= MOD_ADDRESSABLE;
@@ -733,7 +733,7 @@ static struct token *cast_expression(struct token *token, struct expression **tr
 	if (match_op(token, '(')) {
 		struct token *next = token->next;
 		if (lookup_type(next)) {
-			struct expression *cast = alloc_expression(next->pos, EXPR_CAST);
+			struct expression *cast = alloc_expression_tok(next, EXPR_CAST);
 			struct expression *v;
 			struct symbol *sym;
 			int is_force;
@@ -789,7 +789,7 @@ static struct token *cast_expression(struct token *token, struct expression **tr
 									\
 			if (!(compare))					\
 				goto out;				\
-			top = alloc_expression(next->pos, type);	\
+			top = alloc_expression_tok(next, type);	\
 			next = inner(next->next, &right);		\
 			if (!right) {					\
 				sparse_error(next->pos, "No right hand side of '%s'-expression", show_special(op));	\
@@ -892,7 +892,7 @@ struct token *conditional_expression(struct token *token, struct expression **tr
 {
 	token = logical_or_expression(token, tree);
 	if (*tree && match_op(token, '?')) {
-		struct expression *expr = alloc_expression(token->pos, EXPR_CONDITIONAL);
+		struct expression *expr = alloc_expression_tok(token, EXPR_CONDITIONAL);
 		expr->op = token->special;
 		expr->left = *tree;
 		*tree = expr;
@@ -925,7 +925,7 @@ struct token *assignment_expression(struct token *token, struct expression **tre
 		int i, op = token->special;
 		for (i = 0; i < ARRAY_SIZE(assignments); i++)
 			if (assignments[i] == op) {
-				struct expression * expr = alloc_expression(token->pos, EXPR_ASSIGNMENT);
+				struct expression * expr = alloc_expression_tok(token, EXPR_ASSIGNMENT);
 				expr->left = *tree;
 				expr->op = op;
 				*tree = expr;
diff --git a/expression.h b/expression.h
index 9778de8..95edf0f 100644
--- a/expression.h
+++ b/expression.h
@@ -14,6 +14,7 @@
 #include "allocate.h"
 #include "lib.h"
 #include "symbol.h"
+#include "depend.h"
 
 struct expression_list;
 
@@ -185,6 +186,13 @@ static inline struct expression *alloc_expression(struct position pos, int type)
 	return expr;
 }
 
+static inline struct expression *alloc_expression_tok(struct token *tok, int type)
+{
+	struct expression *e = alloc_expression(tok->pos, type);
+	DEP_TOK(e, tok);
+	return e;
+}
+
 static inline struct expression *alloc_const_expression(struct position pos, int value)
 {
 	struct expression *expr = __alloc_expression(0);
diff --git a/inline.c b/inline.c
index 9ed4570..5ba1747 100644
--- a/inline.c
+++ b/inline.c
@@ -20,6 +20,7 @@
 static struct expression * dup_expression(struct expression *expr)
 {
 	struct expression *dup = alloc_expression(expr->pos, expr->type);
+	DEP_INHERIT (dup, expr);
 	*dup = *expr;
 	return dup;
 }
@@ -27,6 +28,7 @@ static struct expression * dup_expression(struct expression *expr)
 static struct statement * dup_statement(struct statement *stmt)
 {
 	struct statement *dup = alloc_statement(stmt->pos, stmt->type);
+	DEP_INHERIT (dup, stmt);
 	*dup = *stmt;
 	return dup;
 }
@@ -142,6 +144,7 @@ static struct expression * copy_expression(struct expression *expr)
 			expr = dup_expression(expr);
 			expr->cast_expression = copy_expression(cast);
 			expr->cast_type = alloc_symbol(sym->pos, sym->type);
+			DEP_INHERIT (expr->cast_type, sym);
 			*expr->cast_type = *sym;
 			break;
 		}
@@ -176,6 +179,7 @@ static struct expression * copy_expression(struct expression *expr)
 	/* Statement expression */
 	case EXPR_STATEMENT: {
 		struct statement *stmt = alloc_statement(expr->pos, STMT_COMPOUND);
+		DEP_INHERIT (stmt, expr);
 		copy_statement(expr->statement, stmt);
 		expr = dup_expression(expr);
 		expr->statement = stmt;
@@ -347,6 +351,7 @@ static struct statement *copy_one_statement(struct statement *stmt)
 	}
 	case STMT_COMPOUND: {
 		struct statement *new = alloc_statement(stmt->pos, STMT_COMPOUND);
+		DEP_INHERIT (new, stmt);
 		copy_statement(stmt, new);
 		stmt = new;
 		break;
@@ -469,6 +474,7 @@ static struct symbol *create_copy_symbol(struct symbol *orig)
 	struct symbol *sym = orig;
 	if (orig) {
 		sym = alloc_symbol(orig->pos, orig->type);
+		DEP_INHERIT (sym, orig);
 		*sym = *orig;
 		sym->bb_target = NULL;
 		sym->pseudo = NULL;
@@ -499,6 +505,7 @@ int inline_function(struct expression *expr, struct symbol *sym)
 	struct symbol_list *name_list, *arg_decl;
 	struct symbol *name;
 	struct expression *arg;
+	DEP_INHERIT (stmt, expr);
 
 	if (!fn->inline_stmt) {
 		sparse_error(fn->pos, "marked inline, but without a definition");
@@ -521,6 +528,7 @@ int inline_function(struct expression *expr, struct symbol *sym)
 	PREPARE_PTR_LIST(name_list, name);
 	FOR_EACH_PTR(arg_list, arg) {
 		struct symbol *a = alloc_symbol(arg->pos, SYM_NODE);
+		DEP_INHERIT (a, arg);
 
 		a->ctype.base_type = arg->ctype;
 		if (name) {
@@ -539,6 +547,7 @@ int inline_function(struct expression *expr, struct symbol *sym)
 
 	if (arg_decl) {
 		struct statement *decl = alloc_statement(expr->pos, STMT_DECLARATION);
+		DEP_INHERIT (decl, expr);
 		decl->declaration = arg_decl;
 		stmt->args = decl;
 	}
@@ -563,6 +572,7 @@ void uninline(struct symbol *sym)
 		p->replace = p;
 	} END_FOR_EACH_PTR(p);
 	fn->stmt = alloc_statement(fn->pos, STMT_COMPOUND);
+	DEP_INHERIT (fn->stmt, fn);
 	copy_statement(fn->inline_stmt, fn->stmt);
 	unset_replace_list(sym->symbol_list);
 	unset_replace_list(arg_list);
diff --git a/lib.c b/lib.c
index 396e9f1..d835870 100644
--- a/lib.c
+++ b/lib.c
@@ -213,6 +213,8 @@ int Wundef = 0;
 int Wuninitialized = 1;
 int Wdeclarationafterstatement = -1;
 
+int fnobuiltin = 0;
+
 int dbg_entry = 0;
 int dbg_dead = 0;
 
@@ -553,6 +555,9 @@ static char **handle_switch_f(char *arg, char **next)
 		arg += 3;
 	}
 	/* handle switch here.. */
+	if (!strncmp(arg, "builtin", 7)) {
+		fnobuiltin = 1;
+	}
 	return next;
 }
 
@@ -668,6 +673,9 @@ static char **handle_switch(char *arg, char **next)
 
 void declare_builtin_functions(void)
 {
+	if (fnobuiltin)
+		return;
+	
 	/* Gaah. gcc knows tons of builtin <string.h> functions */
 	add_pre_buffer("extern void *__builtin_memcpy(void *, const void *, __SIZE_TYPE__);\n");
 	add_pre_buffer("extern void *__builtin_mempcpy(void *, const void *, __SIZE_TYPE__);\n");
diff --git a/parse.c b/parse.c
index bd42180..f7bfc1c 100644
--- a/parse.c
+++ b/parse.c
@@ -652,9 +652,9 @@ static void apply_modifiers(struct position pos, struct decl_state *ctx)
 	
 }
 
-static struct symbol * alloc_indirect_symbol(struct position pos, struct ctype *ctype, int type)
+static struct symbol * alloc_indirect_symbol(struct token *tok, struct ctype *ctype, int type)
 {
-	struct symbol *sym = alloc_symbol(pos, type);
+	struct symbol *sym = alloc_symbol_tok(tok, type);
 
 	sym->ctype.base_type = ctype->base_type;
 	sym->ctype.modifiers = ctype->modifiers;
@@ -673,7 +673,7 @@ struct symbol *label_symbol(struct token *token)
 {
 	struct symbol *sym = lookup_symbol(token->ident, NS_LABEL);
 	if (!sym) {
-		sym = alloc_symbol(token->pos, SYM_LABEL);
+		sym = alloc_symbol_tok(token, SYM_LABEL);
 		bind_symbol(sym, token->ident, NS_LABEL);
 		fn_local_symbol(sym);
 	}
@@ -695,7 +695,7 @@ static struct token *struct_union_enum_specifier(enum type type,
 		     (match_op(token->next,';') || match_op(token->next,'{')))) {
 			// Either a new symbol, or else an out-of-scope
 			// symbol being redefined.
-			sym = alloc_symbol(token->pos, type);
+			sym = alloc_symbol_tok(token, type);
 			bind_symbol(sym, token->ident, NS_STRUCT);
 		}
 		if (sym->type != type)
@@ -716,6 +716,7 @@ static struct token *struct_union_enum_specifier(enum type type,
 			// Mark the structure as needing re-examination
 			sym->examined = 0;
 			sym->endpos = token->pos;
+			DEP_END (sym, token);
 		}
 		return token;
 	}
@@ -727,11 +728,12 @@ static struct token *struct_union_enum_specifier(enum type type,
 		return token;
 	}
 
-	sym = alloc_symbol(token->pos, type);
+	sym = alloc_symbol_tok(token, type);
 	token = parse(token->next, sym);
 	ctx->ctype.base_type = sym;
 	token =  expect(token, '}', "at end of specifier");
 	sym->endpos = token->pos;
+	DEP_END (sym, token);
 
 	return token;
 }
@@ -875,7 +877,7 @@ static struct token *parse_enum_declaration(struct token *token, struct symbol *
 			expr->ctype = ctype;
 		}
 
-		sym = alloc_symbol(token->pos, SYM_NODE);
+		sym = alloc_symbol_tok(token, SYM_NODE);
 		bind_symbol(sym, token->ident, NS_SYMBOL);
 		sym->ctype.modifiers &= ~MOD_ADDRESSABLE;
 		sym->initializer = expr;
@@ -926,6 +928,7 @@ static struct token *parse_enum_declaration(struct token *token, struct symbol *
 		token = next;
 
 		sym->endpos = token->pos;
+		DEP_END (sym, token);
 
 		if (!match_op(token, ','))
 			break;
@@ -988,10 +991,11 @@ static struct token *typeof_specifier(struct token *token, struct decl_state *ct
 		ctx->ctype.base_type = sym->ctype.base_type;
 		apply_ctype(token->pos, &sym->ctype, &ctx->ctype);
 	} else {
-		struct symbol *typeof_sym = alloc_symbol(token->pos, SYM_TYPEOF);
+		struct symbol *typeof_sym = alloc_symbol_tok(token, SYM_TYPEOF);
 		token = parse_expression(token->next, &typeof_sym->initializer);
 
 		typeof_sym->endpos = token->pos;
+		DEP_END (typeof_sym, token);
 		if (!typeof_sym->initializer) {
 			sparse_error(token->pos, "expected expression after the '(' token");
 			typeof_sym = &bad_ctype;
@@ -1440,6 +1444,8 @@ static struct token *declaration_specifiers(struct token *token, struct decl_sta
 						 NS_TYPEDEF | NS_SYMBOL);
 		if (!s || !(s->namespace & NS_TYPEDEF))
 			break;
+		if (DEPEN())
+			add_symbol (&ctx->dep, s);
 		if (s->type != SYM_KEYWORD) {
 			if (seen & Set_Any)
 				break;
@@ -1493,7 +1499,7 @@ static struct token *declaration_specifiers(struct token *token, struct decl_sta
 			sparse_error(token->pos, "invalid modifier");
 			return token;
 		}
-		type = alloc_symbol(token->pos, SYM_BASETYPE);
+		type = alloc_symbol_tok(token, SYM_BASETYPE);
 		*type = *ctx->ctype.base_type;
 		type->ctype.modifiers &= ~MOD_SPECIFIER;
 		type->ctype.base_type = ctx->ctype.base_type;
@@ -1679,7 +1685,7 @@ static struct token *direct_declarator(struct token *token, struct decl_state *c
 	if (match_op(token, '(')) {
 		enum kind kind = which_func(token, p, ctx->prefer_abstract);
 		struct symbol *fn;
-		fn = alloc_indirect_symbol(token->pos, ctype, SYM_FN);
+		fn = alloc_indirect_symbol(token, ctype, SYM_FN);
 		token = token->next;
 		if (kind == K_R)
 			token = identifier_list(token, fn);
@@ -1687,15 +1693,17 @@ static struct token *direct_declarator(struct token *token, struct decl_state *c
 			token = parameter_type_list(token, fn);
 		token = expect(token, ')', "in function declarator");
 		fn->endpos = token->pos;
+		DEP_END (fn, token);
 		return token;
 	}
 
 	while (match_op(token, '[')) {
 		struct symbol *array;
-		array = alloc_indirect_symbol(token->pos, ctype, SYM_ARRAY);
+		array = alloc_indirect_symbol(token, ctype, SYM_ARRAY);
 		token = abstract_array_declarator(token->next, array);
 		token = expect(token, ']', "in abstract_array_declarator");
 		array->endpos = token->pos;
+		DEP_END (array, token);
 		ctype = &array->ctype;
 	}
 	return token;
@@ -1704,7 +1712,7 @@ static struct token *direct_declarator(struct token *token, struct decl_state *c
 static struct token *pointer(struct token *token, struct decl_state *ctx)
 {
 	while (match_op(token,'*')) {
-		struct symbol *ptr = alloc_symbol(token->pos, SYM_PTR);
+		struct symbol *ptr = alloc_symbol_tok(token, SYM_PTR);
 		ptr->ctype.modifiers = ctx->ctype.modifiers;
 		ptr->ctype.base_type = ctx->ctype.base_type;
 		ptr->ctype.as = ctx->ctype.as;
@@ -1717,6 +1725,8 @@ static struct token *pointer(struct token *token, struct decl_state *ctx)
 
 		token = handle_qualifiers(token->next, ctx);
 		ctx->ctype.base_type->endpos = token->pos;
+		DEP_END (ctx->ctype.base_type, token);
+
 	}
 	return token;
 }
@@ -1741,7 +1751,7 @@ static struct token *handle_bitfield(struct token *token, struct decl_state *ctx
 		return conditional_expression(token->next, &expr);
 	}
 
-	bitfield = alloc_indirect_symbol(token->pos, ctype, SYM_BITFIELD);
+	bitfield = alloc_indirect_symbol(token, ctype, SYM_BITFIELD);
 	token = conditional_expression(token->next, &expr);
 	width = const_expression_value(expr);
 	bitfield->bit_size = width;
@@ -1772,6 +1782,7 @@ static struct token *handle_bitfield(struct token *token, struct decl_state *ctx
 	}
 	bitfield->bit_size = width;
 	bitfield->endpos = token->pos;
+	DEP_END (bitfield, token);
 	return token;
 }
 
@@ -1785,7 +1796,7 @@ static struct token *declaration_list(struct token *token, struct symbol_list **
 	mod = storage_modifiers(&ctx);
 	saved = ctx.ctype;
 	for (;;) {
-		struct symbol *decl = alloc_symbol(token->pos, SYM_NODE);
+		struct symbol *decl = alloc_symbol_tok(token, SYM_NODE);
 		ctx.ident = &decl->ident;
 
 		token = declarator(token, &ctx);
@@ -1798,6 +1809,7 @@ static struct token *declaration_list(struct token *token, struct symbol_list **
 		decl->ctype = ctx.ctype;
 		decl->ctype.modifiers |= mod;
 		decl->endpos = token->pos;
+		DEP_END (decl, token);
 		add_symbol(list, decl);
 		if (!match_op(token, ','))
 			break;
@@ -1833,6 +1845,7 @@ static struct token *parameter_declaration(struct token *token, struct symbol *s
 	sym->ctype = ctx.ctype;
 	sym->ctype.modifiers |= storage_modifiers(&ctx);
 	sym->endpos = token->pos;
+	DEP_END (sym, token);
 	return token;
 }
 
@@ -1840,13 +1853,14 @@ struct token *typename(struct token *token, struct symbol **p, int *forced)
 {
 	struct decl_state ctx = {.prefer_abstract = 1};
 	int class;
-	struct symbol *sym = alloc_symbol(token->pos, SYM_NODE);
+	struct symbol *sym = alloc_symbol_tok(token, SYM_NODE);
 	*p = sym;
 	token = declaration_specifiers(token, &ctx);
 	token = declarator(token, &ctx);
 	apply_modifiers(token->pos, &ctx);
 	sym->ctype = ctx.ctype;
 	sym->endpos = token->pos;
+	DEP_END (sym, token);
 	class = ctx.storage_class;
 	if (forced) {
 		*forced = 0;
@@ -1965,6 +1979,7 @@ static struct statement *make_statement(struct expression *expr)
 	if (!expr)
 		return NULL;
 	stmt = alloc_statement(expr->pos, STMT_EXPRESSION);
+	DEP_INHERIT (stmt, expr);
 	stmt->expression = expr;
 	return stmt;
 }
@@ -1984,8 +1999,10 @@ static void start_iterator(struct statement *stmt)
 
 	start_symbol_scope();
 	cont = alloc_symbol(stmt->pos, SYM_NODE);
+	DEP_INHERIT (cont, stmt);
 	bind_symbol(cont, &continue_ident, NS_ITERATOR);
 	brk = alloc_symbol(stmt->pos, SYM_NODE);
+	DEP_INHERIT (brk, stmt);
 	bind_symbol(brk, &break_ident, NS_ITERATOR);
 
 	stmt->type = STMT_ITERATOR;
@@ -2004,9 +2021,11 @@ static struct statement *start_function(struct symbol *sym)
 {
 	struct symbol *ret;
 	struct statement *stmt = alloc_statement(sym->pos, STMT_COMPOUND);
+	DEP_INHERIT (stmt, sym);
 
 	start_function_scope();
 	ret = alloc_symbol(sym->pos, SYM_NODE);
+	DEP_INHERIT (ret, sym);
 	ret->ctype = sym->ctype.base_type->ctype;
 	ret->ctype.modifiers &= ~(MOD_STORAGE | MOD_CONST | MOD_VOLATILE | MOD_TLS | MOD_INLINE | MOD_ADDRESSABLE | MOD_NOCAST | MOD_NODEREF | MOD_ACCESSED | MOD_TOPLEVEL);
 	ret->ctype.modifiers |= (MOD_AUTO | MOD_REGISTER);
@@ -2044,9 +2063,11 @@ static void start_switch(struct statement *stmt)
 
 	start_symbol_scope();
 	brk = alloc_symbol(stmt->pos, SYM_NODE);
+	DEP_INHERIT (brk, stmt);
 	bind_symbol(brk, &break_ident, NS_ITERATOR);
 
 	switch_case = alloc_symbol(stmt->pos, SYM_NODE);
+	DEP_INHERIT (switch_case, stmt);
 	bind_symbol(switch_case, &case_ident, NS_ITERATOR);
 	switch_case->stmt = stmt;
 
@@ -2076,6 +2097,7 @@ static void add_case_statement(struct statement *stmt)
 		return;
 	}
 	sym = alloc_symbol(stmt->pos, SYM_NODE);
+	DEP_INHERIT (sym, stmt);
 	add_symbol(&target->symbol_list, sym);
 	sym->stmt = stmt;
 	stmt->case_label = sym;
@@ -2266,6 +2288,7 @@ static struct token *parse_range_statement(struct token *token, struct statement
 static struct token *statement(struct token *token, struct statement **tree)
 {
 	struct statement *stmt = alloc_statement(token->pos, STMT_NONE);
+	DEP_TOK (stmt, token);
 
 	*tree = stmt;
 	if (token_type(token) == TOKEN_IDENT) {
@@ -2298,7 +2321,7 @@ static struct token *statement(struct token *token, struct statement **tree)
 static struct token *label_statement(struct token *token)
 {
 	while (token_type(token) == TOKEN_IDENT) {
-		struct symbol *sym = alloc_symbol(token->pos, SYM_LABEL);
+		struct symbol *sym = alloc_symbol_tok(token, SYM_LABEL);
 		/* it's block-scope, but we want label namespace */
 		bind_symbol(sym, token->ident, NS_SYMBOL);
 		sym->namespace = NS_LABEL;
@@ -2329,6 +2352,7 @@ static struct token * statement_list(struct token *token, struct statement_list
 				seen_statement = 0;
 			}
 			stmt = alloc_statement(token->pos, STMT_DECLARATION);
+			DEP_TOK (stmt, token);
 			token = external_declaration(token, &stmt->declaration);
 		} else {
 			seen_statement = Wdeclarationafterstatement;
@@ -2343,10 +2367,11 @@ static struct token *identifier_list(struct token *token, struct symbol *fn)
 {
 	struct symbol_list **list = &fn->arguments;
 	for (;;) {
-		struct symbol *sym = alloc_symbol(token->pos, SYM_NODE);
+		struct symbol *sym = alloc_symbol_tok(token, SYM_NODE);
 		sym->ident = token->ident;
 		token = token->next;
 		sym->endpos = token->pos;
+		DEP_END (sym, token);
 		sym->ctype.base_type = &incomplete_ctype;
 		add_symbol(list, sym);
 		if (!match_op(token, ',') ||
@@ -2371,7 +2396,7 @@ static struct token *parameter_type_list(struct token *token, struct symbol *fn)
 			break;
 		}
 
-		sym = alloc_symbol(token->pos, SYM_NODE);
+		sym = alloc_symbol_tok(token, SYM_NODE);
 		token = parameter_declaration(token, sym);
 		if (sym->ctype.base_type == &void_ctype) {
 			/* Special case: (void) */
@@ -2642,12 +2667,13 @@ static struct token *parse_k_r_arguments(struct token *token, struct symbol *dec
 
 static struct token *toplevel_asm_declaration(struct token *token, struct symbol_list **list)
 {
-	struct symbol *anon = alloc_symbol(token->pos, SYM_NODE);
-	struct symbol *fn = alloc_symbol(token->pos, SYM_FN);
+	struct symbol *anon = alloc_symbol_tok(token, SYM_NODE);
+	struct symbol *fn = alloc_symbol_tok(token, SYM_FN);
 	struct statement *stmt;
 
 	anon->ctype.base_type = fn;
 	stmt = alloc_statement(token->pos, STMT_NONE);
+	DEP_TOK (stmt, token);
 	fn->stmt = stmt;
 
 	token = parse_asm_statement(token, stmt);
@@ -2660,7 +2686,7 @@ struct token *external_declaration(struct token *token, struct symbol_list **lis
 {
 	struct ident *ident = NULL;
 	struct symbol *decl;
-	struct decl_state ctx = { .ident = &ident };
+	struct decl_state ctx = { .ident = &ident, .dep = 0 };
 	struct ctype saved;
 	struct symbol *base_type;
 	unsigned long mod;
@@ -2676,7 +2702,7 @@ struct token *external_declaration(struct token *token, struct symbol_list **lis
 	/* Parse declaration-specifiers, if any */
 	token = declaration_specifiers(token, &ctx);
 	mod = storage_modifiers(&ctx);
-	decl = alloc_symbol(token->pos, SYM_NODE);
+	decl = alloc_symbol_tok(token, SYM_NODE);
 	/* Just a type declaration? */
 	if (match_op(token, ';')) {
 		apply_modifiers(token->pos, &ctx);
@@ -2691,6 +2717,8 @@ struct token *external_declaration(struct token *token, struct symbol_list **lis
 	decl->ctype = ctx.ctype;
 	decl->ctype.modifiers |= mod;
 	decl->endpos = token->pos;
+	DEP_END (decl, token);
+	DEP_SYMLIST (decl, ctx.dep);
 
 	/* Just a type declaration? */
 	if (!ident) {
@@ -2755,7 +2783,7 @@ struct token *external_declaration(struct token *token, struct symbol_list **lis
 
 		token = token->next;
 		ident = NULL;
-		decl = alloc_symbol(token->pos, SYM_NODE);
+		decl = alloc_symbol_tok(token, SYM_NODE);
 		ctx.ctype = saved;
 		token = handle_attributes(token, &ctx, KW_ATTRIBUTE);
 		token = declarator(token, &ctx);
@@ -2764,6 +2792,7 @@ struct token *external_declaration(struct token *token, struct symbol_list **lis
 		decl->ctype = ctx.ctype;
 		decl->ctype.modifiers |= mod;
 		decl->endpos = token->pos;
+		DEP_END (decl, token);
 		if (!ident) {
 			sparse_error(token->pos, "expected identifier name in type definition");
 			return token;
diff --git a/parse.h b/parse.h
index b26bd03..9ff6639 100644
--- a/parse.h
+++ b/parse.h
@@ -10,6 +10,7 @@
  */
 
 #include "symbol.h"
+#include "depend.h"
 
 enum statement_type {
 	STMT_NONE,
diff --git a/pre-process.c b/pre-process.c
index 8a16f8b..3dbbb11 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -27,7 +27,10 @@
 #include "symbol.h"
 #include "expression.h"
 #include "scope.h"
+#include "depend.h"
 
+/* interface to depend.c, if set */
+struct depend_if *dif; 
 static int false_nesting = 0;
 
 #define INCLUDEPATHS 300
@@ -111,6 +114,7 @@ static struct symbol *lookup_macro(struct ident *ident)
 	struct symbol *sym = lookup_symbol(ident, NS_MACRO | NS_UNDEF);
 	if (sym && sym->namespace != NS_MACRO)
 		sym = NULL;
+	DEPCALL(push_sym, sym);
 	return sym;
 }
 
@@ -244,15 +248,6 @@ static struct token *collect_arg(struct token *prev, int vararg, struct position
  * We store arglist as <counter> [arg1] <number of uses for arg1> ... eof
  */
 
-struct arg {
-	struct token *arg;
-	struct token *expanded;
-	struct token *str;
-	int n_normal;
-	int n_quoted;
-	int n_str;
-};
-
 static int collect_arguments(struct token *start, struct token *arglist, struct arg *args, struct token *what)
 {
 	int wanted = arglist->count.normal;
@@ -325,15 +320,22 @@ out:
 	return 0;
 }
 
-static struct token *dup_list(struct token *list)
+static struct token *dup_one(struct token *list)
+{
+	struct token *newtok = __alloc_token(0);
+	*newtok = *list;
+	return newtok;
+}
+
+static struct token *dup_list(struct token *list, struct macro_expansion *me, int argi)
 {
 	struct token *res = NULL;
 	struct token **p = &res;
 
 	while (!eof_token(list)) {
-		struct token *newtok = __alloc_token(0);
-		*newtok = *list;
+		struct token *newtok = dup_one(list);
 		*p = newtok;
+		DEP_MACRO_EXP(newtok, me, argi, 0);
 		p = &newtok->next;
 		list = list->next;
 	}
@@ -356,7 +358,7 @@ static struct token *stringify(struct token *arg)
 	return token;
 }
 
-static void expand_arguments(int count, struct arg *args)
+static void expand_arguments(int count, struct arg *args, struct macro_expansion *me)
 {
 	int i;
 	for (i = 0; i < count; i++) {
@@ -372,7 +374,7 @@ static void expand_arguments(int count, struct arg *args)
 			} else if (eof_token(arg)) {
 				args[i].expanded = arg;
 			} else {
-				args[i].expanded = dup_list(arg);
+				args[i].expanded = dup_list(arg,me,i);
 			}
 			expand_list(&args[i].expanded);
 		}
@@ -472,7 +474,7 @@ static int merge(struct token *left, struct token *right)
 	return 0;
 }
 
-static struct token *dup_token(struct token *token, struct position *streampos, struct position *pos)
+static struct token *dup_token(struct token *token, struct position *streampos, struct position *pos, struct macro_expansion *me)
 {
 	struct token *alloc = alloc_token(streampos);
 	token_type(alloc) = token_type(token);
@@ -480,16 +482,17 @@ static struct token *dup_token(struct token *token, struct position *streampos,
 	alloc->pos.whitespace = pos->whitespace;
 	alloc->number = token->number;
 	alloc->pos.noexpand = token->pos.noexpand;
-	return alloc;	
+	DEP_MACRO_EXP(token, me, 0, 1);
+	return alloc;
 }
 
-static struct token **copy(struct token **where, struct token *list, int *count)
+static struct token **copy(struct token **where, struct token *list, int *count, struct macro_expansion *me)
 {
 	int need_copy = --*count;
 	while (!eof_token(list)) {
 		struct token *token;
 		if (need_copy)
-			token = dup_token(list, &list->pos, &list->pos);
+			token = dup_token(list, &list->pos, &list->pos, me);
 		else
 			token = list;
 		if (token_type(token) == TOKEN_IDENT && token->ident->tainted)
@@ -502,7 +505,7 @@ static struct token **copy(struct token **where, struct token *list, int *count)
 	return where;
 }
 
-static struct token **substitute(struct token **list, struct token *body, struct arg *args)
+static struct token **substitute(struct token **list, struct token *body, struct arg *args, struct macro_expansion *me)
 {
 	struct token *token = *list;
 	struct position *base_pos = &token->pos;
@@ -526,7 +529,7 @@ static struct token **substitute(struct token **list, struct token *body, struct
 			 */
 			if (!args[body->next->argnum].arg)
 				continue;
-			added = dup_token(body, base_pos, pos);
+			added = dup_token(body, base_pos, pos, me);
 			token_type(added) = TOKEN_SPECIAL;
 			tail = &added->next;
 			break;
@@ -556,7 +559,7 @@ static struct token **substitute(struct token **list, struct token *body, struct
 				continue;
 			}
 		copy_arg:
-			tail = copy(&added, arg, count);
+			tail = copy(&added, arg, count, me);
 			added->pos.newline = pos->newline;
 			added->pos.whitespace = pos->whitespace;
 			break;
@@ -569,14 +572,14 @@ static struct token **substitute(struct token **list, struct token *body, struct
 			continue;
 
 		case TOKEN_IDENT:
-			added = dup_token(body, base_pos, pos);
+			added = dup_token(body, base_pos, pos, me);
 			if (added->ident->tainted)
 				added->pos.noexpand = 1;
 			tail = &added->next;
 			break;
 
 		default:
-			added = dup_token(body, base_pos, pos);
+			added = dup_token(body, base_pos, pos, me);
 			tail = &added->next;
 			break;
 		}
@@ -606,7 +609,16 @@ static int expand(struct token **list, struct symbol *sym)
 	struct ident *expanding = token->ident;
 	struct token **tail;
 	int nargs = sym->arglist ? sym->arglist->count.normal : 0;
-	struct arg args[nargs];
+	struct arg _args[nargs];
+	struct arg *args = _args;
+	struct macro_expansion *me = 0;
+
+	if (DEPEN()) {
+		me = __alloc_macro_expansion(sizeof(struct arg) * nargs);
+		me->m = token;
+		me->sym = sym;
+		args = me->args;
+	}
 
 	if (expanding->tainted) {
 		token->pos.noexpand = 1;
@@ -618,13 +630,13 @@ static int expand(struct token **list, struct symbol *sym)
 			return 1;
 		if (!collect_arguments(token->next, sym->arglist, args, token))
 			return 1;
-		expand_arguments(nargs, args);
+		expand_arguments(nargs, args, me);
 	}
 
 	expanding->tainted = 1;
 
 	last = token->next;
-	tail = substitute(list, sym->expansion, args);
+	tail = substitute(list, sym->expansion, args, me);
 	*tail = last;
 
 	return 0;
@@ -907,11 +919,18 @@ static inline void set_arg_count(struct token *token)
 	token->count.str = token->count.vararg = 0;
 }
 
-static struct token *parse_arguments(struct token *list)
+#define DUP(l) (DEPEN() ? dup_one(l) : (l))
+#define DUP_NEXT(l) (DEPEN() ? (l)->next = dup_one((l)->next) : (l)->next)
+
+static struct token *parse_arguments(struct token **arglist)
 {
-	struct token *arg = list->next, *next = list;
+	struct token *list = DUP(*arglist);
+	struct token *arg = DUP_NEXT(list), *next = list;
 	struct argcount *count = &list->count;
 
+	if (DEPEN())
+		*arglist = list;
+	
 	set_arg_count(list);
 
 	if (match_op(arg, ')')) {
@@ -925,11 +944,11 @@ static struct token *parse_arguments(struct token *list)
 			goto Eva_args;
 		if (!++count->normal)
 			goto Eargs;
-		next = arg->next;
+		next = DUP_NEXT(arg);
 
 		if (match_op(next, ',')) {
 			set_arg_count(next);
-			arg = next->next;
+			arg = DUP_NEXT(next);
 			continue;
 		}
 
@@ -964,7 +983,7 @@ static struct token *parse_arguments(struct token *list)
 	}
 
 	if (match_op(arg, SPECIAL_ELLIPSIS)) {
-		next = arg->next;
+		next = DUP_NEXT(arg);
 		token_type(arg) = TOKEN_IDENT;
 		arg->ident = &__VA_ARGS___ident;
 		if (!match_op(next, ')'))
@@ -1125,7 +1144,7 @@ static int do_handle_define(struct stream *stream, struct token **line, struct t
 	if (!expansion->pos.whitespace) {
 		if (match_op(expansion, '(')) {
 			arglist = expansion;
-			expansion = parse_arguments(expansion);
+			expansion = parse_arguments(&arglist);
 			if (!expansion)
 				return 1;
 		} else if (!eof_token(expansion)) {
@@ -1299,6 +1318,8 @@ static int expression_value(struct token **where)
 	long long value;
 	int state = 0;
 
+	DEPCALL(on, *list);
+	
 	while (!eof_token(p = scan_next(list))) {
 		switch (state) {
 		case 0:
@@ -1352,8 +1373,10 @@ static int expression_value(struct token **where)
 static int handle_if(struct stream *stream, struct token **line, struct token *token)
 {
 	int value = 0;
-	if (!false_nesting)
+	if (!false_nesting) {
+		DEPCALL(push_dep,token);
 		value = expression_value(&token->next);
+	}
 
 	dirty_stream(stream);
 	return preprocessor_if(stream, token, value);
@@ -1378,6 +1401,8 @@ static int handle_elif(struct stream * stream, struct token **line, struct token
 		return 1;
 	}
 
+	DEPCALL(else_dep,token);
+	
 	dirty_stream(stream);
 	if (token_type(top_if) != TOKEN_IF)
 		return 1;
@@ -1407,6 +1432,9 @@ static int handle_else(struct stream *stream, struct token **line, struct token
 		nesting_error(stream);
 		sparse_error(token->pos, "#else after #else");
 	}
+
+	DEPCALL(else_dep,token);
+	
 	if (false_nesting) {
 		if (token_type(top_if) == TOKEN_IF)
 			false_nesting = 0;
@@ -1428,6 +1456,9 @@ static int handle_endif(struct stream *stream, struct token **line, struct token
 	}
 	if (false_nesting)
 		false_nesting--;
+
+	DEPCALL(pop_dep,token);
+	
 	stream->top_if = top_if->next;
 	__free_token(top_if);
 	return 1;
@@ -1740,7 +1771,7 @@ static void handle_preprocessor_line(struct stream *stream, struct token **line,
 	int is_normal = 1;
 
 	if (eof_token(token))
-		return;
+		goto ret;
 
 	if (token_type(token) == TOKEN_IDENT) {
 		struct symbol *sym = lookup_symbol(token->ident, NS_PREPROCESSOR);
@@ -1762,10 +1793,12 @@ static void handle_preprocessor_line(struct stream *stream, struct token **line,
 			goto out;
 	}
 	if (!handler(stream, line, token))	/* all set */
-		return;
+		goto ret;
 
 out:
 	free_preprocessor_line(token);
+ret:
+	DEPCALL(off, token);
 }
 
 static void preprocessor_line(struct stream *stream, struct token **line)
@@ -1817,6 +1850,7 @@ static void do_preprocess(struct token **list)
 
 		default:
 			dirty_stream(stream);
+			DEPCALL(tag_dep, next);
 			if (false_nesting) {
 				*list = next->next;
 				__free_token(next);
diff --git a/shrink.c b/shrink.c
new file mode 100644
index 0000000..b7a4404
--- /dev/null
+++ b/shrink.c
@@ -0,0 +1,102 @@
+/*
+ * Build macros dependency tree 
+ * Copyright (C) 2012 Konrad Eisele <konrad@gaisler.com,eiselekd@gmail.com>
+ * BSD-License
+ * Redistribution and use in source and binary forms are permitted
+ * provided that the above copyright notice and this paragraph are
+ * duplicated in all such forms and that any documentation,
+ * advertising materials, and other materials related to such
+ * distribution and use acknowledge that the software was developed
+ * by the <organization>.  The name of the
+ * University may not be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ */
+
+#include <stdarg.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <ctype.h>
+#include <unistd.h>
+#include <fcntl.h>
+
+#include "lib.h"
+#include "allocate.h"
+#include "token.h"
+#include "parse.h"
+#include "symbol.h"
+#include "expression.h"
+#include "depend.h"
+
+void  stream_print_line(FILE *io, struct stream *stream, int j);
+
+static void 
+expand_symbols(struct symbol_list *list)
+{
+	struct symbol *sym;
+	FOR_EACH_PTR(list, sym) {
+		expand_symbol(sym);
+		depend_symbol(sym);
+	} END_FOR_EACH_PTR(sym);
+}
+
+int 
+main(int argc, char **argv)
+{
+	struct string_list *filelist = NULL; int i;
+	char *file; struct symbol_list *all_syms = 0;
+	
+	init_dep();
+	
+	expand_symbols(sparse_initialize(argc, argv, &filelist));
+	FOR_EACH_PTR_NOTAG(filelist, file) {
+		struct symbol_list *syms = sparse(file);
+		expand_symbols(syms);
+		concat_symbol_list(syms, &all_syms);
+	} END_FOR_EACH_PTR_NOTAG(file);
+	
+	for (i = 0; i < input_stream_nr; i++) {
+		/* struct stream *c; int j; */
+		/* c = input_streams + i; */
+		/* if (c->n && c->b) { */
+		/* 	for (j =  0; j < c->linenr; j++) { */
+		/* 		struct stream_line *l = c->n[j]; */
+		/* 		if (l->used) { */
+		/* 			stream_print_line(stdout, c, j); */
+		/* 		} */
+		/* 	} */
+		/* } */
+
+	}
+
+	return 0;
+}
+
+/* void  */
+/* stream_print_line(FILE *io, struct stream *stream, int j) */
+/* { */
+/* 	int k; */
+/* 	if (stream->b && j < stream->linenr) { */
+/* 		struct stream_line *l = stream->n[j]; */
+/* 		int f, t, len; */
+/* 		f = t = l->off; */
+/* 		if (j+1 < stream->linenr &&  */
+/* 		    stream->n[j+1] && */
+/* 		    stream->n[j+1]->off > t) { */
+/* 			t = stream->n[j+1]->off; */
+/* 		} */
+/* 		if (f > stream->bz) */
+/* 			f = stream->bz; */
+/* 		if (t > stream->bz) */
+/* 			t = stream->bz; */
+/* 		len = t - f; */
+/* 		/\*fprintf(io, "%s:%04d:",stream->name,j);*\/ */
+/* 		for (k = 0; k < len; k++) { */
+/* 			fprintf(io, "%c", stream->b[f + k]); */
+/* 		} */
+/* 		/\*fprintf(io, "\n");*\/ */
+/* 	} */
+/* } */
diff --git a/sparse.c b/sparse.c
index 67b7d9e..ffdd8d4 100644
--- a/sparse.c
+++ b/sparse.c
@@ -23,6 +23,7 @@
 #include "symbol.h"
 #include "expression.h"
 #include "linearize.h"
+#include "depend.h"
 
 static int context_increase(struct basic_block *bb, int entry)
 {
@@ -278,6 +279,8 @@ int main(int argc, char **argv)
 	struct string_list *filelist = NULL;
 	char *file;
 
+	init_dep();
+	
 	// Expand, linearize and show it.
 	check_symbols(sparse_initialize(argc, argv, &filelist));
 	FOR_EACH_PTR_NOTAG(filelist, file) {
diff --git a/symbol.h b/symbol.h
index 1e74579..e7dcb82 100644
--- a/symbol.h
+++ b/symbol.h
@@ -11,6 +11,7 @@
 
 #include "token.h"
 #include "target.h"
+#include "depend.h"
 
 /*
  * An identifier with semantic meaning is a "symbol".
@@ -94,6 +95,7 @@ struct decl_state {
 	struct ident **ident;
 	struct symbol_op *mode;
 	unsigned char prefer_abstract, is_inline, storage_class, is_tls;
+	struct symbol_list *dep;
 };
 
 struct symbol_op {
@@ -294,6 +296,13 @@ extern void debug_symbol(struct symbol *);
 extern void merge_type(struct symbol *sym, struct symbol *base_type);
 extern void check_declaration(struct symbol *sym);
 
+static inline struct symbol *alloc_symbol_tok(struct token *tok, int type)
+{
+	struct symbol *e = alloc_symbol (tok->pos, type);
+	DEP_TOK(e, tok);
+	return e;
+}
+
 static inline struct symbol *get_base_type(const struct symbol *sym)
 {
 	return examine_symbol_type(sym->ctype.base_type);
diff --git a/token.h b/token.h
index cd29233..00518e5 100644
--- a/token.h
+++ b/token.h
@@ -152,6 +152,15 @@ struct argcount {
 	unsigned vararg:1;
 };
 
+struct arg {
+	struct token *arg;
+	struct token *expanded;
+	struct token *str;
+	int n_normal;
+	int n_quoted;
+	int n_str;
+};
+
 /*
  * This is a very common data structure, it should be kept
  * as small as humanly possible. Big (rare) types go as
@@ -203,6 +212,7 @@ extern struct token * tokenize_buffer(void *, unsigned long, struct token **);
 
 extern void show_identifier_stats(void);
 extern struct token *preprocess(struct token *);
+extern struct depend_if *dif;
 
 static inline int match_op(struct token *token, int op)
 {
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-04-24  9:54 dependency tee from c parser entities downto token Konrad Eisele
  2012-04-25 20:10 ` [PATCH] depend.c: build up a dependency tree from c entities downto tokens: entries in the tree are: macro-depend: tree of #if nesting macro-expansions: possible macro expansion source of a token tok->macro-expansions->macro tok->macro-depend->macro c entities are linked in via [stmt|expr|sym]->start-end-token Konrad Eisele
@ 2012-04-30 22:58 ` Christopher Li
  2012-05-02  7:27   ` Konrad Eisele
  1 sibling, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-04-30 22:58 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: linux-sparse

On Tue, Apr 24, 2012 at 2:54 AM, Konrad Eisele <konrad@gaisler.com> wrote:
> Hi, I'd like to extend sparse so that I can preserve a
> dependency tree that goes from c parse entities all
> the way down to single tokens. There are several places
> this can be useful:

Sorry for the delay. I take a some time to think about this problem.

May I ask some high level questions? I am trying to understand why
this this token level dependency is useful. One alternative is that,
you do a per-processor stage on the source file. Then you have one
big post processed source. The current sparse can run on that processed
source file and get all the symbol dependency.

Because each symbol has "pos" and "endpos" member. You only need
to recursively walk the ctype of the symbols you can get the shrink version
of the source code. The draw back with that is, all the macro has been
expended, you will not able to see the macro names etc.

Also, if you change the source file, the shrinking process has to be redone
because it might use new macros. So the shrinking + compiling is not
necessary a saving compare to directly compiling without the shrinking.
I figure you might have a slightly different usage module in mind to justify
the shrinking. Please help me understand how you intend to use it.

I think if the source file doesn't need pre-processing. Sparse can do symbol
dependency already. What you really want is actually being able to back
trace in the pre-processing stage, what macro has been expand and what
is the original form on the expanded value.

Chris

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-04-30 22:58 ` dependency tee from c parser entities downto token Christopher Li
@ 2012-05-02  7:27   ` Konrad Eisele
  2012-05-03 23:52     ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-02  7:27 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, linux-sparse

2012/5/1 Christopher Li <sparse@chrisli.org>:
> On Tue, Apr 24, 2012 at 2:54 AM, Konrad Eisele <konrad@gaisler.com> wrote:
>> Hi, I'd like to extend sparse so that I can preserve a
>> dependency tree that goes from c parse entities all
>> the way down to single tokens. There are several places
>> this can be useful:
>
> Sorry for the delay. I take a some time to think about this problem.
>
> May I ask some high level questions? I am trying to understand why
> this this token level dependency is useful. One alternative is that,
> you do a per-processor stage on the source file. Then you have one
> big post processed source. The current sparse can run on that processed
> source file and get all the symbol dependency.
>
> Because each symbol has "pos" and "endpos" member. You only need
> to recursively walk the ctype of the symbols you can get the shrink version
> of the source code. The draw back with that is, all the macro has been
> expended, you will not able to see the macro names etc.

But this is the point. All macros are expanded. They disappear completely in
the pre-processing step. There is no tool at all right now that
preserves the macro-dependency. Even in IDEs like Eclipse you have tools to show
macros expansions, but you have no tool to show you in a simple
way for i.e.:
#define a
#ifdef a
#define b 1
#endif
that "#define b 1" is dependent of "#define a" even the "#ifdef a" is
lost and you have to deduce it yourself from the #line comments.

I think not very many people know really how (not where) i.e. "size_t" gets
defined. There are too many "#ifdef" cases around, too many
#include.



>
> Also, if you change the source file, the shrinking process has to be redone
> because it might use new macros. So the shrinking + compiling is not
> necessary a saving compare to directly compiling without the shrinking.
> I figure you might have a slightly different usage module in mind to justify
> the shrinking. Please help me understand how you intend to use it.

Yes that is true. Still, for a fixed configuration in most cases you
can achieve a speedup. Also you would have a tool to really benchmark
source that doesnt draw in too much dependency.

>
> I think if the source file doesn't need pre-processing. Sparse can do symbol
> dependency already. What you really want is actually being able to back
> trace in the pre-processing stage, what macro has been expand and what
> is the original form on the expanded value.

The information for this is in "struct macro_expansion" of the patch and one can
build aroung it a trace of the macro expansion. I've done that with a patch for
gcc and could do it for sparse also. A example is at:
http://cfw.sourceforge.net/htmltag/init_32.c.pinfo.html
source at : http://cfw.sourceforge.net/htmltags.html
i.e. go in init_32.c.pinfo.html to mark_rodata_ro() and click on
virt_to_page (). A help in in the left pannel.

-- Konrad


>
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-02  7:27   ` Konrad Eisele
@ 2012-05-03 23:52     ` Christopher Li
  2012-05-04  7:33       ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-03 23:52 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, linux-sparse

On Wed, May 2, 2012 at 12:27 AM, Konrad Eisele <eiselekd@gmail.com> wrote:
> But this is the point. All macros are expanded. They disappear completely in
> the pre-processing step. There is no tool at all right now that
> preserves the macro-dependency. Even in IDEs like Eclipse you have tools to show
> macros expansions, but you have no tool to show you in a simple
> way for i.e.:
> #define a
> #ifdef a
> #define b 1
> #endif
> that "#define b 1" is dependent of "#define a" even the "#ifdef a" is
> lost and you have to deduce it yourself from the #line comments.

Sure, so the usage case is mostly for people to understand how
the macro get expanded.

> The information for this is in "struct macro_expansion" of the patch and one can
> build aroung it a trace of the macro expansion. I've done that with a patch for
> gcc and could do it for sparse also. A example is at:
> http://cfw.sourceforge.net/htmltag/init_32.c.pinfo.html
> source at : http://cfw.sourceforge.net/htmltags.html
> i.e. go in init_32.c.pinfo.html to mark_rodata_ro() and click on
> virt_to_page (). A help in in the left pannel.

I agree this is useful. However I feel the original patch is a bit invasive.
How about do it in a step by step way. Make a small patch to allow register
call backs when the macro expands. That way the application using sparse
get notify of the pre-processor macro expands. I can take a look at how to
implement this small patch as well.

If needed, we can make one option that pre-processor don't free the tokens.
In this case, very few changes in the caller side. I feel it cleaner
than changing
the free function behavior.

Chris

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-03 23:52     ` Christopher Li
@ 2012-05-04  7:33       ` Konrad Eisele
  2012-05-04  9:25         ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-04  7:33 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, linux-sparse

Christopher Li wrote:
> On Wed, May 2, 2012 at 12:27 AM, Konrad Eisele<eiselekd@gmail.com>  wrote:
>> But this is the point. All macros are expanded. They disappear completely in
>> the pre-processing step. There is no tool at all right now that
>> preserves the macro-dependency. Even in IDEs like Eclipse you have tools to show
>> macros expansions, but you have no tool to show you in a simple
>> way for i.e.:
>> #define a
>> #ifdef a
>> #define b 1
>> #endif
>> that "#define b 1" is dependent of "#define a" even the "#ifdef a" is
>> lost and you have to deduce it yourself from the #line comments.
>
> Sure, so the usage case is mostly for people to understand how
> the macro get expanded.

Yes, and in a human readable way, that is, the output is from
before the pre-procesing step. Otherwise it is meaningless.

>
>> The information for this is in "struct macro_expansion" of the patch and one can
>> build aroung it a trace of the macro expansion. I've done that with a patch for
>> gcc and could do it for sparse also. A example is at:
>> http://cfw.sourceforge.net/htmltag/init_32.c.pinfo.html
>> source at : http://cfw.sourceforge.net/htmltags.html
>> i.e. go in init_32.c.pinfo.html to mark_rodata_ro() and click on
>> virt_to_page (). A help in in the left pannel.
>
> I agree this is useful. However I feel the original patch is a bit invasive.
> How about do it in a step by step way. Make a small patch to allow register
> call backs when the macro expands. That way the application using sparse
> get notify of the pre-processor macro expands. I can take a look at how to
> implement this small patch as well.
>
> If needed, we can make one option that pre-processor don't free the tokens.
> In this case, very few changes in the caller side. I feel it cleaner
> than changing
> the free function behavior.

The minimum original-code-base change that I see would be:
  - "struct position <pos>" is removed in all structures except "struct token" and
    replaced with "struct token *<tok>". When <pos> is needed <tok>->pos is
    used instead. This implies that tokens are not freed by default.
  - struct statement *, struct expression *, struct *token
    all get a extra "void *custom" pointer. This pointer can be used for
    tools to save their own data.
I think this is not too far off: 4 bytes more for each struct and  "<tok>->pos"
instead of "pos", that is not a massive change. the <tok>->pos would be
a boilerplate refactoring.

Should I implemnt a patch for that as a first step?

After that I can build upon that.

-- Konrad

Ps: I post again the link for my "shrinkc" app. I think
it is very impressive. I never had en overview about stdlib.h
and could see for that first time the dependencies for i.e.
FILE *f. Ok the ouput is still raw, "#include" lines are missing,
but still you can really see what is going on. And frankly,
isnt it strange that all c-programmers use stdlib.h without
really knowing what is going on? Maybe I am the only one that
has no clue though...
$git clone git://git.code.sf.net/p/decpp/code decpp
$cd decpp
$make
$./shrinkc t1.c



============= output of "shrinkc" ===============

eiselekd+~/tmp/>shrinkc t1.c

********* builtin *********


********* preprocessor *********


********* t1.c *********

int main(int argc, char **argv) {
         FILE *f;
};

********* preprocessor *********


********* /usr/include/stdio.h *********

#ifndef _STDIO_H
#if !defined __need_FILE && !defined __need___FILE
# define _STDIO_H       1
# define __need_FILE
# define __need___FILE
#endif /* Don't need FILE.  */
#if !defined __FILE_defined && defined __need_FILE
struct _IO_FILE;
typedef struct _IO_FILE FILE;
#endif /* FILE not defined.  */
#if !defined ____FILE_defined && defined __need___FILE
typedef struct _IO_FILE __FILE;
#endif /* __FILE not defined.  */
#ifdef  _STDIO_H

********* /usr/include/features.h *********


********* /usr/include/sys/cdefs.h *********


********* /usr/include/bits/wordsize.h *********


********* /usr/include/gnu/stubs.h *********


********* /usr/include/bits/wordsize.h *********


********* /usr/include/gnu/stubs-32.h *********


********* /usr/lib/gcc/i486-slackware-linux/4.2.4//include/stddef.h *********

#endif
#endif

********* /usr/include/bits/types.h *********


********* /usr/include/bits/wordsize.h *********


********* /usr/include/bits/typesizes.h *********


********* /usr/include/libio.h *********

#ifndef _IO_STDIO_H
#ifdef _G_NEED_STDARG_H
#endif
# else
struct _IO_jump_t;  struct _IO_FILE;
# else
#else
typedef void _IO_lock_t;
#endif
struct _IO_marker {
   struct _IO_marker *_next;
   struct _IO_FILE *_sbuf;
   int _pos;
};
enum __codecvt_result
{
   __codecvt_ok,
   __codecvt_partial,
   __codecvt_error,
   __codecvt_noconv
};
struct _IO_FILE {
   int _flags;           /* High-order word is _IO_MAGIC; rest is flags. */
   char* _IO_read_ptr;   /* Current read pointer */
   char* _IO_read_end;   /* End of get area. */
   char* _IO_read_base;  /* Start of putback+get area. */
   char* _IO_write_base; /* Start of put area. */
   char* _IO_write_ptr;  /* Current put pointer. */
   char* _IO_write_end;  /* End of put area. */
   char* _IO_buf_base;   /* Start of reserve area. */
   char* _IO_buf_end;    /* End of reserve area. */
   char *_IO_save_base; /* Pointer to start of non-current get area. */
   char *_IO_backup_base;  /* Pointer to first valid character of backup area */
   char *_IO_save_end; /* Pointer to end of non-current get area. */
   struct _IO_marker *_markers;
   struct _IO_FILE *_chain;
   int _fileno;
#if 0
#else
   int _flags2;
#endif
   _IO_off_t _old_offset; /* This used to be _offset but it's too small.  */
   unsigned short _cur_column;
   signed char _vtable_offset;
   char _shortbuf[1];
   _IO_lock_t *_lock;
#if defined _G_IO_IO_FILE_VERSION && _G_IO_IO_FILE_VERSION == 0x20001
   _IO_off64_t _offset;
# if defined _LIBC || defined _GLIBCPP_USE_WCHAR_T
# else
   void *__pad1;
   void *__pad2;
   void *__pad3;
   void *__pad4;
   size_t __pad5;
# endif
   int _mode;
   char _unused2[15 * sizeof (int) - 4 * sizeof (void *) - sizeof (size_t)];
#endif
};
#endif /* _IO_STDIO_H */

********* /usr/include/_G_config.h *********

#ifndef _G_config_h
#define __need_mbstate_t
typedef struct
{
   __off_t __pos;
   __mbstate_t __state;
} _G_fpos_t;
typedef struct
{
   __off64_t __pos;
   __mbstate_t __state;
} _G_fpos64_t;
#define _G_off_t        __off_t
#define _G_off64_t      __off64_t
typedef int _G_int16_t __attribute__ ((__mode__ (__HI__)));
typedef int _G_int32_t __attribute__ ((__mode__ (__SI__)));
typedef unsigned int _G_uint16_t __attribute__ ((__mode__ (__HI__)));
typedef unsigned int _G_uint32_t __attribute__ ((__mode__ (__SI__)));
#define _G_NEED_STDARG_H 1
#define _G_IO_IO_FILE_VERSION 0x20001

********* /usr/lib/gcc/i486-slackware-linux/4.2.4//include/stddef.h *********

#endif
#endif
#endif
#endif /* defined(_ANSI_H_) || defined(_MACHINE_ANSI_H_) */
#endif /* _STDDEF_H or __need_size_t.  */
#endif

********* /usr/include/wchar.h *********

#ifndef _WCHAR_H
#if defined _WCHAR_H || defined __need_wint_t || !defined __WINT_TYPE__
# define __need_wint_t
#if (defined _WCHAR_H || defined __need_mbstate_t) && !defined __mbstate_t_defined
typedef struct
{
   int __count;
   union
   {
# ifdef __WINT_TYPE__
     __WINT_TYPE__ __wch;
# else
# endif
     char __wchb[4];
   } __value;            /* Value so far.  */
} __mbstate_t;
#endif
#endif /* ISO C99 or GCC and GNU.  */
#endif /* GCC and use GNU.  */
#endif /* Use ISO C95, C99 and Unix98. */
#endif
#endif
#endif  /* _WCHAR_H defined */
#endif /* wchar.h  */

********* /usr/lib/gcc/i486-slackware-linux/4.2.4//include/stddef.h *********

#endif
#endif
#endif
#endif /* defined(_ANSI_H_) || defined(_MACHINE_ANSI_H_) */
#endif /* _STDDEF_H or __need_size_t.  */
#endif
#endif
#endif /* __WCHAR_T__ */
#endif /* __wchar_t__ */
#endif /* _STDDEF_H or __need_wchar_t.  */
#if defined (__need_wint_t)
#ifndef _WINT_T
#ifndef __WINT_TYPE__
#define __WINT_TYPE__ unsigned int
#endif
typedef __WINT_TYPE__ wint_t;
#endif
#endif
#endif /* __sys_stdtypes_h */

********* /usr/lib/gcc/i486-slackware-linux/4.2.4//include/stdarg.h *********

#ifndef _STDARG_H
#ifndef _ANSI_STDARG_H_
#ifndef __GNUC_VA_LIST
typedef __builtin_va_list __gnuc_va_list;
#endif
#else /* not __svr4__ || _SCO_DS */
#endif
#endif /* not _VA_LIST_, except on certain systems */
#endif /* not __svr4__ */
#endif /* _STDARG_H */
#endif /* not _ANSI_STDARG_H_ */
#endif /* not _STDARG_H */

********* /usr/include/bits/stdio_lim.h *********


********* /usr/include/bits/sys_errlist.h *********


>
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04  7:33       ` Konrad Eisele
@ 2012-05-04  9:25         ` Christopher Li
  2012-05-04 10:36           ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-04  9:25 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, linux-sparse

On Fri, May 4, 2012 at 12:33 AM, Konrad Eisele <konrad@gaisler.com> wrote:
>> I agree this is useful. However I feel the original patch is a bit
>> invasive.
>> How about do it in a step by step way. Make a small patch to allow
>> register
>> call backs when the macro expands. That way the application using sparse
>> get notify of the pre-processor macro expands. I can take a look at how to
>> implement this small patch as well.
>>
>> If needed, we can make one option that pre-processor don't free the
>> tokens.
>> In this case, very few changes in the caller side. I feel it cleaner
>> than changing
>> the free function behavior.
>
>
> The minimum original-code-base change that I see would be:
>  - "struct position <pos>" is removed in all structures except "struct
> token" and
>   replaced with "struct token *<tok>". When <pos> is needed <tok>->pos is
>   used instead. This implies that tokens are not freed by default.

No, that would be a no go. I don't mind allow dependence program
have the option to keep the token around. However, for other C back
ends, the token serve no purpose. It just sit there waste memory. So
I don't want the token become mandatory.

>  - struct statement *, struct expression *, struct *token
>   all get a extra "void *custom" pointer. This pointer can be used for
>   tools to save their own data.
> I think this is not too far off: 4 bytes more for each struct and
>  "<tok>->pos"
> instead of "pos", that is not a massive change. the <tok>->pos would be
> a boilerplate refactoring.

It is still too invasive. I don't want to keep <tok>->pos in the statements
and expression.

Instead, how about using the macro_expand I purposed in previous
email. Make dependence program a two step. The first step is pure
pre-processing, the dependants program using the macro_expand hook
to keep track of macro expand details. The end result is the pre-processed
file and the program remember how the file map into original file using the
macro expand history.

The the second step is just parsing on the pre-processed file. Using
the macro expand history to map the position back to the original file.
In this way, you can do your dependency analyse with minimal
impact to sparse internals. The macro_expand hook can use to
do other useful stuff as well. Will that address your need?

BTW, have you take a look at the ctags program come with sparse?
It can find symbols created by macro expansion as well.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04  9:25         ` Christopher Li
@ 2012-05-04 10:36           ` Konrad Eisele
  2012-05-04 12:36             ` Konrad Eisele
  2012-05-04 18:02             ` Christopher Li
  0 siblings, 2 replies; 50+ messages in thread
From: Konrad Eisele @ 2012-05-04 10:36 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, linux-sparse

Christopher Li wrote:
> On Fri, May 4, 2012 at 12:33 AM, Konrad Eisele<konrad@gaisler.com>  wrote:
>>> I agree this is useful. However I feel the original patch is a bit
>>> invasive.
>>> How about do it in a step by step way. Make a small patch to allow
>>> register
>>> call backs when the macro expands. That way the application using sparse
>>> get notify of the pre-processor macro expands. I can take a look at how to
>>> implement this small patch as well.
>>>
>>> If needed, we can make one option that pre-processor don't free the
>>> tokens.
>>> In this case, very few changes in the caller side. I feel it cleaner
>>> than changing
>>> the free function behavior.
>>
>>
>> The minimum original-code-base change that I see would be:
>>   - "struct position<pos>" is removed in all structures except "struct
>> token" and
>>    replaced with "struct token *<tok>". When<pos>  is needed<tok>->pos is
>>    used instead. This implies that tokens are not freed by default.
>
> No, that would be a no go. I don't mind allow dependence program
> have the option to keep the token around. However, for other C back
> ends, the token serve no purpose. It just sit there waste memory. So
> I don't want the token become mandatory.

If it is a no-go it is a no-go.
If there is no need for the tool i proposed, there is no need. :-)
Still I try: Tokens dont sit around, they are released when
the program finishes. Treating the preprocessing stage
like nonexisting doesnt reflect the way most people use
a compiler. They always use the preprocessor even if
there might be the possibility to use the compiler with only
a preprocessed file. Therefore tokens should sit around.

>
>>   - struct statement *, struct expression *, struct *token
>>    all get a extra "void *custom" pointer. This pointer can be used for
>>    tools to save their own data.
>> I think this is not too far off: 4 bytes more for each struct and
>>   "<tok>->pos"
>> instead of "pos", that is not a massive change. the<tok>->pos would be
>> a boilerplate refactoring.
>
> It is still too invasive. I don't want to keep<tok>->pos in the statements
> and expression.

If this is invasive a little less than this would mean no change at
all.

>
> Instead, how about using the macro_expand I purposed in previous
> email. Make dependence program a two step. The first step is pure
> pre-processing, the dependants program using the macro_expand hook
> to keep track of macro expand details. The end result is the pre-processed
> file and the program remember how the file map into original file using the
> macro expand history.
>
> The the second step is just parsing on the pre-processed file. Using
> the macro expand history to map the position back to the original file.
> In this way, you can do your dependency analyse with minimal
> impact to sparse internals. The macro_expand hook can use to
> do other useful stuff as well. Will that address your need?

Thats not what I want, but rather what you want. If you
want a macro expand history, it would be faster, easier simpler
if you would hack it yourself, I dont want a macro expand,
i have my tool htmltag for that already. I want a macro dependency tree.
With only macro_expand hook and only file-scope <pos> it is not
possible.
And: until I would have come up with something that would fit your requirements
months would be gone. It seems that you know exactly how
it should  be done, there is no way for me to know how
you think a noninvasive solution would look like. The communication
takes too long.

If there is no need for the tool i proposed, there is no need.
At least I tried :-)

-- Thanks Konrad

>
> BTW, have you take a look at the ctags program come with sparse?
> It can find symbols created by macro expansion as well.
>
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04 10:36           ` Konrad Eisele
@ 2012-05-04 12:36             ` Konrad Eisele
  2012-05-04 15:30               ` Josh Triplett
  2012-05-04 18:02             ` Christopher Li
  1 sibling, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-04 12:36 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Christopher Li, linux-sparse

>>
>> No, that would be a no go. I don't mind allow dependence program
>> have the option to keep the token around. However, for other C back
>> ends, the token serve no purpose. It just sit there waste memory. So
>> I don't want the token become mandatory.
>

Ok, one more try:
   Your question is: Why is is meaningful to
   have tokens saved even in the parse stage.
I tried to come up with some meaningful example.

Take the 2 files b.c and a.h.

vvvvvv b.c vvvvv
#define d1
#include "a.h"
struct s0 { int x; };
int main(int a, char **b) {
   struct s0 v;
   d2(m);
};
^^^^^^ b.c ^^^^^^

vvvvvv a.h vvvvv
#ifdef d2
#define m v
#else
#define m n
#endif

#ifdef d1
#define d2(a) while(a.x) { }
#endif
^^^^^^ a.h ^^^^^^

Now use sparse and you get:
$./sparse b.c
b.c:6:3: error: cannot dereference this type

The error was that you forgot in b.c:
+#define d2
  #define d1
  ...

When you have a dependency tree what you can printout is:

$./sparse b.c
b.c:6:3: error: cannot dereference this type
  +macro expansion of d2 defined in a.h:8
    + defined because of #ifdef d1 in a.h:7
     + dependent of d1 defined at b.c:1
  +> argument 0 expansion at b.c:6
    + macro expansion m defined in a.h:4
      + defined because of else of #ifdef d2
        + dependend of d2 (not defined)

Or you can print it out human readable:

$./sparse b.c
b.c:6:3: error: cannot dereference this type
#define d1
#include "a.h"
  #ifdef d2
  #else
  #define m n
  #endif
  #ifdef d1
  #define d2(a) while(a.x) { }
  #endif

I've improviced a bit, the "human readable" ouput
is the whole of a.h and b.c because there is
no nondependent part to strip...

Now tell me that this is not useful and this is the
last post on this subject. Dont you think that gcc's
...
  In file included from ./b.c:3:0:
  In file included from ./x.c13:0:
...
is useful? With macro dependency saved you can
printout all.

-- Konrad





^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04 12:36             ` Konrad Eisele
@ 2012-05-04 15:30               ` Josh Triplett
  2012-05-04 20:53                 ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-05-04 15:30 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Christopher Li, linux-sparse

On Fri, May 04, 2012 at 02:36:40PM +0200, Konrad Eisele wrote:
> Take the 2 files b.c and a.h.
> 
> vvvvvv b.c vvvvv
> #define d1
> #include "a.h"
> struct s0 { int x; };
> int main(int a, char **b) {
>   struct s0 v;
>   d2(m);
> };
> ^^^^^^ b.c ^^^^^^
> 
> vvvvvv a.h vvvvv
> #ifdef d2
> #define m v
> #else
> #define m n
> #endif
> 
> #ifdef d1
> #define d2(a) while(a.x) { }
> #endif
> ^^^^^^ a.h ^^^^^^
> 
> Now use sparse and you get:
> $./sparse b.c
> b.c:6:3: error: cannot dereference this type
> 
> The error was that you forgot in b.c:
> +#define d2
>  #define d1
>  ...
> 
> When you have a dependency tree what you can printout is:
> 
> $./sparse b.c
> b.c:6:3: error: cannot dereference this type
>  +macro expansion of d2 defined in a.h:8
>    + defined because of #ifdef d1 in a.h:7
>     + dependent of d1 defined at b.c:1
>  +> argument 0 expansion at b.c:6
>    + macro expansion m defined in a.h:4
>      + defined because of else of #ifdef d2
>        + dependend of d2 (not defined)

That looks wildly useful to me.  I'd love to see that information
available to Sparse somehow, as long as it doesn't significantly impact
the performance of the common case (namely, running sparse on code that
has no warnings or errors).

One idea: could you check the impact of your patch on a Linux kernel
build (with defconfig)?  Try building the kernel with sparse (make C=2),
with and without your patch, and measure the total time.  If your patch
has negligible impact on build time, and it doesn't require changing
every other line of Sparse due to interface changes, it should prove
reasonable.

The other key point: much like Linux, Sparse doesn't normally accept
patches that add a new interface without a patch adding the
corresponding code that uses that interface.  Having an implementation
helps ensure that the design of an interface fits its intended purpose.
For instance, if you could create a simple example of the kind of output
you showed above (even just saying in a warning message "expanded from
macro foo"), perhaps modeled after LLVM's clang error messages, and
include that in a second patch depending on the first, then that
two-patch sequence would have a much better chance of getting in.

Hope that helps,
Josh Triplett

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04 10:36           ` Konrad Eisele
  2012-05-04 12:36             ` Konrad Eisele
@ 2012-05-04 18:02             ` Christopher Li
  2012-05-04 21:46               ` Konrad Eisele
  1 sibling, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-04 18:02 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, linux-sparse

On Fri, May 4, 2012 at 3:36 AM, Konrad Eisele <konrad@gaisler.com> wrote:
> If it is a no-go it is a no-go.
> If there is no need for the tool i proposed, there is no need. :-)

I think you miss my point.  It is two separate thing. I already
confirm your macro dependency is useful. I want sparse
to support it.

My suggestion is merely how to support it. You purpose
embed the token inside AST. I purpose allow a macro_expand
call back hook.

From my point of view, I can see using the macro_expand
call back hook to accomplish the same macro dependency
analyse, without significant impact the sparse internals.

If you think the macro_expend hook is not good enough,
please let me know where it is not sufficient.

> Still I try: Tokens dont sit around, they are released when
> the program finishes. Treating the preprocessing stage
> like nonexisting doesnt reflect the way most people use
> a compiler. They always use the preprocessor even if
> there might be the possibility to use the compiler with only
> a preprocessed file. Therefore tokens should sit around.

Yes token should sit around for your macro dependency
analyse. But I like it to be an option rather hard code the
token in the to the AST. Sparse is a library, there are several
program use it.

I see a way to allow your do want you want to do on the
macro dependency while not impact other program. Why
not give it a try? The point is, I don't see it is necessary
to force every one accept the expr->tok->pos. It is straightly
worse for program that don't care about the macro expand
dependency. As long as you can accomplish the same
dependency analyse, why do you care it is using the
"embed token" approach rather than macro_expand hook?

>> It is still too invasive. I don't want to keep<tok>->pos in the statements
>> and expression.
>
>
> If this is invasive a little less than this would mean no change at
> all.

Yes, it would be no change at all from the AST point of view if
we use the macro_expand hook. You just need to maintain
a hash table from old <pos> to new <pos> mapping with the
additional dependency information. You don't even need to
generate the pre-processed file explicitly. I am using that as
the thinking process how to get there.

>> The the second step is just parsing on the pre-processed file. Using
>> the macro expand history to map the position back to the original file.
>> In this way, you can do your dependency analyse with minimal
>> impact to sparse internals. The macro_expand hook can use to
>> do other useful stuff as well. Will that address your need?
>
>
> Thats not what I want, but rather what you want. If you
> want a macro expand history, it would be faster, easier simpler
> if you would hack it yourself, I dont want a macro expand,
> i have my tool htmltag for that already. I want a macro dependency tree.
> With only macro_expand hook and only file-scope <pos> it is not
> possible.

Nope, it is possible, that is what I am purposing. Sorry I previous
explain has been very high level, I haven't explain in the implementation
detail of every stage.

So the first patch would be adding the macro_expand hook into sparse.
After a pre-processor macro expend, it will call the the macro_expand
hook if the user register one. (the hook is not NULL).

In the macro_expand hook, it will receive:
- macro before the expand,
- args for the macro
- replacement tokens after the expand.

This will give your macro dependency program a chance to
exam and manipulate the token before it get insert back
to original token list.


Here is how your macro dependency program can use the
macro_expand hook.

The program should create a internal stream call "<pre-processor>".
The content of the file is just the result of macro expand. One
macro at a line, the the order they are expanded. You can use the
pos->line to index when macro expand it is. Notice that you don't
need to actually write out the stream into disk.

Then, inside the macro_expand hook that receive the macro
expand call back.

There will be an array of data structure keep track of the
macro expand. The first macro expand is on the first element
of the array. Let's call this data structure "struct macro_deps".

Inside "struct macro_deps", it will keep track of the original
macro before the expand. The list of the tokens it depends on.
That is your dependency information.

It will allocate one "struct macro_deps" and fill it out, append
to the end of the array.

Before you macro_expand hook return, it walk the replacement
token. For each "token->pos" in the replacement token, it will
replace the stream number to to "pre-processor", and line number
to the index of the "struct macro_deps" in the array. Before the
replacement, if the original stream is already "<pre-processor>",
that means you are expanding the result from another macro expand.
Using the old pos->line to look up the inner macro expand, add
inner macro's dependency list into the current macro dependency list.

Then after the pre-processor stage. All the token from macro
expand will look as if they are expand from the "pre-processor"
file, line number can be use as index to lookup the array to find
out the detail of this macro expand.

Will that work for your dependency file. I notice that it not 100%
the same with your dependency, but with the intact history. You
should able to find that out.

> And: until I would have come up with something that would fit your
> requirements
> months would be gone. It seems that you know exactly how
> it should  be done, there is no way for me to know how
> you think a noninvasive solution would look like. The communication
> takes too long.

So here it is. I already give you the details of the implementation.
Of course, the first step for macro_expand hook is much smaller
scope. Please let me know that works or not.

>
> If there is no need for the tool i proposed, there is no need.
> At least I tried :-)

I already confirm that is useful. Just how to implement it.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04 15:30               ` Josh Triplett
@ 2012-05-04 20:53                 ` Konrad Eisele
  2012-05-04 22:30                   ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-04 20:53 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Konrad Eisele, Christopher Li, linux-sparse

On 05/04/2012 05:30 PM, Josh Triplett wrote:
> On Fri, May 04, 2012 at 02:36:40PM +0200, Konrad Eisele wrote:
>> Take the 2 files b.c and a.h.
>>
>> vvvvvv b.c vvvvv
>> #define d1
>> #include "a.h"
>> struct s0 { int x; };
>> int main(int a, char **b) {
>>    struct s0 v;
>>    d2(m);
>> };
>> ^^^^^^ b.c ^^^^^^
>>
>> vvvvvv a.h vvvvv
>> #ifdef d2
>> #define m v
>> #else
>> #define m n
>> #endif
>>
>> #ifdef d1
>> #define d2(a) while(a.x) { }
>> #endif
>> ^^^^^^ a.h ^^^^^^
>>
>> Now use sparse and you get:
>> $./sparse b.c
>> b.c:6:3: error: cannot dereference this type
>>
>> The error was that you forgot in b.c:
>> +#define d2
>>   #define d1
>>   ...
>>
>> When you have a dependency tree what you can printout is:
>>
>> $./sparse b.c
>> b.c:6:3: error: cannot dereference this type
>>   +macro expansion of d2 defined in a.h:8
>>     + defined because of #ifdef d1 in a.h:7
>>      + dependent of d1 defined at b.c:1
>>   +>  argument 0 expansion at b.c:6
>>     + macro expansion m defined in a.h:4
>>       + defined because of else of #ifdef d2
>>         + dependend of d2 (not defined)
>
> That looks wildly useful to me.  I'd love to see that information
> available to Sparse somehow, as long as it doesn't significantly impact
> the performance of the common case (namely, running sparse on code that
> has no warnings or errors).
>
> One idea: could you check the impact of your patch on a Linux kernel
> build (with defconfig)?  Try building the kernel with sparse (make C=2),
> with and without your patch, and measure the total time.  If your patch
> has negligible impact on build time, and it doesn't require changing
> every other line of Sparse due to interface changes, it should prove
> reasonable.

make C=2:

original sparse:
real    17m54.997s
user    15m25.181s
sys     2m11.281s

decpp-sparse from "git clone git://git.code.sf.net/p/decpp/code decpp "
real    18m29.748s
user    16m18.155s
sys     2m13.221s

But decpp is not written with performance in common cases in mind.
The 2 runs probably also depend on other factors too.
I cant think that 4 bytes extra for each token can have a big impact,
if I would implement it that way (it is not in decpp).

>
> The other key point: much like Linux, Sparse doesn't normally accept
> patches that add a new interface without a patch adding the
> corresponding code that uses that interface.  Having an implementation
> helps ensure that the design of an interface fits its intended purpose.
> For instance, if you could create a simple example of the kind of output
> you showed above (even just saying in a warning message "expanded from
> macro foo"), perhaps modeled after LLVM's clang error messages, and
> include that in a second patch depending on the first, then that
> two-patch sequence would have a much better chance of getting in.

I understand. Actually the code to demonstrate is
git://git.code.sf.net/p/decpp/code , then do a
$make
$./shrinkc t1.c
That is kind of the goal.
And - it does require some  internal structure
change. You dont get this kind of functionality
for free. You have to be invasive, isnt this
something that is obvious?. And in my view, it
can come with penalty. The preprocessing stage
is not something that should be neglected all
the time as if not existent. You struggle
with Macros half of the time you program.


-- Konrad


>
> Hope that helps,
> Josh Triplett
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04 18:02             ` Christopher Li
@ 2012-05-04 21:46               ` Konrad Eisele
  2012-05-04 21:56                 ` Konrad Eisele
  2012-05-04 23:05                 ` Christopher Li
  0 siblings, 2 replies; 50+ messages in thread
From: Konrad Eisele @ 2012-05-04 21:46 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, linux-sparse

>
> I think you miss my point.  It is two separate thing. I already
> confirm your macro dependency is useful. I want sparse
> to support it.
>

Nice to hear this.
When I talk about macro dependency I mean not only the
macro expansion trace. I mean:
  1. The #if (and #include) nestings (with dependencies
     pointing to the macros used in the proprocessor line)
  2. The macro expansion trace
  3. The connection 1+2 into the AST.
Your macro_expand() hook addresses (2) only, but I cant
see how all the extra context for each token can be saved
in that sheme.
In my patch I have modeled (2) using 2 structs:
struct macro_expansion {
	int nargs;
	struct symbol *sym;
	struct token *m;
	struct arg args[0];
};
struct tok_macro_dep {
	struct macro_expansion *m;
	unsigned int argi;
	unsigned int isbody : 1;
	unsigned int visited : 1;
};
Each token from a macro expansion gets tagged with
tok_macro_dep. If it is an macro argument, <argi> shows the
index, if it is from the macro body <isbody> is 1.
Now, I didnt already think about special cases like
token concaternation, even more data is needed to
model this. Also when an macro argument is again used as an
macro argument inside the body expansion, then I kindof
loose the chain: I would also need a "token *dup_of" pointer
to point to the original token that the token is a copy
of (when arguments are created...) etc.

I have read your macro_expand() hook idea, however
when I understand it right you want to reuse position.stream and
position.line as a kind of pointer (to save the extra 4 bytes).
(Your goal is to minimize codebase change, however I wonder
weather you dont change semantic of struct position and then
need to change the code that uses struct position anyway...)
Maybe it is possible like this...I doubt it, where should
all the extra context, that each token has, be saved and
extracted from? using that sheme...

Maybe it is possible but I dont want to have as a design
goal to save 4 bytes (I'd use the void *custom sheme to
save all my extra data, also the pointers to tokens to
"sit around") and adujust everything else to
that. The consequence is that the code-complexity would
grow on the other end.

Here is my compromise then:
Keep the orignial "pos". But still grant me for
each struct a "void *custom" pointer that I can use
to store extradata i.e. pointer to token.

-- Konrad

> My suggestion is merely how to support it. You purpose
> embed the token inside AST. I purpose allow a macro_expand
> call back hook.
>
>> From my point of view, I can see using the macro_expand
> call back hook to accomplish the same macro dependency
> analyse, without significant impact the sparse internals.
>
> If you think the macro_expend hook is not good enough,
> please let me know where it is not sufficient.
>
>> Still I try: Tokens dont sit around, they are released when
>> the program finishes. Treating the preprocessing stage
>> like nonexisting doesnt reflect the way most people use
>> a compiler. They always use the preprocessor even if
>> there might be the possibility to use the compiler with only
>> a preprocessed file. Therefore tokens should sit around.
>
> Yes token should sit around for your macro dependency
> analyse. But I like it to be an option rather hard code the
> token in the to the AST. Sparse is a library, there are several
> program use it.
>
> I see a way to allow your do want you want to do on the
> macro dependency while not impact other program. Why
> not give it a try? The point is, I don't see it is necessary
> to force every one accept the expr->tok->pos. It is straightly
> worse for program that don't care about the macro expand
> dependency. As long as you can accomplish the same
> dependency analyse, why do you care it is using the
> "embed token" approach rather than macro_expand hook?
>
>>> It is still too invasive. I don't want to keep<tok>->pos in the statements
>>> and expression.
>>
>>
>> If this is invasive a little less than this would mean no change at
>> all.
>
> Yes, it would be no change at all from the AST point of view if
> we use the macro_expand hook. You just need to maintain
> a hash table from old<pos>  to new<pos>  mapping with the
> additional dependency information. You don't even need to
> generate the pre-processed file explicitly. I am using that as
> the thinking process how to get there.
>
>>> The the second step is just parsing on the pre-processed file. Using
>>> the macro expand history to map the position back to the original file.
>>> In this way, you can do your dependency analyse with minimal
>>> impact to sparse internals. The macro_expand hook can use to
>>> do other useful stuff as well. Will that address your need?
>>
>>
>> Thats not what I want, but rather what you want. If you
>> want a macro expand history, it would be faster, easier simpler
>> if you would hack it yourself, I dont want a macro expand,
>> i have my tool htmltag for that already. I want a macro dependency tree.
>> With only macro_expand hook and only file-scope<pos>  it is not
>> possible.
>
> Nope, it is possible, that is what I am purposing. Sorry I previous
> explain has been very high level, I haven't explain in the implementation
> detail of every stage.
>
> So the first patch would be adding the macro_expand hook into sparse.
> After a pre-processor macro expend, it will call the the macro_expand
> hook if the user register one. (the hook is not NULL).
>
> In the macro_expand hook, it will receive:
> - macro before the expand,
> - args for the macro
> - replacement tokens after the expand.
>
> This will give your macro dependency program a chance to
> exam and manipulate the token before it get insert back
> to original token list.
>
>
> Here is how your macro dependency program can use the
> macro_expand hook.
>
> The program should create a internal stream call "<pre-processor>".
> The content of the file is just the result of macro expand. One
> macro at a line, the the order they are expanded. You can use the
> pos->line to index when macro expand it is. Notice that you don't
> need to actually write out the stream into disk.
>
> Then, inside the macro_expand hook that receive the macro
> expand call back.
>
> There will be an array of data structure keep track of the
> macro expand. The first macro expand is on the first element
> of the array. Let's call this data structure "struct macro_deps".
>
> Inside "struct macro_deps", it will keep track of the original
> macro before the expand. The list of the tokens it depends on.
> That is your dependency information.
>
> It will allocate one "struct macro_deps" and fill it out, append
> to the end of the array.
>
> Before you macro_expand hook return, it walk the replacement
> token. For each "token->pos" in the replacement token, it will
> replace the stream number to to "pre-processor", and line number
> to the index of the "struct macro_deps" in the array. Before the
> replacement, if the original stream is already "<pre-processor>",
> that means you are expanding the result from another macro expand.
> Using the old pos->line to look up the inner macro expand, add
> inner macro's dependency list into the current macro dependency list.
>
> Then after the pre-processor stage. All the token from macro
> expand will look as if they are expand from the "pre-processor"
> file, line number can be use as index to lookup the array to find
> out the detail of this macro expand.
>
> Will that work for your dependency file. I notice that it not 100%
> the same with your dependency, but with the intact history. You
> should able to find that out.
>
>> And: until I would have come up with something that would fit your
>> requirements
>> months would be gone. It seems that you know exactly how
>> it should  be done, there is no way for me to know how
>> you think a noninvasive solution would look like. The communication
>> takes too long.
>
> So here it is. I already give you the details of the implementation.
> Of course, the first step for macro_expand hook is much smaller
> scope. Please let me know that works or not.
>
>>
>> If there is no need for the tool i proposed, there is no need.
>> At least I tried :-)
>
> I already confirm that is useful. Just how to implement it.
>
> Chris
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04 21:46               ` Konrad Eisele
@ 2012-05-04 21:56                 ` Konrad Eisele
  2012-05-04 23:05                 ` Christopher Li
  1 sibling, 0 replies; 50+ messages in thread
From: Konrad Eisele @ 2012-05-04 21:56 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, linux-sparse

Append:

> 1. The #if (and #include) nestings (with dependencies
> pointing to the macros used in the proprocessor line)
> 2. The macro expansion trace
> 3. The connection 1+2 into the AST.

Should be:

1. The #if (and #include) nestings (with dependencies
    pointing to the macros used in the proprocessor line)
2. The macro expansion trace (with dependencies
    pointing to the macros used in the expansion)
3. The connection of 1+2 into the AST.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04 20:53                 ` Konrad Eisele
@ 2012-05-04 22:30                   ` Christopher Li
  2012-05-05  0:32                     ` Josh Triplett
  2012-05-05  8:56                     ` Konrad Eisele
  0 siblings, 2 replies; 50+ messages in thread
From: Christopher Li @ 2012-05-04 22:30 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Josh Triplett, Konrad Eisele, linux-sparse

On Fri, May 4, 2012 at 1:53 PM, Konrad Eisele <eiselekd@gmail.com> wrote:
> make C=2:
>
> original sparse:
> real    17m54.997s
> user    15m25.181s
> sys     2m11.281s
>
> decpp-sparse from "git clone git://git.code.sf.net/p/decpp/code decpp "
> real    18m29.748s
> user    16m18.155s
> sys     2m13.221s
>
> But decpp is not written with performance in common cases in mind.
> The 2 runs probably also depend on other factors too.
> I cant think that 4 bytes extra for each token can have a big impact,
> if I would implement it that way (it is not in decpp).

The deal breaker is not able to free token list if other program using
sparse don't need it. I believe that I have an alternative approach
allow you do want you want while keeping the impact to sparse
internal small. In my mind, that is straightly better than your
current patch. I can share more details if you found a missing link.

> I understand. Actually the code to demonstrate is
> git://git.code.sf.net/p/decpp/code , then do a
> $make
> $./shrinkc t1.c
> That is kind of the goal.
> And - it does require some  internal structure
> change. You dont get this kind of functionality
> for free. You have to be invasive, isnt this
> something that is obvious?. And in my view, it
> can come with penalty. The preprocessing stage

That doesn't mean that we should pay any penalty
for it. Especially I believe there is better alternative
approach to allow you do what you want and keep
the API clean. You need to be a little patient here
to understand my alternative suggestions. I did spend
some time here to come up with the approach to make
it works for you and keep myself happy about the internals.

You dismiss the my suggestion too eagerly without
considering how to make it work. You are pretty much saying,
"Nah, this is not going to work, I am calling in the big hammers."
You should at least consider it, point out where you think
it doesn't work, so that I can provide addition details how
to make it work. I did notice that I have some detail left out in
my suggestions, mostly due to time constrain to write it up.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04 21:46               ` Konrad Eisele
  2012-05-04 21:56                 ` Konrad Eisele
@ 2012-05-04 23:05                 ` Christopher Li
  2012-05-05  8:54                   ` Konrad Eisele
  1 sibling, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-04 23:05 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, linux-sparse

On Fri, May 4, 2012 at 2:46 PM, Konrad Eisele <eiselekd@gmail.com> wrote:
>
> Nice to hear this.
> When I talk about macro dependency I mean not only the
> macro expansion trace. I mean:
>  1. The #if (and #include) nestings (with dependencies
>    pointing to the macros used in the proprocessor line)
>  2. The macro expansion trace
>  3. The connection 1+2 into the AST.
> Your macro_expand() hook addresses (2) only, but I cant
> see how all the extra context for each token can be saved
> in that sheme.

That is much better. There is two separate problem here.
One is keep track of all the macro expand history so you can
trace back the token back to the original form. I believe my
description of the macro_expand hook should take care of that.

Now how to connect the AST tree with those information is a
very good question. Notice the symbol->aux pointer? That is
the place to attach extra context or back end related data
to symbols.

Because each symbol has "pos" and "endpos". If the symbol
is expand from macro, using the previous scheme, the pos
should point to a line in the "<pre-processor>" stream.

However, if the macro expand is happen between "pos" and
"endpos", you will not able to access the token that contain
the macro expand "pos" easily.

For that, we could, just thinking it out loud, add a parser
hook for declares when a symbol is complete building.
That would a very small and straight forward change.
If the hook is not NULL, the call back function will be call
with the symbol that just get defined, and the start and end
token of that symbol.

So your dependence program just need to register the
symbol parsing hook. In side the call back function, walk
the token from start to end. Look up macro expand information
is needed. Build up the dependency struct and store that in
symbol->aux.

BTW, unrelated to this patch, I can see other program might
be able to use the same parser hook to perform source code
transformations as well.

Make sense? In this way, you don't even need the hash
table to attach a context into the token. You can get it directly
from symbol->aux.

> In my patch I have modeled (2) using 2 structs:
> struct macro_expansion {
>        int nargs;
>        struct symbol *sym;
>        struct token *m;
>        struct arg args[0];
> };
> struct tok_macro_dep {
>        struct macro_expansion *m;
>        unsigned int argi;
>        unsigned int isbody : 1;
>        unsigned int visited : 1;
> };
> Each token from a macro expansion gets tagged with
> tok_macro_dep. If it is an macro argument, <argi> shows the
> index, if it is from the macro body <isbody> is 1.
> Now, I didnt already think about special cases like
> token concaternation, even more data is needed to
> model this. Also when an macro argument is again used as an
> macro argument inside the body expansion, then I kindof
> loose the chain: I would also need a "token *dup_of" pointer
> to point to the original token that the token is a copy
> of (when arguments are created...) etc.
>
> I have read your macro_expand() hook idea, however
> when I understand it right you want to reuse position.stream and
> position.line as a kind of pointer (to save the extra 4 bytes).
> (Your goal is to minimize codebase change, however I wonder
> weather you dont change semantic of struct position and then
> need to change the code that uses struct position anyway...)

Nope, because the position.stream change is only happen on
your dependency analyse program. It is the dependency program
register the hook to it. This behaviour is private to the dependency
analyse program. Other program that use sparse library don't see
it at all, because they don't register macro_expand hooks to perform
those stream manipulations. It will receive the exact AST as before.

> Maybe it is possible like this...I doubt it, where should
> all the extra context, that each token has, be saved and
> extracted from? using that sheme...

Two places, one is symbol->aux. Also the macro_expand
can be lookup by pos->line. That will index into the macro_expand
array which store the context.

Having this two should be enough to put the exact same
dependency result as you are doing right now.

> Maybe it is possible but I dont want to have as a design
> goal to save 4 bytes (I'd use the void *custom sheme to
> save all my extra data, also the pointers to tokens to
> "sit around") and adujust everything else to
> that. The consequence is that the code-complexity would
> grow on the other end.

It is not only about saving 4 bytes. It is about other program
don't have to suck in the full token struct if they don't need to.
It is about re-usable macro hooks and parser hooks that
external program can do more fancy stuff like source code transformations
without impacting the other user of the sparse lib.

> Here is my compromise then:
> Keep the orignial "pos". But still grant me for
> each struct a "void *custom" pointer that I can use
> to store extradata i.e. pointer to token.

symbol->aux.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04 22:30                   ` Christopher Li
@ 2012-05-05  0:32                     ` Josh Triplett
  2012-05-05  8:59                       ` Konrad Eisele
  2012-05-05  8:56                     ` Konrad Eisele
  1 sibling, 1 reply; 50+ messages in thread
From: Josh Triplett @ 2012-05-05  0:32 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Konrad Eisele, linux-sparse

On Fri, May 04, 2012 at 03:30:21PM -0700, Christopher Li wrote:
> On Fri, May 4, 2012 at 1:53 PM, Konrad Eisele <eiselekd@gmail.com> wrote:
> > make C=2:
> >
> > original sparse:
> > real    17m54.997s
> > user    15m25.181s
> > sys     2m11.281s
> >
> > decpp-sparse from "git clone git://git.code.sf.net/p/decpp/code decpp "
> > real    18m29.748s
> > user    16m18.155s
> > sys     2m13.221s
> >
> > But decpp is not written with performance in common cases in mind.
> > The 2 runs probably also depend on other factors too.
> > I cant think that 4 bytes extra for each token can have a big impact,
> > if I would implement it that way (it is not in decpp).
> 
> The deal breaker is not able to free token list if other program using
> sparse don't need it.

From the top of token.h:

/*
 * Basic tokenization structures. NOTE! Those tokens had better
 * be pretty small, since we're going to keep them all in memory
 * indefinitely.

Has that changed?  If so, perhaps the comment needs fixing.  If not, I
suspect the problem lies elsewhere; perhaps in the extra levels of
indirection introduced by the patch, rather than in the extra memory
usage.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04 23:05                 ` Christopher Li
@ 2012-05-05  8:54                   ` Konrad Eisele
  2012-05-05 11:12                     ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-05  8:54 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, linux-sparse

On 05/05/2012 01:05 AM, Christopher Li wrote:
> On Fri, May 4, 2012 at 2:46 PM, Konrad Eisele<eiselekd@gmail.com>  wrote:
>>
>> Nice to hear this.
>> When I talk about macro dependency I mean not only the
>> macro expansion trace. I mean:
>>   (1). The #if (and #include) nestings (with dependencies
>>        pointing to the macros used in the proprocessor line)
>>   (2). The macro expansion trace
>>   (3). The connection 1+2 into the AST.
>> Your macro_expand() hook addresses (2) only, but I cant
>> see how all the extra context for each token can be saved
>> in that sheme.
>
> That is much better. There is two separate problem here.
> One is keep track of all the macro expand history so you can
> trace back the token back to the original form. I believe my
> description of the macro_expand hook should take care of that.

Ok, I'll try to implement it the way you suggest, coding macro-
expansion into token.pos See (Concerning (2)). Tell me weather
I can start implementing the scheme stated below (at least for
(Concerning (2)). I would add 3 hooks as stated in "Conclusion:" of
section "Concerning (2)". Can you give the ok to go?

Concerning (1): You didnt comment on this point.
------------------------------------------------

I would need a list-based-pushdown-stack. Each entry would
register calls to lookup_macro() when inside a # preprocessor
line. Then an mechanism has to be implemented to tag each
token with an entry in the pushdown stack (which builds up a
tree). I guess that you dont want a pointer in struct token :-)
so maybe the pushdown stack can define start-pos and when popped
end-pos and use these "ranges" to match tokens.

I would need hooks for this in the # preprocessor line locations.

Concerning (2): Macro expansion trace using token.<pos>
-------------------------------------------------------

I've thought about how to fit in macro_expand and stuffing
macro trace into <pos>. Below is my sketch how I would record
a macro expansion. p[] is the array of preprocessor-"lines",
rather, it is an array of PP_struct (see below) with extra info
needed for each line. PP_struct.copy is the copy of the array of
tokens involved.

Annotation: p[x] denotes the stuffing of the macrotrace into
position.stream==preprocess,position.line==pp-line.
Tokenlists are written with "." between: tok0 . tok1 . ...
Under the tokenlists I have written below each token its
token.pos in p[x] notation, when token.pos is from file-scope
I have written a range, i.e [a.h:1:23..a.h:1:45] so not to
have write it for each token.

Note that a reference to p[] in p[x] notation only references
the "start" of the  PP_struct.copy. An uique identification
of the "source" token might not always be possible because
of disambiguities, so when doing a copy of the  tokens in
PP_struct.copy I might use an extended version of struct token
to also include an offset.

----- file a.h start -----
#define D0(d0a0,d0a1) 1 D1(d0a0) 2 D2(d0a1) 3
#define D1(d1a0) 4 d1a0 5
#define D2(d2a0) 6 d2a0 7
#define D3(d3a0) 8 d3a0 9
D0(D3(10),11)
----- file a.h end   .....

Preprocessor output (gcc -E a.h): "1 4 8 10 9 5 2 6 11 7 3"

PreProcessor macro trace on p[]:

p[0]:mdefn_body[D0]     :1.D1.(.d0a0.).2.D2.(.d0a1.).3
                          [ a.h:1:23     ..   a.h:1:45]
p[1]:mdefn_body[D1]     :4   .   d1a0   .    5
                          [ a.h:2:18..a.h:2:25]
p[2]:mdefn_body[D2]     :6   .   d2a0   .    7
                          [ a.h:3:18..a.h:3:25]
p[3]:mdefn_body[D3]     :8   .   d3a0   .    9
                          [ a.h:4:18..a.h:4:25]
p[4]:minst_arg0[D0]     :D3  . (  .   10 . )
                          [ a.h:5:4..a.h:5:9]
p[5]:minst_arg1[D0]     :11
                          [a.h:5:11]
p[6]:minst_arg0[D3]     :10
                          p[4]
p[7]:(args)expand[p[3]] :8    .  10   .  9
                          p[3]    p[4]    p[3]
p[8]:minst_arg0[d2]     :11
                          p[5]
p[9]:(body)expand[p[2]] :6   .   11   .    7
                          p[2]    p[5]      p[2]
p[10]:(body)expand[p[0]]:1  .4  .8  .10 .9  .5  .2  .6  .11 .7  .3
                          p[0]p[1]p[7]p[7]p[7]p[1]p[0]p[9]p[9]p[9]p[0]


p[0]-p[3] are build up when the macro is defined.
           A p[] entry is needed to destinguish between
           the different sources of tokens.
p[4],p[5] is build in collect_arguments() for D0(D3(10),11)
p[6]      is build in collect_arguments() for D3(10)
p[7]      is build in call to macro_expand() hook with flag that
           it is a (args)expand
p[8]      is build in collect_arguments() for D2(11)
           (inside D0's expansion
p[9]      is build in call to macro_expand() hook with flag that
           it is a (body)expand (of D2)
p[10]     is build in call to macro_expand() hook with flag that
           it is a (body)expand (of D0)

PP_struct {
           enum {minst_arg, expand_body, expand_arg, mdef_body} typ;
	  uint argidx;
           struct symbol *macro;
	  struct token copy[];
};

Conclusion:
-----------
Apart from the macro_expand() hook I also need hooks
in macro definition and also in collect_arguments() or expand().


Concerning (3) How to connect (1) and (2) to the AST
----------------------------------------------------

can maybe wait for later iteration. There are more complex parts
involved...


>
> Now how to connect the AST tree with those information is a
> very good question. Notice the symbol->aux pointer? That is
> the place to attach extra context or back end related data
> to symbols.
>
> Because each symbol has "pos" and "endpos". If the symbol
> is expand from macro, using the previous scheme, the pos
> should point to a line in the "<pre-processor>" stream.
>
> However, if the macro expand is happen between "pos" and
> "endpos", you will not able to access the token that contain
> the macro expand "pos" easily.
>
> For that, we could, just thinking it out loud, add a parser
> hook for declares when a symbol is complete building.
> That would a very small and straight forward change.
> If the hook is not NULL, the call back function will be call
> with the symbol that just get defined, and the start and end
> token of that symbol.
>
> So your dependence program just need to register the
> symbol parsing hook. In side the call back function, walk
> the token from start to end. Look up macro expand information
> is needed. Build up the dependency struct and store that in
> symbol->aux.
>
> BTW, unrelated to this patch, I can see other program might
> be able to use the same parser hook to perform source code
> transformations as well.
>
> Make sense? In this way, you don't even need the hash
> table to attach a context into the token. You can get it directly
> from symbol->aux.
>
>> In my patch I have modeled (2) using 2 structs:
>> struct macro_expansion {
>>         int nargs;
>>         struct symbol *sym;
>>         struct token *m;
>>         struct arg args[0];
>> };
>> struct tok_macro_dep {
>>         struct macro_expansion *m;
>>         unsigned int argi;
>>         unsigned int isbody : 1;
>>         unsigned int visited : 1;
>> };
>> Each token from a macro expansion gets tagged with
>> tok_macro_dep. If it is an macro argument,<argi>  shows the
>> index, if it is from the macro body<isbody>  is 1.
>> Now, I didnt already think about special cases like
>> token concaternation, even more data is needed to
>> model this. Also when an macro argument is again used as an
>> macro argument inside the body expansion, then I kindof
>> loose the chain: I would also need a "token *dup_of" pointer
>> to point to the original token that the token is a copy
>> of (when arguments are created...) etc.
>>
>> I have read your macro_expand() hook idea, however
>> when I understand it right you want to reuse position.stream and
>> position.line as a kind of pointer (to save the extra 4 bytes).
>> (Your goal is to minimize codebase change, however I wonder
>> weather you dont change semantic of struct position and then
>> need to change the code that uses struct position anyway...)
>
> Nope, because the position.stream change is only happen on
> your dependency analyse program. It is the dependency program
> register the hook to it. This behaviour is private to the dependency
> analyse program. Other program that use sparse library don't see
> it at all, because they don't register macro_expand hooks to perform
> those stream manipulations. It will receive the exact AST as before.
>
>> Maybe it is possible like this...I doubt it, where should
>> all the extra context, that each token has, be saved and
>> extracted from? using that sheme...
>
> Two places, one is symbol->aux. Also the macro_expand
> can be lookup by pos->line. That will index into the macro_expand
> array which store the context.
>
> Having this two should be enough to put the exact same
> dependency result as you are doing right now.
>
>> Maybe it is possible but I dont want to have as a design
>> goal to save 4 bytes (I'd use the void *custom sheme to
>> save all my extra data, also the pointers to tokens to
>> "sit around") and adujust everything else to
>> that. The consequence is that the code-complexity would
>> grow on the other end.
>
> It is not only about saving 4 bytes. It is about other program
> don't have to suck in the full token struct if they don't need to.
> It is about re-usable macro hooks and parser hooks that
> external program can do more fancy stuff like source code transformations
> without impacting the other user of the sparse lib.
>
>> Here is my compromise then:
>> Keep the orignial "pos". But still grant me for
>> each struct a "void *custom" pointer that I can use
>> to store extradata i.e. pointer to token.
>
> symbol->aux.
>
> Chris
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-04 22:30                   ` Christopher Li
  2012-05-05  0:32                     ` Josh Triplett
@ 2012-05-05  8:56                     ` Konrad Eisele
  1 sibling, 0 replies; 50+ messages in thread
From: Konrad Eisele @ 2012-05-05  8:56 UTC (permalink / raw)
  To: Christopher Li; +Cc: Josh Triplett, Konrad Eisele, linux-sparse

On 05/05/2012 12:30 AM, Christopher Li wrote:
> On Fri, May 4, 2012 at 1:53 PM, Konrad Eisele<eiselekd@gmail.com>  wrote:
>> make C=2:
>>
>> original sparse:
>> real    17m54.997s
>> user    15m25.181s
>> sys     2m11.281s
>>
>> decpp-sparse from "git clone git://git.code.sf.net/p/decpp/code decpp "
>> real    18m29.748s
>> user    16m18.155s
>> sys     2m13.221s
>>
>> But decpp is not written with performance in common cases in mind.
>> The 2 runs probably also depend on other factors too.
>> I cant think that 4 bytes extra for each token can have a big impact,
>> if I would implement it that way (it is not in decpp).
>
> The deal breaker is not able to free token list if other program using
> sparse don't need it. I believe that I have an alternative approach
> allow you do want you want while keeping the impact to sparse
> internal small. In my mind, that is straightly better than your
> current patch. I can share more details if you found a missing link.
>
>> I understand. Actually the code to demonstrate is
>> git://git.code.sf.net/p/decpp/code , then do a
>> $make
>> $./shrinkc t1.c
>> That is kind of the goal.
>> And - it does require some  internal structure
>> change. You dont get this kind of functionality
>> for free. You have to be invasive, isnt this
>> something that is obvious?. And in my view, it
>> can come with penalty. The preprocessing stage
>
> That doesn't mean that we should pay any penalty
> for it. Especially I believe there is better alternative
> approach to allow you do what you want and keep
> the API clean. You need to be a little patient here
> to understand my alternative suggestions. I did spend
> some time here to come up with the approach to make
> it works for you and keep myself happy about the internals.
>
> You dismiss the my suggestion too eagerly without
> considering how to make it work. You are pretty much saying,
> "Nah, this is not going to work, I am calling in the big hammers."
> You should at least consider it, point out where you think
> it doesn't work, so that I can provide addition details how
> to make it work. I did notice that I have some detail left out in
> my suggestions, mostly due to time constrain to write it up.
>
> Chris
>

I'm willing to learn, I'll implement your token<pos> scheme,
see the other mail just sent on that subject
-- Konrad



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-05  0:32                     ` Josh Triplett
@ 2012-05-05  8:59                       ` Konrad Eisele
  0 siblings, 0 replies; 50+ messages in thread
From: Konrad Eisele @ 2012-05-05  8:59 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Christopher Li, Konrad Eisele, linux-sparse

On 05/05/2012 02:32 AM, Josh Triplett wrote:
> On Fri, May 04, 2012 at 03:30:21PM -0700, Christopher Li wrote:
>> On Fri, May 4, 2012 at 1:53 PM, Konrad Eisele<eiselekd@gmail.com>  wrote:
>>> make C=2:
>>>
>>> original sparse:
>>> real    17m54.997s
>>> user    15m25.181s
>>> sys     2m11.281s
>>>
>>> decpp-sparse from "git clone git://git.code.sf.net/p/decpp/code decpp "
>>> real    18m29.748s
>>> user    16m18.155s
>>> sys     2m13.221s
>>>
>>> But decpp is not written with performance in common cases in mind.
>>> The 2 runs probably also depend on other factors too.
>>> I cant think that 4 bytes extra for each token can have a big impact,
>>> if I would implement it that way (it is not in decpp).
>>
>> The deal breaker is not able to free token list if other program using
>> sparse don't need it.
>
>> From the top of token.h:
>
> /*
>   * Basic tokenization structures. NOTE! Those tokens had better
>   * be pretty small, since we're going to keep them all in memory
>   * indefinitely.
>
> Has that changed?  If so, perhaps the comment needs fixing.  If not, I
> suspect the problem lies elsewhere; perhaps in the extra levels of
> indirection introduced by the patch, rather than in the extra memory
> usage.

The benchmarking of decpp's sparse is not based on the patch I sent,
rather decpp's sparse does a lot more tagging to every token with a lot
of extra data. And without performance in mind.Programmed with the
common cases performance in mind it shouldnt have an impact at all...
I think I understood Christophers suggestion now and migh implement
it that way..
-- Konrad



>
> - Josh Triplett
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-05  8:54                   ` Konrad Eisele
@ 2012-05-05 11:12                     ` Christopher Li
  2012-05-05 16:59                       ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-05 11:12 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, linux-sparse

[-- Attachment #1: Type: text/plain, Size: 11133 bytes --]

On Sat, May 5, 2012 at 1:54 AM, Konrad Eisele <eiselekd@gmail.com> wrote:
>>
>>
>> That is much better. There is two separate problem here.
>> One is keep track of all the macro expand history so you can
>> trace back the token back to the original form. I believe my
>> description of the macro_expand hook should take care of that.
>
>
> Ok, I'll try to implement it the way you suggest, coding macro-
> expansion into token.pos See (Concerning (2)). Tell me weather
> I can start implementing the scheme stated below (at least for
> (Concerning (2)). I would add 3 hooks as stated in "Conclusion:" of
> section "Concerning (2)". Can you give the ok to go?
>
> Concerning (1): You didnt comment on this point.

Oh, obviously you can register the preprocess hook to notifify
of the ifdef and include. I consider adding this kind of the call back
function less invasive because program that only use the stander
preprocessor don't need to pay the price for it.

I attach a patch I am playing with the macro expand hook ideas.
It compiles but I haven't did any more test beyond that.
Feel free to use or modify it as you see fit.


> ------------------------------------------------
>
> I would need a list-based-pushdown-stack. Each entry would
> register calls to lookup_macro() when inside a # preprocessor
> line. Then an mechanism has to be implemented to tag each
> token with an entry in the pushdown stack (which builds up a
> tree). I guess that you dont want a pointer in struct token :-)
> so maybe the pushdown stack can define start-pos and when popped
> end-pos and use these "ranges" to match tokens.

I consider how to track the the macro dependency belongs to
internal behaviour of the shrink program. Spase is a library used
by this program. I want to make sure the API provide by spase
library is good enough information to allow dependency analyse.

As program that call into the sparse library, I don't have strong
opinion on how those program should implement. I care more
about the plumbing part of the sparse library, what API sparse
should provide to enable this kind of analyse.

> I would need hooks for this in the # preprocessor line locations.

Sure. Add it as one of the preprocessor hooks.

>
> Concerning (2): Macro expansion trace using token.<pos>
> -------------------------------------------------------
>
> I've thought about how to fit in macro_expand and stuffing
> macro trace into <pos>. Below is my sketch how I would record
> a macro expansion. p[] is the array of preprocessor-"lines",
> rather, it is an array of PP_struct (see below) with extra info
> needed for each line. PP_struct.copy is the copy of the array of
> tokens involved.
>
> Annotation: p[x] denotes the stuffing of the macrotrace into
> position.stream==preprocess,position.line==pp-line.
> Tokenlists are written with "." between: tok0 . tok1 . ...
> Under the tokenlists I have written below each token its
> token.pos in p[x] notation, when token.pos is from file-scope
> I have written a range, i.e [a.h:1:23..a.h:1:45] so not to
> have write it for each token.

Again, as long as it is part of the program and not the sparse
library. You are relative free to chose your implementation
within your program. It should not have impact on other
program that use sparse library any way. I use <pos> to make
a point that macro_expand hook can work.

I am not sure I understand your range representation yet.

To be continue...

Chris


>
> Note that a reference to p[] in p[x] notation only references
> the "start" of the  PP_struct.copy. An uique identification
> of the "source" token might not always be possible because
> of disambiguities, so when doing a copy of the  tokens in
> PP_struct.copy I might use an extended version of struct token
> to also include an offset.
>
> ----- file a.h start -----
> #define D0(d0a0,d0a1) 1 D1(d0a0) 2 D2(d0a1) 3
> #define D1(d1a0) 4 d1a0 5
> #define D2(d2a0) 6 d2a0 7
> #define D3(d3a0) 8 d3a0 9
> D0(D3(10),11)
> ----- file a.h end   .....
>
> Preprocessor output (gcc -E a.h): "1 4 8 10 9 5 2 6 11 7 3"
>
> PreProcessor macro trace on p[]:
>
> p[0]:mdefn_body[D0]     :1.D1.(.d0a0.).2.D2.(.d0a1.).3
>                         [ a.h:1:23     ..   a.h:1:45]
> p[1]:mdefn_body[D1]     :4   .   d1a0   .    5
>                         [ a.h:2:18..a.h:2:25]
> p[2]:mdefn_body[D2]     :6   .   d2a0   .    7
>                         [ a.h:3:18..a.h:3:25]
> p[3]:mdefn_body[D3]     :8   .   d3a0   .    9
>                         [ a.h:4:18..a.h:4:25]
> p[4]:minst_arg0[D0]     :D3  . (  .   10 . )
>                         [ a.h:5:4..a.h:5:9]
> p[5]:minst_arg1[D0]     :11
>                         [a.h:5:11]
> p[6]:minst_arg0[D3]     :10
>                         p[4]
> p[7]:(args)expand[p[3]] :8    .  10   .  9
>                         p[3]    p[4]    p[3]
> p[8]:minst_arg0[d2]     :11
>                         p[5]
> p[9]:(body)expand[p[2]] :6   .   11   .    7
>                         p[2]    p[5]      p[2]
> p[10]:(body)expand[p[0]]:1  .4  .8  .10 .9  .5  .2  .6  .11 .7  .3
>                         p[0]p[1]p[7]p[7]p[7]p[1]p[0]p[9]p[9]p[9]p[0]
>
>
> p[0]-p[3] are build up when the macro is defined.
>          A p[] entry is needed to destinguish between
>          the different sources of tokens.
> p[4],p[5] is build in collect_arguments() for D0(D3(10),11)
> p[6]      is build in collect_arguments() for D3(10)
> p[7]      is build in call to macro_expand() hook with flag that
>          it is a (args)expand
> p[8]      is build in collect_arguments() for D2(11)
>          (inside D0's expansion
> p[9]      is build in call to macro_expand() hook with flag that
>          it is a (body)expand (of D2)
> p[10]     is build in call to macro_expand() hook with flag that
>          it is a (body)expand (of D0)
>
> PP_struct {
>          enum {minst_arg, expand_body, expand_arg, mdef_body} typ;
>          uint argidx;
>          struct symbol *macro;
>          struct token copy[];
> };
>
> Conclusion:
> -----------
> Apart from the macro_expand() hook I also need hooks
> in macro definition and also in collect_arguments() or expand().
>
>
> Concerning (3) How to connect (1) and (2) to the AST
> ----------------------------------------------------
>
> can maybe wait for later iteration. There are more complex parts
> involved...
>
>
>
>>
>> Now how to connect the AST tree with those information is a
>> very good question. Notice the symbol->aux pointer? That is
>> the place to attach extra context or back end related data
>> to symbols.
>>
>> Because each symbol has "pos" and "endpos". If the symbol
>> is expand from macro, using the previous scheme, the pos
>> should point to a line in the "<pre-processor>" stream.
>>
>> However, if the macro expand is happen between "pos" and
>> "endpos", you will not able to access the token that contain
>> the macro expand "pos" easily.
>>
>> For that, we could, just thinking it out loud, add a parser
>> hook for declares when a symbol is complete building.
>> That would a very small and straight forward change.
>> If the hook is not NULL, the call back function will be call
>> with the symbol that just get defined, and the start and end
>> token of that symbol.
>>
>> So your dependence program just need to register the
>> symbol parsing hook. In side the call back function, walk
>> the token from start to end. Look up macro expand information
>> is needed. Build up the dependency struct and store that in
>> symbol->aux.
>>
>> BTW, unrelated to this patch, I can see other program might
>> be able to use the same parser hook to perform source code
>> transformations as well.
>>
>> Make sense? In this way, you don't even need the hash
>> table to attach a context into the token. You can get it directly
>> from symbol->aux.
>>
>>> In my patch I have modeled (2) using 2 structs:
>>> struct macro_expansion {
>>>        int nargs;
>>>        struct symbol *sym;
>>>        struct token *m;
>>>        struct arg args[0];
>>> };
>>> struct tok_macro_dep {
>>>        struct macro_expansion *m;
>>>        unsigned int argi;
>>>        unsigned int isbody : 1;
>>>        unsigned int visited : 1;
>>> };
>>> Each token from a macro expansion gets tagged with
>>> tok_macro_dep. If it is an macro argument,<argi>  shows the
>>> index, if it is from the macro body<isbody>  is 1.
>>> Now, I didnt already think about special cases like
>>> token concaternation, even more data is needed to
>>> model this. Also when an macro argument is again used as an
>>> macro argument inside the body expansion, then I kindof
>>> loose the chain: I would also need a "token *dup_of" pointer
>>> to point to the original token that the token is a copy
>>> of (when arguments are created...) etc.
>>>
>>> I have read your macro_expand() hook idea, however
>>> when I understand it right you want to reuse position.stream and
>>> position.line as a kind of pointer (to save the extra 4 bytes).
>>> (Your goal is to minimize codebase change, however I wonder
>>> weather you dont change semantic of struct position and then
>>> need to change the code that uses struct position anyway...)
>>
>>
>> Nope, because the position.stream change is only happen on
>> your dependency analyse program. It is the dependency program
>> register the hook to it. This behaviour is private to the dependency
>> analyse program. Other program that use sparse library don't see
>> it at all, because they don't register macro_expand hooks to perform
>> those stream manipulations. It will receive the exact AST as before.
>>
>>> Maybe it is possible like this...I doubt it, where should
>>> all the extra context, that each token has, be saved and
>>> extracted from? using that sheme...
>>
>>
>> Two places, one is symbol->aux. Also the macro_expand
>> can be lookup by pos->line. That will index into the macro_expand
>> array which store the context.
>>
>> Having this two should be enough to put the exact same
>> dependency result as you are doing right now.
>>
>>> Maybe it is possible but I dont want to have as a design
>>> goal to save 4 bytes (I'd use the void *custom sheme to
>>> save all my extra data, also the pointers to tokens to
>>> "sit around") and adujust everything else to
>>> that. The consequence is that the code-complexity would
>>> grow on the other end.
>>
>>
>> It is not only about saving 4 bytes. It is about other program
>> don't have to suck in the full token struct if they don't need to.
>> It is about re-usable macro hooks and parser hooks that
>> external program can do more fancy stuff like source code transformations
>> without impacting the other user of the sparse lib.
>>
>>> Here is my compromise then:
>>> Keep the orignial "pos". But still grant me for
>>> each struct a "void *custom" pointer that I can use
>>> to store extradata i.e. pointer to token.
>>
>>
>> symbol->aux.
>>
>> Chris
>>
>

[-- Attachment #2: 0001-macro-expand-hook.patch --]
[-- Type: application/octet-stream, Size: 1574 bytes --]

From dd9c308543da53138d12bb04261d62b8adc112a8 Mon Sep 17 00:00:00 2001
From: Christopher Li <sparse@chrisli.org>
Date: Sat, 5 May 2012 03:09:18 -0700
Subject: [PATCH] macro expand hook

---
 pre-process.c |    4 ++++
 token.h       |    5 +++++
 2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/pre-process.c b/pre-process.c
index 8a16f8b..63cd509 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -30,6 +30,8 @@
 
 static int false_nesting = 0;
 
+struct preprocess_hook *preprocess_hook = NULL;
+
 #define INCLUDEPATHS 300
 const char *includepath[INCLUDEPATHS+1] = {
 	"",
@@ -625,6 +627,8 @@ static int expand(struct token **list, struct symbol *sym)
 
 	last = token->next;
 	tail = substitute(list, sym->expansion, args);
+	if (preprocess_hook && preprocess_hook->expand)
+		preprocess_hook->expand(token, list, tail);
 	*tail = last;
 
 	return 0;
diff --git a/token.h b/token.h
index cd29233..2344932 100644
--- a/token.h
+++ b/token.h
@@ -171,6 +171,10 @@ struct token {
 	};
 };
 
+struct preprocess_hook {
+	void (*expand)(struct token *macro, struct token **replace, struct token **replace_tail);
+};
+
 #define MAX_STRING 4095
 
 static inline struct token *containing_token(struct token **p)
@@ -188,6 +192,7 @@ static inline struct token *containing_token(struct token **p)
  */
 extern struct token eof_token_entry;
 #define eof_token(x) ((x) == &eof_token_entry)
+extern struct preprocess_hook *preprocess_hook;
 
 extern int init_stream(const char *, int fd, const char **next_path);
 extern const char *stream_name(int stream);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: dependency tee from c parser entities downto token
  2012-05-05 11:12                     ` Christopher Li
@ 2012-05-05 16:59                       ` Konrad Eisele
       [not found]                         ` <CANeU7Qn7vUzLQAF6JGRECro_pPDnL7MCswkrNACe1wohLHZu7g@mail.gmail.com>
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-05 16:59 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, linux-sparse

>
> I am not sure I understand your range representation yet.
>

You need to view it with a fixed width font. Its not rocket science,
token lists (or arrays) are viewed as dotted lists. The token.pos
field is listed below each token as p[x] or as the file-location
in file-scope.

I'll come up with a patch to implement this scheme when I have
time to and send it, it might take a while.
-- Konrad

> To be continue...
>
> Chris
>
>
>>
>> Note that a reference to p[] in p[x] notation only references
>> the "start" of the  PP_struct.copy. An uique identification
>> of the "source" token might not always be possible because
>> of disambiguities, so when doing a copy of the  tokens in
>> PP_struct.copy I might use an extended version of struct token
>> to also include an offset.
>>
>> ----- file a.h start -----
>> #define D0(d0a0,d0a1) 1 D1(d0a0) 2 D2(d0a1) 3
>> #define D1(d1a0) 4 d1a0 5
>> #define D2(d2a0) 6 d2a0 7
>> #define D3(d3a0) 8 d3a0 9
>> D0(D3(10),11)
>> ----- file a.h end   .....
>>
>> Preprocessor output (gcc -E a.h): "1 4 8 10 9 5 2 6 11 7 3"
>>
>> PreProcessor macro trace on p[]:
>>
>> p[0]:mdefn_body[D0]     :1.D1.(.d0a0.).2.D2.(.d0a1.).3
>>                         [ a.h:1:23     ..   a.h:1:45]
>> p[1]:mdefn_body[D1]     :4   .   d1a0   .    5
>>                         [ a.h:2:18..a.h:2:25]
>> p[2]:mdefn_body[D2]     :6   .   d2a0   .    7
>>                         [ a.h:3:18..a.h:3:25]
>> p[3]:mdefn_body[D3]     :8   .   d3a0   .    9
>>                         [ a.h:4:18..a.h:4:25]
>> p[4]:minst_arg0[D0]     :D3  . (  .   10 . )
>>                         [ a.h:5:4..a.h:5:9]
>> p[5]:minst_arg1[D0]     :11
>>                         [a.h:5:11]
>> p[6]:minst_arg0[D3]     :10
>>                         p[4]
>> p[7]:(args)expand[p[3]] :8    .  10   .  9
>>                         p[3]    p[4]    p[3]
>> p[8]:minst_arg0[d2]     :11
>>                         p[5]
>> p[9]:(body)expand[p[2]] :6   .   11   .    7
>>                         p[2]    p[5]      p[2]
>> p[10]:(body)expand[p[0]]:1  .4  .8  .10 .9  .5  .2  .6  .11 .7  .3
>>                         p[0]p[1]p[7]p[7]p[7]p[1]p[0]p[9]p[9]p[9]p[0]
>>
>>
>> p[0]-p[3] are build up when the macro is defined.
>>          A p[] entry is needed to destinguish between
>>          the different sources of tokens.
>> p[4],p[5] is build in collect_arguments() for D0(D3(10),11)
>> p[6]      is build in collect_arguments() for D3(10)
>> p[7]      is build in call to macro_expand() hook with flag that
>>          it is a (args)expand
>> p[8]      is build in collect_arguments() for D2(11)
>>          (inside D0's expansion
>> p[9]      is build in call to macro_expand() hook with flag that
>>          it is a (body)expand (of D2)
>> p[10]     is build in call to macro_expand() hook with flag that
>>          it is a (body)expand (of D0)
>>
>> PP_struct {
>>          enum {minst_arg, expand_body, expand_arg, mdef_body} typ;
>>          uint argidx;
>>          struct symbol *macro;
>>          struct token copy[];
>> };
>>
>> Conclusion:
>> -----------
>> Apart from the macro_expand() hook I also need hooks
>> in macro definition and also in collect_arguments() or expand().
>>
>>
>> Concerning (3) How to connect (1) and (2) to the AST
>> ----------------------------------------------------
>>
>> can maybe wait for later iteration. There are more complex parts
>> involved...
>>
>>
>>
>>>
>>> Now how to connect the AST tree with those information is a
>>> very good question. Notice the symbol->aux pointer? That is
>>> the place to attach extra context or back end related data
>>> to symbols.
>>>
>>> Because each symbol has "pos" and "endpos". If the symbol
>>> is expand from macro, using the previous scheme, the pos
>>> should point to a line in the "<pre-processor>" stream.
>>>
>>> However, if the macro expand is happen between "pos" and
>>> "endpos", you will not able to access the token that contain
>>> the macro expand "pos" easily.
>>>
>>> For that, we could, just thinking it out loud, add a parser
>>> hook for declares when a symbol is complete building.
>>> That would a very small and straight forward change.
>>> If the hook is not NULL, the call back function will be call
>>> with the symbol that just get defined, and the start and end
>>> token of that symbol.
>>>
>>> So your dependence program just need to register the
>>> symbol parsing hook. In side the call back function, walk
>>> the token from start to end. Look up macro expand information
>>> is needed. Build up the dependency struct and store that in
>>> symbol->aux.
>>>
>>> BTW, unrelated to this patch, I can see other program might
>>> be able to use the same parser hook to perform source code
>>> transformations as well.
>>>
>>> Make sense? In this way, you don't even need the hash
>>> table to attach a context into the token. You can get it directly
>>> from symbol->aux.
>>>
>>>> In my patch I have modeled (2) using 2 structs:
>>>> struct macro_expansion {
>>>>        int nargs;
>>>>        struct symbol *sym;
>>>>        struct token *m;
>>>>        struct arg args[0];
>>>> };
>>>> struct tok_macro_dep {
>>>>        struct macro_expansion *m;
>>>>        unsigned int argi;
>>>>        unsigned int isbody : 1;
>>>>        unsigned int visited : 1;
>>>> };
>>>> Each token from a macro expansion gets tagged with
>>>> tok_macro_dep. If it is an macro argument,<argi>  shows the
>>>> index, if it is from the macro body<isbody>  is 1.
>>>> Now, I didnt already think about special cases like
>>>> token concaternation, even more data is needed to
>>>> model this. Also when an macro argument is again used as an
>>>> macro argument inside the body expansion, then I kindof
>>>> loose the chain: I would also need a "token *dup_of" pointer
>>>> to point to the original token that the token is a copy
>>>> of (when arguments are created...) etc.
>>>>
>>>> I have read your macro_expand() hook idea, however
>>>> when I understand it right you want to reuse position.stream and
>>>> position.line as a kind of pointer (to save the extra 4 bytes).
>>>> (Your goal is to minimize codebase change, however I wonder
>>>> weather you dont change semantic of struct position and then
>>>> need to change the code that uses struct position anyway...)
>>>
>>>
>>> Nope, because the position.stream change is only happen on
>>> your dependency analyse program. It is the dependency program
>>> register the hook to it. This behaviour is private to the dependency
>>> analyse program. Other program that use sparse library don't see
>>> it at all, because they don't register macro_expand hooks to perform
>>> those stream manipulations. It will receive the exact AST as before.
>>>
>>>> Maybe it is possible like this...I doubt it, where should
>>>> all the extra context, that each token has, be saved and
>>>> extracted from? using that sheme...
>>>
>>>
>>> Two places, one is symbol->aux. Also the macro_expand
>>> can be lookup by pos->line. That will index into the macro_expand
>>> array which store the context.
>>>
>>> Having this two should be enough to put the exact same
>>> dependency result as you are doing right now.
>>>
>>>> Maybe it is possible but I dont want to have as a design
>>>> goal to save 4 bytes (I'd use the void *custom sheme to
>>>> save all my extra data, also the pointers to tokens to
>>>> "sit around") and adujust everything else to
>>>> that. The consequence is that the code-complexity would
>>>> grow on the other end.
>>>
>>>
>>> It is not only about saving 4 bytes. It is about other program
>>> don't have to suck in the full token struct if they don't need to.
>>> It is about re-usable macro hooks and parser hooks that
>>> external program can do more fancy stuff like source code transformations
>>> without impacting the other user of the sparse lib.
>>>
>>>> Here is my compromise then:
>>>> Keep the orignial "pos". But still grant me for
>>>> each struct a "void *custom" pointer that I can use
>>>> to store extradata i.e. pointer to token.
>>>
>>>
>>> symbol->aux.
>>>
>>> Chris
>>>
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Fwd: dependency tee from c parser entities downto token
       [not found]                         ` <CANeU7Qn7vUzLQAF6JGRECro_pPDnL7MCswkrNACe1wohLHZu7g@mail.gmail.com>
@ 2012-05-05 19:56                           ` Christopher Li
  2012-05-05 23:38                             ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-05 19:56 UTC (permalink / raw)
  To: Linux-Sparse; +Cc: Konrad Eisele

Sorry I forget to CC the list.

Chris


---------- Forwarded message ----------
From: Christopher Li <sparse@chrisli.org>
Date: Sat, May 5, 2012 at 12:54 PM
Subject: Re: dependency tee from c parser entities downto token
To: Konrad Eisele <eiselekd@gmail.com>


On Sat, May 5, 2012 at 9:59 AM, Konrad Eisele <eiselekd@gmail.com> wrote:
>>
>> I am not sure I understand your range representation yet.
>>
>
> You need to view it with a fixed width font. Its not rocket science,
> token lists (or arrays) are viewed as dotted lists. The token.pos
> field is listed below each token as p[x] or as the file-location
> in file-scope.

I am not sure using range is a win because you need to find
the token within a list of ranges. That is a lot of hassle.
Just walk the replacement token in the macro expand hook
and replace them all with new stream and line number seems
easier. The look up is much easier, just an index into the array.

>
> I'll come up with a patch to implement this scheme when I have
> time to and send it, it might take a while.

Can you keep the change to the core sparse library (e.g. adding
hook in parser and pre-processor) as a separate patch? You can
send that part out for review earlier if other part of your dependency
analyse is not ready yet.

Chris

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-05 19:56                           ` Fwd: " Christopher Li
@ 2012-05-05 23:38                             ` Konrad Eisele
  2012-05-06 18:34                               ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-05 23:38 UTC (permalink / raw)
  To: Christopher Li; +Cc: Linux-Sparse

[-- Attachment #1: Type: text/plain, Size: 1101 bytes --]


>> I'll come up with a patch to implement this scheme when I have
>> time to and send it, it might take a while.
>
> Can you keep the change to the core sparse library (e.g. adding
> hook in parser and pre-processor) as a separate patch? You can
> send that part out for review earlier if other part of your dependency
> analyse is not ready yet.

I appended a diff for review. Is this kind of interface ok?
This is kind of not a patch to apply, rather I want to avoid to put
effort in it and you telling me later I am too intrusive...:). So
can you give a ok or comment...

Interface so far:
struct preprocess_hook {
	def      : called when #define is processed
	args_beg : called before argument expasnion
	args_end : called after argument expasnion
	body_beg : called before body expansion
	body_end : called after body expansion
	post     : called after preprocess
};
All of there I found are needed. There might be more to be added...

I also introduce a tokentype TOKEN_M_EMPTY so that I can track
empty expansion. To filter these out again I add the post hook.

-- Konrad








>
> Chris
>


[-- Attachment #2: diff.txt --]
[-- Type: text/plain, Size: 5615 bytes --]

diff --git a/lib.c b/lib.c
index 396e9f1..554d6c4 100644
--- a/lib.c
+++ b/lib.c
@@ -974,7 +974,8 @@ struct symbol_list * __sparse(char *filename)
 	res = sparse_keep_tokens(filename);
 
 	/* Drop the tokens for this file after parsing */
-	clear_token_alloc();
+	if (!PPHOOKEN())
+		clear_token_alloc();
 
 	/* And return it */
 	return res;
diff --git a/pre-process.c b/pre-process.c
index 8a16f8b..99ee5a4 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -30,6 +30,8 @@
 
 static int false_nesting = 0;
 
+struct preprocess_hook *preprocess_hook = NULL;
+
 #define INCLUDEPATHS 300
 const char *includepath[INCLUDEPATHS+1] = {
 	"",
@@ -74,9 +76,10 @@ static const char **dirafter_includepath = includepath + 3;
 static struct token *alloc_token(struct position *pos)
 {
 	struct token *token = __alloc_token(0);
-
-	token->pos.stream = pos->stream;
-	token->pos.line = pos->line;
+	if (PPHOOKEN()) {
+		token->pos.stream = pos->stream;
+		token->pos.line = pos->line;
+	}
 	token->pos.pos = pos->pos;
 	token->pos.whitespace = 1;
 	return token;
@@ -106,7 +109,7 @@ static void replace_with_integer(struct token *token, unsigned int val)
 	token->number = buf;
 }
 
-static struct symbol *lookup_macro(struct ident *ident)
+struct symbol *lookup_macro(struct ident *ident)
 {
 	struct symbol *sym = lookup_symbol(ident, NS_MACRO | NS_UNDEF);
 	if (sym && sym->namespace != NS_MACRO)
@@ -231,28 +234,17 @@ static struct token *collect_arg(struct token *prev, int vararg, struct position
 		} else if (match_op(next, ',') && !nesting && !vararg) {
 			break;
 		}
-		next->pos.stream = pos->stream;
-		next->pos.line = pos->line;
-		next->pos.pos = pos->pos;
+		if (!PPHOOKEN()) {
+			next->pos.stream = pos->stream;
+			next->pos.line = pos->line;
+			next->pos.pos = pos->pos;
+		}
 		p = &next->next;
 	}
 	*p = &eof_token_entry;
 	return next;
 }
 
-/*
- * We store arglist as <counter> [arg1] <number of uses for arg1> ... eof
- */
-
-struct arg {
-	struct token *arg;
-	struct token *expanded;
-	struct token *str;
-	int n_normal;
-	int n_quoted;
-	int n_str;
-};
-
 static int collect_arguments(struct token *start, struct token *arglist, struct arg *args, struct token *what)
 {
 	int wanted = arglist->count.normal;
@@ -476,6 +468,8 @@ static struct token *dup_token(struct token *token, struct position *streampos,
 {
 	struct token *alloc = alloc_token(streampos);
 	token_type(alloc) = token_type(token);
+	alloc->pos.stream = pos->stream;
+	alloc->pos.line = pos->line;
 	alloc->pos.newline = pos->newline;
 	alloc->pos.whitespace = pos->whitespace;
 	alloc->number = token->number;
@@ -618,13 +612,21 @@ static int expand(struct token **list, struct symbol *sym)
 			return 1;
 		if (!collect_arguments(token->next, sym->arglist, args, token))
 			return 1;
+
+		PPHOOK (args_beg, token, nargs, args);
 		expand_arguments(nargs, args);
+		PPHOOK (args_end, token, nargs, args);
 	}
 
 	expanding->tainted = 1;
 
 	last = token->next;
+
+	PPHOOK (body_beg, token, sym->expansion);
+
 	tail = substitute(list, sym->expansion, args);
+	PPHOOK (body_end, token, list, tail);
+
 	*tail = last;
 
 	return 0;
@@ -893,6 +895,8 @@ static int token_list_different(struct token *list1, struct token *list2)
 			return 0;
 		if (!list1 || !list2)
 			return 1;
+		while(PPHOOKEN() && token_type(list1) == TOKEN_M_EMPTY) list1 = list1->next;
+		while(PPHOOKEN() && token_type(list2) == TOKEN_M_EMPTY) list2 = list2->next;
 		if (token_different(list1, list2))
 			return 1;
 		list1 = list1->next;
@@ -1140,6 +1144,8 @@ static int do_handle_define(struct stream *stream, struct token **line, struct t
 
 	ret = 1;
 	sym = lookup_symbol(name, NS_MACRO | NS_UNDEF);
+	PPHOOK (def, left, &expansion);
+	
 	if (sym) {
 		int clean;
 
@@ -1835,6 +1841,7 @@ struct token * preprocess(struct token *token)
 	preprocessing = 1;
 	init_preprocessor();
 	do_preprocess(&token);
+	PPHOOK(post, &token);
 
 	// Drop all expressions from preprocessing, they're not used any more.
 	// This is not true when we have multiple files, though ;/
diff --git a/token.h b/token.h
index cd29233..0111d23 100644
--- a/token.h
+++ b/token.h
@@ -84,6 +84,7 @@ enum token_type {
 	TOKEN_IF,
 	TOKEN_SKIP_GROUPS,
 	TOKEN_ELSE,
+	TOKEN_M_EMPTY,
 };
 
 /* Combination tokens */
@@ -171,6 +172,32 @@ struct token {
 	};
 };
 
+/*
+ * We store arglist as <counter> [arg1] <number of uses for arg1> ... eof
+ */
+
+struct arg {
+	struct token *arg;
+	struct token *expanded;
+	struct token *str;
+	int n_normal;
+	int n_quoted;
+	int n_str;
+};
+
+struct preprocess_hook {
+	void (*def)(struct token *macro, struct token **ex);
+	void (*args_beg)(struct token *macro, int count, struct arg *a);
+	void (*args_end)(struct token *macro, int count, struct arg *a);
+	void (*body_beg)(struct token *macro, struct token *body);
+	void (*body_end)(struct token *macro, struct token **rep, struct token **reptail);
+	void (*post)(struct token **token);
+	
+	
+};
+#define PPHOOK(n, args... ) if (preprocess_hook && preprocess_hook->n) preprocess_hook->n( args );
+#define PPHOOKEN() (preprocess_hook != 0) 
+
 #define MAX_STRING 4095
 
 static inline struct token *containing_token(struct token **p)
@@ -188,7 +215,9 @@ static inline struct token *containing_token(struct token **p)
  */
 extern struct token eof_token_entry;
 #define eof_token(x) ((x) == &eof_token_entry)
+extern struct preprocess_hook *preprocess_hook;
 
+extern struct symbol *lookup_macro(struct ident *ident);
 extern int init_stream(const char *, int fd, const char **next_path);
 extern const char *stream_name(int stream);
 extern struct ident *hash_ident(struct ident *);

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-05 23:38                             ` Konrad Eisele
@ 2012-05-06 18:34                               ` Christopher Li
  2012-05-07  6:12                                 ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-06 18:34 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Linux-Sparse

On Sat, May 5, 2012 at 4:38 PM, Konrad Eisele <eiselekd@gmail.com> wrote:
>
> I appended a diff for review. Is this kind of interface ok?
> This is kind of not a patch to apply, rather I want to avoid to put
> effort in it and you telling me later I am too intrusive...:). So
> can you give a ok or comment...
>
> Interface so far:
> struct preprocess_hook {
>        def      : called when #define is processed

Why do you need #define hooks? The macro symbol
should have pos and macro body already.

>        args_beg : called before argument expasnion
>        args_end : called after argument expasnion
>        body_beg : called before body expansion
>        body_end : called after body expansion

I am wondering why is do you need 4 call backs for macro
expand instead of just one like the patch I post previously.
That macro expand give you the name of the macro need to
be expand and the replacement list.  My guess is that you
want the end position of the macro before expand. That
can be add to the macro expand hook. Will that be sufficient?


>        post     : called after preprocess

I don't think the post call back is needed. You can call
the preprocessor directly. It will return when it complete.
Then you can do your post work. You can invoke the parser
separately.  The sparse wrapper function does it all in
one function but that did not stop you it step by step
if you wants to.


> All of there I found are needed. There might be more to be added...
>
> I also introduce a tokentype TOKEN_M_EMPTY so that I can track
> empty expansion. To filter these out again I add the post hook.

What purpose does the token_m_empty if you filter those token later?
Rember where in the stream there is an invisible macro?

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-06 18:34                               ` Christopher Li
@ 2012-05-07  6:12                                 ` Konrad Eisele
  2012-05-07 22:06                                   ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-07  6:12 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

Christopher Li wrote:
> On Sat, May 5, 2012 at 4:38 PM, Konrad Eisele<eiselekd@gmail.com>  wrote:
>>
>> I appended a diff for review. Is this kind of interface ok?
>> This is kind of not a patch to apply, rather I want to avoid to put
>> effort in it and you telling me later I am too intrusive...:). So
>> can you give a ok or comment...
>>
>> Interface so far:
>> struct preprocess_hook {
>>         def      : called when #define is processed
>
> Why do you need #define hooks? The macro symbol
> should have pos and macro body already.

I need to hook to insert a TOKEN_M_EMPTY into the substitution body
if the define is an empty define. The "post" hook is then used to
remove all TOKEN_M_EMPTY again after the preprocessing (or rather
to save it internally).

>
>>         args_beg : called before argument expasnion
>>         args_end : called after argument expasnion
>>         body_beg : called before body expansion
>>         body_end : called after body expansion
>
> I am wondering why is do you need 4 call backs for macro
> expand instead of just one like the patch I post previously.
> That macro expand give you the name of the macro need to
> be expand and the replacement list.  My guess is that you
> want the end position of the macro before expand. That
> can be add to the macro expand hook. Will that be sufficient?

I want to trace the complete macro substitution. Using a before-after
hook for both body and arg I can reconstruct what is done in the
expand (Not just dependencies but the whole of expand).


>
>
>>         post     : called after preprocess
>
> I don't think the post call back is needed. You can call
> the preprocessor directly. It will return when it complete.
> Then you can do your post work. You can invoke the parser
> separately.  The sparse wrapper function does it all in
> one function but that did not stop you it step by step
> if you wants to.

See above. I need the post to filter out the helper TOKEN_M_EMPTY.

>
>
>> All of there I found are needed. There might be more to be added...
>>
>> I also introduce a tokentype TOKEN_M_EMPTY so that I can track
>> empty expansion. To filter these out again I add the post hook.
>
> What purpose does the token_m_empty if you filter those token later?
> Rember where in the stream there is an invisible macro?

See above. I use it as a placeholder. The "post" hook saves the information
then and removes the token again.




>
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-07  6:12                                 ` Konrad Eisele
@ 2012-05-07 22:06                                   ` Christopher Li
  2012-05-08  6:38                                     ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-07 22:06 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Linux-Sparse

Hi Konrad,

I appreciate all the input. How about this, I am going to implement
the core sparse
part of the API change. You will be the first customer that use that
API. I think this
API is also useful for people want to do source code transformations
using sparse.
They usually want to manipulate the source code stream before the
pre-processing.

Anyway, I am hacking a branch starting from the last patch I send out.
I am adding the test program to demonstrate how to use those hooks to accomplish
different things. You just need to keep telling me what extra feature
you want from
the API. I can try to incorporate them with the demo program.

I will let you know when I have the branch ready.

Chris

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-07 22:06                                   ` Christopher Li
@ 2012-05-08  6:38                                     ` Konrad Eisele
  2012-05-09  9:18                                       ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-08  6:38 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

Christopher Li wrote:
> Hi Konrad,
>
> I appreciate all the input. How about this, I am going to implement
> the core sparse
> part of the API change. You will be the first customer that use that
> API. I think this
> API is also useful for people want to do source code transformations
> using sparse.
> They usually want to manipulate the source code stream before the
> pre-processing.
>
> Anyway, I am hacking a branch starting from the last patch I send out.
> I am adding the test program to demonstrate how to use those hooks to accomplish
> different things. You just need to keep telling me what extra feature
> you want from
> the API. I can try to incorporate them with the demo program.

Yea, I think it is better that way. You should implement it yourself first,
it kind of takes too long otherwise :-). Still I'm kind of
curious how you can trace macro expansion with just 1 callback
but I'll like to be surprised.
-- Konrad

>
> I will let you know when I have the branch ready.
>
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-08  6:38                                     ` Konrad Eisele
@ 2012-05-09  9:18                                       ` Christopher Li
  2012-05-09  9:48                                         ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-09  9:18 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Linux-Sparse

On Mon, May 7, 2012 at 11:38 PM, Konrad Eisele <konrad@gaisler.com> wrote:
>
> Yea, I think it is better that way. You should implement it yourself first,
> it kind of takes too long otherwise :-). Still I'm kind of
> curious how you can trace macro expansion with just 1 callback
> but I'll like to be surprised.

OK, there is the initial version of the preprocessor hook.
I create an branch "unclean-preprocess-hook" for review.

http://git.kernel.org/?p=devel/sparse/chrisl/sparse.git;a=shortlog;h=refs/heads/unclean-preprocess-hook

I end up use more than one call back, but it is still better that 6.
I also think that is easier for the caller to use. because it receive the
the before and after at the same time.


struct preprocess_hook {
	void (*expand_macro)(struct token *macro, struct symbol *sym,
			     struct token **replace, struct token **replace_tail);
	void (*expand_arg)(struct token *macro, struct symbol *sym, int arg,
			   struct token *orig, struct token *expanded);
};

The demo program expand your example macro with the following
results:

<beginning of 't.c'>
#define D0(d0a0,d0a1) 1 D1(d0a0) 2 D2(d0a1) 3
#define D1(d1a0) 4 d1a0 5
#define D2(d2a0) 6 d2a0 7
#define D3(d3a0) 8 d3a0 9
D0(D3(10),11)<end of 't.c'>
arg0 in D3 :10 -> 10
macro D3 inside D0
expand result: 8 10 9 <untaint: D3>
arg0 in D0 :D3(10) -> 8 10 9
arg1 in D0 :11 -> 11
macro D0 inside <noident>
expand result: 1 D1(8 10 9) 2 D2(11) 3 <untaint: D0>
arg0 in D1 :8 10 9 -> 8 10 9
macro D1 inside D0
expand result: 4 8 10 9 5 <untaint: D1>
arg0 in D2 :11 -> 11
macro D2 inside D0
expand result: 6 11 7 <untaint: D2>
After preprocessing
1 4 8 10 9 5 2 6 11 7 3

A few things. I don't think you need to manipulate the define for empty
body macro any more. You should be able to find out the macro expand
to empty in the hook.

I still haven't fully understand why you need the empty token type. However
there is the untaint token which mark the end of the a macro expand. You
might able to use that as well.

This branch needs cleanup before merge to the upstream.
Please let me know what I miss.

Chris

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-09  9:18                                       ` Christopher Li
@ 2012-05-09  9:48                                         ` Konrad Eisele
  2012-05-09 22:50                                           ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-09  9:48 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

Christopher Li wrote:
> On Mon, May 7, 2012 at 11:38 PM, Konrad Eisele<konrad@gaisler.com>  wrote:
>>
>> Yea, I think it is better that way. You should implement it yourself first,
>> it kind of takes too long otherwise :-). Still I'm kind of
>> curious how you can trace macro expansion with just 1 callback
>> but I'll like to be surprised.
>
> OK, there is the initial version of the preprocessor hook.
> I create an branch "unclean-preprocess-hook" for review.
>
> http://git.kernel.org/?p=devel/sparse/chrisl/sparse.git;a=shortlog;h=refs/heads/unclean-preprocess-hook
>
> I end up use more than one call back, but it is still better that 6.
> I also think that is easier for the caller to use. because it receive the
> the before and after at the same time.
>
>
> struct preprocess_hook {
> 	void (*expand_macro)(struct token *macro, struct symbol *sym,
> 			     struct token **replace, struct token **replace_tail);
> 	void (*expand_arg)(struct token *macro, struct symbol *sym, int arg,
> 			   struct token *orig, struct token *expanded);
> };

I dont think its practical: If you have a argument that is expanded then
when using 4 callbacks you get the calls:

   arg-expand-begin(a)
     body-expand-begin(c)
     body-expand-end(d)
   arg-expand-end(b)

When using 2 callbacks you get the calls:

     body-expand(c d)
   arg-expand-begin(a b)

But "a" is the source to both "c" and "d". The goal of all this is to
generate a tree. You need to know where a token originated from. The
tokens in "a" might be duplicated, then in your case you dont have
enough information to reason about the origin of "c".

I do it like this: Inside
arg-expand-begin(a) I "dope" all tokens of "a" by setting
token.pos.position (which I understood I can use as I want)
with an unique id (token.pos.stream is my preprocessor stream). When a
token is duplicated in argument replacement etc. token.pos will also
be copied. The duplicates of "a" will always retain information where
they came from.
Then I can regenerate the tree.

T think you need to implement this first so that I can see how it could
be done...

>
> The demo program expand your example macro with the following
> results:
>
> <beginning of 't.c'>
> #define D0(d0a0,d0a1) 1 D1(d0a0) 2 D2(d0a1) 3
> #define D1(d1a0) 4 d1a0 5
> #define D2(d2a0) 6 d2a0 7
> #define D3(d3a0) 8 d3a0 9
> D0(D3(10),11)<end of 't.c'>
> arg0 in D3 :10 ->  10
> macro D3 inside D0
> expand result: 8 10 9<untaint: D3>
> arg0 in D0 :D3(10) ->  8 10 9
> arg1 in D0 :11 ->  11
> macro D0 inside<noident>
> expand result: 1 D1(8 10 9) 2 D2(11) 3<untaint: D0>
> arg0 in D1 :8 10 9 ->  8 10 9
> macro D1 inside D0
> expand result: 4 8 10 9 5<untaint: D1>
> arg0 in D2 :11 ->  11
> macro D2 inside D0
> expand result: 6 11 7<untaint: D2>
> After preprocessing
> 1 4 8 10 9 5 2 6 11 7 3
>
> A few things. I don't think you need to manipulate the define for empty
> body macro any more. You should be able to find out the macro expand
> to empty in the hook.
>
> I still haven't fully understand why you need the empty token type. However
> there is the untaint token which mark the end of the a macro expand. You
> might able to use that as well.

I dont think you can, not without patching preprocess.c. And the patching
would be messier than by introducing a dedicated token.
Also: TOKEN_M_EMPTY is only used by the hook, it is also
  removed afterwards

>
> This branch needs cleanup before merge to the upstream.
> Please let me know what I miss.
>
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-09  9:48                                         ` Konrad Eisele
@ 2012-05-09 22:50                                           ` Christopher Li
  2012-05-10  6:19                                             ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-09 22:50 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Linux-Sparse

On Wed, May 9, 2012 at 2:48 AM, Konrad Eisele <konrad@gaisler.com> wrote:
> I dont think its practical: If you have a argument that is expanded then
> when using 4 callbacks you get the calls:
>
>  arg-expand-begin(a)
>    body-expand-begin(c)
>    body-expand-end(d)
>  arg-expand-end(b)
>
> When using 2 callbacks you get the calls:
>
>    body-expand(c d)
>  arg-expand-begin(a b)
>
> But "a" is the source to both "c" and "d". The goal of all this is to
> generate a tree. You need to know where a token originated from. The
> tokens in "a" might be duplicated, then in your case you dont have
> enough information to reason about the origin of "c".

I was thinking that you can use the sym->parent to find out you are inside
which macro's scope. If I change the sym->parent to the token type, which
is the token initiated the macro expand. Then you should have all the
information
you need?


> I do it like this: Inside
> arg-expand-begin(a) I "dope" all tokens of "a" by setting
> token.pos.position (which I understood I can use as I want)
> with an unique id (token.pos.stream is my preprocessor stream). When a
> token is duplicated in argument replacement etc. token.pos will also
> be copied. The duplicates of "a" will always retain information where
> they came from.
> Then I can regenerate the tree.

Right, you can still do the the duping with expand_macro call back.

>T think you need to implement this first so that I can see how it could
> be done...

That will take more time. I am hoping you can point out what is missing
in the demo program. Which you did already.

>> I still haven't fully understand why you need the empty token type.
>> However
>> there is the untaint token which mark the end of the a macro expand. You
>> might able to use that as well.
>
>
> I dont think you can, not without patching preprocess.c. And the patching
> would be messier than by introducing a dedicated token.
> Also: TOKEN_M_EMPTY is only used by the hook, it is also
>  removed afterwards

I can only guess how to TOKEN_M_EMPTY in the call back. I only
see the code add into into the stream and removed, not how the client
use it.

Here is another idea. Instead of require the client to register a lot of
callback function. We can have option to insert macro expand annotation token
into the token stream. The annotation token mark the beginning and end of the
macro expansion. In the macro begin annotation, it will preserve the
original stream
before the expansion. Similarly, there will be begin and end
annotation for the argument
replacement.

The client can do a custom pass to consume the annotation token. It should
be able to build the patch tree that lead to the expansion result. The
annotation
token will be consume and removed from the token stream before it get pass to
the parser.

In this way, no call back is required. It is all data structures. The
comments of
the source code can fit nicely into the annotation token as well.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-09 22:50                                           ` Christopher Li
@ 2012-05-10  6:19                                             ` Konrad Eisele
  2012-05-10  6:38                                               ` Konrad Eisele
  2012-05-10  9:03                                               ` Christopher Li
  0 siblings, 2 replies; 50+ messages in thread
From: Konrad Eisele @ 2012-05-10  6:19 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

Christopher Li wrote:
> On Wed, May 9, 2012 at 2:48 AM, Konrad Eisele<konrad@gaisler.com>  wrote:
>> I dont think its practical: If you have a argument that is expanded then
>> when using 4 callbacks you get the calls:
>>
>>   arg-expand-begin(a)
>>     body-expand-begin(c)
>>     body-expand-end(d)
>>   arg-expand-end(b)
>>
>> When using 2 callbacks you get the calls:
>>
>>     body-expand(c d)
>>   arg-expand-begin(a b)
>>
>> But "a" is the source to both "c" and "d". The goal of all this is to
>> generate a tree. You need to know where a token originated from. The
>> tokens in "a" might be duplicated, then in your case you dont have
>> enough information to reason about the origin of "c".
>
> I was thinking that you can use the sym->parent to find out you are inside
> which macro's scope. If I change the sym->parent to the token type, which
> is the token initiated the macro expand. Then you should have all the
> information
> you need?

You dont get the point. Is there a pointer token.parent? No. How can
you know which macro expansion you _came from_, you only know
where you _end up_. How can you build the _branches_ of a tree? You can
only build _leafs_.
Funny thing is: When you _are_ a commiter you _can_ suddenly add
"sym.parent" when you think it is necessary? All this mess is originally
because you dont think that sym.token can replace sym.position...
Wasnt adding new members a sin ? I guess it actually doesnt matter,
if you are a committer ...:-)

>
>
>> I do it like this: Inside
>> arg-expand-begin(a) I "dope" all tokens of "a" by setting
>> token.pos.position (which I understood I can use as I want)
>> with an unique id (token.pos.stream is my preprocessor stream). When a
>> token is duplicated in argument replacement etc. token.pos will also
>> be copied. The duplicates of "a" will always retain information where
>> they came from.
>> Then I can regenerate the tree.
>
> Right, you can still do the the duping with expand_macro call back.
>
>> T think you need to implement this first so that I can see how it could
>> be done...
>
> That will take more time. I am hoping you can point out what is missing
> in the demo program. Which you did already.

Apply my latest patch any you'll see.

Also: I dont see that you do anything with token.pos. Dont
you remember that you came up with token.pos encoding to avoid adding
a new member to token that my original patch did?
In the last patch I sent this idea is used, as described, I write
an unique id to token.pos.position that then shows the origin...
You seem to have forgotten what we started from.

>
>>> I still haven't fully understand why you need the empty token type.
>>> However
>>> there is the untaint token which mark the end of the a macro expand. You
>>> might able to use that as well.
>>
>>
>> I dont think you can, not without patching preprocess.c. And the patching
>> would be messier than by introducing a dedicated token.
>> Also: TOKEN_M_EMPTY is only used by the hook, it is also
>>   removed afterwards
>
> I can only guess how to TOKEN_M_EMPTY in the call back. I only
> see the code add into into the stream and removed, not how the client
> use it.
>
> Here is another idea. Instead of require the client to register a lot of
> callback function. We can have option to insert macro expand annotation token
> into the token stream. The annotation token mark the beginning and end of the
> macro expansion. In the macro begin annotation, it will preserve the
> original stream
> before the expansion. Similarly, there will be begin and end
> annotation for the argument
> replacement.

This is all the same thing. It doenst matter how, just implement it.
I dont understand: Here you think you can patch 1000 lines in the
original code (which your new idea  would requite ) however you
insist of 2 callbacks (callbacks that normal code would ignore anyway)
because you dont want to grant me 4 extra lines (the begin-end hooks+define+post).
This is greedy at least, I dont get it, I think I'm out of here...

>
> The client can do a custom pass to consume the annotation token. It should
> be able to build the patch tree that lead to the expansion result. The
> annotation
> token will be consume and removed from the token stream before it get pass to
> the parser.

Have fun implementing it :-)
-- Konrad

>
> In this way, no call back is required. It is all data structures. The
> comments of
> the source code can fit nicely into the annotation token as well.
>
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-10  6:19                                             ` Konrad Eisele
@ 2012-05-10  6:38                                               ` Konrad Eisele
  2012-05-10  9:37                                                 ` Christopher Li
  2012-05-10  9:03                                               ` Christopher Li
  1 sibling, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-10  6:38 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

Konrad Eisele wrote:
> Christopher Li wrote:
>> On Wed, May 9, 2012 at 2:48 AM, Konrad Eisele<konrad@gaisler.com> wrote:
>>> I dont think its practical: If you have a argument that is expanded then
>>> when using 4 callbacks you get the calls:
>>>
>>> arg-expand-begin(a)
>>> body-expand-begin(c)
>>> body-expand-end(d)
>>> arg-expand-end(b)
>>>
>>> When using 2 callbacks you get the calls:
>>>
>>> body-expand(c d)
>>> arg-expand-begin(a b)
>>>
>>> But "a" is the source to both "c" and "d". The goal of all this is to
>>> generate a tree. You need to know where a token originated from. The
>>> tokens in "a" might be duplicated, then in your case you dont have
>>> enough information to reason about the origin of "c".
>>
>> I was thinking that you can use the sym->parent to find out you are inside
>> which macro's scope. If I change the sym->parent to the token type, which
>> is the token initiated the macro expand. Then you should have all the
>> information
>> you need?
>
> You dont get the point. Is there a pointer token.parent? No. How can
> you know which macro expansion you _came from_, you only know
> where you _end up_. How can you build the _branches_ of a tree? You can
> only build _leafs_.
> Funny thing is: When you _are_ a commiter you _can_ suddenly add
> "sym.parent" when you think it is necessary? All this mess is originally
> because you dont think that sym.token can replace sym.position...
> Wasnt adding new members a sin ? I guess it actually doesnt matter,
> if you are a committer ...:-)
>
>>
>>
>>> I do it like this: Inside
>>> arg-expand-begin(a) I "dope" all tokens of "a" by setting
>>> token.pos.position (which I understood I can use as I want)
>>> with an unique id (token.pos.stream is my preprocessor stream). When a
>>> token is duplicated in argument replacement etc. token.pos will also
>>> be copied. The duplicates of "a" will always retain information where
>>> they came from.
>>> Then I can regenerate the tree.
>>
>> Right, you can still do the the duping with expand_macro call back.
>>
>>> T think you need to implement this first so that I can see how it could
>>> be done...
>>
>> That will take more time. I am hoping you can point out what is missing
>> in the demo program. Which you did already.
>
> Apply my latest patch any you'll see.
>
> Also: I dont see that you do anything with token.pos. Dont
> you remember that you came up with token.pos encoding to avoid adding
> a new member to token that my original patch did?
> In the last patch I sent this idea is used, as described, I write
> an unique id to token.pos.position that then shows the origin...
> You seem to have forgotten what we started from.
>
>>
>>>> I still haven't fully understand why you need the empty token type.
>>>> However
>>>> there is the untaint token which mark the end of the a macro expand. You
>>>> might able to use that as well.
>>>
>>>
>>> I dont think you can, not without patching preprocess.c. And the patching
>>> would be messier than by introducing a dedicated token.
>>> Also: TOKEN_M_EMPTY is only used by the hook, it is also
>>> removed afterwards
>>
>> I can only guess how to TOKEN_M_EMPTY in the call back. I only
>> see the code add into into the stream and removed, not how the client
>> use it.
>>
>> Here is another idea. Instead of require the client to register a lot of
>> callback function. We can have option to insert macro expand annotation token
>> into the token stream. The annotation token mark the beginning and end of the
>> macro expansion. In the macro begin annotation, it will preserve the
>> original stream
>> before the expansion. Similarly, there will be begin and end
>> annotation for the argument
>> replacement.
>
> This is all the same thing. It doenst matter how, just implement it.
> I dont understand: Here you think you can patch 1000 lines in the
> original code (which your new idea would requite ) however you
> insist of 2 callbacks (callbacks that normal code would ignore anyway)
> because you dont want to grant me 4 extra lines (the begin-end hooks+define+post).
> This is greedy at least, I dont get it, I think I'm out of here...
>
>>
>> The client can do a custom pass to consume the annotation token. It should
>> be able to build the patch tree that lead to the expansion result. The
>> annotation
>> token will be consume and removed from the token stream before it get pass to
>> the parser.

Reading it 2 times and thinking about what kind of thought
drive you: I can only say this is fucking sick:
You oppose adding one little TOKEN_M_EMPTY, now you use my idea
to add 100 new tokens and yeah ... Hello, hello ... How can
you filter them out : You _cannot_ add a "post" hook. :-) Haha...

>
> Have fun implementing it :-)
> -- Konrad
>
>>
>> In this way, no call back is required. It is all data structures. The
>> comments of
>> the source code can fit nicely into the annotation token as well.
>>
>> Chris
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-10  6:19                                             ` Konrad Eisele
  2012-05-10  6:38                                               ` Konrad Eisele
@ 2012-05-10  9:03                                               ` Christopher Li
  1 sibling, 0 replies; 50+ messages in thread
From: Christopher Li @ 2012-05-10  9:03 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Linux-Sparse

On Wed, May 9, 2012 at 11:19 PM, Konrad Eisele <konrad@gaisler.com> wrote:
>> I was thinking that you can use the sym->parent to find out you are inside
>> which macro's scope. If I change the sym->parent to the token type, which
>> is the token initiated the macro expand. Then you should have all the
>> information
>> you need?
> You dont get the point. Is there a pointer token.parent? No. How can
> you know which macro expansion you _came from_, you only know

Did you look at my patch? It save the top level expanding macro in
parent variable and the macro expansion active maintains it.
e.g. When you enter the macro, it save the current macro in parent.
When you exit the macro during untainted, it will pop it from the
symbol->parent.

So during the macro expansion. It does remember the upper level
macro and the macro above that all the way to the outside the macro
expand. Where parent will be NULL. My test program demo part of it.

> where you _end up_. How can you build the _branches_ of a tree? You can
> only build _leafs_.
> Funny thing is: When you _are_ a commiter you _can_ suddenly add
> "sym.parent" when you think it is necessary? All this mess is originally
> because you dont think that sym.token can replace sym.position...

Please. Sparse carefully separate out the token from the AST is by
design. So it does not need to carry the token baggage in the later part
of the AST processing. That is a design and I want to keep it. It make
the program run faster as well because it reduce the compiler foot print.

> Wasnt adding new members a sin ? I guess it actually doesnt matter,
> if you are a committer ...:-)

Well, in case you did not notice. The place I add it is part of a union
in side symbol. It will not expand the size of the symbol at all. Other
part of the sparse use similar tricks. There is no extra memory expansion
for that.

It is not about I am the committer so I can do what I want. I did not
commit my branch into the official repository do I? Why do you think
I name the branch "unclean-xxxxx". It is putting it out for review just
like your patches. So that I can receive critic comment like yours.
The goal is find out the better way to incorporate this feature without
much negative  on other clients of sparse. e.g. your patch using
token replacing pos will have such negative impact on other sparse
client that don't care about the macro expand history.


> Also: I dont see that you do anything with token.pos. Dont
> you remember that you came up with token.pos encoding to avoid adding
> a new member to token that my original patch did?
> In the last patch I sent this idea is used, as described, I write
> an unique id to token.pos.position that then shows the origin...
> You seem to have forgotten what we started from.

Of course I remember. That trick is done in the call back. In the client
of the sparse library. I am currently focus on getting the right change for
the core sparse library. Think sparse as a library.
There are many client that use this library. The sparse checker, ctags
and sparse-llvm are all client of the sparse library.

I am trying to get the library support your dependency analyse without
much negative impact of other client of sparse. Your patch is clear
focus on getting your program to run. I have more constrains to consider.

> This is all the same thing. It doenst matter how, just implement it.

It doesn't matter to you. But the detail matters to me.

> I dont understand: Here you think you can patch 1000 lines in the
> original code (which your new idea  would requite ) however you
> insist of 2 callbacks (callbacks that normal code would ignore anyway)
> because you dont want to grant me 4 extra lines (the begin-end
> hooks+define+post).
> This is greedy at least, I dont get it, I think I'm out of here...

A change like that clearly needs more discussion and review
process for feed back. My own patches will need to go through the
same thing. And that is what I am doing.

I am sorry. I understand your frustration trying submit a patch
but haven't able to get it in as early as you hope. I am also
frustrated because I haven't found the clean enough way to
support it. That is why I am having discuss with you to work
it through.

Keeping the project clean and provide review feed backs is
just one of the roles of the maintainer, including rejecting patches
at times.

The question I concern most is weather that is the right thing
to do. A 6 call back API is obvious more complex than a 2 call
back. No call back is obvious simpler than 2 call back. It is just
part of the quest for finding the right interface.

There is no need for such negative attitude.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-10  6:38                                               ` Konrad Eisele
@ 2012-05-10  9:37                                                 ` Christopher Li
  2012-05-10  9:51                                                   ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-10  9:37 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Linux-Sparse

On Wed, May 9, 2012 at 11:38 PM, Konrad Eisele <konrad@gaisler.com> wrote:
>>> The client can do a custom pass to consume the annotation token. It
>>> should
>>> be able to build the patch tree that lead to the expansion result. The
>>> annotation
>>> token will be consume and removed from the token stream before it get
>>> pass to
>>> the parser.
>
>
> Reading it 2 times and thinking about what kind of thought
> drive you: I can only say this is fucking sick:
> You oppose adding one little TOKEN_M_EMPTY, now you use my idea
> to add 100 new tokens and yeah ... Hello, hello ... How can
> you filter them out : You _cannot_ add a "post" hook. :-) Haha...


I only request some clarification on how you use
the TOKEN_M_EMPTY in the call back. Which you don't care to explain.

It is clear that I consider no call back is better than 2 call back, which
is better than 6 call back. I am not too worry about changing internals
of parse, as long as it is "do the right thing".

Where do you get the idea you can't do custom filtering without
a "post" hook? I already explain to you in previous email
you can call preprocessor and parser step by step yourself.


        token = tokenize(filename, fd, NULL, includepath);
	token = preprocess(token);
        token = my_custom_filter(token);
        while (!eof_token(token))
		token = external_declaration(token, &translation_unit_used_list);

Do I need to use your post hook? No.

You doesn't want to listen.

I have nothing against you. It is pure technical merit I am evaluating.

Chris

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-10  9:37                                                 ` Christopher Li
@ 2012-05-10  9:51                                                   ` Konrad Eisele
  2012-05-10 11:25                                                     ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-10  9:51 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

Christopher Li wrote:
> On Wed, May 9, 2012 at 11:38 PM, Konrad Eisele<konrad@gaisler.com>  wrote:
>>>> The client can do a custom pass to consume the annotation token. It
>>>> should
>>>> be able to build the patch tree that lead to the expansion result. The
>>>> annotation
>>>> token will be consume and removed from the token stream before it get
>>>> pass to
>>>> the parser.
>>
>>
>> Reading it 2 times and thinking about what kind of thought
>> drive you: I can only say this is fucking sick:
>> You oppose adding one little TOKEN_M_EMPTY, now you use my idea
>> to add 100 new tokens and yeah ... Hello, hello ... How can
>> you filter them out : You _cannot_ add a "post" hook. :-) Haha...
>
>
> I only request some clarification on how you use
> the TOKEN_M_EMPTY in the call back. Which you don't care to explain.
>
> It is clear that I consider no call back is better than 2 call back, which
> is better than 6 call back. I am not too worry about changing internals
> of parse, as long as it is "do the right thing".
>
> Where do you get the idea you can't do custom filtering without
> a "post" hook? I already explain to you in previous email
> you can call preprocessor and parser step by step yourself.
>
>
>          token = tokenize(filename, fd, NULL, includepath);
> 	token = preprocess(token);
>          token = my_custom_filter(token);
>          while (!eof_token(token))
> 		token = external_declaration(token,&translation_unit_used_list);
>
> Do I need to use your post hook? No.
>
> You doesn't want to listen.
>
> I have nothing against you. It is pure technical merit I am evaluating.

You didnt get it. The "_cannot_" was ironic. There is always
a way you can fit things. The point is you want to implement it
yourself, exaclty the way you think it should be done, then do it.
I've nothing to contribute.
I've also nothing against you personally, only against this
ping pong emailing. It takes too much time.
-- By Konrad

>
> Chris
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-10  9:51                                                   ` Konrad Eisele
@ 2012-05-10 11:25                                                     ` Christopher Li
  2012-05-10 12:14                                                       ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-10 11:25 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Linux-Sparse

On Thu, May 10, 2012 at 2:51 AM, Konrad Eisele <konrad@gaisler.com> wrote:
>
> You didnt get it. The "_cannot_" was ironic. There is always
> a way you can fit things. The point is you want to implement it
> yourself, exaclty the way you think it should be done, then do it.
> I've nothing to contribute.
> I've also nothing against you personally, only against this
> ping pong emailing. It takes too much time.

Well, the __cannot__ part is base on your reply you seems don't
wish to continue this discussion.

A change like this is bound to need some careful discussion and
planing. Yes, I am guilt of only accepting patches meet some subjective
stander of mine. But so is to any self respect project maintainers.
I would rather spend some time to do it right than commit some thing
I would regret later on.

I heard you that this discussion is taking long. That is why I offer
to write up the core sparse part of the change myself and let you
provide feed back to shape it the way we both can happy.
That is the agreement we have earlier right?

So I did exactly what I said I am going to do, now you are calling
me my way vs your way?

My evaluation function is straightly technical merit:

- I prefer patch minimize performance impact on other clients don't
use this feature.
- I prefer simpler interface over complicate one.

To me, believe it or not, It is never about my way vs your way.
If you submit a perfect patch, I would more than happy to apply it.
Apply a patch is much easier than writing one myself.

Chris

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-10 11:25                                                     ` Christopher Li
@ 2012-05-10 12:14                                                       ` Konrad Eisele
  2012-05-10 12:28                                                         ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-10 12:14 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

On 05/10/2012 01:25 PM, Christopher Li wrote:
> On Thu, May 10, 2012 at 2:51 AM, Konrad Eisele<konrad@gaisler.com>  wrote:
>>
>> You didnt get it. The "_cannot_" was ironic. There is always
>> a way you can fit things. The point is you want to implement it
>> yourself, exaclty the way you think it should be done, then do it.
>> I've nothing to contribute.
>> I've also nothing against you personally, only against this
>> ping pong emailing. It takes too much time.
>
> Well, the __cannot__ part is base on your reply you seems don't
> wish to continue this discussion.
>
> A change like this is bound to need some careful discussion and
> planing. Yes, I am guilt of only accepting patches meet some subjective
> stander of mine. But so is to any self respect project maintainers.
> I would rather spend some time to do it right than commit some thing
> I would regret later on.

Do a B(B(x)) and your sym->parent linked-list will fail.
-- Konrad

>
> I heard you that this discussion is taking long. That is why I offer
> to write up the core sparse part of the change myself and let you
> provide feed back to shape it the way we both can happy.
> That is the agreement we have earlier right?
>
> So I did exactly what I said I am going to do, now you are calling
> me my way vs your way?
>
> My evaluation function is straightly technical merit:
>
> - I prefer patch minimize performance impact on other clients don't
> use this feature.
> - I prefer simpler interface over complicate one.
>
> To me, believe it or not, It is never about my way vs your way.
> If you submit a perfect patch, I would more than happy to apply it.
> Apply a patch is much easier than writing one myself.
>
> Chris
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-10 12:14                                                       ` Konrad Eisele
@ 2012-05-10 12:28                                                         ` Konrad Eisele
  2012-05-11 19:40                                                           ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-10 12:28 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse


>> A change like this is bound to need some careful discussion and
>> planing. Yes, I am guilt of only accepting patches meet some subjective
>> stander of mine. But so is to any self respect project maintainers.
>> I would rather spend some time to do it right than commit some thing
>> I would regret later on.
>
> Do a B(B(x)) and your sym->parent linked-list will fail.
> -- Konrad

I have to revise my previous assumption though:

#define B(y) A(x)
B(1)

i thought that in the body expansion of B recursion is involved,
so that it would yield:

   expand_macro(A);
expand_macro(B);

but that was wrong, so its

expand_macro(B);
expand_macro(A);

you might be right here. expand() is flat when substituting
the body part of the macro. It might work.

-- Konrad

>
>>
>> I heard you that this discussion is taking long. That is why I offer
>> to write up the core sparse part of the change myself and let you
>> provide feed back to shape it the way we both can happy.
>> That is the agreement we have earlier right?
>>
>> So I did exactly what I said I am going to do, now you are calling
>> me my way vs your way?
>>
>> My evaluation function is straightly technical merit:
>>
>> - I prefer patch minimize performance impact on other clients don't
>> use this feature.
>> - I prefer simpler interface over complicate one.
>>
>> To me, believe it or not, It is never about my way vs your way.
>> If you submit a perfect patch, I would more than happy to apply it.
>> Apply a patch is much easier than writing one myself.
>>
>> Chris
>>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-10 12:28                                                         ` Konrad Eisele
@ 2012-05-11 19:40                                                           ` Christopher Li
  2012-05-11 21:48                                                             ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-11 19:40 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Linux-Sparse

On Thu, May 10, 2012 at 5:28 AM, Konrad Eisele <eiselekd@gmail.com> wrote:
>
>>> A change like this is bound to need some careful discussion and
>>> planing. Yes, I am guilt of only accepting patches meet some subjective
>>> stander of mine. But so is to any self respect project maintainers.
>>> I would rather spend some time to do it right than commit some thing
>>> I would regret later on.
>>
>>
>> Do a B(B(x)) and your sym->parent linked-list will fail.

You have a valid point that B(B(x)) will break the sym->parent list.
I remove the sym->parent and just use a token_list to maintain the
expanding macro. The macro is append to the list when enter and remove
from the list when untaint token is reached. The change is in the
same review branch.

That should solve this problem?

> I have to revise my previous assumption though:
>
> #define B(y) A(x)
> B(1)
>
> i thought that in the body expansion of B recursion is involved,
> so that it would yield:
>
>  expand_macro(A);
> expand_macro(B);
>
> but that was wrong, so its
>
> expand_macro(B);
> expand_macro(A);

That is right. The macro expansion has 3 stages. The first stage is
expand the arguments list while the caller macro does not consider
"tainted" during the argument expansion. The macro can recursively
appear in the argument list. That is some thing I haven't consider
previously.

The second stage is just replace the expanded arguments into body.

The third stage is rescan the replacement string for macro expand.
In this third stage, the macro itself is consider tainted and can't be
expand again during the rescan.

Can you take a look at this modify version will fit your need or not?
I am curious the part weather that will remove the need to add
empty token to the list for expansion.

In order words, can we design the API clever enough that you
don't need to jump through hoops to handle the empty expansion
token.

Thanks

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-11 19:40                                                           ` Christopher Li
@ 2012-05-11 21:48                                                             ` Konrad Eisele
  2012-05-12 11:02                                                               ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-11 21:48 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

On 05/11/2012 09:40 PM, Christopher Li wrote:
> On Thu, May 10, 2012 at 5:28 AM, Konrad Eisele<eiselekd@gmail.com>  wrote:
>>
>>>> A change like this is bound to need some careful discussion and
>>>> planing. Yes, I am guilt of only accepting patches meet some subjective
>>>> stander of mine. But so is to any self respect project maintainers.
>>>> I would rather spend some time to do it right than commit some thing
>>>> I would regret later on.
>>>
>>>
>>> Do a B(B(x)) and your sym->parent linked-list will fail.
>
> You have a valid point that B(B(x)) will break the sym->parent list.
> I remove the sym->parent and just use a token_list to maintain the
> expanding macro. The macro is append to the list when enter and remove
> from the list when untaint token is reached. The change is in the
> same review branch.
>
> That should solve this problem?

This seems ok. expanding_macro has to be global not static to be
used... (?)

>
>> I have to revise my previous assumption though:
>>
>> #define B(y) A(x)
>> B(1)
>>
>> i thought that in the body expansion of B recursion is involved,
>> so that it would yield:
>>
>>   expand_macro(A);
>> expand_macro(B);
>>
>> but that was wrong, so its
>>
>> expand_macro(B);
>> expand_macro(A);
>
> That is right. The macro expansion has 3 stages. The first stage is
> expand the arguments list while the caller macro does not consider
> "tainted" during the argument expansion. The macro can recursively
> appear in the argument list. That is some thing I haven't consider
> previously.

I think the fact that argument expansion is recursive and
body expansion is non-recursive is one of the things that
make the preprocessor kindof hard to grasp.

>
> The second stage is just replace the expanded arguments into body.
>
> The third stage is rescan the replacement string for macro expand.
> In this third stage, the macro itself is consider tainted and can't be
> expand again during the rescan.
>
> Can you take a look at this modify version will fit your need or not?
> I am curious the part weather that will remove the need to add
> empty token to the list for expansion.

I cannot say this before I've tried it.

I'd like to straighten things out a bit: My last emails
where a bit too harsh and I'd like to apologize. Sorry
for that.

The next step then is: I'll write a patch to add a
test-prog that uses this api to trace the token generation
and generate a tree for it.
For a start I'll printout for all tokens of a preprocessor
run all macros-expansions that generated them.

Now, I've learned not to run too fast towards the
goal, (which is still "dependency tee from c parser entities downto
token"), maybe you can think about how to achieve the next steps
in an API :
- An #include #ifdef #else #endif pushdown-stack
   to record the nestings for each token
- How to connect all this to the AST.

-- Konrad

>
> In order words, can we design the API clever enough that you
> don't need to jump through hoops to handle the empty expansion
> token.
>
> Thanks
>
> Chris
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-11 21:48                                                             ` Konrad Eisele
@ 2012-05-12 11:02                                                               ` Christopher Li
  2012-05-12 17:46                                                                 ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-12 11:02 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Linux-Sparse

On Fri, May 11, 2012 at 2:48 PM, Konrad Eisele <eiselekd@gmail.com> wrote:
>
> This seems ok. expanding_macro has to be global not static to be
> used... (?)

The expand_macro call back use the parent argument which get
from expanding_macro list. The caller should be able to create tree
from the leaf node using the parent pointer.

Feel free to change to use the expanding_macro instead if that make
building the tree easier.

> I think the fact that argument expansion is recursive and
> body expansion is non-recursive is one of the things that
> make the preprocessor kindof hard to grasp.

The body expansion can't be recursive on same macro  otherwise
it can result in unlimited expansion. The C stander specify
the macro expand this way.

>
> I cannot say this before I've tried it.
>
> I'd like to straighten things out a bit: My last emails
> where a bit too harsh and I'd like to apologize. Sorry
> for that.

No problem at all. I figure you just want to the patch to
get included.

> The next step then is: I'll write a patch to add a
> test-prog that uses this api to trace the token generation
> and generate a tree for it.
> For a start I'll printout for all tokens of a preprocessor
> run all macros-expansions that generated them.

That is great. I have a test-macro program in that
branch which is very close to print out all the tokens.

> Now, I've learned not to run too fast towards the
> goal, (which is still "dependency tee from c parser entities downto
> token"), maybe you can think about how to achieve the next steps
> in an API :
> - An #include #ifdef #else #endif pushdown-stack
>  to record the nestings for each token

Let me think about this. Just thinking out lound,
The #include and #ifdef can consider as a special kind
of predefine macro as well.

> - How to connect all this to the AST.

For symbol, it relative easy because symbol has pos range
and aux pointer.

Do you need to attach the dependency for the statment and
expression as well?

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-12 11:02                                                               ` Christopher Li
@ 2012-05-12 17:46                                                                 ` Konrad Eisele
  2012-05-12 17:57                                                                   ` Konrad Eisele
                                                                                     ` (2 more replies)
  0 siblings, 3 replies; 50+ messages in thread
From: Konrad Eisele @ 2012-05-12 17:46 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

[-- Attachment #1: Type: text/plain, Size: 4762 bytes --]

On 05/12/2012 01:02 PM, Christopher Li wrote:
> On Fri, May 11, 2012 at 2:48 PM, Konrad Eisele<eiselekd@gmail.com>  wrote:
>>
>> This seems ok. expanding_macro has to be global not static to be
>> used... (?)
>
> The expand_macro call back use the parent argument which get
> from expanding_macro list. The caller should be able to create tree
> from the leaf node using the parent pointer.
>
> Feel free to change to use the expanding_macro instead if that make
> building the tree easier.
>
>> I think the fact that argument expansion is recursive and
>> body expansion is non-recursive is one of the things that
>> make the preprocessor kindof hard to grasp.
>
> The body expansion can't be recursive on same macro  otherwise
> it can result in unlimited expansion. The C stander specify
> the macro expand this way.
>
>>
>> I cannot say this before I've tried it.
>>
>> I'd like to straighten things out a bit: My last emails
>> where a bit too harsh and I'd like to apologize. Sorry
>> for that.
>
> No problem at all. I figure you just want to the patch to
> get included.
>
>> The next step then is: I'll write a patch to add a
>> test-prog that uses this api to trace the token generation
>> and generate a tree for it.
>> For a start I'll printout for all tokens of a preprocessor
>> run all macros-expansions that generated them.
>
> That is great. I have a test-macro program in that
> branch which is very close to print out all the tokens.

Appended is a test-patch that adds test-mdep testcase.
The file mdep.c is used to record that macro
expansion, each token will have a reference to its
source.
test-mdep.c does pre-process (as test-macro.c) then
prints out the token trace through macros for each
token: @{ } is used to mark the active path.

An example file is added: a.h
$test-mdep a.h
...
0004: 8
      body in D1 :4 @{8} 10 9 5 <untaint: D1>
      arg0 in D1 :@{8} 10 9
      body in D0 :1 @{D1}(8 10 9) 2 D2(11) 3 <untaint: D0>
      a.h:6:6
...
Token nr 4 of the preprocess stream is "8". The
generation path of "8" is marked @{8}...
Not 100%, still, I think already readable. (Actually
the printout order should be reversed (starting from file scope
and drilling down the macro expansions...)

I still dont handle empty expansions. I'll see weather I can come up 
with something here...


>
>> Now, I've learned not to run too fast towards the
>> goal, (which is still "dependency tee from c parser entities downto
>> token"), maybe you can think about how to achieve the next steps
>> in an API :
>> - An #include #ifdef #else #endif pushdown-stack
>>   to record the nestings for each token
>
> Let me think about this. Just thinking out lound,
> The #include and #ifdef can consider as a special kind
> of predefine macro as well.

No, only a linked list that model the nexting levels.
Then a preprocessor hook that can register lookup_macro()
macro lookups inside # preprocessor lines. An example
makes it clear:

#if defined(a) && defined(b)
#if defined(c)
#endif
#if defined(e)
#endif
#endif

Result in:
[a b]+<-[c]
      +<-[e]

This can be easily done with a push-pop brackets
and a callback in lookup_macro().


Also:
#if defined(a)
#elif defined(c)
#endif

[a]+<-[c]

#if defined(a)
#else
#endif

<-[empty]<-[a]

...


Another point I also need is to have an option so that inside
do_handle_define() the symbol structures are never reused but
alloc_symbol() is always used for undef and define, this is
because I need to be able to also track the undef and define
history for a macro at a certain position. I think this should be
easy to add because you just need to define define-undef on
top of each other...


>
>> - How to connect all this to the AST.
>
> For symbol, it relative easy because symbol has pos range
> and aux pointer.

I thought about taking "struct symbol_list *syms = sparse(file)"
as the root. Then mark all elements that are used by them as dependent.
I dont have enough insight to say how I can determine things like
  which "static inline" are used or how to traverse the
"typedef" dependency.
The goal is to have a "shrink" application that can strip away
all c-lines (pre-pre-process level) that are not used by a specific
command invocation of the compiler. Also a tool that can quickly show
for a specific identifier everything that is connected to it, again on
pre-preprocessor source level. kind-of something like:
...
func1() {
	struct string_list *filelist = NULL; int i;
}
..
I point to "string_list" and then all lines that are related
to struct string_list, (#ifdef nestings, macros, all member typedefs)
etc are shown and all the rest stripped away, again on human
readable c source level.


>
> Do you need to attach the dependency for the statment and
> expression as well?
>
> Chris
>


[-- Attachment #2: 0001-mdep.c-test.patch --]
[-- Type: text/plain, Size: 15568 bytes --]

From aff7f53ce89d24512c0ba2f66b981718538ae1c8 Mon Sep 17 00:00:00 2001
From: Konrad Eisele <eiselekd@gmail.de>
Date: Sat, 12 May 2012 18:43:16 +0200
Subject: [PATCH] mdep.c test

---
 Makefile      |    5 +-
 a.h           |    6 ++
 a2.h          |    2 +
 lib.c         |   19 +++--
 mdep.c        |  248 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 pre-process.c |    3 +-
 test-macro.c  |   10 +-
 test-mdep.c   |   62 ++++++++++++++
 token.h       |    8 ++-
 tokenize.c    |    8 +-
 10 files changed, 351 insertions(+), 20 deletions(-)
 create mode 100644 a.h
 create mode 100644 a2.h
 create mode 100644 mdep.c
 create mode 100644 test-mdep.c

diff --git a/Makefile b/Makefile
index 4abcbdd..b688054 100644
--- a/Makefile
+++ b/Makefile
@@ -44,7 +44,7 @@ PKGCONFIGDIR=$(LIBDIR)/pkgconfig
 
 PROGRAMS=test-lexing test-parsing obfuscate compile graph sparse \
 	 test-linearize example test-unssa test-dissect ctags \
-	 test-macro
+	 test-mdep test-macro 
 
 INST_PROGRAMS=sparse cgcc
 INST_MAN1=sparse.1 cgcc.1
@@ -96,7 +96,8 @@ LIB_H=    token.h parse.h lib.h symbol.h scope.h expression.h target.h \
 LIB_OBJS= target.o parse.o tokenize.o pre-process.o symbol.o lib.o scope.o \
 	  expression.o show-parse.o evaluate.o expand.o inline.o linearize.o \
 	  sort.o allocate.o compat-$(OS).o ptrlist.o \
-	  flow.o cse.o simplify.o memops.o liveness.o storage.o unssa.o dissect.o
+	  flow.o cse.o simplify.o memops.o liveness.o storage.o unssa.o dissect.o \
+	  mdep.o
 
 LIB_FILE= libsparse.a
 SLIB_FILE= libsparse.so
diff --git a/a.h b/a.h
new file mode 100644
index 0000000..63da9e8
--- /dev/null
+++ b/a.h
@@ -0,0 +1,6 @@
+//#include <stdio.h>
+#define D0(d0a0,d0a1) 1 D1(d0a0) 2 D2(d0a1) 3
+#define D1(d1a0) 4 d1a0 5
+#define D2(d2a0)
+#define D3(d3a0) 8 d3a0 9
+1 2  D0(D3(10),11) 3 4 
diff --git a/a2.h b/a2.h
new file mode 100644
index 0000000..098fbc2
--- /dev/null
+++ b/a2.h
@@ -0,0 +1,2 @@
+#define A(a,b) b 3 
+A(1,2)
diff --git a/lib.c b/lib.c
index 1876fc9..51879d9 100644
--- a/lib.c
+++ b/lib.c
@@ -577,6 +577,7 @@ static char **handle_switch_ftabstop(char *arg, char **next)
 	return next;
 }
 
+int fnobuildin = 0;
 static char **handle_switch_f(char *arg, char **next)
 {
 	arg++;
@@ -588,6 +589,9 @@ static char **handle_switch_f(char *arg, char **next)
 
 	if (!strncmp(arg, "no-", 3)) {
 		arg += 3;
+		if (!strncmp(arg, "buildin", 7)) {
+			fnobuildin = 1;
+		}
 	}
 	/* handle switch here.. */
 	return next;
@@ -875,7 +879,7 @@ static struct symbol_list *sparse_tokenstream(struct token *token)
 	token = preprocess(token);
 
 	if (preprocess_only) {
-		show_tokenstream(token);
+		show_tokenstream(token, 0);
 		putchar('\n');
 		return NULL;
 	}
@@ -963,12 +967,13 @@ struct symbol_list *sparse_initialize(int argc, char **argv, struct string_list
 		// Initialize type system
 		init_ctype();
 
-		create_builtin_stream();
-		add_pre_buffer("#define __CHECKER__ 1\n");
-		if (!preprocess_only)
-			declare_builtin_functions();
-
-		list = sparse_initial();
+		if (!fnobuildin) {
+			create_builtin_stream();
+			add_pre_buffer("#define __CHECKER__ 1\n");
+			if (!preprocess_only)
+				declare_builtin_functions();
+			list = sparse_initial();
+		}
 
 		/*
 		 * Protect the initial token allocations, since
diff --git a/mdep.c b/mdep.c
new file mode 100644
index 0000000..6af8467
--- /dev/null
+++ b/mdep.c
@@ -0,0 +1,248 @@
+/*
+ * Copyright (C) 2012 Konrad Eisele <eiselekd@gmail.com>
+ * BSD-License
+ * Redistribution and use in source and binary forms are permitted
+ * provided that the above copyright notice and this paragraph are
+ * duplicated in all such forms and that any documentation,
+ * advertising materials, and other materials related to such
+ * distribution and use acknowledge that the software was developed
+ * by the <organization>.  The name of the
+ * University may not be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <assert.h>
+#include "token.h"
+#include "allocate.h"
+#include "compat.h"
+#include "parse.h"
+#include "symbol.h"
+#include "token.h"
+#include "lib.h"
+
+static void expand_macro(struct token *macro, struct symbol *sym, 
+                         struct token *parent, struct token **replace, 
+                         struct token **replace_tail, struct token *last);
+static void expand_arg(struct token *macro, struct symbol *sym, int arg,
+		       struct token *orig, struct token *expanded);
+struct pp;
+struct pp_e;
+
+struct preprocess_hook pp = {
+    .expand_macro = expand_macro,
+    .expand_arg = expand_arg
+};
+
+unsigned int pps = 0;
+
+void mdep_init(void) {
+    preprocess_hook = &pp;
+    pps = init_stream("<pp>", -1, 0);
+}
+
+enum tags {
+    ATTR_TOK = 1,
+    ATTR_TOKP = 2,
+};
+
+struct hash_v {
+    struct hash_v *n;
+    long key;
+    enum tags tag;
+    void *v;
+};
+
+__DECLARE_ALLOCATOR(struct hash_v, hash_v);
+__ALLOCATOR(struct hash_v, "hash value", hash_v);
+
+#define HASH_LEN (1024*4)
+struct hash {
+    struct hash_v *f;
+} h[HASH_LEN];
+
+static int hash_func(long key, enum tags tag) {
+    unsigned int k = ((unsigned int)key) >> 4;
+    return ((k) ^ (k >> 16) ^ (k >> 24) ^ tag) & (HASH_LEN-1);
+}
+
+void **lookup_attr(long key, enum tags tag, int create) {
+    int i = hash_func(key, tag);
+    struct hash *hp = &h[i];
+    struct hash_v *p;
+    struct hash_v **c = &hp->f;
+    while((p = *c)) {
+        if ((p ->tag == tag)
+            && (p ->key == key)) {
+            return &p->v;
+        }
+        c = &p->n;
+    }
+    if (create) {
+        p = __alloc_hash_v(0);
+        p->key = key;
+        p->tag = tag;
+        p->v = 0;
+        *c = p;
+        return &p->v;
+    }
+    return 0;
+}
+
+enum pp_typ {
+    MARG = 1,
+    MBODY,
+};
+
+struct pp {
+    enum pp_typ t;
+    union {
+        unsigned int argi;
+    };
+    struct pp_e *f;
+    struct symbol *sym;
+    struct token *tok;
+    struct token *s, *d;
+};
+
+struct pp_e {
+    struct pp_e *n;
+    struct pp *p;
+    struct position from;
+    int idx;
+};
+
+__DECLARE_ALLOCATOR(struct pp, pp_v);
+__ALLOCATOR(struct pp, "pp trace", pp_v);
+__DECLARE_ALLOCATOR(struct pp_e, pp_e_v);
+__ALLOCATOR(struct pp_e, "pp trace element", pp_e_v);
+int n_tokid = 1;
+
+void pp_dope_list(struct pp *p, struct token **d, int dope, struct token *list, struct token *end, int prt)
+{
+    struct pp_e **e = &p->f;
+    struct token *n;
+    int idx = 0;
+    while ((!eof_token(list)) && list != end ) {
+        if (dope) {
+            void **v;
+            int id = n_tokid++;
+            struct pp_e *n = __alloc_pp_e_v(0);
+            n->from = list->pos;
+            n->idx = idx;
+            n->n = 0;
+            n->p = p;
+            *e = n;
+            v = lookup_attr(id, ATTR_TOK, 1);
+            *v = n;
+            list->pos.line = id;
+            list->pos.stream = pps;
+            e = &n->n;
+        }
+        n = __alloc_token(0);
+        *n = *list;
+        /*printf(" %s\n", show_token(list));*/
+        n->next = &eof_token_entry;
+        *d = n;
+        d = &n->next;
+        list = list->next;
+        idx++;
+        
+	
+    }
+}
+
+struct pp *new_pp(struct token *m, int t) {
+	struct pp *n = __alloc_pp_v(0);
+	n->t = t;
+	n->f = 0;
+	return n;
+}
+
+static void expand_macro(struct token *macro, struct symbol *sym, struct token *parent,
+			 struct token **replace, struct token **replace_tail, struct token *last)
+{
+    struct pp *p = new_pp(macro, MBODY);
+    p->sym = sym;
+    p->tok = macro;
+    pp_dope_list(p, &p->s, 0, sym->expansion, 0, 0);
+    pp_dope_list(p, &p->d, 1, *replace, last, 0);
+}
+
+static void expand_arg(struct token *macro, struct symbol *sym, int arg,
+		       struct token *orig, struct token *expanded)
+{
+    struct pp *p = new_pp(macro, MARG);
+    p->argi = arg;
+    p->sym = sym;
+    p->tok = macro;
+    pp_dope_list(p, &p->s, 0, orig, 0, 0);
+    pp_dope_list(p, &p->d, 1, expanded, 0, 0);
+}
+
+void mdep_show_tokenstream(struct token *token, struct token *end, int idx)
+{
+    int i = 0;
+    while (token != end && !eof_token(token)) {
+        int prec = 1;
+        struct token *next = token->next;
+        const char *separator = "";
+        if (next->pos.whitespace)
+            separator = " ";
+        if (next->pos.newline) {
+            separator = "\n\t\t\t\t\t";
+            prec = next->pos.pos;
+            if (prec > 4)
+                prec = 4;
+        }
+        if (i == idx) 
+            fprintf(stderr,"@{%s}%.*s", show_token(token), prec, separator);
+        else
+            fprintf(stderr,"%s%.*s", show_token(token), prec, separator);
+        token = next;
+        i++;
+    }
+}
+
+void mdep_trace (struct token *tok, char *pre)
+{
+    void **v; int id; struct position pos = tok->pos;
+    struct pp_e *e; struct pp *p;
+    pre = pre ? pre : "";
+    if(!tok || eof_token (tok)) 
+        return;
+    while(1) {
+        if (pos.stream != pps) {
+            char *name = stream_name(pos.stream);
+            fprintf(stderr, "%s%s:%d:%d\n", pre,
+                    name, pos.line, pos.pos);
+            break;
+        } 
+        id = pos.line;
+        if (!(v = lookup_attr(id, ATTR_TOK, 0))) {
+            break;
+        }
+        e = (struct pp_e *)*v;
+        p = e->p;
+        fprintf(stderr, "%s",pre); 
+        if (p->t == MARG) {
+            fprintf(stderr,"arg%d in %s :", p->argi, show_token(p->tok));
+        } else {
+            fprintf(stderr,"body in %s :", show_token(p->tok));
+        }
+        mdep_show_tokenstream(p->d, 0,e->idx); fprintf (stderr,"\n");
+        pos = e->from;
+    }
+}
+
+/*
+Local Variables:
+c-basic-offset:4
+indent-tabs-mode:nil
+End:
+*/
diff --git a/pre-process.c b/pre-process.c
index fb3430a..4eee864 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -385,6 +385,7 @@ static void expand_arguments(int count, struct arg *args)
 		struct token *arg = args[i].arg;
 		if (!arg)
 			arg = &eof_token_entry;
+		args[i].expanded = &eof_token_entry;
 		if (args[i].n_str)
 			args[i].str = stringify(arg);
 		if (args[i].n_normal) {
@@ -661,7 +662,7 @@ static int expand(struct token **list, struct symbol *sym)
 	last = token->next;
 	tail = substitute(list, sym->expansion, args);
 	if (preprocess_hook && preprocess_hook->expand_macro)
-		preprocess_hook->expand_macro(token, sym, parent, list, tail);
+		preprocess_hook->expand_macro(token, sym, parent, list, tail, last);
 	*tail = last;
 
 	return 0;
diff --git a/test-macro.c b/test-macro.c
index b30ee50..3115bef 100644
--- a/test-macro.c
+++ b/test-macro.c
@@ -22,9 +22,9 @@
 static void expand_arg(struct token *macro, struct symbol *sym, int i, struct token *orig, struct token *expanded)
 {
 	printf("arg%d in %s :", i, show_token(macro));
-	show_tokenstream(orig);
+	show_tokenstream(orig, 0);
 	printf(" -> ");
-	show_tokenstream(expanded);
+	show_tokenstream(expanded, 0);
 	printf("\n");
 	
 }
@@ -35,7 +35,7 @@ static void expand_macro(struct token *macro, struct symbol *sym, struct token *
 	printf("macro %s inside", show_token(macro));
 	printf(" %s\n", show_token(parent));
 	printf("expand result: ");
-	show_tokenstream(*replace);
+	show_tokenstream(*replace, 0);
 	printf("\n");
 }
 
@@ -55,11 +55,11 @@ void test_macro(char *filename)
 		die("No such file: %s", filename);
 
 	token = tokenize(filename, fd, NULL, includepath);
-	show_tokenstream(token);
+	show_tokenstream(token, 0);
 	printf("\n");
 	token = preprocess(token);
 	printf("After preprocessing\n");
-	show_tokenstream(token);
+	show_tokenstream(token, 0);
 }
 
 int main(int argc, char **argv)
diff --git a/test-mdep.c b/test-mdep.c
new file mode 100644
index 0000000..eca0357
--- /dev/null
+++ b/test-mdep.c
@@ -0,0 +1,62 @@
+/*
+ * Parse and linearize the tree for testing.
+ *
+ * Copyright (C) 2012 Christophre Li
+ *
+ */
+#include <stdarg.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <ctype.h>
+#include <unistd.h>
+#include <fcntl.h>
+
+#include "lib.h"
+#include "allocate.h"
+#include "token.h"
+#include "parse.h"
+#include "symbol.h"
+#include "expression.h"
+
+void test_mdep(char *filename)
+{
+    struct token *token;
+    int fd; int idx = 0;
+    fd = open(filename, O_RDONLY);
+    if (fd < 0)
+        die("No such file: %s", filename);
+	
+    token = tokenize(filename, fd, NULL, includepath);
+    token = preprocess(token);
+    printf("Dump token stream:\n");
+    
+    while (!eof_token(token)) {
+        struct token *next = token->next;
+        printf("%04d: %s\n", idx, show_token(token));
+        mdep_trace (token, "     ");
+        token = next; idx++;
+    }
+    
+}
+
+int main(int argc, char **argv)
+{
+    struct string_list *filelist = NULL;
+    char *file;
+
+    mdep_init();
+	
+    sparse_initialize(argc, argv, &filelist);
+    FOR_EACH_PTR_NOTAG(filelist, file) {
+        test_mdep(file);
+    } END_FOR_EACH_PTR_NOTAG(file);
+    return 0;
+}
+
+/*
+Local Variables:
+c-basic-offset:4
+indent-tabs-mode:nil
+End:
+*/
diff --git a/token.h b/token.h
index 985d1f5..8ddccd5 100644
--- a/token.h
+++ b/token.h
@@ -173,7 +173,7 @@ struct token {
 
 struct preprocess_hook {
 	void (*expand_macro)(struct token *macro, struct symbol *sym, struct token *parent,
-			     struct token **replace, struct token **replace_tail);
+			     struct token **replace, struct token **replace_tail, struct token *last);
 	void (*expand_arg)(struct token *macro, struct symbol *sym, int arg,
 			   struct token *orig, struct token *expanded);
 };
@@ -206,7 +206,7 @@ extern const char *show_special(int);
 extern const char *show_ident(const struct ident *);
 extern const char *show_string(const struct string *string);
 extern const char *show_token(const struct token *);
-extern void show_tokenstream(struct token *token);
+extern void show_tokenstream(struct token *token, struct token *end);
 extern struct token * tokenize(const char *, int, struct token *, const char **next_path);
 extern struct token * tokenize_buffer(void *, unsigned long, struct token **);
 
@@ -223,4 +223,8 @@ static inline int match_ident(struct token *token, struct ident *id)
 	return token->pos.type == TOKEN_IDENT && token->ident == id;
 }
 
+/* mdep.c */
+extern void mdep_init (void);
+extern void mdep_trace (struct token *tok, char *pre);
+
 #endif
diff --git a/tokenize.c b/tokenize.c
index b626f3f..6d0978f 100644
--- a/tokenize.c
+++ b/tokenize.c
@@ -127,6 +127,8 @@ const char *show_token(const struct token *token)
 
 	if (!token)
 		return "<no token>";
+	if (token == &eof_token_entry)
+		return "<eof>";
 	switch (token_type(token)) {
 	case TOKEN_ERROR:
 		return "syntax error";
@@ -180,9 +182,9 @@ const char *show_token(const struct token *token)
 	}
 }
 
-void show_tokenstream(struct token *token)
+void show_tokenstream(struct token *token, struct token *end)
 {
-	while (!eof_token(token)) {
+	while (token != end && !eof_token(token)) {
 		int prec = 1;
 		struct token *next = token->next;
 		const char *separator = "";
@@ -194,7 +196,7 @@ void show_tokenstream(struct token *token)
 			if (prec > 4)
 				prec = 4;
 		}
-		printf("%s%.*s", show_token(token), prec, separator);
+		fprintf(stderr,"%s%.*s", show_token(token), prec, separator);
 		token = next;
 	}
 }
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-12 17:46                                                                 ` Konrad Eisele
@ 2012-05-12 17:57                                                                   ` Konrad Eisele
  2012-05-13  8:52                                                                   ` Konrad Eisele
  2012-05-14 10:53                                                                   ` Christopher Li
  2 siblings, 0 replies; 50+ messages in thread
From: Konrad Eisele @ 2012-05-12 17:57 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

On 05/12/2012 07:46 PM, Konrad Eisele wrote:
> On 05/12/2012 01:02 PM, Christopher Li wrote:
>> On Fri, May 11, 2012 at 2:48 PM, Konrad Eisele<eiselekd@gmail.com> wrote:
>>>
>>> This seems ok. expanding_macro has to be global not static to be
>>> used... (?)
>>
>> The expand_macro call back use the parent argument which get
>> from expanding_macro list. The caller should be able to create tree
>> from the leaf node using the parent pointer.
>>
>> Feel free to change to use the expanding_macro instead if that make
>> building the tree easier.
>>
>>> I think the fact that argument expansion is recursive and
>>> body expansion is non-recursive is one of the things that
>>> make the preprocessor kindof hard to grasp.
>>
>> The body expansion can't be recursive on same macro otherwise
>> it can result in unlimited expansion. The C stander specify
>> the macro expand this way.
>>
>>>
>>> I cannot say this before I've tried it.
>>>
>>> I'd like to straighten things out a bit: My last emails
>>> where a bit too harsh and I'd like to apologize. Sorry
>>> for that.
>>
>> No problem at all. I figure you just want to the patch to
>> get included.
>>
>>> The next step then is: I'll write a patch to add a
>>> test-prog that uses this api to trace the token generation
>>> and generate a tree for it.
>>> For a start I'll printout for all tokens of a preprocessor
>>> run all macros-expansions that generated them.
>>
>> That is great. I have a test-macro program in that
>> branch which is very close to print out all the tokens.
>
> Appended is a test-patch that adds test-mdep testcase.
> The file mdep.c is used to record that macro
> expansion, each token will have a reference to its
> source.
> test-mdep.c does pre-process (as test-macro.c) then
> prints out the token trace through macros for each
> token: @{ } is used to mark the active path.
>

To explain mdep.c: There are in fact only 3 lines that
are of interest:

...
137:            n->from = list->pos;
...

...
143:            list->pos.line = id;
144:            list->pos.stream = pps;
...

Line 137 saves the last token.pos , (143+144) insert a new id
into token.pos. This will generate the path for each token through
the expansions.
mdep_trace() traverses the path...


> An example file is added: a.h
> $test-mdep a.h
> ...
> 0004: 8
> body in D1 :4 @{8} 10 9 5 <untaint: D1>
> arg0 in D1 :@{8} 10 9
> body in D0 :1 @{D1}(8 10 9) 2 D2(11) 3 <untaint: D0>
> a.h:6:6
> ...
> Token nr 4 of the preprocess stream is "8". The
> generation path of "8" is marked @{8}...
> Not 100%, still, I think already readable. (Actually
> the printout order should be reversed (starting from file scope
> and drilling down the macro expansions...)
>
> I still dont handle empty expansions. I'll see weather I can come up
> with something here...
>
>
>>
>>> Now, I've learned not to run too fast towards the
>>> goal, (which is still "dependency tee from c parser entities downto
>>> token"), maybe you can think about how to achieve the next steps
>>> in an API :
>>> - An #include #ifdef #else #endif pushdown-stack
>>> to record the nestings for each token
>>
>> Let me think about this. Just thinking out lound,
>> The #include and #ifdef can consider as a special kind
>> of predefine macro as well.
>
> No, only a linked list that model the nexting levels.
> Then a preprocessor hook that can register lookup_macro()
> macro lookups inside # preprocessor lines. An example
> makes it clear:
>
> #if defined(a) && defined(b)
> #if defined(c)
> #endif
> #if defined(e)
> #endif
> #endif
>
> Result in:
> [a b]+<-[c]
> +<-[e]
>
> This can be easily done with a push-pop brackets
> and a callback in lookup_macro().
>
>
> Also:
> #if defined(a)
> #elif defined(c)
> #endif
>
> [a]+<-[c]
>
> #if defined(a)
> #else
> #endif
>
> <-[empty]<-[a]
>
> ...
>
>
> Another point I also need is to have an option so that inside
> do_handle_define() the symbol structures are never reused but
> alloc_symbol() is always used for undef and define, this is
> because I need to be able to also track the undef and define
> history for a macro at a certain position. I think this should be
> easy to add because you just need to define define-undef on
> top of each other...
>
>
>>
>>> - How to connect all this to the AST.
>>
>> For symbol, it relative easy because symbol has pos range
>> and aux pointer.
>
> I thought about taking "struct symbol_list *syms = sparse(file)"
> as the root. Then mark all elements that are used by them as dependent.
> I dont have enough insight to say how I can determine things like
> which "static inline" are used or how to traverse the
> "typedef" dependency.
> The goal is to have a "shrink" application that can strip away
> all c-lines (pre-pre-process level) that are not used by a specific
> command invocation of the compiler. Also a tool that can quickly show
> for a specific identifier everything that is connected to it, again on
> pre-preprocessor source level. kind-of something like:
> ...
> func1() {
> struct string_list *filelist = NULL; int i;
> }
> ..
> I point to "string_list" and then all lines that are related
> to struct string_list, (#ifdef nestings, macros, all member typedefs)
> etc are shown and all the rest stripped away, again on human
> readable c source level.
>
>
>>
>> Do you need to attach the dependency for the statment and
>> expression as well?
>>
>> Chris
>>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-12 17:46                                                                 ` Konrad Eisele
  2012-05-12 17:57                                                                   ` Konrad Eisele
@ 2012-05-13  8:52                                                                   ` Konrad Eisele
  2012-05-15  6:30                                                                     ` Christopher Li
  2012-05-14 10:53                                                                   ` Christopher Li
  2 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-13  8:52 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

[-- Attachment #1: Type: text/plain, Size: 5550 bytes --]

On 05/12/2012 07:46 PM, Konrad Eisele wrote:
> On 05/12/2012 01:02 PM, Christopher Li wrote:
>> On Fri, May 11, 2012 at 2:48 PM, Konrad Eisele<eiselekd@gmail.com> wrote:
>>>
>>> This seems ok. expanding_macro has to be global not static to be
>>> used... (?)
>>
>> The expand_macro call back use the parent argument which get
>> from expanding_macro list. The caller should be able to create tree
>> from the leaf node using the parent pointer.
>>
>> Feel free to change to use the expanding_macro instead if that make
>> building the tree easier.
>>
>>> I think the fact that argument expansion is recursive and
>>> body expansion is non-recursive is one of the things that
>>> make the preprocessor kindof hard to grasp.
>>
>> The body expansion can't be recursive on same macro otherwise
>> it can result in unlimited expansion. The C stander specify
>> the macro expand this way.
>>
>>>
>>> I cannot say this before I've tried it.
>>>
>>> I'd like to straighten things out a bit: My last emails
>>> where a bit too harsh and I'd like to apologize. Sorry
>>> for that.
>>
>> No problem at all. I figure you just want to the patch to
>> get included.
>>
>>> The next step then is: I'll write a patch to add a
>>> test-prog that uses this api to trace the token generation
>>> and generate a tree for it.
>>> For a start I'll printout for all tokens of a preprocessor
>>> run all macros-expansions that generated them.
>>
>> That is great. I have a test-macro program in that
>> branch which is very close to print out all the tokens.
>
> Appended is a test-patch that adds test-mdep testcase.
> The file mdep.c is used to record that macro
> expansion, each token will have a reference to its
> source.
> test-mdep.c does pre-process (as test-macro.c) then
> prints out the token trace through macros for each
> token: @{ } is used to mark the active path.
>
> An example file is added: a.h
> $test-mdep a.h
> ...
> 0004: 8
> body in D1 :4 @{8} 10 9 5 <untaint: D1>
> arg0 in D1 :@{8} 10 9
> body in D0 :1 @{D1}(8 10 9) 2 D2(11) 3 <untaint: D0>
> a.h:6:6
> ...
> Token nr 4 of the preprocess stream is "8". The
> generation path of "8" is marked @{8}...
> Not 100%, still, I think already readable. (Actually
> the printout order should be reversed (starting from file scope
> and drilling down the macro expansions...)
>
> I still dont handle empty expansions. I'll see weather I can come up
> with something here...

I have thought about how to implement empty expansion tracing without
introducing a new token type. I came up with a solution, however I need
one callback, I called it substitute_arg(), see patch attached.
What do you think, is it apply-able?

I think I can use the address of the pointer to token (strict token
**,  which is normally &tok->next) as a hashing to propagate the empty
expansions...

Im not 100% shure it works but I need the extra hook to be able
to propagate the empty expansion from the arguments into the
substitution body...


>
>
>>
>>> Now, I've learned not to run too fast towards the
>>> goal, (which is still "dependency tee from c parser entities downto
>>> token"), maybe you can think about how to achieve the next steps
>>> in an API :
>>> - An #include #ifdef #else #endif pushdown-stack
>>> to record the nestings for each token
>>
>> Let me think about this. Just thinking out lound,
>> The #include and #ifdef can consider as a special kind
>> of predefine macro as well.
>
> No, only a linked list that model the nexting levels.
> Then a preprocessor hook that can register lookup_macro()
> macro lookups inside # preprocessor lines. An example
> makes it clear:
>
> #if defined(a) && defined(b)
> #if defined(c)
> #endif
> #if defined(e)
> #endif
> #endif
>
> Result in:
> [a b]+<-[c]
> +<-[e]
>
> This can be easily done with a push-pop brackets
> and a callback in lookup_macro().
>
>
> Also:
> #if defined(a)
> #elif defined(c)
> #endif
>
> [a]+<-[c]
>
> #if defined(a)
> #else
> #endif
>
> <-[empty]<-[a]
>
> ...
>
>
> Another point I also need is to have an option so that inside
> do_handle_define() the symbol structures are never reused but
> alloc_symbol() is always used for undef and define, this is
> because I need to be able to also track the undef and define
> history for a macro at a certain position. I think this should be
> easy to add because you just need to define define-undef on
> top of each other...
>
>
>>
>>> - How to connect all this to the AST.
>>
>> For symbol, it relative easy because symbol has pos range
>> and aux pointer.
>
> I thought about taking "struct symbol_list *syms = sparse(file)"
> as the root. Then mark all elements that are used by them as dependent.
> I dont have enough insight to say how I can determine things like
> which "static inline" are used or how to traverse the
> "typedef" dependency.
> The goal is to have a "shrink" application that can strip away
> all c-lines (pre-pre-process level) that are not used by a specific
> command invocation of the compiler. Also a tool that can quickly show
> for a specific identifier everything that is connected to it, again on
> pre-preprocessor source level. kind-of something like:
> ...
> func1() {
> struct string_list *filelist = NULL; int i;
> }
> ..
> I point to "string_list" and then all lines that are related
> to struct string_list, (#ifdef nestings, macros, all member typedefs)
> etc are shown and all the rest stripped away, again on human
> readable c source level.
>
>
>>
>> Do you need to attach the dependency for the statment and
>> expression as well?
>>
>> Chris
>>
>


[-- Attachment #2: hook.diff --]
[-- Type: text/plain, Size: 1453 bytes --]

diff --git a/pre-process.c b/pre-process.c
index fb3430a..73a58be 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -573,6 +573,9 @@ static struct token **substitute(struct token **list, struct token *body, struct
 		case TOKEN_MACRO_ARGUMENT:
 			arg = args[body->argnum].expanded;
 			count = &args[body->argnum].n_normal;
+			if (preprocess_hook) {
+				preprocess_hook->substitute_arg (&added, &args[body->argnum].expanded);
+			}
 			if (eof_token(arg)) {
 				state = Normal;
 				continue;
@@ -650,7 +653,7 @@ static int expand(struct token **list, struct symbol *sym)
 		if (preprocess_hook && preprocess_hook->expand_arg) {
 			int i;
 			for (i = 0; i < nargs; i++) {
-				preprocess_hook->expand_arg(token, sym, i, args[i].orig, args[i].expanded);
+				preprocess_hook->expand_arg(token, sym, i, args[i].orig, &args[i].expanded);
 				free_preprocessor_line(args[i].orig);
 			}
 		}
diff --git a/token.h b/token.h
index 985d1f5..c45d6be 100644
--- a/token.h
+++ b/token.h
@@ -175,7 +175,8 @@ struct preprocess_hook {
 	void (*expand_macro)(struct token *macro, struct symbol *sym, struct token *parent,
 			     struct token **replace, struct token **replace_tail);
 	void (*expand_arg)(struct token *macro, struct symbol *sym, int arg,
-			   struct token *orig, struct token *expanded);
+			   struct token *orig, struct token **expanded);
+	void (*substitute_arg)(struct token **dest, struct token **argp);
 };
 
 #define MAX_STRING 4095

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-12 17:46                                                                 ` Konrad Eisele
  2012-05-12 17:57                                                                   ` Konrad Eisele
  2012-05-13  8:52                                                                   ` Konrad Eisele
@ 2012-05-14 10:53                                                                   ` Christopher Li
  2 siblings, 0 replies; 50+ messages in thread
From: Christopher Li @ 2012-05-14 10:53 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Linux-Sparse

On Sat, May 12, 2012 at 10:46 AM, Konrad Eisele <eiselekd@gmail.com> wrote:
> Appended is a test-patch that adds test-mdep testcase.
> The file mdep.c is used to record that macro
> expansion, each token will have a reference to its
> source.
> test-mdep.c does pre-process (as test-macro.c) then
> prints out the token trace through macros for each
> token: @{ } is used to mark the active path.

Here is some feed back for the patch you send out:

>diff --git a/a.h b/a.h
>new file mode 100644

The test file should be in validation/ directory. Please give a.h a
name represent the test as well.

+               if (!strncmp(arg, "buildin", 7)) {
+                       fnobuildin = 1;
+               }

Is there a stand gcc option to handle nobuildin? I haven't found one.
If gcc has one, we can duplicate the gcc behavior. Otherwise,
I don't see it is necessary to add one. The dependence program needs
to able to handle the symbol from built in stream any way.
If it is for easier to debug, your program can trivially skip out the builtin.
stream.

+ * BSD-License
+ * Redistribution and use in source and binary forms are permitted

Can you make it cover by the sparse license file as well? I don't mind you grant
extra license to your code. But please at least make it cover by the
license of the project itself otherwise I won't apply it. It is pure overhead
to figure out what license is compatiable to spase and maintain different
file at different license. Have all file under the project cover by the same
license will keep it simple.

+void mdep_init(void) {
+    preprocess_hook = &pp;
+    pps = init_stream("<pp>", -1, 0);
+}

Please use the same coding standing as the other part of the project.
It is actually the same as linux kernel. The "{" for the function should be
on a new line. Indentation is tab and 8 char wide. There is a lot of those
little one on this file. I am not going to repeat myself here.

+struct hash {
+    struct hash_v *f;
+} h[HASH_LEN];

+struct pp {
+    enum pp_typ t;
+    union {
+        unsigned int argi;
+    };
+    struct pp_e *f;
+    struct symbol *sym;
+    struct token *tok;
+    struct token *s, *d;
+};
+

I hate those one letter member name. You have to guess what it is
all the time. pp_dope_list() is full of those one letter variable name as
well, make it very hard to read. The variable name need to be a little
bit more meaningful.


+       show_tokenstream(token, 0);
Please don't use 0 for NULL pointer.

-void show_tokenstream(struct token *token)
+void show_tokenstream(struct token *token, struct token *end)
-       while (!eof_token(token)) {
+       while (token != end && !eof_token(token)) {

I don't see you use the show_tokenstream with two arguments.
You write your own show_tokenstream for mdep any way.
You can leave it one unchanged.

@@ -194,7 +196,7 @@ void show_tokenstream(struct token *token)
-               printf("%s%.*s", show_token(token), prec, separator);
+               fprintf(stderr,"%s%.*s", show_token(token), prec, separator);

Can't do that, the "-E" will use show_tokenstream() as well. You just
change it to stderr for "-E".


To be continue...

Chris

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-13  8:52                                                                   ` Konrad Eisele
@ 2012-05-15  6:30                                                                     ` Christopher Li
  2012-05-15  7:52                                                                       ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-15  6:30 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Linux-Sparse

On Sun, May 13, 2012 at 1:52 AM, Konrad Eisele <eiselekd@gmail.com> wrote:
> I have thought about how to implement empty expansion tracing without
> introducing a new token type. I came up with a solution, however I need
> one callback, I called it substitute_arg(), see patch attached.
> What do you think, is it apply-able?
>

I am very sorry that my speed of absorbing patches is much slower than
your speed of producing it :(

About propagating the empty expansions, may I ask a silly question?
Is that the goal is you are able to track that, after the recursive expansion,
you are able to tell in the result stream, there is an macro expand to nothing
in the this location?

Obviously, if the empty expansion is happen in the top level macro expand,
you can always keep track of where is the original location of the expand using
macro token->pos. The tricky part is the, if the empty expansion happen inside
a multi-level macro expand. Then it is hard to keep track of where that empty
macro should have been landed if it is not empty. Is that the problem you are
trying to solve?

About the two example usage you give. The first one is using the html
to show how macro expand recursively. I like that demo. It only need to remember
at which level the macro expand to empty. It don't need to remember where
the empty macro need to land in the result stream.

For the second usage example, the shrinking program. I think it only need to
remember empty macro expand at the top level. Which is easy because you
have the macro token->pos pointing to where this macro used in the source
stream. For the empty macro expand inside another macro, you only need
to remember it is a dependent of the the top level macro. Because when you
trim the source code, you can't split a top level macro.

I think a lot of the complexity is introduced try to remember where that empty
macro will land, if it is not being empty. However, exactly because the macro
is being empty, it will not show up in the result token list. So where it should
land is actually not very useful information, the parser never see it any way.
We can relax the requirement a little bit, only need to remember where the
empty macro will land in the top level case. That will greatly
simplify the solution.

Is my understanding correct?

Thanks

Chris

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-15  6:30                                                                     ` Christopher Li
@ 2012-05-15  7:52                                                                       ` Konrad Eisele
  2012-05-15  9:44                                                                         ` Christopher Li
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Eisele @ 2012-05-15  7:52 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

Christopher Li wrote:
> On Sun, May 13, 2012 at 1:52 AM, Konrad Eisele<eiselekd@gmail.com>  wrote:
>> I have thought about how to implement empty expansion tracing without
>> introducing a new token type. I came up with a solution, however I need
>> one callback, I called it substitute_arg(), see patch attached.
>> What do you think, is it apply-able?
>>
>
> I am very sorry that my speed of absorbing patches is much slower than
> your speed of producing it :(
The last patch with mdep.c and test-mdep.c was also nothing to
apply, only a work in progress to go further...
One thing you can see is how I propagate the token sources
using:
137:            n->from = list->pos;
...
143:            list->pos.line = id;
144:            list->pos.stream = pps;

here you get an argument why it is better to have a
(1) ->macro_begin(a)
     ->macro_end(b)
instead of only one
(2) expand_macro(a,b)
If you want to use the preprocessorhooks to output human readable macro
expansion history line similar to LLVM you probably want a pointer
to the "pre-expanded" token location, that is in (a). In (1) you can buildup
the token-source-paths going from post-expand buffer (b) through the pre-expanded buffers (a)
in (2) you only have the paths going through the expanded buffers, can use (a) to reason
about where (b) came from, but  not in a simple way...

>
> About propagating the empty expansions, may I ask a silly question?
> Is that the goal is you are able to track that, after the recursive expansion,
> you are able to tell in the result stream, there is an macro expand to nothing
> in the this location?
>
> Obviously, if the empty expansion is happen in the top level macro expand,
> you can always keep track of where is the original location of the expand using
> macro token->pos. The tricky part is the, if the empty expansion happen inside
> a multi-level macro expand. Then it is hard to keep track of where that empty
> macro should have been landed if it is not empty. Is that the problem you are
> trying to solve?

Kindof something like this:
#define E1
#define E2
#define S1(a) struct a { E1 int d1; };
#define S2(a) struct a { E2 int d2; };
#define xdef S1(sx) S2(sy)
xdef
main() {
	struct sx v;
}

The xdef expands in one-line to "struct sx { int d1; }; struct sy { int d1; };"
main() only uses "struct sx". Therefore the dependency analysis should not
have "E2 and S2" as dependencies. I think you need the location of empty expansions...

>
> About the two example usage you give. The first one is using the html
> to show how macro expand recursively. I like that demo. It only need to remember
> at which level the macro expand to empty. It don't need to remember where
> the empty macro need to land in the result stream.
Do you mean http://cfw.sourceforge.net/htmltag/init_32.c.pinfo.html ? This
is a gcc-based patch even there you need to trace empty expansion positions.

>
> For the second usage example, the shrinking program. I think it only need to
> remember empty macro expand at the top level. Which is easy because you
> have the macro token->pos pointing to where this macro used in the source
> stream. For the empty macro expand inside another macro, you only need
> to remember it is a dependent of the the top level macro. Because when you
> trim the source code, you can't split a top level macro.

See above example.

>
> I think a lot of the complexity is introduced try to remember where that empty
> macro will land, if it is not being empty. However, exactly because the macro
> is being empty, it will not show up in the result token list. So where it should
> land is actually not very useful information, the parser never see it any way.
> We can relax the requirement a little bit, only need to remember where the
> empty macro will land in the top level case. That will greatly
> simplify the solution.

I'm not shure if it works...

>
> Is my understanding correct?
>
> Thanks
>
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-15  7:52                                                                       ` Konrad Eisele
@ 2012-05-15  9:44                                                                         ` Christopher Li
  2012-05-15 13:03                                                                           ` Konrad Eisele
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher Li @ 2012-05-15  9:44 UTC (permalink / raw)
  To: Konrad Eisele; +Cc: Konrad Eisele, Linux-Sparse

On Tue, May 15, 2012 at 12:52 AM, Konrad Eisele <konrad@gaisler.com> wrote:
> The last patch with mdep.c and test-mdep.c was also nothing to
> apply, only a work in progress to go further...

OK, I mistaken that as apply request. Never mind some of the
coding style comment then.

> One thing you can see is how I propagate the token sources
> using:
>
> 137:            n->from = list->pos;
> ...
> 143:            list->pos.line = id;
> 144:            list->pos.stream = pps;
>
> here you get an argument why it is better to have a
> (1) ->macro_begin(a)
>    ->macro_end(b)
> instead of only one
> (2) expand_macro(a,b)
> If you want to use the preprocessorhooks to output human readable macro
> expansion history line similar to LLVM you probably want a pointer
> to the "pre-expanded" token location, that is in (a). In (1) you can buildup
> the token-source-paths going from post-expand buffer (b) through the
> pre-expanded buffers (a)
> in (2) you only have the paths going through the expanded buffers, can use
> (a) to reason
> about where (b) came from, but  not in a simple way...

At this point I think  1) is actually better for your requirement.
I start out as hoping the expand_macro() can abstract away the
internal implementation detail of macro expand and just give you the
text before and after the macro expand. But with all this extra
manipulations of the macro tokens, especially the substitute_argument()
is clear indicate tightening into the implementation details.

I would take 1) over substitute_arguments() if 1) don't need call back like
substitute_arguments().


> Kindof something like this:
> #define E1
> #define E2
> #define S1(a) struct a { E1 int d1; };
> #define S2(a) struct a { E2 int d2; };
> #define xdef S1(sx) S2(sy)
> xdef
> main() {
>        struct sx v;
> }
>
> The xdef expands in one-line to "struct sx { int d1; }; struct sy { int d1;
> };"
> main() only uses "struct sx". Therefore the dependency analysis should not
> have "E2 and S2" as dependencies. I think you need the location of empty

That is surprising to me. I previously have different assumptions.
I assume you want to back trace all the way back to the source macro.
I would just say main() depend on xdef, which depend one S1 and S2.
S1 also depend on E1 and S2 depend on E2.

If you bypass xdef and directly extract S1(sx). Why can't you do one
step further bypass S1() as well,  and say main depend on "struct sx {
E1 int d1;}?

This is even more complicated than I original though.
How about this case:

#define S1(a) struct a { E1 int d1; }; E3 struct
#define S2(a) a { E2 int d2; };

The rest is the same. Now xdef will expand to the same text.
What should main() depend on?  S1, E1 and E3?
Notice that without S2, macro expand by S1 can't compile at all.

> Do you mean http://cfw.sourceforge.net/htmltag/init_32.c.pinfo.html ? This
> is a gcc-based patch even there you need to trace empty expansion positions.

Yes. Did that patch submit to gcc? I can see trace empty expansion one level.
You track empty macro pass that one multiple level macro expand as well?

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Fwd: dependency tee from c parser entities downto token
  2012-05-15  9:44                                                                         ` Christopher Li
@ 2012-05-15 13:03                                                                           ` Konrad Eisele
  0 siblings, 0 replies; 50+ messages in thread
From: Konrad Eisele @ 2012-05-15 13:03 UTC (permalink / raw)
  To: Christopher Li; +Cc: Konrad Eisele, Linux-Sparse

On 05/15/2012 11:44 AM, Christopher Li wrote:
> On Tue, May 15, 2012 at 12:52 AM, Konrad Eisele<konrad@gaisler.com>  wrote:
>> The last patch with mdep.c and test-mdep.c was also nothing to
>> apply, only a work in progress to go further...
>
> OK, I mistaken that as apply request. Never mind some of the
> coding style comment then.
>
>> One thing you can see is how I propagate the token sources
>> using:
>>
>> 137:            n->from = list->pos;
>> ...
>> 143:            list->pos.line = id;
>> 144:            list->pos.stream = pps;
>>
>> here you get an argument why it is better to have a
>> (1) ->macro_begin(a)
>>     ->macro_end(b)
>> instead of only one
>> (2) expand_macro(a,b)
>> If you want to use the preprocessorhooks to output human readable macro
>> expansion history line similar to LLVM you probably want a pointer
>> to the "pre-expanded" token location, that is in (a). In (1) you can buildup
>> the token-source-paths going from post-expand buffer (b) through the
>> pre-expanded buffers (a)
>> in (2) you only have the paths going through the expanded buffers, can use
>> (a) to reason
>> about where (b) came from, but  not in a simple way...
>
> At this point I think  1) is actually better for your requirement.
> I start out as hoping the expand_macro() can abstract away the
> internal implementation detail of macro expand and just give you the
> text before and after the macro expand. But with all this extra
> manipulations of the macro tokens, especially the substitute_argument()
> is clear indicate tightening into the implementation details.
>
> I would take 1) over substitute_arguments() if 1) don't need call back like
> substitute_arguments().

substitute_arguments() is because I am not allowed extra token
TOKEN_M_EMPTY. it is unrelated to the upper cases...

>
>
>> Kindof something like this:
>> #define E1
>> #define E2
>> #define S1(a) struct a { E1 int d1; };
>> #define S2(a) struct a { E2 int d2; };
>> #define xdef S1(sx) S2(sy)
>> xdef
>> main() {
>>         struct sx v;
>> }
>>
>> The xdef expands in one-line to "struct sx { int d1; }; struct sy { int d1;
>> };"
>> main() only uses "struct sx". Therefore the dependency analysis should not
>> have "E2 and S2" as dependencies. I think you need the location of empty
>
> That is surprising to me. I previously have different assumptions.
> I assume you want to back trace all the way back to the source macro.
> I would just say main() depend on xdef, which depend one S1 and S2.
> S1 also depend on E1 and S2 depend on E2.

I'm not shure who is wrong here or weather maybe my example is not
general enought, maybe you are right...

>
> If you bypass xdef and directly extract S1(sx). Why can't you do one
> step further bypass S1() as well,  and say main depend on "struct sx {
> E1 int d1;}?
>
> This is even more complicated than I original though.
> How about this case:
>
> #define S1(a) struct a { E1 int d1; }; E3 struct
> #define S2(a) a { E2 int d2; };
>
> The rest is the same. Now xdef will expand to the same text.
> What should main() depend on?  S1, E1 and E3?
> Notice that without S2, macro expand by S1 can't compile at all.
>
>> Do you mean http://cfw.sourceforge.net/htmltag/init_32.c.pinfo.html ? This
>> is a gcc-based patch even there you need to trace empty expansion positions.
>
> Yes. Did that patch submit to gcc? I can see trace empty expansion one level.

Kind-of:-) http://gcc.gnu.org/ml/gcc/2012-03/msg00208.html
It disapears in the gcc list noise however...

> You track empty macro pass that one multiple level macro expand as well?

I dump everything, however I use a perl script to reconstruct...

>
> Chris
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2012-05-15 13:00 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-24  9:54 dependency tee from c parser entities downto token Konrad Eisele
2012-04-25 20:10 ` [PATCH] depend.c: build up a dependency tree from c entities downto tokens: entries in the tree are: macro-depend: tree of #if nesting macro-expansions: possible macro expansion source of a token tok->macro-expansions->macro tok->macro-depend->macro c entities are linked in via [stmt|expr|sym]->start-end-token Konrad Eisele
2012-04-30 22:58 ` dependency tee from c parser entities downto token Christopher Li
2012-05-02  7:27   ` Konrad Eisele
2012-05-03 23:52     ` Christopher Li
2012-05-04  7:33       ` Konrad Eisele
2012-05-04  9:25         ` Christopher Li
2012-05-04 10:36           ` Konrad Eisele
2012-05-04 12:36             ` Konrad Eisele
2012-05-04 15:30               ` Josh Triplett
2012-05-04 20:53                 ` Konrad Eisele
2012-05-04 22:30                   ` Christopher Li
2012-05-05  0:32                     ` Josh Triplett
2012-05-05  8:59                       ` Konrad Eisele
2012-05-05  8:56                     ` Konrad Eisele
2012-05-04 18:02             ` Christopher Li
2012-05-04 21:46               ` Konrad Eisele
2012-05-04 21:56                 ` Konrad Eisele
2012-05-04 23:05                 ` Christopher Li
2012-05-05  8:54                   ` Konrad Eisele
2012-05-05 11:12                     ` Christopher Li
2012-05-05 16:59                       ` Konrad Eisele
     [not found]                         ` <CANeU7Qn7vUzLQAF6JGRECro_pPDnL7MCswkrNACe1wohLHZu7g@mail.gmail.com>
2012-05-05 19:56                           ` Fwd: " Christopher Li
2012-05-05 23:38                             ` Konrad Eisele
2012-05-06 18:34                               ` Christopher Li
2012-05-07  6:12                                 ` Konrad Eisele
2012-05-07 22:06                                   ` Christopher Li
2012-05-08  6:38                                     ` Konrad Eisele
2012-05-09  9:18                                       ` Christopher Li
2012-05-09  9:48                                         ` Konrad Eisele
2012-05-09 22:50                                           ` Christopher Li
2012-05-10  6:19                                             ` Konrad Eisele
2012-05-10  6:38                                               ` Konrad Eisele
2012-05-10  9:37                                                 ` Christopher Li
2012-05-10  9:51                                                   ` Konrad Eisele
2012-05-10 11:25                                                     ` Christopher Li
2012-05-10 12:14                                                       ` Konrad Eisele
2012-05-10 12:28                                                         ` Konrad Eisele
2012-05-11 19:40                                                           ` Christopher Li
2012-05-11 21:48                                                             ` Konrad Eisele
2012-05-12 11:02                                                               ` Christopher Li
2012-05-12 17:46                                                                 ` Konrad Eisele
2012-05-12 17:57                                                                   ` Konrad Eisele
2012-05-13  8:52                                                                   ` Konrad Eisele
2012-05-15  6:30                                                                     ` Christopher Li
2012-05-15  7:52                                                                       ` Konrad Eisele
2012-05-15  9:44                                                                         ` Christopher Li
2012-05-15 13:03                                                                           ` Konrad Eisele
2012-05-14 10:53                                                                   ` Christopher Li
2012-05-10  9:03                                               ` Christopher Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.