On Wed, 2012-02-08 at 16:11 -0200, Emílio Wuerges wrote:
Hello,
I've made a small patch including my dominator analysis.
Thanks for sending this!
It is ugly.
It is the first time I did something with python embedded in C, so I'm
aware I did not use the best practices.
A good tutorial for this can be seen here:
http://docs.python.org/extending/index.html
and detailed notes are here:
http://docs.python.org/c-api/index.html
I'm sorry that there isn't yet a guide to contributing to plugin.
Beware: lots of the relevant code within the gcc python plugin is
autogenerated, so often it's best to implement things by patching the
generate-*.py files, rather than the C code directly, and often a
mixture of the two is best.
I inserted all my changes inside the gcc-python.c
If you could check my patch and suggest some changes I'd be happy to
do them before re-submitting my patch.
I'm afraid I'm not yet familiar with GCC's own CFG analysis code, so
please bear with me.
Some high-level API thoughts:
* can the various dominator analysis code be methods of
gcc.BasicBlock, rather than classmethods within gcc? For example,
rather than:
if gcc.dominates(bb1, bb2):
...etc...
it seems more "pythonic" to me to have this instead:
if bb1.dominates(bb2):
...etc...
and:
if bb1.post_dominates(bb2):
...etc...
Similarly, rather than:
gcc.dominator_sons(bb)
it seems more "pythonic" to have
bb.dominator_sons()
As a new parent, I'm biased: can they be daughters, rather than
sons? :)
But maybe the API would read better as something like this:
bb.get_dominated_blocks()
or somesuch?
Adding new methods and attributes to the classes requires editing the
appropriate generate-*.py script to wire up the new entrypoint. For
very simple attributes you can embed the C code directly there, but
anything that's more than a one-liner should have its implementation in
the relevant C file.
So you'd edit:
* generate-cfg-c.py to add the new methods and attributes to the
relevant tables of callbacks
* gcc-python-wrappers.h to add declarations of the new C functions
* gcc-python-cfg.c to add the implementations of the new C functions
Going through the patch itself:
+static PyObject *
+gcc_python_dominates(PyGccBasicBlock *n, PyObject *args)
+{
+ static int calc = 0;
+ PyGccBasicBlock *a, *b;
+
+ if(calc == 0) {
+ calculate_dominance_info (CDI_DOMINATORS);
+ calc = 1;
+ }
I'm a little wary about the global state that "calc" introduces here: it
looks like you're lazily calculating the dominance info. Where does GCC
normally calculate the information? Could the CFG have changed since
the last call to this function?
+
+ gcc_assert(PyArg_ParseTuple(args,
+ "OO",
+ &a, &b));;
There are a couple of issues here:
(a) IIRC, gcc_assert is intended for errors within GCC, and kills the
process with a diagnostic, whereas it's possible for PyArg_ParseTuple to
fail with a script-level error (e.g. if the wrong number of arguments is
passed in). If that happens, it's better to instead do something like
this:
if (!PyArg_ParseTuple(args, ...etc...) {
return NULL;
}
since PyArg_ParseTuple will have set the current thread's exception for
us, thus printing a Python traceback with a good error message, rather
than halting GCC with an inscrutable Internal Compiler Error.
(b) The "O" flag only checks for an object; it doesn't do any type
checking, so if someone passes in something that isn't a gcc.BasicBlock,
there's an implicit cast to the wrong C struct, and we'll likely have a
segfault. The easy way to avoid this is to use the "O!" format code,
which takes a PyTypeObject and gives us type checking against it, so
something like this:
if (!PyArg_ParseTuple(args, "O!O!",
&gcc_BasicBlock_Type, &a,
&gcc_BasicBlock_Type, &b)) {
return NULL;
}
so that if the args aren't of the correct type, a meaningful exception
will be set for us.
+ {"dominates",
+ (PyCFunction)gcc_python_dominates,
+ (METH_VARARGS | METH_KEYWORDS),
+ "Checks if a BB dominates another."},
^^^^ if you use METH_KEYWORDS, the callback function takes 3 arguments
(and you need to use PyArg_ParseTupleAndKeywords() instead):
http://docs.python.org/c-api/structures.html#METH_KEYWORDS
despite needing a cast to (PyCFunction).
This *is* confusing, but note that the plugin's own gcc-with-cpychecker
script actually checks for all of these issues (which may be why I'm
being pedantic about them - sorry!). If you want to try doing so, it's
possible to build the plugin once, then compile it with itself (using
CC=gcc-with-cpychecker as a Makefile variable:
make CC=/path/to/a/clean/build/of/the/plugin/gcc-with-cpychecker
though unfortunately it doesn't quite compile itself cleanly right
now :(
static PyObject *
+gcc_python_calculate_dominance_info(PyObject *n, PyObject *args)
+{
+ calculate_dominance_info (CDI_DOMINATORS);
+ Py_RETURN_NONE;
+}
+
+static PyObject *
+gcc_python_free_dominance_info(PyObject *n, PyObject *args)
+{
+ free_dominance_info (CDI_DOMINATORS);
+ Py_RETURN_NONE;
+}
+
+static PyObject *
gcc_python_dominates(PyGccBasicBlock *n, PyObject *args)
{
- static int calc = 0;
PyGccBasicBlock *a, *b;
-
- if(calc == 0) {
- calculate_dominance_info (CDI_DOMINATORS);
- calc = 1;
- }
-
Looks like you fixed the issue I mentioned above. It may make it easier
to review if you generate a diff combining all of the changes (not
sure).
+static PyObject *
+gcc_python_dominator_sons(PyObject *n, PyObject *args)
+{
+ PyGccBasicBlock *py_root;
+ PyObject *py_son;
+ basic_block root, son;
+ PyObject *result;
+
+ gcc_assert(PyArg_ParseTuple(args,
+ "O",
+ &py_root));
Similar type-safety issues as above (i.e. what if the argument isn't a
basic block?). BTW, if you do this as an instance method of
gcc.BasicBlock (as opposed to a function within the gcc module) then you
get typechecking (on "self") "for free".
+ root = py_root->bb;
+
+ result = PyList_New(0);
+ if (!result) {
+ goto error;
+ }
+
+ for (son = first_dom_son (CDI_DOMINATORS, root);
+ son;
+ son = next_dom_son (CDI_DOMINATORS, son))
+ {
+ py_son = gcc_python_make_wrapper_basic_block(son);
This could fail (e.g. under low memory conditions); you need something
like this:
if (!py_son) {
goto error;
}
At this point, either py_son "owns" a reference on a gcc.BasicBlock:
every path onwards needs to DECREF it, or the object will leak.
+ if (-1 == PyList_Append(result, py_son)) {
Py_DECREF(py_son); /* need one here */
+ goto error;
+ }
Py_DECREF(py_son); /* need another one here */
+ }
@@ -386,6 +421,11 @@ static PyMethodDef GccMethods[] = {
(METH_VARARGS | METH_KEYWORDS),
"Frees dominance info."},
+ {"dominator_sons",
+ (PyCFunction)gcc_python_dominator_sons,
+ (METH_VARARGS | METH_KEYWORDS),
+ "Returns the dominated sons of a bb."},
Is this recursive?
+static PyObject *
+gcc_python_loop_count(PyObject *n, PyObject *args)
+{
+ struct niter_desc *desc;
+ unsigned int result;
+ PyGccBasicBlock *py_bb;
+ basic_block bb;
+ gcc_assert(PyArg_ParseTuple(args, "O", &py_bb));
+ bb = py_bb->bb;
Looks like this would be better as a method of the gcc.BasicBlock,
rather than of the gcc module itself, so that "py_bb" would become
"self".
[I'm sorry, I don't yet fully understand the relevant parts of GCC to
comment further on the rest of this function]
My preference with getters is that if the implementation is a simple
field lookup, it should be an attribute (the "getter" is only implicit,
existing at the C level):
print(bb.loopcount)
whereas if getting the result involves some work, it should be an
explicit method of the class (where the "getter" is explicit at the
Python level):
print(bb.get_loop_count())
It looks like this situation is the latter case.
static PyObject *
+gcc_python_loop_contains(PyObject *n, PyObject *args)
+{
+ int result;
+ PyGccBasicBlock *a, *b;
+ struct loop *A, *B, *step;
+ gcc_assert(PyArg_ParseTuple(args, "OO", &a, &b));
Again, sorry for my lack of knowledge of GCC at this point, but would it
be possible to expose the "loop" type in python form? (would it be a
good idea? I don't know, but it might make it possible to implement
this kind of analysis in Python rather than in C, and it then becomes
much easier to hack on, try new algorithms, etc. But wrapping a new
type does take some effort).
+static PyObject *
+gcc_python_loop_depth(PyObject *n, PyObject *args)
+{
+ PyGccBasicBlock *py_bb;
+ gcc_assert(PyArg_ParseTuple(args, "O", &py_bb));
+
+ return Py_BuildValue("i", py_bb->bb->loop_depth);
}
Probably should be a readonly attribute of the gcc.BasicBlock (as a
getter within the PyGetSetDef table), rather than a function of the
"gcc" module, so that you're dealing with "self".
Sorry if the above came across as negative, thanks again for your work
on this, and I hope the above makes sense.
Dave