Monday, August 17, 2009

Goals Met

Got the doctests all prettified.  They are located in biolib/src/mappings/swig/python/test, and called test_bpp_*.py, where the * is utils, numcalc, seq, and popgen.  I have notes for every class, and also for functions whose use isn't very clear.

Just checked over Hilmar's e-mail with the goals he set for me.  I was slightly mistaken--he wanted 60% of the classes wrapped, with 80% of those tested--not 80% with 60% tested as I thought.  I'll quote him directly:

1. Map for Python with SWIG 60% of classes and templates of all Bio++
modules

2. 80% of all methods in those classes and templates have complete doctests
with a clear description of what the method is supposed to do.

Now, I rechecked my list of objects.  It turns out that there are 416 classes, not 432.  That's because 16 of those are nested classes that SWIG can't handle anyway.  The number I have wrapped is 414--I could never get numcalc/LUDecomposition to work, and utils/BppString was also ill-behaved (and totally useless for this project, anyway).  That's 99.5%. The number I've tested thoroughly kept creeping up and down slightly as I edited my doctests, but it settled out at 208.  That's exactly half of the total that I have completely unit-tested.  So that achieves the 48% (80% × 60%), with a little to spare.

Am I happy with this?  Yes and no.  I think it's obvious that I managed to put out a lot over the last couple of weeks, and I'm satisfied with my progress.  But it's still short of the goals I set out to do.  Yes, I think those goals were ambitious.  When I set them up, I hadn't looked at the Bio++ code as closely as I should have.  I should have known that doing each library, wrapping and testing, in a week was too much.  But I said I'd do it, and I still feel responsible for it.  So there we go...

I'm also more in touch with Julien and the Bio++ coders, and I think I'd like to switch gears a little now, and maybe work with them to make the code a little more SWIG-friendly.  We'll see how that goes.  I also need to work with Pjotr to get everything incorporated into the whole biolib framework.  So there's still stuff to do.  This autumn I have a teaching gig at a local college, but that's not going to take all of my time.  So I'm going to keep working on bits and pieces of this.  Despite the last month of high stress, I still feel that this project is a good one, with a lot of potential to help scientists do what they need to do, with a minimum of pain.

In the meantime, it's past 3:30 AM, and I'm probably blathering.  So I'll sign off.

שלום

Sunday, August 16, 2009

Not Much to Report

Went through the utils doctest today, adding comments where I though aspects were unclear.  I'm pretty happy with that document.  Just need to do the other three now...

Also found a place in the doctest where I forgot to demarcate a class.  Thus I have 209 objects, not 208. Woo hoo.  Not that it's a big object--it's just an exception.

Saturday, August 15, 2009

Hilmar's Goal Met...

...at least technically. I'll get to that in a second. Of the 432 objects in Bio++, I have the vast majority of them wrapped (I think all but 3), and 208 of them doc-tested. Hilmar asked me to have 80% of them wrapped, and 60% of those documented.

Managed to uncover some of the formats by doing some fair involved web hunts. Did Clustal and DCSE. Both of those inherit from a couple other objects, so I could do those too. Also did one of the static "Tools" objects. So yes, I met the quota without "cheating" by unit-testing übersmall Exceptions. (Found out that the Clustal format that Bio++ does is out-of-date, but these can change so fast that that's not totally unexpected. More serious is the off-by-one error that causes it to chop off the first amino acid in a protein.)

Finally figured out why I hadn't been able to post to the various Bio++ message boards, even though I've been getting all their messages. I subscribed with my gmail address, when I was trying to send with my cs.wisc.edu address. It's sorted out now. Actually, I was thinking of doing some coding for them this fall. I'll have some free time, and I think I'll be able to do more good for this project there, working on Bio++ code to make it more palatable for SWIG. At the very least, I'm hoping that I can get all the compilable code into the .cpp files, rather than the .h files. That would make a lot of things easier for SWIG.

So...I think now I've technically met the goals that Hilmar set for me to get a passing grade. However, the doctests are ugly. Basically, they show a ton of examples using the code, with barely any explanation. But I need to make sure that these documents are useful. So...this weekend I'm going to work on beautifying them, so that they might help someone actually use this tool.

Friday, August 14, 2009

203 (So Close...)

Wanted to finish today, but didn't quite make it.  Did 15 objects today. Many of those were quite large, in fact (hence not quite making it).

I also managed to address the const vector problem. I managed to get constructors that returned null objects, but otherwise didn't crash. (At least, until you tried to use the null object--then it's segfault city.) It turns out that when you use %extend to create a new constructor inside of SWIG, it's not really a constructor. It's more of a pseudo-constructor, that looks like a constructor from inside of your scripting language. Thus, you need to explicitly return something--hence my null value. Once I added that in, it was golden.

So what objects am I going to unit test for my last 5? Ideally, I'd like to do more file-formatting stuff, which Bio++ uses to read and write files of specific types. Unfortunately, I haven't been able to find example files for most of those, and the descriptions I've found have been lacking. I asked Julien and company if maybe they had some example files--after all, they had to debug them using something. But, I haven't heard anything back.

Option #2 would be to just knock off some Exception objects inside the phyl library. This seems like cheating, somehow. I mean, Exceptions are quick and easy, and I could have my 208 objects within 20 minutes. But having those unit tested doesn't buy us that much.

Probably, I'm going to slog through the few functions remaining in seq, which involves a large number of static utility functions that do random things. Those will be slow to do, but it seems more responsible. After all, if I didn't have this deadline looming over me, that's what I'd do next. Having those tested is really worth something.

Thursday, August 13, 2009

188 Objects (of Beer on the Wall)

Did 28 today. Part of that is because I started working with popgen. Part is because I had a breakthrough with seq, that let me do several more objects there. The seq library is almost all tested, in fact.

Spent a couple of hours hacking around with the const/vector problem. Got a suggestion that I should try to create an extension to the VectorSiteContainer object, that would take vectors that aren't const. I tried several things...none of them worked. Not sure why--looking at the code, it seems that we don't need to store the original address of the vector anywhere. And yet, the vector is passed by reference. I really don't know why they decided to do that.

Tried to post to the Bio++ boards again, but I don't think it got through.

So...I'm at about 43.5% unit tested.  Hilmar's goal is for 48% to be done. I have 20 to go. I'm going to try to finish that up tomorrow, and then spend the weekend going over my doctests and expanding them, so that they're human-readable documents. Fact is, Bio++ needs much better documentation. And given that I've now deduced about half the methods (often the hard way), I might be the best person for the job.

Wednesday, August 12, 2009

Only 160

Six objects today. Not a good day.

Spent a couple of hours hacking around with VectorTools. This ended up not being a good idea. Well...yeah, I got a lot of it tested. And it is an important class. But it turns out that sections of the thing give SWIG indigestion when it tries to compile the wrapper files. This seems to be a really solid reason to put your code in the .cpp files, not the .h files. If these bits of code were already compiled, tucked away in their object files, I wouldn't be having these problems. Instead, it's in the header, where it gets entangled with SWIG, and SWIG starts having problems with certain templates.

And then I spent a fair amount of time today working with VectorSiteContainer. The main constructor to that requires a vector as one of its arguments. I tried for a very long time to get SWIG to wrap such an object, with no luck. The closest I could get was a vector, which isn't good enough. Every time I tried to add that "const" in there, the SWIG-generated wrapper file wouldn't compile. Finally, very late at night, I threw a request for help to the swig mailing list (and also CC'ed to the phyloinformatics people), but I don't expect a response soon.

Update--message came in, as I was writing that last paragraph. It turns out that this is a known bug in SWIG. Which means that it's not going to be fixed this week, which means that there are 27 classes in seq that I won't be able to test.  So...I'm going to have to move on to phyl and popgen.

On other piece of info, though. According to Bio++ docs, there are 432 objects. Multiplying this by 0.48 yields a little less than 208, so that's my goal. That means that I now have 48 to go.

Tuesday, August 11, 2009

154!

Did 5 yesterday (Sunday), and 37 today. So today was a good day. However, I fully expect things to slow down soon. I'm dealing with some issues that are segfaulting on me, and I don't know why.

Also finally managed to find a solution to the input stream problem, thanks to folks on the SWIG mailing list. This involves writing an interface for std:ifstream, whose constructor only needs the name of the file. Suddenly you've got an input stream, in a form that inherits from istream. It's really a very elegant solution.

If I'm lucky, I'll have the minimum number of objects tested by the end of Wednesday. Thursday or Friday is probably more realistic, though.

Sunday, August 9, 2009

Up to 112...

Another late night, and I'm a little more than half-done. Found a couple more bugs, of course.

Finally exchanged some e-mail with Julien, the principal coder on the Bio++ project. My e-mails to the biopp dev list were getting lost in cyberspace somewhere. Not sure what to do about that. Hopefully though, Julien can answer my questions (such as to where I should submit bug reports).

Saturday, August 8, 2009

92 Objects Unit-Tested

I've backtracked a bit on my counting method. Before, if a new object to be tested seemed to be abstract, I just called it abstract and counted it toward my total. This made me a little nervous though, and I realized that it also meant duplicated testing. That is, if Object1 and Object2 both inherit from abstract ObjectA, I would be testing ObjectA's functions twice. So I've redone things, so that I use say Object1 to test ObjectA's functions, under ObjectA. So there are a few abstract objects that have been removed from my total--but I think the count is more honest this way.

92 are done, after 5 days. 30 of those are from utils, done in the first 3 days. 62 are from numcalc. At this rate, I'll have ~250 objects after my 14 days have passed.

Tomorrow, I think I'm going to stop working with numcalc, since most of the things in there don't get seen by end-users, anyway. I'm going to start with seq, where some of the real meat is.

Friday, August 7, 2009

Guess What 4*4 Is?

Recounted the number of classes I've done, and then I did a fair number today.  Current count: 84 (though some of those are abstract, and don't need direct wrapping). I don't know if things will speed up or slow down after this. NumCalc is yielding some interesting little features, that are proving interesting to wrap.

I've found at least 5 bugs in Bio++. Most recently, it tried to tell me that 4^2 = 15. I checked this out by coding in C++, without SWIG. Yup--not a SWIG bug. NumTools.pow(4,2) = 16. (I believe the actual calculation is done with doubles, and that it's a rounding error.) I don't know where to report this. I've written the Bio++ dev mailing list, and gotten back zilch in response. I don't know if they're ignoring me, or if my messages just haven't gotten through.

In the meantime, these bugs are slowing me down. Every time I find one, I need to check that it's not my own idiocy with SWIG that's causing problems. That's aggravating. I have a lot of money on the line, and I'm being slowed down by careless bugs. (Of course, I can't be too critical. Bugs are part of the Game, and I have to recognize that I'm under a lot of stress right now.)

Thursday, August 6, 2009

Utils Done, Working on NumCalc

Stuck on the Matrix object right now, and it's not pretty. There are strange bugs that I can't pinpoint, such as that right now double-matrices can compile, but int-matrices can't. It's the same code--just templated. Grrr.

Current score: ~40 objects thoroughly unit tested, over the last 3 days. Extrapolating, that means that after 14 days I'll have 187 done. There are 446 total objects, and 48% of that is 214. I'm going to be working solid during this time.

Wednesday, August 5, 2009

Utils Almost Tested

Well, today I managed to do about 15 objects within utils (much more than that if you count every kind of exception to be a different object). So by simple geometric progression, that means that tomorrow I'll do around 225, on Thursday I'll do 3375, and so on...

Seriously, this is getting faster, and I've found some important bugs and usage issues. (For example, I now have Python IDing many more templated objects, and making meaningful interfaces into them.)  However, there's still a ton of work to be done. I'm honestly not sure if I'm going to make the 60% of 80% goal that Hilmar set for me. (That is, 80% of objects implemented, and 60% of those properly tested. Thus, 48% of the total number of objects should be implemented and tested.) This coming two weeks are going to be rough.

Tomorrow I finish utils, and start with numcalc. If I can possibly get through seq by the end of the weekend, I'll be in good shape.

Tuesday, August 4, 2009

Python Doctesting

The good news--testing hasn't revealed any bugs in the underlying code. SWIG seems to be doing its job admirably.

The bad news--I managed to thoroughly test exactly one object today. Granted, it's the biggest object within utils: TextTools. And it was my first. However, given that it took all day--this really is going to be a race to the finish.

Monday, August 3, 2009

Today Was A Good Day

I had forgotten how fun it is to code when everything works right out of the box. Today was such a day. It took me about an hour or so to get utils working in Perl.  numcalc was a little more difficult, but it followed just fine. So then I tried R, knowing it to be tempermental. SWIG documentation for R is pretty spotty, since I don't think anyone uses it except for academics. But I got everything compiled before too long. The difficult bit was figuring out how to actually call my functions, but I figured that out by reading raw machine-generated R code. So then was Ruby, and that just sailed along smoothly.

So, I now have utils working in 4 languages:  Python, Perl, Ruby, and R. Java doesn't work, and I still have my doubts about it because it doesn't wrap as neatly (because Java's not a scripting language). numcalc works in Perl, and all 5 libraries work in Python. Libraries that haven't been wrapped yet in those languages should be very easy to do.

So then of course Pjotr had to write to ruin my fun. (I exagerate...I'd already come to the same conclusion he did.) What needs the bulk of my attention right now is the unit testing and documentation. So that's what I'm going to start with in the morning. I read through a lot of Xin's work this evening, and we'll see how much I can incorporate that.

Sunday, August 2, 2009

Git & Java

Spent some amount of time this evening working with Pjotr, trying to get my cmake files up to standards. A lot of this involved git--git still frustrates me, but I do learn.

Also worked with wrapping files in Java. I'm unsure if Java will really work. As near as I can tell, SWIG's performance in Java is quite different from the way it does scripting languages. And one of the things that's causing a lot of problems is multiple inheritence. SWIG starts screaming bloody murder whenever I try to wrap some object that inherits from multiple other objects. It works by just not inheriting from more than one parent object--other parents are just ignored. Multiple inheritence is used a lot in Bio++. Hence, the Java environment will be lacking several methods, and also not able to perform some basic assignment operations. Is it even worth having it available in Java, if the library will be severely crippled?

I think tomorrow I'm going to try Perl. Hopefully, that'll be highly similar to the Python I've already done.

Friday, July 31, 2009

Significant Milepost

I have a functional module, called bpp, that incorporates all five libraries from Bio++. I've played with it in Python, and it seems to all work fine.

Later this evening and this weekend, I start playing with other languages, and testing.

So Close...

Got PhylLib working. (That is, it runs within Python with no obvious errors.)

Turned my attention to PopGenLib. I got it all compiling, except for one object: CoordsTools. That has mysterious compiling errors that are annoying me.

Here's what I know about the object:

  • It's defined within CoordsTools.h, and there's no CoordsTools.cpp.
  • No other file #includes it.
  • Because of these two facts, it doesn't actually get compiled with anything, when one compiles PopGenLib. Accessing the object would require one to specifically #include CoordsTools.h.
  • It's pretty much a static object--the only function of import is one to calculate the distance between two different "coordinates" that are of templated type.
  • It hasn't been modified in just over 5 years.
  • All the errors I'm having seem to be errors within the .h file itself. There are no issues I can see with the wrapper file.
  • Bio++ documentation doesn't even list it as a proper member of the bpp namespace.

Given all of this, I'm guessing it's an orphaned file. It may be that it didn't get included with some of the project's recent bug fixes, to allow the files to be compiled under current versions of gcc (or to be part of the bpp namespace). In any case, I'm seriously inclined to ignore the stupid thing. We're talking about one function, that's not called anywhere else.

So, with this exception...I think I'm almost done with writing these .i files. I need to put them all together into a single python object, but that should be easy to do. I'll do it in the morning, when my head is clear.

After that...I start the serious bug testing.

Thursday, July 30, 2009

Bug in Bio++

I'm pretty sure that that's the reason for the failure to compile. There are two overloaded functions in TreeTemplateTools--one is called searchNodeWithId, and the other is searchNodeWithName. They're both pretty similar. Each one has two versions. One returns a list of Nodes, the other returns void and has the list of Nodes as one of the arguments, returning the value by reference.  Except...the argument-returned searchNodeWithName calls the return-value-returned searchNodeWithId, and it does it using the name as an argument. I believe this was just cut-and-pasted from the argument-returned searchNodeWithId.

The code is in TreeTemplateTools.h. If I modify it using the above assumption (that one version of searchNodeWithName should call the other version of the same), everything compiles wonderfully. However, I'm reluctant to call a bug report into the project, since I've only been looking through this code for a few months. I sent a query to the mailing list, asking if this is indeed a bug. Hopefully someone will get back to me.

Anyway, I managed to get everything in PhylLib to compile. There are warnings, but I believe that they're fine. A lot of the functions call one particular function, that throws exceptions. And it throws a more general exception before a more specific one. But I looked through the code, and there seem to be good reasons for doing it this way. However, it does mean that the wrapper C++ files complain vociferously, since this appears to be the "wrong way around." I believe it's fine.

Last step with PhylLib is to make sure that it runs within Python. That should be straight-forward. Basically, I just need to make sure that every function binding that should be defined, is. This means that I'll have to slightly modify the interface files for some of the static functions (basically, anything called *Tools.i), but it should be pretty easy. Then, on to PopGenLib. Hopefully that'll go much more quickly, since it has 1/3 the number of .h files (and therefore SWIG .i files).

Wednesday, July 29, 2009

58 Files Compile

Of the 103 wrapper files generated by SWIG, I've managed to get 58 to compile. The rest rely on Tree objects, with which I've had trouble. One issue--one of the objects (TreeTools) has a function that returns a structure whose definition is nested within the TreeTools class. SWIG doesn't support this, though it's on the list to be supported in a future version. I think the best thing to do there is to just remove that structure and the one function, in wait for the future version.

Otherwise, I'm having issues with templating. I have one templated function that calls another in the file TreeTemplateTools. And for some reason, SWIG isn't able to find the inner function. I've posted to both the SWIG and GSoC mailing lists, and hopefully I'll get an answer.

Pjotr also asked me to move the SWIG interface files to a biopp subdirectory, to mimic the directory structure of the other projects. This was easily done.

Tuesday, July 28, 2009

PhylLib Interface Files

103 of them, to be exact. Automated as best I could, but of course I had to modify every one of them by hand. However, they all run under SWIG without error (except for possible memory leaks for a few of them, but that's probably unavoidable). Next step is to compile all the generated C++ files, but that shouldn't take as long. PhylLib should be all SWIGgy by tomorrow evening.

Monday, July 27, 2009

SeqLib Done

I'm getting better and better at this. The SeqLib library has been SWIGified. I'm running it right now in Python, without any errors that I've been able to find.

As part of getting this done, I had to fix the Matrix.i problem within the NumCalc library. It was convoluted, but basically I had to explicitly define a SWIG %rename for a vector to a vector.

Of course, there is one small change in the Python version of SeqLib that needs to be mentioned. Python doesn't support operator overloading in which the overloaded function isn't a part of the class--and there were two functions like that within the Site object of the SeqLib library. After playing with them for a while, I finally decided to just not include them.

So...this coming week I try to do the final two libraries: PhylLib and PopGenLib. Since both of them are smaller than SeqLib, I hope to have them done pretty quickly.

Saturday, July 25, 2009

I'm Back

Back home yesterday afternoon. Slept a lot, but now I'm back in action. Watching myself much more carefully this time, to make sure that jet lag doesn't turn into an infection, again.

I managed to work for a couple of hours each day I was in Japan, trying to finish up numcalc. Got almost all the interface files working just fine. Two exceptions. First, LUDecomposition.i doesn't work. SWIG seems to run fine, but then it gives me some scoping errors when I try to compile the resulting wrapper C++ files. Tried several variants, but I couldn't get it working. It's a minor issue though--it doesn't seem to be necessary for any of the other wrappers. Second exception is Matrix.i, which I'll touch on in in relation to the next problem.

Got all of the interface files for seqlib put together. Running through them now, though a couple still have bugs to work out. Some of it is leading back to Matrix.i inside of numcalc, that I never quite got working. That is, it all compiles, but I'm having trouble wrapping the templates. The RowMatrix object inherits from a vector of vectors. Really not pretty at all, and I'm worried the SWIG can't handle it. This problem's going to need to go to the SWIG mailing list, I'm afraid.

Saturday, July 18, 2009

Special Japan Edition

I'm writing from Kawaguchiko, at the base of Mt. Fuji. Work officially demanded my attention today...with a vengence.

What I've been trying to do with SWIG might be outright impossible. It has severe problems when trying to inherit from a standard, templatable object.

Unlike other SWIG directives which should be defined prior to reading in aheader file, you need to put %template after the *template* class isdeclared but *before* that template class used. Which creates a problem ifthe template class is created in the same header file where it is used (andyou don't own the header file). I have submitted a bug/enhancement requeston this:

http://sourceforge.net/tracker/?func=detail&aid=2502006&group_id=1645&atid=101645

This is still a big issue for me given that I'm wrapping a few thousandclasses of which a number of them have this problem. I think thatultimately I'll need to familiarize myself with the SWIG source code and fixthis myself (unless other SWIG developer with free time wants to pick thisup :)).

Thanks,
-James

The workaround might be making my own vector .i file. This is not exactly pleasing, but it seems it might be the only way to do this.

Otherwise, just spent a few hours ironing out the wrinkles in NumCalc. Managed to get some more .i files to compile with SWIG. But the big problem is still ParameterList, which a ton of things use, and which inherits from vector. This is not going to be pretty.

Tuesday, July 14, 2009

NumCalc

Started the annoying process of going through all the automatically generated .i files, and modifying them so that they run with SWIG without complaint.  It mostly went pretty well, except for the ParameterList object.  That one is called by many others, and unfortunately it doesn't work.  The problem is that it inherits from vector, which seems to confuse SWIG.  It doesn't matter how I try to instantiate things specifically--it always tells me that I forgot to instantiate vector.  I'm afraid that without this, nothing else will work.  I've posted to both the SWIG mailing list and the NESCent SoC mailing lists, but so far nothing.

Flight out for my real summer vacation is in 8 hours. Hopefully, I'll get a bit of work done over this time.

Monday, July 13, 2009

I'm Back

Managed to put in a full workday today, at last. (I've been working in fits and starts over the last few days, but also fighting off a bacterial infection in my lungs. ISMB travel -> jet lag -> contracted a cold & had a depressed immune system -> my new prokaryotic friends decided to colonize me. Fuckers.)

Anyway...here's what's up. I have a package for Python called bpp. bpp contains everything in Utils (minus a couple of objects such as BppString, which the user doesn't need, anyway). As near as I can tell, it works. Haven't done extensive unit testing, but what little I have played with all functions beautifully.

So I started working on NumCalc. Found that there were many more header files I needed to convert to SWIG .i files. And here's where things began to get interesting. SWIG needs these .i interface files in order to define the hooks into the underlying C++ libraries. Basically, these are just the header files with a few minor modifications. Well, they would be except that for some reason the biopp .h files contain an awful lot of source code. I don't know why--maybe there's a good reason that Julien et al. did that. But from my perspective, it means a lot more work on my part, trying to edit everything down so I just have the prototypes of the functions. Seriously...there are thousands of lines of actual algorithmic source code in the header files. I have never seen this be a good idea.

Anyway, today's big accomplishment was to automate a lot of the procedure. Thanks to that, I have every .i file for NumCalc completed, though not tested.

Wednesday, July 8, 2009

Sick...

Fighting off a cold. The timeline's about right to put this as something I picked up on the plane back. Hopefully I'll be more able to concentrate tomorrow.

Did manage to get some work done, during my moments of lucidity. I now have a python module called bpp, that includes every object of utils except BppString, BppVector, KeyvalTooks, and ApplicationTools. The first two I think are unnecessary anyway. The second two are having some linking problems that I'm trying to put together. They compile just fine, it's just that when you try to import them, the thing complains.

Monday, July 6, 2009

I'm Back

Back from Stockholm, which was well worth the trip. Got to discuss actual research, and finally got to meet Pjotr and Hillmar face-to-face.

This Sunday evening I spent trying to get every object within utils into a single library, which I'm calling bpp.  This will hopefully be a single element to run/module to include/whatever.  Right now though, there are four .i files that I'm having a hard time compiling:  ApplicationTools.i, BppString.i, Font.i, and FontManager.i.  Except for those guys, Utils might even be almost complete!

Friday, June 26, 2009

Success!

As of today, I have officially made a working SWIG representation of a Bio++ object. (And it's about bloody time.)

I was having some linking errors, so I spent some time today rereading SWIG docs (specifically the bits about Python). Turns out that the errors I was having were mentioned explicitly in them. When I linked together the .so file, I needed to make sure that all its dependencies were there.

So here's the commands I used:

$ swig -python -c++ -I../../../../contrib/biopp/utils/Utils/ Exceptions.i
$ g++ -c -fpic Exceptions.cpp
$ g++ -c -fpic TextTools.cpp
$ gcc -c -fpic Exceptions_wrap.cxx -I/usr/include/python2.6/
$ g++ -shared Exceptions.o Exceptions_wrap.o TextTools.o -o _bpp_exceptions.so

Exceptions depends on TextTools, hence the inclusion of that object file.

I'm not certain that -fpic is necessary--some of the SWIG docs indicate that this is only for some specific platforms, but not Linux. It certainly doesn't hurt things, though.

I think things will be easier if I can manage to put everything in utils into a single object file.

Wednesday, June 24, 2009

Happy Dance!

It's working! At least, the tiny little C++ program I had written is working. Basically, I needed to include the proper local .h files in the .i file's headers, but only within the %{ and %} delimiters. That way SWIG doesn't choke on them, but they're still included (verbatim) in the created .cxx file.

Now we try this on the actual Bio++ files...

Still Not Working...

Still lots and lots of compiler errors when I try to compile the wrapper function--even though SWIG itself runs without any errors at all.  They look like scoping and syntax errors, but I'm not sure what's causing them.  I've been working with very a simple C++ file, with a single object, and no namespace issues.  Using that, I get the same errors as when I try to compile the Bio++ wrappers.

One thing I have discovered--this is language invariant.  It doesn't matter if I try to use Java, Python, etc.--they all get the same errors.

Reread Chapters 5 and 6 of the SWIG docs today, very carefully.  Found very little of any relevance. Finally posted a note to the SWIG mailing list. I feel like an idiot doing so, but this needs to get done.

Thursday, June 18, 2009

Convention

Spent most of today at the open source convention, thanks to Ellen and Leslie at Google. Managed to spare a couple of hours to back up a bit (again) and start exploring how namespaces interact with SWIG.

Wednesday, June 17, 2009

%#&*@!

And things were going so smoothly.

The xxx_wrap.cxx files that SWIG makes do not compile. I get about two pages worth of bugs, some of which seem to be syntax errors. This makes me unhappy, to say the least. What kind of tool outputs code that can't compile?

Back up a bit...I have to assume that I did something wrong in the .i file. The errors seem to indicate that this might be a namespace issue. (And unfortunately, namespaces are new enough that they weren't teaching them when I learned C++.) It's telling me that classes I try to compile aren't part of the bpp namespace. Looking through the original source, it seems that might be correct. All the .cpp files indicate that we're using the bpp namespace, so that we can access functions in it without having to precede everything with bpp::. But nothing actually seems to be in it. On the other hand, the function prototypes are in bpp in the .hpp files.

Honestly, I'm ready to just try to rip out all references to the namespace from the original source, and see if it works. I'm very unhappy with this--I'd hoped that I was all ready to roll, and to get some serious work done. Instead, I'm thinking that this is going to take me another week to get through. I'm not fulfilling what I've promised to do, and that infuriates me.

Tuesday, June 16, 2009

Operator Issue

Just solved something that's been plaguing me for a week. I figured out why the example code in the SWIG docs weren't working with operator overloading--namespaces. The docs showed %rename and %extend being used with little difficulty, but all their examples were small enough that they didn't use namespaces. And Xin's notes said to use %rename toward the beginning of the file. No effect...but as soon as I tried putting these statements inside the bpp namespace, it all worked. (And it only took me several hours to figure out...)

Only one issue is left--the "incomplete" string class. I've sent another note to the SWIG mailing list, this time without the problem being buried way down in the note. Hopefully someone will see it.

Going to take a dinner break right now, then come back to this later.

Still SWIGing

Here are the warnings that are left:

ApplicationTools.i:21: Warning(454): Setting a pointer/reference variable may leak memory.
ApplicationTools.i:22: Warning(454): Setting a pointer/reference variable may leak memory.
ApplicationTools.i:23: Warning(454): Setting a pointer/reference variable may leak memory.

BppString.i:13: Warning(402): Base class 'string' is incomplete.
/usr/share/swig1.3/typemaps/std_string.swg:17: Warning(402): Only forward declaration 'string' was found.

Number.i:19: Warning(362): operator= ignored

RGBColor.i:43: Warning(389): operator[] ignored (consider using %extend)
RGBColor.i:44: Warning(389): operator[] ignored (consider using %extend)

I'm inclined to ignore the possible memory leak in ApplicationTools. I still need to figure out the operator issue in RGBColor and Number. And then the BppString issue--I have no idea. My thought is that it's an issue with the SWIG library on which everything relies.

Have some information from Xin about the operator issue, and I have a couple of things to try. (And in any case, I'm not really sure I need to do anything here--if someone can't access Utils from Python or Perl, I don't think it's much of a loss.) That leaves the base 'string' class problem. Have posted to the SWIG mailing list, but have gotten nothing from them. (That might be because I did it as an addendum to an earlier question of mine, that ended up being a dumb question.)

Monday, June 15, 2009

Sunday Work

Did work today, because I missed some of the last work week due to being sick.

Now have .i files for everything in utils. Most of them compile. The ones that don't seem to have some pretty uniform problems, that I need to look into. For example, things that refer to Exceptions.h have problems. This is because Exceptions.h is a local file that refers to the standard C++ exception object. So I get Exceptions.i to compile no problem (by telling it to include the C++ version it inherits), but other .i files don't know where it is. I could tell them explicitly, but I get the feeling that there's some better way to do it. (Perhaps by using the %import keyword rather than %include?)

So yeah, was happy to do some *real* coding, instead of just mucking about with make files.

Friday, June 12, 2009

Interface Files

Lots more reading today, of SWIG docs.

The main concrete thing I did was to succeed in making a SWIG .i file from scratch, and then I used it to get to where I was yesterday--getting SWIG to work (albeit with warnings). And then I managed to figure out how to make it pay attention to dependencies and inheritance. This was a little tricky for the STL, but I found how to do it (there are special .i files that ship with SWIG, that you have to include).

However, now I'm getting this pair of error for BppString.i:

BppString.i:13: Warning(402): Base class 'string' is incomplete.
/usr/share/swig1.3/typemaps/std_string.swg:17: Warning(402): Only forward declaration 'string' was found.

And there seems to be very little online with regard to this. Finally ended up shooting off a question to the SWIG mailing list, hoping one of them would have an answer.

Regardless, I think I know how to make .i files now. (Part of that is from reading through Xin's code.) And that represents actual coding (of a sort), so I feel I can actually move forward.

Pjotr also mentioned that Xin dealt with the operator warnings messages, so hopefully he's already found an answer of some kind.

Thursday, June 11, 2009

Baby Steps

So Pjotr recommended that I back up, and try to use SWIG on tiny bits of the code. At the same time, I realized that having a 35K-line .i file was insane, when "based" off a 300-line .h file (especially when about half of those 300 lines are comments). Xin has been making his own .i files, so I decided that I'd have a go at that.

And then, looking through SWIG docs, I find that that's not even necessary for some very simple .h files. There's a -module option that lets you go directly off of the .h file. So...can't hurt to try it. Here are the results, by header file (in utils). In the cases where it compiled sans error, I just left it blank:

AttributesTools
ApplicationTools
 ApplicationTools.h:96: Warning(454): Setting a pointer/reference variable may leak memory.
 ApplicationTools.h:100: Warning(454): Setting a pointer/reference variable may leak memory.
 ApplicationTools.h:104: Warning(454): Setting a pointer/reference variable may leak memory.
BppString
 BppString.h:57: Warning(401): Nothing known about base class 'string'. Ignored.
 BppString.h:57: Warning(401): Nothing known about base class 'Clonable'. Ignored.
BppVector
Clonable
ColorManager
 
ColorManager.h:109: Warning(401): Nothing known about base class 'ColorManager<>'. Ignored.
 ColorManager.h:109: Warning(401): Maybe you forgot to instantiate 'ColorManager<>' using %template.
ColorSet
ColorTools
DefaultColorSet
 DefaultColorSet.h:54: Warning(401): Nothing known about base class 'AbstractColorSet'. Ignored.
DvipsColorSet
 DvipsColorSet.h:55: Warning(401): Nothing known about base class 'AbstractColorSet'. Ignored.
Exceptions
 Exceptions.h:59: Warning(401): Nothing known about base class 'exception'. Ignored.
FileTools
Font
 Font.h:55: Warning(401): Nothing known about base class 'Clonable'. Ignored.
FontManager
 FontManager.h:107: Warning(401): Nothing known about base class 'FontManager<>'. Ignored.
 FontManager.h:107: Warning(401): Maybe you forgot to instantiate 'FontManager<>' using %template.
 FontManager.h:164: Warning(401): Nothing known about base class 'FontManager<>'. Ignored.
 FontManager.h:164: Warning(401): Maybe you forgot to instantiate 'FontManager<>' using %template.
GraphicDevice
IOFormat
KeyvalTools
 KeyvalTools.h:56: Warning(401): Nothing known about base class 'Exception'. Ignored.
MapTools
MolscriptColorSet
 MolscriptColorSet.h:53: Warning(401): Nothing known about base class 'AbstractColorSet'. Ignored.
Number
 Number.h:75: Warning(362): operator= ignored
PGFGraphicDevice
 PGFGraphicDevice.h:58: Warning(401): Nothing known about base class 'GraphicDevice'. Ignored.
RColorSet
 RColorSet.h:53: Warning(401): Nothing known about base class 'AbstractColorSet'. Ignored.
RGBColor
 RGBColor.h:117: Warning(389): operator[] ignored (consider using %extend)
 RGBColor.h:128: Warning(389): operator[] ignored (consider using %extend)
 RGBColor.h:60: Warning(401): Nothing known about base class 'Clonable'. Ignored.
StringTokenizer
SVGGraphicDevice
 SVGGraphicDevice.h:58: Warning(401): Nothing known about base class 'GraphicDevice'. Ignored.
TextTools
XFigGraphicDevice
 XFigGraphicDevice.h:60: Warning(401): Nothing known about base class 'GraphicDevice'. Ignored.

Everything compiled--this were all warnings. Many of them are just inheritance problems. Some are for other classes in utils (e.g. GraphicDevice, Clonable), and others are for standard C++ classes (e.g. string, exception). This should be easy to fix.

What I'm more worried about is the template stuff. Also strange are errors about the [] and = operators. For the former, it talks about using the %template directive in SWIG. That's a good place to start looking.

Finally, I feel like I'm moving forward again.

Wednesday, June 10, 2009

Backing Up...

Yesterday was a sick day. Got food poisoning from a party over the weekend. Not fun.

Today spent a lot of time trying to figure out why SWIG only gives me a single error line, without any substantial results. Also got reminded by Hilmar to e-mail weekly updates to everyone, and so did so.  (Thought the blog sufficed, but nope.) But then David offered to help, and now we have a discussion going. This might prove to be my lifeline.

Tomorrow's project is to start some smaller projects with SWIG. I've managed to SWIGify some very basic code, but I'm going to try a bit bigger now, and see what I get.

Saturday, June 6, 2009

Git Working

I think I have Git under control. I have my own branch (biopp) inside of my own fork, and I'm successfully compiling things with very little effort. Pushing and pulling all seems to work, as do Pjotr's CMake macros.

The downside--tried using SWIG on more .i files: this time from NumCalc. (Aside: it occurred to me that I don't necessarily need to actually SWIGize Utils, since no one calls it directly, anyway.) The result was exactly the same. Something's not working, somewhere, and I still don't know why. I've sent queries to both the CMake and SWIG mailing lists, but so far no one's replied. Gotta keep plugging, and hope the answer presents itself soon.

Friday, June 5, 2009

God Dammit

I was using Git wrong. This is why many of Pjotr's messages have been confusing to me, and no doubt many of mine have been for him. This is all supposed to be done in a branch of biolib, not in its own tree altogether.

Of course, now that I've done things right, I can't figure out how to push back up. Somewhere, Git is just crashing out. Looking for help online seemed to indicate that there's a mismatch between the version I pulled, and what I'm trying to push. But nope--repulling doesn't help.

Meanwhile, I did manage to recreate my .i files. And guess what? SWIG still claims there's a syntax error, right on the same line.

Bleah. New day tomorrow.

Thursday, June 4, 2009

Enter SWIG

Successfully used SWIG on an example file lifted from the SWIG documentation. It took quite an effort to get it to work, and I'm not really sure what I did to make it function. I just deleted everything and tried again, and maybe I got the order of commands correct. Don't know, and that disturbs me. I don't like inconsistent programs. (Or, more accurately, programs that are perfectly consistent but whose patterns I can't recognize.)

Also managed to use CMake on utils in order to make a bunch of .i files for SWIG. However I was unable to get SWIG to treat them properly. It kept complaining of a syntax error. The line number varied depending on the exact .i file, but the offending line is always:

namespace std __attribute__ ((__visibility__ ("default"))) {

I'm afraid that I'm going to need to dig deep into the SWIG docs to figure out what the hell this is. Not what I wanted, but oh well.

Wednesday, June 3, 2009

Decent Amount

Actually got a fair amount done today.

  1. Used CMake to make an installation shared library, in my /usr/local/lib/ directory. Incorporated that into the test code I got working yesterday, and it worked perfectly. (I was actually kind of surprised that it did.)
  2. Used code from the CMake FAQ (http://www.cmake.org/Wiki/CMake_FAQ) to modify CMakeLists.txt to do SWIG mappings. Took some effort, and it didn't work. While CMake ran fine, make complained about missing targets. The code they have up there doesn't have much in the way of comments to help me figure out what went wrong.
  3. Learned another trick in Git: downloading submodules. Pjotr has written some packages for biolib to help with the SWIG mapping. With luck, that'll help me do what I need to. Tomorrow I dive fully into trying to understand that.

Tuesday, June 2, 2009

Yesterday's Progress

Was out last night, and didn't get back until late. Hence, the post-facto update now.

Actually managed to use Bio++ for the first time, yesterday. Got some code up and running that calls the library to make a calculation. This was harder than it should have been, because of dependency hell issues.

Also, managed to push my changes to the project (i.e. the addition of CMakeLists.txt) up to github. So I'm making slow progress there, too.

Things still going more slowly than I want them too, though. Hoping that this is just a rough patch that will be over soon.

Saturday, May 30, 2009

More CMake

Printed out the complete documentation, and have been reading that. Have also managed to get .so files out of it. So I'm making progress...

Friday, May 29, 2009

CMake Progress

think I managed to compile utils with CMake, but I'm not 100% sure. It all came out with .a archive files, which seems to be CMake's default library file format. Trying to figure out how to do the .o, .lo, and .la files that the more traditional make configuration uses. Or perhaps it's not really necessary. Don't know. That's the next step, is figuring that out.

Otherwise, just spent today running through CMake tutorials of various kinds. Beginning to get a grip on how it all works. They're probably dreading me on the mailing list though, since I sound a bit like a naive idiot.

Thursday, May 28, 2009

And onto CMake...

Docs for SWIG mention CMake explicitly. Given that Pjotr also mentioned it, I'm thinking that the next step might be to try to get everything compiling with that. If I can do that, SWIG might follow really easily.

Main problem right now: CMake's docs are woefully vague. My main complaint is that their "basic tutorial" is written all in the passive voice, making it sound like everything is done automatically. (I suppose you probably have to at least think about it working, first.) Have just written to their mailing list asking for clarification. We'll see if they thought I was too bitchy. I tried to be nice, but I'm tired and annoyed right now.

It's nearing 2:00. Time to sleep.

Wednesday, May 27, 2009

SWIG!

Have downloaded SWIG, and it seems to be working. Now looking through documentation and examples. This is going to be a bit of a slog...

I worry a bit how far we're going to be able to get with SWIG. If the latest version of gcc had problems compiling due to scope issues, it seems that SWIG might choke as well. Oh well—no place to go but forward.

Compiling is all done. All five libraries are good. Took a little bit to get them to recognize each other, but wasn't too bad.

Tuesday, May 26, 2009

Success!

Had to add a couple of packages to my installation (automake and libtool), but I finally got autogen.sh to work.  So...I have working code now.  First step has been accomplished.

Problems Getting Started

A little annoyed right now with (non-) progress.

Essentially, the code downloadable from http://kimura.univ-montp2.fr/BioPP/ doesn't compile. There's a known issue with the most recent version of gcc, which is somehow a little more strict than previous versions.  There are some scoping issues, which make the thing just plain not work. They know about this, but won't have the fix up until June.

It is, apparently, fixed in the "CVS" version, which I haven't gotten the chance to find yet. Pjotr set up a Git repository as I was about to start searching, and I assume that that was the same thing. However, I can't get that to compile either. It has no make file. Tried using the one from the other version, in various combinations, but couldn't get it going.

At this point I backed off, and read a bunch of material on Git. If nothing else, I feel like I understand that better.

So...current plan of attack. First, find the explicit CVS version, and make sure that that's the same as our Git version. Maybe it even has a make file in there that'll work. Second, Pjotr suggests looking at cmake, of which I hadn't heard. Looked at it briefly, and maybe it'll work better than make. The long holiday weekend is behind us, and I'll get to focus on this in the morning.

Monday, May 25, 2009

Blog Created

Here is the first entry of this new blog, recording my progress on using SWIG to translate Bio++ to Java, R, and Python.